[zz]Python 使用 UTF-8 编码 & Defining Python Source Code Encodings

Python 使用 UTF-8 编码

发表于：2010年1月28日 | 分类：Python | 标签： utf8 | views(3,059)

版权信息: 可以任意转载, 转载时请务必以超链接形式标明文章原文出处, 即下面的声明.

原文出处：http://blog.chenlb.com/2010/01/python-use-utf-8.html

一般我喜欢用 utf-8 编码，在 python 怎么使用呢？

1、在 python 源码文件中用 utf-8 文字。一般会报错，如下：

File "F:\workspace\psh\src\test.py", line 2
SyntaxError: Non-ASCII character '\xe4' in file F:\workspace\psh\src\test.py on line 2, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

test.py 的内容：

print "你好"

print "你好"

如果要正常运行在 test.py 文件前面加编码注释，如：

#!/usr/bin/python2.6
# -*- coding: utf-8 -*-
print "你好"

http://www.python.org/dev/peps/pep-0263/

Defining Python Source Code Encodings

Defining the Encoding

    Python will default to ASCII as standard encoding if no other
    encoding hints are given.

    To define a source code encoding, a magic comment must
    be placed into the source files either as first or second
    line in the file, such as:

          # coding=<encoding name>

    or (using formats recognized by popular editors)

          #!/usr/bin/python
          # -*- coding: <encoding name> -*-

    or

          #!/usr/bin/python
          # vim: set fileencoding=<encoding name> :

    More precisely, the first or second line must match the regular
    expression "coding[:=]\s*([-\w.]+)". The first group of this
    expression is then interpreted as encoding name. If the encoding
    is unknown to Python, an error is raised during compilation. There
    must not be any Python statement on the line that contains the
    encoding declaration.

    To aid with platforms such as Windows, which add Unicode BOM marks
    to the beginning of Unicode files, the UTF-8 signature
    '\xef\xbb\xbf' will be interpreted as 'utf-8' encoding as well
    (even if no magic encoding comment is given).

    If a source file uses both the UTF-8 BOM mark signature and a
    magic encoding comment, the only allowed encoding for the comment
    is 'utf-8'.  Any other encoding will cause an error.

Examples

    These are some examples to clarify the different styles for
    defining the source code encoding at the top of a Python source
    file:

    1. With interpreter binary and using Emacs style file encoding
       comment:

          #!/usr/bin/python
          # -*- coding: latin-1 -*-
          import os, sys
          ...

          #!/usr/bin/python
          # -*- coding: iso-8859-15 -*-
          import os, sys
          ...

          #!/usr/bin/python
          # -*- coding: ascii -*-
          import os, sys
          ...

    2. Without interpreter line, using plain text:

          # This Python file uses the following encoding: utf-8
          import os, sys
          ...

    3. Text editors might have different ways of defining the file's
       encoding, e.g.

          #!/usr/local/bin/python
          # coding: latin-1
          import os, sys
          ...

    4. Without encoding comment, Python's parser will assume ASCII
       text:

          #!/usr/local/bin/python
          import os, sys
          ...

    5. Encoding comments which don't work:

       Missing "coding:" prefix:

          #!/usr/local/bin/python
          # latin-1
          import os, sys
          ...

       Encoding comment not on line 1 or 2:

          #!/usr/local/bin/python
          #
          # -*- coding: latin-1 -*-
          import os, sys
          ...

       Unsupported encoding:

          #!/usr/local/bin/python
          # -*- coding: utf-42 -*-
          import os, sys
          ...

posted @ 2011-03-21 12:39 bettermanlu 阅读(2727) 评论(0) 编辑收藏举报

刷新页面返回顶部

Justify for QA

Who said testing is easy?

[zz]Python 使用 UTF-8 编码 & Defining Python Source Code Encodings

Python 使用 UTF-8 编码

Defining the Encoding

Examples

公告