CodePage and CharSet

37 IBM037 IBM EBCDIC(美国 - 加拿大) IBM EBCDIC (US-Canada)
437 IBM437 OEM 美国 OEM United States
500 IBM500 IBM EBCDIC(国际) IBM EBCDIC (International)
708 ASMO-708 阿拉伯字符 (ASMO 708) Arabic (ASMO 708)
720 DOS-720 阿拉伯字符 (DOS) Arabic (DOS)
737 ibm737 希腊字符 (DOS) Greek (DOS)
775 ibm775 波罗的海字符 (DOS) Baltic (DOS)
850 ibm850 西欧字符 (DOS) Western European (DOS)
852 ibm852 中欧字符 (DOS) Central European (DOS)
855 IBM855 OEM 西里尔语 OEM Cyrillic
857 ibm857 土耳其字符 (DOS) Turkish (DOS)
858 IBM00858 OEM 多语言拉丁语 I OEM Multilingual Latin I
860 IBM860 葡萄牙语 (DOS) Portuguese (DOS)
861 ibm861 冰岛语 (DOS) Icelandic (DOS)
862 DOS-862 希伯来字符 (DOS) Hebrew (DOS)
863 IBM863 加拿大法语 (DOS) French Canadian (DOS)
864 IBM864 阿拉伯字符 (864) Arabic (864)
865 IBM865 北欧字符 (DOS) Nordic (DOS)
866 cp866 西里尔字符 (DOS) Cyrillic (DOS)
869 ibm869 现代希腊字符 (DOS) Greek, Modern (DOS)
870 IBM870 IBM EBCDIC(多语言拉丁语 2) IBM EBCDIC (Multilingual Latin-2)
874 windows-874 泰语 (Windows) Thai (Windows)
875 cp875 IBM EBCDIC(现代希腊语) IBM EBCDIC (Greek Modern)
932 shift_jis 日语 (Shift-JIS) Japanese (Shift-JIS)
936 gb2312 简体中文 (GB2312) Chinese Simplified (GB2312)
949 ks_c_5601-1987 朝鲜语 Korean
950 big5 繁体中文 (Big5) Chinese Traditional (Big5)
1026 IBM1026 IBM EBCDIC(土耳其拉丁语 5) IBM EBCDIC (Turkish Latin-5)
1047 IBM01047 IBM 拉丁语 1 IBM Latin-1
1140 IBM01140 IBM EBCDIC(美国 - 加拿大 - 欧洲) IBM EBCDIC (US-Canada-Euro)
1141 IBM01141 IBM EBCDIC(德国 - 欧洲) IBM EBCDIC (Germany-Euro)
1142 IBM01142 IBM EBCDIC(丹麦 - 挪威 - 欧洲) IBM EBCDIC (Denmark-Norway-Euro)
1143 IBM01143 IBM EBCDIC(芬兰 - 瑞典 - 欧洲) IBM EBCDIC (Finland-Sweden-Euro)
1144 IBM01144 IBM EBCDIC(意大利 - 欧洲) IBM EBCDIC (Italy-Euro)
1145 IBM01145 IBM EBCDIC(西班牙 - 欧洲) IBM EBCDIC (Spain-Euro)
1146 IBM01146 IBM EBCDIC(英国 - 欧洲) IBM EBCDIC (UK-Euro)
1147 IBM01147 IBM EBCDIC(法国 - 欧洲) IBM EBCDIC (France-Euro)
1148 IBM01148 IBM EBCDIC(国际 - 欧洲) IBM EBCDIC (International-Euro)
1149 IBM01149 IBM EBCDIC(冰岛语 - 欧洲) IBM EBCDIC (Icelandic-Euro)
1200 utf-16 Unicode Unicode
1201 UnicodeFFFE Unicode (Big-Endian) Unicode (Big-Endian)
1250 windows-1250 中欧字符 (Windows) Central European (Windows)
1251 windows-1251 西里尔字符 (Windows) Cyrillic (Windows)
1252 Windows-1252 西欧字符 (Windows) Western European (Windows)
1253 windows-1253 希腊字符 (Windows) Greek (Windows)
1254 windows-1254 土耳其字符 (Windows) Turkish (Windows)
1255 windows-1255 希伯来字符 (Windows) Hebrew (Windows)
1256 windows-1256 阿拉伯字符 (Windows) Arabic (Windows)
1257 windows-1257 波罗的海字符 (Windows) Baltic (Windows)
1258 windows-1258 越南字符 (Windows) Vietnamese (Windows)
1361 Johab 朝鲜语 (Johab) Korean (Johab)
10000 macintosh 西欧字符 (Mac) Western European (Mac)
10001 x-mac-japanese 日语 (Mac) Japanese (Mac)
10002 x-mac-chinesetrad 繁体中文 (Mac) Chinese Traditional (Mac)
10003 x-mac-korean 朝鲜语 (Mac) Korean (Mac)
10004 x-mac-arabic 阿拉伯字符 (Mac) Arabic (Mac)
10005 x-mac-hebrew 希伯来字符 (Mac) Hebrew (Mac)
10006 x-mac-greek 希腊字符 (Mac) Greek (Mac)
10007 x-mac-cyrillic 西里尔字符 (Mac) Cyrillic (Mac)
10008 x-mac-chinesesimp 简体中文 (Mac) Chinese Simplified (Mac)
10010 x-mac-romanian 罗马尼亚语 (Mac) Romanian (Mac)
10017 x-mac-ukrainian 乌克兰语 (Mac) Ukrainian (Mac)
10021 x-mac-thai 泰语 (Mac) Thai (Mac)
10029 x-mac-ce 中欧字符 (Mac) Central European (Mac)
10079 x-mac-icelandic 冰岛语 (Mac) Icelandic (Mac)
10081 x-mac-turkish 土耳其字符 (Mac) Turkish (Mac)
10082 x-mac-croatian 克罗地亚语 (Mac) Croatian (Mac)
20000 x-Chinese-CNS 繁体中文 (CNS) Chinese Traditional (CNS)
20001 x-cp20001 TCA 台湾 TCA Taiwan
20002 x-Chinese-Eten 繁体中文 (Eten) Chinese Traditional (Eten)
20003 x-cp20003 IBM5550 台湾 IBM5550 Taiwan
20004 x-cp20004 TeleText 台湾 TeleText Taiwan
20005 x-cp20005 Wang 台湾 Wang Taiwan
20105 x-IA5 西欧字符 (IA5) Western European (IA5)
20106 x-IA5-German 德语 (IA5) German (IA5)
20107 x-IA5-Swedish 瑞典语 (IA5) Swedish (IA5)
20108 x-IA5-Norwegian 挪威语 (IA5) Norwegian (IA5)
20127 us-ascii US-ASCII US-ASCII
20261 x-cp20261 T.61 T.61
20269 x-cp20269 ISO-6937 ISO-6937
20273 IBM273 IBM EBCDIC(德国) IBM EBCDIC (Germany)
20277 IBM277 IBM EBCDIC(丹麦 - 挪威) IBM EBCDIC (Denmark-Norway)
20278 IBM278 IBM EBCDIC(芬兰 - 瑞典) IBM EBCDIC (Finland-Sweden)
20280 IBM280 IBM EBCDIC(意大利) IBM EBCDIC (Italy)
20284 IBM284 IBM EBCDIC(西班牙) IBM EBCDIC (Spain)
20290 IBM290 IBM EBCDIC(日语片假名) IBM EBCDIC (Japanese katakana)
20297 IBM297 IBM EBCDIC(法国) IBM EBCDIC (France)
20420 IBM420 IBM EBCDIC(阿拉伯语) IBM EBCDIC (Arabic)
20423 IBM423 IBM EBCDIC(希腊语) IBM EBCDIC (Greek)
20424 IBM424 IBM EBCDIC(希伯来语) IBM EBCDIC (Hebrew)
20833 x-EBCDIC-KoreanExtended IBM EBCDIC(朝鲜语扩展) IBM EBCDIC (Korean Extended)
20838 IBM-Thai IBM EBCDIC(泰语) IBM EBCDIC (Thai)
20866 koi8-r 西里尔字符 (KOI8-R) Cyrillic (KOI8-R)
20871 IBM871 IBM EBCDIC(冰岛语) IBM EBCDIC (Icelandic)
20880 IBM880 IBM EBCDIC(西里尔俄语) IBM EBCDIC (Cyrillic Russian)
20905 IBM905 IBM EBCDIC(土耳其语) IBM EBCDIC (Turkish)
20924 IBM00924 IBM 拉丁语 1 IBM Latin-1
20932 EUC-JP 日语(JIS 0208-1990 和 0212-1990) Japanese (JIS 0208-1990 and 0212-1990)
20936 x-cp20936 简体中文 (GB2312-80) Chinese Simplified (GB2312-80)
20949 x-cp20949 朝鲜语 Wansung Korean Wansung
21025 cp1025 IBM EBCDIC(西里尔塞尔维亚 - 保加利亚语) IBM EBCDIC (Cyrillic Serbian-Bulgarian)
21866 koi8-u 西里尔字符 (KOI8-U) Cyrillic (KOI8-U)
28591 iso-8859-1 西欧字符 (ISO) Western European (ISO)
28592 iso-8859-2 中欧字符 (ISO) Central European (ISO)
28593 iso-8859-3 拉丁语 3 (ISO) Latin 3 (ISO)
28594 iso-8859-4 波罗的海字符 (ISO) Baltic (ISO)
28595 iso-8859-5 西里尔字符 (ISO) Cyrillic (ISO)
28596 iso-8859-6 阿拉伯字符 (ISO) Arabic (ISO)
28597 iso-8859-7 希腊字符 (ISO) Greek (ISO)
28598 iso-8859-8 希伯来字符 (ISO-Visual) Hebrew (ISO-Visual)
28599 iso-8859-9 土耳其字符 (ISO) Turkish (ISO)
28603 iso-8859-13 爱沙尼亚语 (ISO) Estonian (ISO)
28605 iso-8859-15 拉丁语 9 (ISO) Latin 9 (ISO)
29001 x-Europa 欧罗巴 Europa
38598 iso-8859-8-i 希伯来字符 (ISO-Logical) Hebrew (ISO-Logical)
50220 iso-2022-jp 日语 (JIS) Japanese (JIS)
50221 csISO2022JP 日语(JIS- 允许 1 字节假名) Japanese (JIS-Allow 1 byte Kana)
50222 iso-2022-jp 日语(JIS- 允许 1 字节假名 - SO/SI) Japanese (JIS-Allow 1 byte Kana - SO/SI)
50225 iso-2022-kr 朝鲜语 (ISO) Korean (ISO)
50227 x-cp50227 简体中文 (ISO-2022) Chinese Simplified (ISO-2022)
51932 euc-jp 日语 (EUC) Japanese (EUC)
51936 EUC-CN 简体中文 (EUC) Chinese Simplified (EUC)
51949 euc-kr 朝鲜语 (EUC) Korean (EUC)
52936 hz-gb-2312 简体中文 (HZ) Chinese Simplified (HZ)
54936 GB18030 简体中文 (GB18030) Chinese Simplified (GB18030)
57002 x-iscii-de ISCII 梵文 ISCII Devanagari
57003 x-iscii-be ISCII 孟加拉语 ISCII Bengali
57004 x-iscii-ta ISCII 泰米尔语 ISCII Tamil
57005 x-iscii-te ISCII 泰卢固语 ISCII Telugu
57006 x-iscii-as ISCII 阿萨姆语 ISCII Assamese
57007 x-iscii-or ISCII 奥里雅语 ISCII Oriya
57008 x-iscii-ka ISCII 卡纳达语 ISCII Kannada
57009 x-iscii-ma ISCII 马拉雅拉姆语 ISCII Malayalam
57010 x-iscii-gu ISCII 古吉拉特语 ISCII Gujarati
57011 x-iscii-pa ISCII 旁遮普语 ISCII Punjabi
65000 utf-7 Unicode (UTF-7) Unicode (UTF-7)
65001 utf-8 Unicode (UTF-8) Unicode (UTF-8)
65005 utf-32 Unicode (UTF-32) Unicode (UTF-32)
65006 utf-32BE Unicode (UTF-32 Big-Endian) Unicode (UTF-32 Big-Endian)

The following Windows code pages exist:

  • 874 — Thai
  • 932 — Japanese
  • 936 — Chinese (simplified) (PRC, Singapore)
  • 949 — Korean
  • 950 — Chinese (traditional) (Taiwan, Hong Kong)
  • 1200 — Unicode (BMP of ISO 10646, UTF-16LE)
  • 1201 — Unicode (BMP of ISO 10646, UTF-16BE)
  • 1250 — Latin (Central European languages)
  • 1251 — Cyrillic
  • 1252 — Latin (Western European languages, replacing Code page 850)
  • 1253 — Greek
  • 1254 — Turkish
  • 1255 — Hebrew
  • 1256 — Arabic
  • 1257 — Latin (Baltic languages)
  • 1258 — Vietnamese
  • 65000 — Unicode (BMP of ISO 10646, UTF-7)
  • 65001 — Unicode (BMP of ISO 10646, UTF-8)

Table 2-3 lSO 8859 Character Sets

StandardLanguages Supported

ISO 8859-1

Western European (
Albanian, Basque, Breton, Catalan, Danish, Dutch,
English, Faeroese, Finnish, French, German, Greenlandic,
Icelandic, Irish Gaelic, Italian, Latin, Luxemburgish,
Norwegian, Portuguese, Rhaeto-Romanic, Scottish Gaelic,
Spanish, Swedish)

ISO 8859-2

Eastern European (
Albanian, Croatian, Czech, English,
German, Hungarian, Latin, Polish,
Romanian, Slovak, Slovenian, Serbian)

ISO 8859-3

Southeastern European (
Afrikaans, Catalan, Dutch, English, Esperanto,
German, Italian, Maltese, Spanish, Turkish)

ISO 8859-4

Northern European (
Danish, English, Estonian, Finnish, German, Greenlandic,
Latin, Latvian, Lithuanian, Norwegian, Sámi, Slovenian, Swedish)

ISO 8859-5

Eastern European (
Cyrillic-based: Bulgarian, Byelorussian,
Macedonian, Russian, Serbian, Ukrainian)

ISO 8859-6


ISO 8859-7


ISO 8859-8


ISO 8859-9

Western European (
Albanian, Basque, Breton, Catalan, Cornish, Danish, Dutch,
English, Finnish, French, Frisian, Galician, German, Greenlandic,
Irish Gaelic, Italian, Latin, Luxemburgish, Norwegian, Portuguese,
Rhaeto-Romanic, Scottish Gaelic, Spanish, Swedish, Turkish)

ISO 8859-10

Northern European (
Danish, English, Estonian, Faeroese, Finnish, German,
Greenlandic, Icelandic, Irish Gaelic, Latin, Lithuanian,
Norwegian, Sámi, Slovenian, Swedish)

ISO 8859-13

Baltic Rim (
English, Estonian, Finnish, Latin, Latvian, Norwegian)

ISO 8859-14

Celtic (
Albanian, Basque, Breton, Catalan, Cornish, Danish, English,
Galician, German, Greenlandic, Irish Gaelic, Italian, Latin,
Luxemburgish, Manx Gaelic, Norwegian, Portuguese, Rhaeto-Romanic,
Scottish Gaelic, Spanish, Swedish, Welsh)

ISO 8859-15

Western European (
Albanian, Basque, Breton, Catalan, Danish, Dutch,
English, Estonian, Faroese, Finnish, French, Frisian,
Galician, German, Greenlandic, Icelandic, Irish Gaelic,
Italian, Latin, Luxemburgish, Norwegian, Portuguese,
Rhaeto-Romanic, Scottish Gaelic, Spanish, Swedish) 

Code Page Identifiers

The following table defines the available code page identifiers.

Note   ANSI code pages can be different on different computers, or can be changed for a single computer, leading to data corruption. For the most consistent results, applications should use Unicode, such as UTF-8 or UTF-16, instead of a specific code page.

Identifier.NET NameAdditional information
037 IBM037 IBM EBCDIC US-Canada
437 IBM437 OEM United States
500 IBM500 IBM EBCDIC International
708 ASMO-708 Arabic (ASMO 708)
709   Arabic (ASMO-449+, BCON V4)
710   Arabic - Transparent Arabic
720 DOS-720 Arabic (Transparent ASMO); Arabic (DOS)
737 ibm737 OEM Greek (formerly 437G); Greek (DOS)
775 ibm775 OEM Baltic; Baltic (DOS)
850 ibm850 OEM Multilingual Latin 1; Western European (DOS)
852 ibm852 OEM Latin 2; Central European (DOS)
855 IBM855 OEM Cyrillic (primarily Russian)
857 ibm857 OEM Turkish; Turkish (DOS)
858 IBM00858 OEM Multilingual Latin 1 + Euro symbol
860 IBM860 OEM Portuguese; Portuguese (DOS)
861 ibm861 OEM Icelandic; Icelandic (DOS)
862 DOS-862 OEM Hebrew; Hebrew (DOS)
863 IBM863 OEM French Canadian; French Canadian (DOS)
864 IBM864 OEM Arabic; Arabic (864)
865 IBM865 OEM Nordic; Nordic (DOS)
866 cp866 OEM Russian; Cyrillic (DOS)
869 ibm869 OEM Modern Greek; Greek, Modern (DOS)
870 IBM870 IBM EBCDIC Multilingual/ROECE (Latin 2); IBM EBCDIC Multilingual Latin 2
874 windows-874 ANSI/OEM Thai (same as 28605, ISO 8859-15); Thai (Windows)
875 cp875 IBM EBCDIC Greek Modern
932 shift_jis ANSI/OEM Japanese; Japanese (Shift-JIS)
936 gb2312 ANSI/OEM Simplified Chinese (PRC, Singapore); Chinese Simplified (GB2312)
949 ks_c_5601-1987 ANSI/OEM Korean (Unified Hangul Code)
950 big5 ANSI/OEM Traditional Chinese (Taiwan; Hong Kong SAR, PRC); Chinese Traditional (Big5)
1026 IBM1026 IBM EBCDIC Turkish (Latin 5)
1047 IBM01047 IBM EBCDIC Latin 1/Open System
1140 IBM01140 IBM EBCDIC US-Canada (037 + Euro symbol); IBM EBCDIC (US-Canada-Euro)
1141 IBM01141 IBM EBCDIC Germany (20273 + Euro symbol); IBM EBCDIC (Germany-Euro)
1142 IBM01142 IBM EBCDIC Denmark-Norway (20277 + Euro symbol); IBM EBCDIC (Denmark-Norway-Euro)
1143 IBM01143 IBM EBCDIC Finland-Sweden (20278 + Euro symbol); IBM EBCDIC (Finland-Sweden-Euro)
1144 IBM01144 IBM EBCDIC Italy (20280 + Euro symbol); IBM EBCDIC (Italy-Euro)
1145 IBM01145 IBM EBCDIC Latin America-Spain (20284 + Euro symbol); IBM EBCDIC (Spain-Euro)
1146 IBM01146 IBM EBCDIC United Kingdom (20285 + Euro symbol); IBM EBCDIC (UK-Euro)
1147 IBM01147 IBM EBCDIC France (20297 + Euro symbol); IBM EBCDIC (France-Euro)
1148 IBM01148 IBM EBCDIC International (500 + Euro symbol); IBM EBCDIC (International-Euro)
1149 IBM01149 IBM EBCDIC Icelandic (20871 + Euro symbol); IBM EBCDIC (Icelandic-Euro)
1200 utf-16 Unicode UTF-16, little endian byte order (BMP of ISO 10646); available only to managed applications
1201 unicodeFFFE Unicode UTF-16, big endian byte order; available only to managed applications
1250 windows-1250 ANSI Central European; Central European (Windows)
1251 windows-1251 ANSI Cyrillic; Cyrillic (Windows)
1252 windows-1252 ANSI Latin 1; Western European (Windows)
1253 windows-1253 ANSI Greek; Greek (Windows)
1254 windows-1254 ANSI Turkish; Turkish (Windows)
1255 windows-1255 ANSI Hebrew; Hebrew (Windows)
1256 windows-1256 ANSI Arabic; Arabic (Windows)
1257 windows-1257 ANSI Baltic; Baltic (Windows)
1258 windows-1258 ANSI/OEM Vietnamese; Vietnamese (Windows)
1361 Johab Korean (Johab)
10000 macintosh MAC Roman; Western European (Mac)
10001 x-mac-japanese Japanese (Mac)
10002 x-mac-chinesetrad MAC Traditional Chinese (Big5); Chinese Traditional (Mac)
10003 x-mac-korean Korean (Mac)
10004 x-mac-arabic Arabic (Mac)
10005 x-mac-hebrew Hebrew (Mac)
10006 x-mac-greek Greek (Mac)
10007 x-mac-cyrillic Cyrillic (Mac)
10008 x-mac-chinesesimp MAC Simplified Chinese (GB 2312); Chinese Simplified (Mac)
10010 x-mac-romanian Romanian (Mac)
10017 x-mac-ukrainian Ukrainian (Mac)
10021 x-mac-thai Thai (Mac)
10029 x-mac-ce MAC Latin 2; Central European (Mac)
10079 x-mac-icelandic Icelandic (Mac)
10081 x-mac-turkish Turkish (Mac)
10082 x-mac-croatian Croatian (Mac)
12000 utf-32 Unicode UTF-32, little endian byte order; available only to managed applications
12001 utf-32BE Unicode UTF-32, big endian byte order; available only to managed applications
20000 x-Chinese_CNS CNS Taiwan; Chinese Traditional (CNS)
20001 x-cp20001 TCA Taiwan
20002 x_Chinese-Eten Eten Taiwan; Chinese Traditional (Eten)
20003 x-cp20003 IBM5550 Taiwan
20004 x-cp20004 TeleText Taiwan
20005 x-cp20005 Wang Taiwan
20105 x-IA5 IA5 (IRV International Alphabet No. 5, 7-bit); Western European (IA5)
20106 x-IA5-German IA5 German (7-bit)
20107 x-IA5-Swedish IA5 Swedish (7-bit)
20108 x-IA5-Norwegian IA5 Norwegian (7-bit)
20127 us-ascii US-ASCII (7-bit)
20261 x-cp20261 T.61
20269 x-cp20269 ISO 6937 Non-Spacing Accent
20273 IBM273 IBM EBCDIC Germany
20277 IBM277 IBM EBCDIC Denmark-Norway
20278 IBM278 IBM EBCDIC Finland-Sweden
20280 IBM280 IBM EBCDIC Italy
20284 IBM284 IBM EBCDIC Latin America-Spain
20285 IBM285 IBM EBCDIC United Kingdom
20290 IBM290 IBM EBCDIC Japanese Katakana Extended
20297 IBM297 IBM EBCDIC France
20420 IBM420 IBM EBCDIC Arabic
20423 IBM423 IBM EBCDIC Greek
20424 IBM424 IBM EBCDIC Hebrew
20833 x-EBCDIC-KoreanExtended IBM EBCDIC Korean Extended
20838 IBM-Thai IBM EBCDIC Thai
20866 koi8-r Russian (KOI8-R); Cyrillic (KOI8-R)
20871 IBM871 IBM EBCDIC Icelandic
20880 IBM880 IBM EBCDIC Cyrillic Russian
20905 IBM905 IBM EBCDIC Turkish
20924 IBM00924 IBM EBCDIC Latin 1/Open System (1047 + Euro symbol)
20932 EUC-JP Japanese (JIS 0208-1990 and 0121-1990)
20936 x-cp20936 Simplified Chinese (GB2312); Chinese Simplified (GB2312-80)
20949 x-cp20949 Korean Wansung
21025 cp1025 IBM EBCDIC Cyrillic Serbian-Bulgarian
21027   (deprecated)
21866 koi8-u Ukrainian (KOI8-U); Cyrillic (KOI8-U)
28591 iso-8859-1 ISO 8859-1 Latin 1; Western European (ISO)
28592 iso-8859-2 ISO 8859-2 Central European; Central European (ISO)
28593 iso-8859-3 ISO 8859-3 Latin 3
28594 iso-8859-4 ISO 8859-4 Baltic
28595 iso-8859-5 ISO 8859-5 Cyrillic
28596 iso-8859-6 ISO 8859-6 Arabic
28597 iso-8859-7 ISO 8859-7 Greek
28598 iso-8859-8 ISO 8859-8 Hebrew; Hebrew (ISO-Visual)
28599 iso-8859-9 ISO 8859-9 Turkish
28603 iso-8859-13 ISO 8859-13 Estonian
28605 iso-8859-15 ISO 8859-15 Latin 9
29001 x-Europa Europa 3
38598 iso-8859-8-i ISO 8859-8 Hebrew; Hebrew (ISO-Logical)
50220 iso-2022-jp ISO 2022 Japanese with no halfwidth Katakana; Japanese (JIS)
50221 csISO2022JP ISO 2022 Japanese with halfwidth Katakana; Japanese (JIS-Allow 1 byte Kana)
50222 iso-2022-jp ISO 2022 Japanese JIS X 0201-1989; Japanese (JIS-Allow 1 byte Kana - SO/SI)
50225 iso-2022-kr ISO 2022 Korean
50227 x-cp50227 ISO 2022 Simplified Chinese; Chinese Simplified (ISO 2022)
50229   ISO 2022 Traditional Chinese
50930   EBCDIC Japanese (Katakana) Extended
50931   EBCDIC US-Canada and Japanese
50933   EBCDIC Korean Extended and Korean
50935   EBCDIC Simplified Chinese Extended and Simplified Chinese
50936   EBCDIC Simplified Chinese
50937   EBCDIC US-Canada and Traditional Chinese
50939   EBCDIC Japanese (Latin) Extended and Japanese
51932 euc-jp EUC Japanese
51936 EUC-CN EUC Simplified Chinese; Chinese Simplified (EUC)
51949 euc-kr EUC Korean
51950   EUC Traditional Chinese
52936 hz-gb-2312 HZ-GB2312 Simplified Chinese; Chinese Simplified (HZ)
54936 GB18030 Windows XP and later: GB18030 Simplified Chinese (4 byte); Chinese Simplified (GB18030)
57002 x-iscii-de ISCII Devanagari
57003 x-iscii-be ISCII Bengali
57004 x-iscii-ta ISCII Tamil
57005 x-iscii-te ISCII Telugu
57006 x-iscii-as ISCII Assamese
57007 x-iscii-or ISCII Oriya
57008 x-iscii-ka ISCII Kannada
57009 x-iscii-ma ISCII Malayalam
57010 x-iscii-gu ISCII Gujarati
57011 x-iscii-pa ISCII Punjabi
65000 utf-7 Unicode (UTF-7)
65001 utf-8 Unicode (UTF-8)


IBM PC (OEM) code pages [edit]

These code pages were originally embedded directly in the text mode hardware of the graphic adapters used with the IBM PC and its clones, including the original MDA and CGA adapters whose character sets could only be changed by physically replacing a ROM chip that contained the font. The interface of those adapters (emulated by all later adapters such as VGA) was typically limited to single byte character sets with only 256 characters in each font/encoding (although VGA added partial support for slightly larger character sets). Since the original IBM PC code page (number 437) was not really designed for international use, several partially compatible country or region specific variants emerged. Microsoft refers to these as the OEM code pages because they were defined by the OEM's who licensed MS-DOS for distribution with their hardware, not by Microsoft or a standard body. Examples include:

When dealing with older hardware, protocols and file formats, it is often necessary to support these code pages, but use of newer code pages, in particular Unicode, is encouraged for new designs.

Code page 819 is identical to Latin-1, ISO/IEC 8859-1, and with slightly-modified commands, permits MS-DOS machines to use that encoding. It was used with IBM AS/400 minicomputers.

Code pages for DBCS character sets [edit]

These code pages represent DBCS character encodings for various CJK languages. In Microsoft operating systems, these are used as both the "OEM" and "ANSI" code page for the applicable locale.

Microsoft code page numbers for various other character encodings [edit]

The following code page numbers are specific to Microsoft Windows. IBM may use different numbers for these code pages.

Miscellaneous [edit]

Windows (ANSI) code pages [edit]

Microsoft defined a number of code pages known as the ANSI code pages (as the first one, 1252 was based on an apocryphal ANSI draft of what became ISO 8859-1). Code page 1252 is built on ISO 8859-1 but uses the range 0x80-0x9F for extra printable characters rather than the C1 control codes used in ISO-8859-1. Some of the others are based in part on other parts of ISO 8859 but often rearranged to make them closer to 1252.

Microsoft recommends applications use UTF-8 or UCS-2/UTF-16 instead of these code pages.[8]


