Oracle中针对中文进行排序
在oracle 9i之前,对中文的排序,是默认按2进制编码来进行排序的. 9i时增加了几种新的选择:
- 按中文拼音进行排序:SCHINESE_PINYIN_M
- 按中文部首进行排序:SCHINESE_RADICAL_M
- 按中文笔画进行排序:SCHINESE_STROKE_M
而oracle 9i是对中文的排序是默认按拼音排序(并不是指NLS_SORT = SCHINESE_PINYIN_M,而是说SQL中不指定NLS_SORT时对中文列排序时默认按拼音)的,跟之前的2进制编码排序有所不同.具体用法如下:
- 直接写在sql中,例如:
- SELECT * FROM TEAM ORDER BY NLSSORT(排序字段名,'NLS_SORT = SCHINESE_PINYIN_M');
- SELECT * FROM TEAM ORDER BY NLSSORT(排序字段名,'NLS_SORT = SCHINESE_STROKE_M');
- SELECT * FROM TEAM ORDER BY NLSSORT(排序字段名,'NLS_SORT = SCHINESE_RADICAL_M');
- 配置在初始化参数NLS_SORT中,这可以在数据库创建时指定,也可以通过alter session来修改.如果是前者,则在所有session中生效.例如:
- 使用select * from NLS_SESSION_PARAMETERS;语句可以看到NLS_SORT的值.
- 更改配置文件:alter system set nls_sort='SCHINESE_PINYIN_M' scope=spfile;
- 更改session:alter SESSION set NLS_SORT = SCHINESE_PINYIN_M;
这里要额外注意一下性能问题,按oracle官方文档的解释,oracle在对中文列建立索引时,是按照2进制编码进行排序的,所以如果NLS_SORT被设置为BINARY时,排序则可以利用索引.如果不是2进制排序,而是使用上面介绍的3种针对中文的特殊排序,则oracle无法使用索引,会进行全表扫描.这点一定要注意,多用plsql工具比较一下执行效率.解决方法是,在此列上建立linguistic index.例如:CREATE INDEX nls_index ON my_table (NLSSORT(name, 'NLS_SORT = SCHINESE_PINYIN_M'));
以下是oracle文档中的原文:
Note:
Setting NLS_SORT to anything other than BINARY causes a sort to use a full table scan, regardless of the path chosen by the optimizer. BINARY is the exception because indexes are built according to a binary order of keys. Thus the optimizer can use an index to satisfy the ORDER BY clause when NLS_SORT is set to BINARY. If NLS_SORT is set to any linguistic sort, the optimizer must include a full table scan and a full sort in the execution plan.