How I can obtain the collation of a specific table in a database?
How I can obtain the collation of a specific table in a database?
回答1
Collation at the table level is on a per column basis, so it is possible to have a collation different than the database. If the collation is not defined at the column level, it defaults to the database collation setting.
SQL Server 2000:
SELECT c.name,
c.collation
FROM SYSCOLUMNS c
WHERE [id] = OBJECT_ID('your_table_name')
SQL Server 2005+:
SELECT c.name,
c.collation_name
FROM SYS.COLUMNS c
JOIN SYS.TABLES t ON t.object_id = c.object_id
WHERE t.name = 'your_table_name'
获取数据库默认的collation
Temp Table collation conflict - Error : Cannot resolve the collation conflict between Latin1* and SQL_Latin1*
回答1
This happens because the collations on #tempdb.temp_po.OD1
and STR_IndentDetail.D1
are different (and specifically, note that #tempdb
is a different, system database, which is generally why it will have a default opinion for collation, unlike your own databases and tables where you may have provided more specific opinions).
Since you have control over the creation of the temp table, the easiest way to solve this appears to be to create *char columns in the temp table with the same collation as your STR_IndentDetail
table:
CREATE TABLE #temp_po(
IndentID INT,
OIndentDetailID INT,
OD1 VARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AS,
.. Same for the other *char columns
In the situation where you don't have control over the table creation, when you join the columns, another way is to add explicit COLLATE
statements in the DML where errors occur, either via COLLATE SQL_Latin1_General_CP1_CI_AS
or easier, using COLLATE DATABASE_DEFAULT
SELECT * FROM #temp_po t INNER JOIN STR_IndentDetail s
ON t.OD1 = s.D1 COLLATE SQL_Latin1_General_CP1_CI_AS;
OR, easier
SELECT * FROM #temp_po t INNER JOIN STR_IndentDetail s
ON t.OD1 = s.D1 COLLATE DATABASE_DEFAULT;
Collations and Case Sensitivity
Database collation
In most database systems, a default collation is defined at the database level; unless overridden, that collation implicitly applies to all text operations occurring within that database. The database collation is typically set at database creation time (via the CREATE DATABASE
DDL statement), and if not specified, defaults to a some server-level value determined at setup time. For example, the default server-level collation in SQL Server for the "English (United States)" machine locale is SQL_Latin1_General_CP1_CI_AS
, which is a case-insensitive, accent-sensitive collation. Although database systems usually do permit altering the collation of an existing database, doing so can lead to complications; it is recommended to pick a collation before database creation.
When using EF Core migrations to manage your database schema, the following in your model's OnModelCreating
method configures a SQL Server database to use a case-sensitive collation:
modelBuilder.UseCollation("SQL_Latin1_General_CP1_CS_AS");
Column collation
Collations can also be defined on text columns, overriding the database default. This can be useful if certain columns need to be case-insensitive, while the rest of the database needs to be case-sensitive.
When using EF Core migrations to manage your database schema, the following configures the column for the Name
property to be case-insensitive in a database that is otherwise configured to be case-sensitive:
modelBuilder.Entity<Customer>().Property(c => c.Name)
.UseCollation("SQL_Latin1_General_CP1_CI_AS");
SQL Server Collation介绍及其变更对数据的影响
字符的存储
在将collation之前,我们首先需要知道字符是如何被存储的。在计算机中,所有数据都是用0和1这样的位来描述。一个字节有8位,因此一个字节最多可以描述256个字符。在欧美国家,比如美国,他们的文字字符主要就是26个字母加上一些特殊符号(+-*/等),用一个字节就可以存储,一个国家使用的所有字符就是一个code page,用一个字节存储字符的code page 叫做single-byte code page。但是在亚洲的一些国家,比如中国,常用汉字有几万个,根本不能用一个字节来表示所有的汉字字符,因此需要用两个字节描述。因为两个字节有16位,最多可以描述65536个字符,足够用来描述所有汉语字符以及常用字符,这些字符也是一个code page,不过是double-byte code page,主要针对的是中国。每个国家都有一个code page来对应所使用的字符。比如欧美国家,他们使用拉丁,虽然a-z这26个字母所对应的二进制在code page中是相同的,但是在重音('é'和'á')方面是不相同的。所以如果code page不同,那么相同的二进制代码所表示出来的字符也可能不相同。如果数据在不同code page的计算机上传输,就需要进行code page的转换,如果接收方的code page上没有定义传输方传送过来的特定字符的二进制位,那么就会出现数据丢失。
Collation的组成
在讲完字符的存储形式以后,我们就可以讲什么是collation了。Collation描述了数据在数据库中是按照什么规则来描述字符,以及字符时如何被排序和比较的。在SQL Server中,Collation由两部分组成,比如中国的一个collation是 Chinese_PRC_CI_AI_WS ,前半部份是指的是所支持的字符集,与code page相对应,如Chinese_PRC 对应的代码页是936,在这个code page中定义了所有能够使用的字符。后半部CI_AI_WS用于表示排序规则,比如:
- _CI(CS)表示是否区分字母大小写,CI不区分,CS区分。如果区分大小写,那么排序的时候小写字母的排在大写的前面;如果不区分大小写,那么排序的时候视大小写字母相同。
- _AI(AS) 表示是否区分重音,AI不区分,AS区分。如果不区分重音,那么排序的时候视“a”和“ấ”为相同字符
- _KI(KS) 表示是否区分假名类型,KI不区分,KS区分。在日语中应用。
- _WI(WS) 表示是否区分全半角,WI不区分,WS区分。半角是单字节,全角是双字节。
Collation的四个级别
Collation一共有四个级别,分别是server-level, database-level, column-level和expression-level。
作者:Chuck Lu GitHub |
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
2018-11-14 使用sql compare生成的sql语句
2018-11-14 string interpolation in sql server
2015-11-14 Hearthstone-Deck-Tracker汉化处理技巧
2014-11-14 Asynchronous Programming Using Delegates使用委托进行异步编程
2014-11-14 通过委托来实现异步 Delegate的BeginInvoke和EndInvoke
2014-11-14 DataSet 和 DataTable 以及 DataRow