Database design best practice(1):关于primary key及其它
1. The job of the primary key is to
uniquely identify records, not to store business data ; any use of
business data in a primary key is a dangerous practice, since any
changes to such data will have large ripple effects (from javapractices)
这是不是意味着每一张表都要增加一个没有业务意义的column来做为primary key,这个column的类型是int的,然后可以设置为自增(在sql server里只要把这个字段的identity属性设为Yes就可以了).
讨论一下以上的做法:
(1) 表中如果有column符合传统的做为primary key的条件(值唯一,不可能为空值,极少会变动),比如学号,这种情况下是不是还要增加一个column来表征primary key呢?
似乎没什么必要了.但这个时候的primary key却是表达了一些业务信息.倘若有一天学号的格式发生了变化(如由5位升级为6位) ,的确会比较麻烦,好在这种事情不多.
(2) 一个表里任何一个column都不满足primary key的硬条件,只有数个columns联合起来才能共同确定一条记录.这种情况下倒是建议增加一个column来做primary key.否则,你要查询一条信息还要写好几个条件: where and group = ? and team = ?, 会比较郁闷的.
(3) 如果一个column专门用来做primary key(不包含商业信息),建议使用int类型,这样可以auto increment功能.如果不使用auto increment,插入记录时选定primary的值是比较麻烦的.(使用int类型可能会有数据溢出的风险,但现在int类型都是32位的,最大值有42亿多,一般不会有问题)
2. "数字标志符(numerical identifier)"的类型应该是整数类型还是字符类型?
比如学号,工号,信用卡号.
(1) 如果要保留leading zero,就要用字符类型.
(2) 如果位数很大,也要用字符类型.
(3) 字符串也支持between操作,但是要注意'9'>'10', '09'<'10'(sql server2005中)
3. Do not use SELECT * in your queries. Always write the required column names after the SELECT statement
For reduced disk I/O and better performance
4. 数据库能完成的操作,不要在代码里完成.
比如
排序(order by)
过滤数据(where)
聚合信息(Group by,count,avg,stddev标准方差)
5. Views are generally used to show specific data to specific users based on their interest.
Views are also used to restrict access to the base tables by granting permission only on views.
Yet another significant use of views is that they simplify your queries.
Incorporate your frequently required, complicated joins and calculations into a view so that you don't have to repeat those joins/calculations in all your queries. Instead, just select from the view.
这是不是意味着每一张表都要增加一个没有业务意义的column来做为primary key,这个column的类型是int的,然后可以设置为自增(在sql server里只要把这个字段的identity属性设为Yes就可以了).
讨论一下以上的做法:
(1) 表中如果有column符合传统的做为primary key的条件(值唯一,不可能为空值,极少会变动),比如学号,这种情况下是不是还要增加一个column来表征primary key呢?
似乎没什么必要了.但这个时候的primary key却是表达了一些业务信息.倘若有一天学号的格式发生了变化(如由5位升级为6位) ,的确会比较麻烦,好在这种事情不多.
(2) 一个表里任何一个column都不满足primary key的硬条件,只有数个columns联合起来才能共同确定一条记录.这种情况下倒是建议增加一个column来做primary key.否则,你要查询一条信息还要写好几个条件: where and group = ? and team = ?, 会比较郁闷的.
(3) 如果一个column专门用来做primary key(不包含商业信息),建议使用int类型,这样可以auto increment功能.如果不使用auto increment,插入记录时选定primary的值是比较麻烦的.(使用int类型可能会有数据溢出的风险,但现在int类型都是32位的,最大值有42亿多,一般不会有问题)
2. "数字标志符(numerical identifier)"的类型应该是整数类型还是字符类型?
比如学号,工号,信用卡号.
(1) 如果要保留leading zero,就要用字符类型.
(2) 如果位数很大,也要用字符类型.
(3) 字符串也支持between操作,但是要注意'9'>'10', '09'<'10'(sql server2005中)
3. Do not use SELECT * in your queries. Always write the required column names after the SELECT statement
For reduced disk I/O and better performance
4. 数据库能完成的操作,不要在代码里完成.
比如
排序(order by)
过滤数据(where)
聚合信息(Group by,count,avg,stddev标准方差)
5. Views are generally used to show specific data to specific users based on their interest.
Views are also used to restrict access to the base tables by granting permission only on views.
Yet another significant use of views is that they simplify your queries.
Incorporate your frequently required, complicated joins and calculations into a view so that you don't have to repeat those joins/calculations in all your queries. Instead, just select from the view.