CS3402 Integrity Constraints
Integrity Constraints
Motivation
- To ensure that changes (made by authorized users) to the database do not cause a loss of consistency
- Constraints are conditions that must hold on all valid relation instances.
- In general, an integrity constraint (IC) can be an arbitrary condition about the database
- In practice, they are limited to those that can be tested with minimal overhead
Some familiar ones from the ER model: key declarations | mapping cardinality (eg, M:N, 1:M, ...)
Main types of ICs:
- Non-null
- Key (uniqueness in column)
- Referential integrity (Foreign Key)
- Attribute-based
- Tuple-based
- General assertions
Key constraints
- Superkey of R: A set of attributes SK of R such that no two tuples in any valid relation instance r(R) will have the same value for SK. That is, for any distinct tuples t1 and t2 in r(R), t1[SK] <> t2[SK].
- Candidate Key of R: A "minimal" superkey; that is, a superkey K such that removal of any attribute from K results in a set of attributes that is not a superkey.
- Primary Key of R: choice by the DB designer when there are more than one candidate key
The attribute which are the part of CK are prime attribures CK: Minimum set of attribute used to uniquely differentiate records of the table( distinguish each record seperately/uniquely ) SK: CK + zero or more attributes (Every CK is SK, Every SK cannot be a CK)->Minimal SK is the CK PK: PK is a CK with no null values
Example: The CAR relation schema: CAR(State, Reg#, SerialNo, Make, Model, Year) has two (candidate) keys:
Key1 = {State, Reg#}, Key2 = {SerialNo}, which are also superkeys.
{SerialNo, Make} is a superkey but not a candidate key.
The primary key attributes are usually underlined.
Domain/Entity Constraints
Relational Database Schema: A set S of relation schemas that belong to the same database. S is the name of the database.
S = {R1, R2, ..., Rn}
Domain/Entity Integrity: specify the set of possible values that may be associated with an attribute
E.g.,
C-Name: string of char (30)
Balance: Number (6,2)
...
Domain/entity constraints may also prohibit “null” values for particular attributes
Note: The primary key attributes, PK of each relation schema R in S cannot have null values in any tuple of r(R). This is because primary key values are used to identify the individual tuples.
t[PK] <> null for any tuple t in r(R)
Other attributes of R may be similarly constrained to disallow null values, even though they are not members of the primary key.
Referential Integrity Constraints
- A constraint involving two relations (the previous constraints involve a single relation).
- Used to specify a relationship among tuples in two relations: the referencing relation and thereferenced relation.
- to ensure that a value appears in one relation also appears in another
-
typically this implies some “subset dependency”relationships between 2 sets of attributes in 2 tables
- Also called “foreign key constraint” because S# is the primary key on the foreign table (ie, Supplier)
super key 超键
A super key of an entity set is a set of one or more attributes whose values uniquely determine each entity
在关系中能唯一标识元组的属性集称为关系模式的超键
candidate key 候选键
A candidate key of an entity set is a minimal super key
不含有多余属性的超键称为候选键
举例:Customer_id is candidate key of customer
primary key 主键
Although several candidate keys may exist, one of the candidate keys is selected to be the primary key
用户选作元组标识的一个候选键程序主键
Foreign key 外键
完整性约束的一种,要求该属性在另一表的某个key中出现。
身份证号码 姓名 性别 年龄
身份证号码唯一,所以是一个超键
姓名唯一,所以是一个超键
(姓名,性别)唯一,所以是一个超键
(姓名,性别,年龄)唯一,所以是一个超键
--超键的组合是唯一的,但可能不是最小唯一的
身份证号码唯一,而且没有多余属性,所以是一个候选键
姓名唯一,而且没有多余属性,所以是一个候选键
(身份证号码, 姓名) 不能作为一个候选键, 因为身份证号码已经是唯一, 所以姓名就为多余属性
--候选键是没有多余属性的超键
考虑输入查询方便性,可以选择身份证号码为主键
也可以 考虑习惯 选择 姓名 为主键
--主键是选中的一个候选键
例子:
学号 姓名 性别 年龄 系别 专业 20020612 李辉 男 20 计算机 软件开发 20060613 张明 男 18 计算机 软件开发 20060614 王小玉 女 19 物理 力学 20060615 李淑华 女 17 生物 动物学 20060616 赵静 男 21 化学 食品化学 20060617 赵静 女 20 生物 植物学
问: 下列选项中哪个不能作为候选键?
a){学号}
b){学号、姓名}
c){年龄、系别}
d){姓名、性别}
e){姓名、专业}
a选项{学号}可以唯一的标识一个元组,比如说我们用20060616就可以知道是性别为男,名字为赵静的人。由于没有多余的属性,所以{学号}既可以称为一个超键,也可以称为一个候选键。
b选项{学号、姓名}也可以唯一的标识一个元组,但是这个集合里的姓名属性是不必要的,也就是多余的,即如果没有【姓名】属性,那么这个集合 {学号、姓名}也可以唯一确定一个元组。由于有多余的属性,所以{学号、姓名}可以称为超键,但是不能称为候选键。
c,d,e选项均可以唯一的标识一个元组,并且其中每一个属性均是必须的,去掉任意一个属性,那么其所在的集合便不能唯一的标识一个元组,所以c,d,e 均可以称为超键,也可以称为候选键。