数据库
Data Base
DDL.“DDL是数据定义语言的缩写。定义了不同的数据段数据库表格索引和其他数据库对象。 普通语句的关键词主要包括CreateDropalter等。
RDBMS Relational Database Management System
- Data Base
- Lecture1 Introduction
- Lecture2 Introduction to SQL
- Lecture 3 Retrieving Data from One Table
- Some Functions
- Lecture4 More on Retrieving Data
- Lecture5 Retrieving Data from Multiple Tables
- Lecture6 Update,Delete,Applications
- Lecture 7 - 1:More about NULL; Ordering; Window Function
- Lecture8 - Window Function
- Lection10 Function
- Lecture12 Trigger
- Lecture13: View, Access Control
- Lecture14:Indexing
Lecture1 Introduction
Ted Codd. defines the relational data model
Oracle releases first commercial relational database
Rational Algebrea is the theoretical foundation for relational databases
Key
Duplicates are forbidden in relational tables
key可以是单独的一列,也可以是由多列组合成的
Primary Key
Normalization
“First Normal Form” (1NF)
• Each column should only contain ONE piece of information
一般就把一列拆成几列或者几个表
Normal Form 有UNF(universal),1NF到6NF
Every non key attribute must provide a fact about the key, the whole key, and nothing but the key——William Kent (1936 – 2005)
Entity(实体) and Relationship
• Entity Relationship Diagram (E/R Diagram, ER Diagram, ERD)
A way of representing entity tables and their relationships (relationship tables)
Data Definition Language (DDL)
DDL compiler generates a set of table templates stored in a data dictionary
• Database schema
• Integrity constraints (primary key, etc.)
• Authorization (who can access it)
Data Manipulation Language (DML)
• Language for accessing and updating the data organized by the appropriate data model (also known as query language)
SQL query language
• Takes several tables as input (possibly only one) and always returns a single table
Database Applications
• Database applications are usually partitioned into two or three parts
• Application programs generally access databases through one of
Language extensions to allow embedded SQL
Application Program Interface (e.g., ODBC/JDBC) which allow SQL queries to be sent to a database
Lecture2 Introduction to SQL
history: ALPHA Codd’s querying language
SEQUEL: A Structured English Query Language
IBM Sequel language developed as part of System R project at the IBM San Jose Research Laboratory
It is now renamed as Structured Query Language (SQL)
Standardizaion of SQL
ANSI and ISO standard SQL:
• SQL-86
• SQL-89
• SQL-92
• SQL:1999 (language name became Y2K compliant!)
• SQL:2003
• Commercial systems offer most, if not all, SQL-92 features, plus varying feature sets from later standards and special proprietary features
• Not all examples here may work on your particular system.
其他类似的查询语言有:
- QUEL--Ingres (后来升级成了postgres的前身)
- QBE(Query by Example),IBM开发的。
Basic Syntax of SQL
select * from lab where time = '3-34';
数据库语言有两个重要的部分,contents (data) ,containers (tables) 所以就有了DDL 和DML
Query Language
Data Definition Language (DDL)
create
alter
drop
Data Manipulation Language (DML)
select
insert
delete
update
Data Types
Text data types
• char(length) -- fixed-length strings
• varchar(max length) -- non-fixed-length text
• varchar2(max length) -- Oracle’s transformation of varchar
• clob -- very long text (like GB-level text) 在MySQL中就是text
Numerical types
• int -- Integer (a finite subset of the integers that is machine-dependent)
• float(n) -- Floating point number, with user-specified precision of at least n digits
• real -- Floating point and double-precision floating point numbers, with machine-dependent precision
• numeric(p, d)
• Fixed point number, with user-specified precision of p digits, with d digits to the right of decimal point
• E.g., numeric(3,1), allows 44.5 to be stores exactly, but not 444.5 or 0.32 • In SQL Server, it is also called decimal
Date types
• date -- YYYY-MM-DD
• datetime -- YYYY-MM-DD HH:mm:SS
• timestamp -- YYYY-MM-DD HH:mm:SS
• But it is in the UNIX timestamp
• Value range: 1970-01-01 00:00:01 UTC - 2038-01-19 03:14:07 UTC
• More reading about the “Year 2038 Problem” of the timestamp data type: https://en.wikipedia.org/wiki/Year_2038_problem
Binary data types
• raw(max length)
• varbinary(max length)
• blob -- binary large object
• bytea -- used in PostgreSQL
DDL
CREATE TABLE
CREATE TABLE table_name(
column1 datatype,
column2 datatype,
column3 datatype,
column4 datatype
);
对大多数sql语言来说,table_name 是大小写不敏感的。keywords(create table)也是。
推荐使用下划线命名法。
双引号""代表着case-sensitive
Constraints 由 Chris Date 先提出的来保证数据的准确性
Constraints are declarative rules that the DBMS will check every time new data will be added, when data is changed, or even when data is deleted, in order to prevent any inconsistency.
DROP TABLE
DROP TABLE table_name;
ALTER TABLE
ALTER TABLE table_name ADD column_name datatype;
ALTER TABLE table_name DROP COLUMN column_name;
ALTER TABLE table_name ALTER COLUMN column_name TYPE datatype; -- 更改某列的类型。
ALTER TABLE table_name
ADD CONSTRAINT MyPrimaryKey PRIMARY KEY (column1, column2...);
ALTER TABLE table_name MODIFY column_name datatype NOT NULL;
ALTER TABLE table_name
ADD CONSTRAINT MyUniqueConstraint UNIQUE(column1, column2...);
ALTER TABLE table_name
ADD CONSTRAINT MyUniqueConstraint CHECK (CONDITION);
ALTER TABLE 主要用于添加,删除,或者修改现有表中的列,也可以用于在现有表格上添加和删除各种约束。
Constraints
Not NULL
CREATE TABLE table_name(
column1 datatype not null,
column2 datatype,
column3 datatype,
column4 datatype
);
Primary Key
CREATE TABLE table_name(
column1 datatype not null primary key,
column2 datatype,
column3 datatype,
column4 datatype,
);
primary key implies not null, 所以此处的 not null 是多余的
Unique
CREATE TABLE table_name(
column1 datatype not null primary key,
column2 datatype,
column3 datatype,
column4 datatype,
unique (column1, column2)
);
CREATE TABLE table_name(
column1 datatype not null primary key,
column2 datatype unique,
column3 datatype,
column4 datatype,
);
unique可以是一列也可以是多列
Check
CREATE TABLE table_name(
column1 datatype not null primary key,
column2 datatype,
column3 datatype,
column4 datatype,
unique (column1, column2),
check(column1 >= 0),
constraint validate_1 check(column2 >= 0)
);
trick: 在名字中存在大小写不好去重时,可以用upper(name)全部转化成大写。
可以给每一个check取名字。如果每取,postgreSQL会生成一个默认的名字。
Referential Integrity 参照完整性/应用完整性
check语句是静态的,为了应对动态的变化,所以引入了 foreign key
Foreign Key
CREATE TABLE table_name(
column1 datatype not null primary key,
column2 datatype,
column3 datatype,
column4 datatype,
unique (column1, column2),
check(column1 >= 0),
constraint validate_1 check(column2 >= 0),
foreign key(column1) references table_2 (column3)
);
在大数据应用时,外键的使用要格外注意。
外键与级联更新更适用于单机低并发,不适合分布式、高并发集群。
Comments 注释
/*Multi-line
comments */
-- Single line comments
// Some DBMS support this one.
DML
INSERT
insert into table1(column1, column3, column6) values('1',1,'ds-dfds')
escape charactor: 文本中含'单引号怎么办,外层用双引号。
Lecture 3 Retrieving Data from One Table
What are you installing when you install PostgreSQL?
• Server-side program: the database management system itself
• Client-side program: the client tools to manipulate the server via networks
Search for the keywords after the class:
• basics in computer networking (IP address, port, client, server, web browser, HTTP)
• client-server architecture, browser-server architecture
超文本传输协议(Hyper Text Transfer Protocol,HTTP)是一个简单的请求-响应协议,它通常运行在TCP之上。它指定了客户端可能发送给服务器什么样的消息以及得到什么样的响应。
SELECT
select * from movies where country = 'us'
select 在交互程序可以多用,但是在应用程序中不应使用。
字符串常量应该用单引号括起来。
用法:
用法1. A common way to test expressions
select '437'
用法2. Can give the column a name
select '437' as FOO
用法3. 用于生成一个n行一列的表格,每个元素值都为'A'
select 'A' from movies
用法4. 用于对列进行算术计算
select runtime from movies;
select runtime * 10 as runtime10 from movies;
Logical Connectives
and, or, not
not: < > 或者 !=
select * from movies
where ( country = 'us' or country = 'gb' ) and (year_released between 1940 and 1949);
in()
select * from movies
where country = 'us' or country = 'gb';
--equivalent to
select * from movies
where coutry in ('us', 'gb');
Negation
key: not
select * from movies
where coutry not in ('us', 'gb');
Like
a kind of regex (regular expression) 正则表达式
% meaning "any number of characters, including none”
_ meaning "one and only one character"
https://www.jianshu.com/p/6dfeb91d6b51
如果要查%和_,就用escape character
select * from movies where upper(title) not like '%A%';
select username from gg_user where username like '%xiao/_%' escape '/';
select username from gg_user where username like '%xiao/%%' escape '/';
Date
Date Formats
DD/MM/YYYY
MM/DD/YYYY
YYYY/MM/DD
select * from forum_posts where post_date >= '2018-03-12';
select * from forum_posts where post_date >= date('2018-03-12');
select * from forum_posts where post_date >= date('12 March, 2018');
select date_eq_timestamp(date('2018-03-12'), date('2018-02-12') + inteval '1 month'); --true
NULL
在其他很多语言中null是一个值
在sql中,null 不是一个值
select * from movies where runtime is null;
select * from movies where runtime = null; -- warning in DataGrip; not the same as 'is null'
Some Functions
Show DDL of a table
desc movies; -- Oracle, MySQL
describle table movies; --IBM DB2
\d movies -- PostgreSQL
.schema movies --SQLite
Concatenating Two Strings
Most products use || (two vertical bars) to indicate string concatenation
• SQL Server, though, uses +
• MySQL a special concat() function that also exists in some other products
select title
|| 'was released in '
|| year_released movie_release
from movies
where country = 'us';
相当于选择格式为title || 'was released in ' || year_released ,把他们连接所形成的string 放在一列叫做 movie_release
select title
|| 'was release in '
|| cast(year_released as varchar) movie_release
from movies
where country = 'us'
cast(year_released as varchar): 把 year_released 这个数据类型原本是integer的转换成string
When to use functions
Computing age
You should never store an age; it changes all the time!!!
select peopleid, surname,
date_part('year', now()) - born as age
from people
where died is null;
date_part 用于从一些时间标识符领域提取资源,like
- century
- decade
- year
- month
- day
- hour
- minute
- second
- microseconds
- milliseconds
- dow
- doy
- epoch
- isodow
- isoyear
- timezone
- timezone_hour
- timezone_minute
Numerical functions
round(3.141592,3) --3.142
trunc(3.141592,3) --3.141
round 四舍五入,trunc截断
String functions
upper('Citizen Kane')
lower('Citizen Kane')
substr('Citizen Kane',5,3) -- 'zen'
trim(' Oops ') -- 'Oops'
replace('Sheep', 'ee', 'i') --'Ship'
Type casting
select cast(born as char)||'abc' from people;
select cast(born as char(2)) ||'abc' from people;
Case When
select peopleid, surname,
case gender
when 'M' then 'male'
when 'F' then 'female'
end as gender_2
from people
where died is null;
select peopleid,surname,
case(date_part('year', now()) - born > 44)
when true then 'older than 44'
when false then 'younger than 44'
else '44 years old'
end as status
from people
where died is null;
Lecture4 More on Retrieving Data
retrieving data: 检索数据
it's not a relation because many rows cannot be distinguished
Retrieving Data from a Single Table
distinct
select distinct country from movies
where year_released < 2000;
select distince country, year_released from movies where year_released in (2000,2001);
计算先后:先算where,再去重
第二个语句中distinct是作用在(country,year_released)上的。
aggregate functions (count)
聚合函数
Aggregate function will aggregate all rows that share a feature (such as being movies from the same country) and return a characteristic of each group of aggregated rows.
select country, count(*) number_of_movies
from movies group by country;
select country, year_releases count(*) number_of_movies_per_year from movies group by country, year_released;
notice: group by 后面的列要和聚合函数里的列一一对应。
number_of_movies 是生成的列的名称。
注意:执行顺序是先aggregate 再select from,所以时间复杂度较高。
其他常见的aggregate function有 count(*)/count(col), min(col), max(col), stddev(col), avg(col), 例如:
select country,min(year_released) oldest_movie
from movies group by country;
select * from(
select country,
min(year_released) oldest_movie
from movies
group by country
) earliest_movies_per_country
where oldest_movie < 1940;
双重select,外层的select是在内层的结果集上进行查找。
However, 也可以不用双重select,用 having 来实现。
select country,
min(year_released) oldest_movie
from movies
group by country
having min(year_release d) < 1940
注意:aggregate rows 隐喻着排序。
在DMS(database management systems)中有个"query optimizer" 会搞一些事情。。。可能会使情况变好/变坏。
aggregate functions 会忽视 null
count(*) 会count所有row包括null,而count(col) 只count非null的row
select count(distinct colname) 只count colname 中有多少个distinct的
既是演员又是导演
select count(*) number_of_acting_directors
from (
select peopleid, count(*) as number_of_roles
from(
select distinct peopleid, credit_as
from credits
where credits_as in('A','D')
) all_actors_and_directors
group by peopleid
having count(*) == 2) acting_directors;
Lecture5 Retrieving Data from Multiple Tables
JOIN
select title, country_name, year_released
from movies
join countries
on country_code = country
where country_code <> 'us';
natural join
select * from people natural join credits;
-- The same as:
select *
from people join credits
on people.peopleid = credits.peopleid;
-- Or use "using"
select *
from people natural join credits using (peopleid);
-- A better practice : just write all of them in a unified way
select *
from people join credits
on people.peopleid = credits.peopleid;
if we don't specify the column.
如果要join的表中有两列名称相同,natural join 会自动找出来join,最终表格中只保留名字不变的一列。
Notice:
“If a column has the same name, then we should join on it”
• Bad idea!
• Same name != Same meaning
It is preferred not to depend on how database designers name their columns
Self Join
Join the same table together
• For example: How can we find all the pairs of people with the same first name?
select *
from people p1 join people p2 -- rename the tables, or you cannot refer to them respectively
on p1.first_name = p2.first_name -- p1 = first people table; p2 = second people table
where p1.peopleid <> p2.peopleid;
Chaining Joins Together
Example: Show names of actors and directors for Chinese movies
select m.title, c.credited_as, p.first_name, p.surname
from
movies m join credits c on movieid = c.movieid join people p on c.peopleid = p.peopleid
where m.country = 'cn';
上面的代码把 movies rename成 m
join was introduced in SQL-1999
join 默认是inner,不然就需要left join/right join/full join
Left outer join
select * from movies m left join credits c on m.movieid = c.movieid
where m.year_released = 2018;
select c.country_name, number_of_movies
from countries c left join(
select country as stat_country_code,
count(*) as number_of_movies
from movies
group by country
) stat
on c.country_code = stat_country_code;
在合并表格的同时replace nulls
select c.country_name,
case
when stat.number_of_movies is null then 0
else stat.number_of_movies
end
from countries c left join(
select country as stat_country_code,
count(*) as number_of_movies
from movies
group by country
) stat
on c.country_code = stat_country_code;
Set Operators
Unoin
Takes two result sets and combines them into a single result set
把两个五列的表格unoin在一起会得到一个五列的表格,把他们join在一起是十列表格。
Union requires two (commonsensical) conditions:
• They must return the same number of columns
• The data types of corresponding columns must match.
select movieid
from movies
where country = 'us'
union
select movieid
from movies
where country = 'gb'
Usage scenario: combine movies from two tables, one for standard accounts and one for VIP accounts
(select movieid
from movies limit 5 offset 0)
union
(select movieid
from movies limit 5 offset 3)
limit 表示读取5条数据,offset 表示跳过 3个数据,即从第4个数据开始读5个。limit也可单独使用。
union 会自动去重,如果不要自动去重,就用union all
intersect
Return the rows that appears in both tables
except
Return the rows that appear in the first table but not the second one
Sometimes written as minus in some database products
However, they are not used as much as union
• intersect -> inner join
• except -> left outer join with an “is null” condition
Subquery
Subquery after Select
Example: show titles, released years, and country names for non-US movies
-- solution 1 : join
select m.title, m.year_released, c.country_name
from movies m join countries c
on m.country = c.country_code
where m.country <> 'us';
-- solution 2 : Nested selection
select m.title,
m.year_released,
(
select c.country_name
from countries c
where c.country_code = m.country
) country_name
from movies m
where m.country <> 'us';
Subquery after Where
Example: Select all European movies
-- solution1:list them all
select country, year_released, title
from movies
where country in('fr','de',...)
-- proper solution: use subquery after where
select country, year_released, title
from movies
where country in(
select country_code
from countries
where continent = 'EUROPE'
);
Especially useful when the table in the subquery changes often
Some products (Oracle, DB2, PostgreSQL with some twisting) even allow comparing a set of column values (the correct word is "tuple") to the result of a subquery. e.g.
(col1, col2) in
(select col3, col4
from t
where ...)
Notice:
in() means an implicit distinct in the subquery
• in(‘cn’, ‘us’, ‘cn’, ‘us’, ‘us’) is equal to in(‘cn’, ‘us’)
null values in in()
• Be extremely cautious if you are using not in(…) with a null value in it
value not in(2, 3, null)
等价于 not (value=2 or value=3 or value=null)
等价于 value<>2 and value<>3 and value<>null
因为null不是一个数,所以对任意的value,value = null 和value <> null 都永远为 false。所以not in(2,3,null) 的返回值一直是false。
Lecture6 Update,Delete,Applications
CRUD/CURD(就是顺序不一样,都是指数据库中最基本的操作) crud是指在做计算处理时的增加(Create)、检索(Retrieve)、更新(Update)和删除(Delete)几个单词的首字母简写。crud主要被用在描述软件系统中数据库或者持久层的基本操作功能。
• In SQL: insert, select, update, delete
• In RESTful API: Post, Get, Put, Delete
Update
When you are doing any experiments with writing operations (update, delete), backup first
在操作之前,先用select看一下效果
select replace('von Neumann', 'von ', '') || '(von)';
然后再操作
update people
set surname = replace(surname, 'von ', '') || '(von)'
where surname like 'von %';
如果surname中不含von,就是在末尾加上(von)
如果没有where的限制,会对整张表进行更改。在一些IDEs中如datagrip会有warning
update people p
set num_movies = (
select count(*) from credits c where c.peopleid = p.peopleid
);
Delete
delete from countries
where country_code = 'us';
many database products provide a roll-back mechanism when deleting rows 意思是如果执行到某一行出错了,之前已经执行了的delete的操作也会被恢复。
Application Programs and User Interfaces
Applications split into
• front-end : user interface
• Forms
• Graphical user interfaces
• Many interfaces are Web-based
• middle layer
• backend
Application Architecture Evolution
Three distinct era’s of application architecture
• Mainframe (1960’s and 70’s)
• Personal computer era (1980’s)
• Web era (mid 1990’s onwards)
• Web and Smartphone era (2010 onwards)
Web Interface
Web browsers have become the de-facto(事实上) standard user interface to databases
The World Wide Web
Most Web documents are hypertext documents formatted via the HyperText Markup Language (HTML)
HTML documents contain
• text along with font specifications, and other formatting instructions
• hypertext links to other documents, which can be associated with regions of the text.
• forms, enabling users to enter data which can then be sent back to the Web server
Three-Layer Web Architecture
HTML and HTTP
• HTML provides formatting, hypertext link, and image display features
• including tables, stylesheets (to alter default formatting), etc.
• HTML also provides input features
• Select from a set of options
• Pop-up menus, radio buttons, check lists
• Enter values
• Text boxes • Filled in input sent back to the server, to be acted upon by an executable at the server
• HyperText Transfer Protocol (HTTP) used for communication with the Web server
JavaScript
Lecture 7 - 1:More about NULL; Ordering; Window Function
NULL
大部分和null一起的运算结果都为null
col+NULL -> NULL
(col > NULL) -> NULL
需要注意的logical operator(and,or)
TRUE and NULL -> NULL
FALSE and NULL -> FALSE
TRUE or NULL -> TRUE
FALSE or NULL -> NULL
col is NULL -> Ture or False
-- 注意
value not in (2,3,null) -- 这句表达式的值不可能为true
-- 可以理解为 value not in (2,3) and (value <> NULL)
-- (value <> NULL) 的取值只可能为NULL
Ordering in SQL
order ... by ... asc/desc
可以 order by 多列
单表可以 order by
多表 join 时,可以 order by join 前的一个表的一列。
还可以有自定义的顺序
select * from credits
order by
case credited_as
when 'D' then 1
when 'A' then 2
end desc
data types in ordering
NULL: 在不同的数据库中有不同的级别
Oracle/PostgreSQL:NULL greater than anything
SQL Server, MySQL, SQLite:NULL smaller than anything
limit offset
limit k offset p
Return the top-k rows in the result set and skip the first p rows
select * from movies
where country = 'us'
order by year_released
limit 10 offset 5
注意:limit offset 一定和要 order by 一起使用
Lecture8 - Window Function
Scalar Functions vs Aggregation Functions
Scalar Function: 使用某行的数据或某个数据
Aggregation Functions:使用多行的数据
Window Function
select title, country, year_released,
rank() over(partition by country order by year_released) movie_rank
from movies;
select country,title, year_released, min(year_released) over (partition by country order by year_released) movie_rank
from movies;
aggregate funcion 只能返回一个数据, window function会多加上一列
Lection10 Function
Procedural vs Declarative
Procedural language 过程式语言
Imperative: Describe the algorithms step-by-step (How to do)
• Procedural: C (and many other legacy languages) procedural 类似命令型的语言
• Object-oriented: Java
Declarative: Describe the result without specifying the detailed steps (What to do)
• (Pure) declarative: SQL, Regular Expressions, Markup (HTML, XML), CSS
• Functional: Scheme, Haskell, Scala, Erlang
• Logic programming: Prolog
Procedural Extension to SQL
Many DBMS products provide a procedural (私有的,每个DBMS会不一样)extension to the standard SQL
SQL Server Transact-SQL
ORACLE PL/SQL
PostgreSQL PL/PGSQL
MySQL (No specific name)
SQLite (Not supported) … well, sometimes SQLite is even not considered a DBMS
select case
when first_name is null then ''
else first_name || ''
end || case position('(' in surname)
when 0 then surname
else trim(')' from substr(surname, postition('(' in surname) + 1)) || '' || trim(substr(surname, 1, position('(' in surname) - 1))
end
from people
where surname not like '%(von)';
create function function_name(p_fname varchar, p_sname varchar)
returns varchar
as $$
begin
return case
when p_fname is null then ''
else p_fname || ''
end || case position('(' in surname)
when 0 then surname
else trim(')' from substr(surname, postition('(' in surname) + 1)) || '' || trim(substr(surname, 1, position('(' in surname) - 1))
end;
end;
$$ language plpgsql;
Language Type
PostgreSQL supports 4 procedural languages: PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python (Tcl, Perl, Python 都是著名的scripting language)
Trim
built-in function 去掉一个字符串中的(第一个遇到的)指定字符。默认是去空格,直接用trim()的话,默认是使用BTRIM()
trim (LEADING FROM string)
LTRIM(string, [character])
trim (TRAILING FROM string)
RTRIM(string, [character])
trim (BOTH FROM string)
BTRIM(string, [character])
function 里也可以使用select等SQL语句
create function get_country_name(p_code varchar)
returns countries.country_name%type
-- %type 求这一列对应的类型
as $$
declare
v_name countries.country_name%type;
begin
select country_name
into v_name
from countries
where country_code = p.code;
return v_name;
end;
$$ language plpgsql;
drop function get_country_name(p_code varchar)
--调用函数
select get_country_name("dfs")
perform get_country_name("dfs")
Functions vs. Procedures
“Function” comes from mathematics (map a value to another,Thus, functions always have a return value )
”Procedure” comes from programming (describe a set of instructions that will be executed in order,does NOT (necessarily) have a return value)
在一些语言中他们是等价的
Functions and Procedures in (Postgre)SQL
- Function: return a value
- Procedure: return NO value
但是在postgreSQL中Procedure通过void function实现
用procedure的好处:
-
Network overhead
减少和服务器交互的时间消耗
-
Security
Access to the Oracle data can be restricted by allowing users to manipulate the data only through stored procedures that execute with their definer’s privileges
How can we pack them into a single execution unit?
• Minimize the communication between the client program we are using and the database server
• Client program = DataGrip, psql
insert into select
insert into table2
select * form table1
where condition;
在使用insert into 之后,可以判断一下是不是正确插入了。
get diagnostics n_rowcount = row_count;
if n_rowcount != n_people
then
raise exception 'Some people couldn't be found';
end if;
-- 写在函数里
Lecture12 Trigger
Trigger
Trigger 触发器,在对表格的某种修改后会自动触发
作用:
- Validate data
- Checking complex rules
- Manage data redundancy
On-the-fly modification(计算机:运行中):Change the input directly when the input arrives
比如:插入同一部电影的英文信息,中文信息等。
- PostgreSQL 触发器可以在下面几种情况下触发:在执行操作之前BEFROE(在检查约束并尝试INSERT、UPDATE或DELETE之前)。在执行操作之后AFTER(在检查约束并INSERT、UPDATE或DELETE之后)。更新操作(在对一个视图进行插入、更新、删除时)。
- 每个DBMS对允许触发trigger的时间会不一样。
- 触发器的 FOR EACH ROW 属性是可选的,如果选中,当操作修改时每行调用一次;相反,选中 FOR EACH STATEMENT,不管修改了多少行,每个语句标记的触发器执行一次。
- WHEN 子句和触发器操作在引用 NEW.column-name 和 OLD.column-name 表单插入、删除或更新时可以访问每一行元素。其中 column-name 是与触发器关联的表中的列的名称。
- 如果存在 WHEN 子句,PostgreSQL 语句只会执行 WHEN 子句成立的那一行,如果没有 WHEN 子句,PostgreSQL 语句会在每一行执行。
- BEFORE 或 AFTER 关键字决定何时执行触发器动作,决定是在关联行的插入、修改或删除之前或者之后执行触发器动作。
- 要修改的表必须存在于同一数据库中,作为触发器被附加的表或视图,且必须只使用 tablename,而不是 database.tablename。
- 当创建约束触发器时会指定约束选项。这与常规触发器相同,只是可以使用这种约束来调整触发器触发的时间。当约束触发器实现的约束被违反时,它将抛出异常。
Fire a trigger
create trigger test_trigger
before update
-- before/after/instead of + insert/update/delete
on people_1
for each row
-- for each row/statement
execute procedure fill_in_num_movies();
create or replace function fill_in_num_movies()
returns trigger
as
$$
begin
select count(distinct c.movieid)
into new.num_movies
-- "new" 和 "old" 是两个内部变量,that represent the row before and after the changes
from credits c
where c.peopleid = new.peopleid;
return new;
end;
$$ language plpgsql;
update people_l set num_movies = 0 where people_l.peopleid <= 100;
执行顺序:
- create the procedure fill_in_num_movies()
- then, create the trigger
- finally, we can run some test updates statements
常见使用场景:
Modify input on the fly
- before insert / update
- for each row
Check complex rules
- before insert / update / delete
- for each row
Manage data redundancy
- after insert / update / delete
- for each row
Lecture13: View, Access Control
View
View(视图)实际上是一个以预定义的 PostgreSQL 查询形式存在的表的组合。
View(视图)可以包含一个表的所有行或从一个或多个表选定行。
View(视图)可以从一个或多个表创建,这取决于要创建视图的 PostgreSQL 查询。
View(视图)是一种虚拟表,允许用户实现以下几点:
- 用户或用户组认为更自然或直观查找结构数据的方式。
- 限制数据访问,用户只能看到有限的数据,而不是完整的表。
- 汇总各种表中的数据,用于生成报告。
PostgreSQL 视图是只读的,因此可能无法在视图上执行 DELETE、INSERT 或 UPDATE 语句。但是可以在视图上创建一个触发器,当尝试 DELETE、INSERT 或 UPDATE 视图时触发,需要做的动作在触发器内容中定义。i
create view my_view(col1, col2, col3) as
select m.movieid,m.title,m.year_released, c.country_name
from movies m join countries c
on c.country_code = m.country;
--这样会把前三列重命名成col1,col2,col3,第四列会使用它原来的名字。
drop view my_view;
View(视图)实际上是一个以预定义的 PostgreSQL 查询形式存在的表的组合。
View(视图)可以包含一个表的所有行或从一个或多个表选定行。
View(视图)可以从一个或多个表创建,这取决于要创建视图的 PostgreSQL 查询。
View(视图)是一种虚拟表,允许用户实现以下几点:
- 用户或用户组认为更自然或直观查找结构数据的方式。
- 限制数据访问,用户只能看到有限的数据,而不是完整的表。
- 汇总各种表中的数据,用于生成报告。
PostgreSQL 视图是只读的,因此可能无法在视图上执行 DELETE、INSERT 或 UPDATE 语句。但是可以在视图上创建一个触发器,当尝试 DELETE、INSERT 或 UPDATE 视图时触发,需要做的动作在触发器内容中定义。
使用场景:
- Simplify Complex Queries
- An alternative way to implement E-R models
view可以作为另一种实现ER图的方式,比如 access control, dirty and messy original data.
Access Control
Operations (select, update, insert, delete, etc.)
Objects (table, database, views, trigger, etc.)
要向用户分配权限,可以使用 GRANT 命令。
grant select on movies to test_user;
GRANT privilege [,...]
on object [,...]
TO {PUBLIC | Group group | username}
- privilege − 值可以为:SELECT,INSERT,UPDATE,DELETE, RULE,ALL。
- object − 要授予访问权限的对象名称。可能的对象有: table, view,sequence。
- PUBLIC − 表示所有用户。
- GROUP group − 为用户组授予权限。
- username − 要授予权限的用户名。PUBLIC 是代表所有用户的简短形式。
可以使用 REVOKE 命令取消权限。
revoke select on movies from test_user;
REVOKE privilege [,...]
ON object [,...]
FROM {PUBLIC | Group groupname | username}
和view的联动:
create view user_view as
select movieid,title
from movies
where user_name = user;
--user 是一个内置的变量,返回现有的所有user_name
Lecture14:Indexing
Concept:An index is a data structure which improves the efficiency of retrieving data with specific values from a database
In PostgreSQL, indexes are built automatically on columns with primary key or unique constraints
create index index_name
on table_name (column_name [, ...])
create index salary_index on company(salary);
Index Taxonomy
索引的分类
-
能否完全和数据分开
能:external index(postgreSQL)
不能:integrated index
-
是否说明了数据的存储顺序,是不是排了序
是:clustered index
否:non-clustered index
-
Does every search key in the data file correspond to an index entry?
Yes:Dense Index
No: Sparse Index
-
Does the search key contain more than one attribute?
Yes:Multi-key/Multi-column index
No:Single-key/Single-column index
Index implementation
索引的实现
主要是两种数据结构:
B-tree 和B+-tree 中间那个是连字符,不是B-
Hash table
B tree
虽然索引的目的在于提高数据库的性能,但这里有几个情况需要避免使用索引。
使用索引时,需要考虑下列准则:
- 索引不应该使用在较小的表上。
- 索引不应该使用在有频繁的大批量的更新或插入操作的表上。
- 索引不应该使用在含有大量的 NULL 值的列上。
- 索引不应该使用在频繁操作的列上。