SQL COOKBOOK (Ch.1-10)

by Anthony Molinaro

 

---Preface---

 

平台与版本

  • DB2 v.8
  • Oracle Database 10g (with the exception of a handful of recipes, the solutions will work for Oracle8i Database and Oracle9i Database as well)
  • PostgreSQL 8
  • SQL Server 2005
  • MySQL 5

 

Supplemental files and examples for this book can be found at 

http://examples.oreilly.com/9780596009762/

 

Tables Used in This Book

select * from emp;

select * from dept;

 

---Chapter 1 Retrieving Records---

Retrieving Records, introduces very simple queries. Examples include how to use a WHERE clause to restrict rows from your result set, providing aliases for columns in your result set, using an inline view to reference aliased columns, using simple conditional logic, limiting the number of rows returned by a query, returning random records, and finding NULL values. Most of the examples are very simple, but some of them appear in more complex recipes, so it’s a good idea to read this chapter if you’re relatively new to SQL or aren’t familiar with any of the examples listed for this chapter. 

 

1.6. Referencing an Aliased Column in the WHERE Clause

Problem

You have used aliases to provide more meaningful column names for your result set and would like to exclude some of the rows using the WHERE clause. However, your attempt to reference alias names in the WHERE clause fails:

1 select sal as salary, comm as commission
2     from emp
3 where salary < 5000

 

Solution

By wrapping your query as an inline view you can reference the aliased columns:

1 select * 
2 from (
3 select sal as salary, comm as commission
4 from emp) x
5 where salary < 5000

Why do you need to do this? The WHERE clause is evaluated before the SELECT, thus, SALARY and COMMISSION do not yet exist when the “Problem” query’s WHERE clause is evaluated. Those aliases are not applied until after the WHERE clause processing is complete. However, the FROM clause is evaluated before the WHERE. By placing the original query in a FROM clause, the results from that query are generated before the outermost WHERE clause, and your outermost WHERE clause "sees"  the alias names. This technique is particularly useful when the columns in a table are not named particularly well.

 

1.7. Concatenating Column Values

Problem

You want to return values in multiple columns as one column. For example, you would like to produce this result set from a query against the EMP table:

CLARK WORKS AS A MANAGER
KING WORKS AS A PRESIDENT
MILLER WORKS AS A CLERK

 

DB2, Oracle, PostgreSQL

These databases use the double vertical bar as the concatenation operator:

1 select ename ||' WORKS AS A '||job as msg
2 from emp
3 where deptno=10

 

MySQL

This database supports a function called CONCAT:

1 select concat(ename, ' WORKS AS A ',job) as msg
2 from emp
3 where deptno=10

 

SQL Server

Use the "+” operator for concatenation:

1 select ename + ' WORKS AS A ' + job as msg
2 from emp
3 where deptno=10

 

 

1.8. Using Conditional Logic in a SELECT Statement

Problem

You want to perform IF-ELSE operations on values in your SELECT statement. For example, you would like to produce a result set such that, if an employee is paid \$2000 or less, a message of “UNDERPAID” is returned, if an employee is paid \$4000 or more, a message of “OVERPAID” is returned, if they make somewhere in between, then “OK” is returned. The result set should look like this

 

Solution

Use the CASE expression to perform conditional logic directly in your SELECT statement:

1 select ename, sal,
2 case when sal <= 2000 then 'UNDERPAID'
3 when sal >= 4000 then 'OVERPAID'
4 else 'OK'
5 end as status
6 from emp

 

1.12. Transforming Nulls into Real Values

Problem

You have rows that contain nulls and would like to return non-null values in place of those nulls.

Solution

Use the function COALESCE to substitute real values for nulls:

1 select coalesce(comm,0)
2 from emp

 

 

1.13. Searching for Patterns

Problem

Of the employees in departments 10 and 20, you want to return only those that have either an “I” somewhere in their name or a job title ending with “ER”: 

  

 

 

Solution

Use the LIKE operator in conjunction with the SQL wildcard operator (”%”):

select ename, job
from emp
where deptno in (10,20)
and (ename like '%I%' or job like '%ER')

 

 

 

 

---Chapter 2 Sorting Query Results---

Sorting Query Results, introduces recipes for sorting query results. The ORDER BY clause is introduced and is used to sort query results. Examples increase in complexity ranging from simple, single-column ordering, to ordering by substrings, to ordering based on conditional expressions.

 

2.3. Sorting by Substrings

Problem

You want to sort the results of a query by specific parts of a string. For example, you want to return employee names and jobs from table EMP and sort by the last two characters in the job field.

 

Solution

DB2, MySQL, Oracle, and PostgreSQL

Use the SUBSTR function in the ORDER BY clause:

1 select ename,job
2 from emp
3 order by substr(job,length(job)-2)

 

SQL Server

Use the SUBSTRING function in the ORDER BY clause:

1 select ename,job
2 from emp
3 order by substring(job,len(job)-2, 2)

 

 

2.4. Sorting Mixed Alphanumeric Data

Problem

You have mixed alphanumeric data and want to sort by either the numeric or character portion of the data. Consider this view:

 

Solution

Oracle and PostgreSQL

Use the functions REPLACE and TRANSLATE to modify the string for sorting:

1 /* ORDER BY DEPTNO */
2 select data
3 from V
4 order by replace(data, replace(translate(data,'0123456789','##########'),'#',''),'')

 

---Chapter 3 Working with Multiple Tables---

Working with Multiple Tables. Examples in this chapter include performing both inner and outer joins, identifying Cartesian productions, basic set operations (set difference, union, intersection), and the effects of joins on aggregate functions.

 

 

3.1. Stacking One Rowset atop Another

Problem

You want to return data stored in more than one table, conceptually stacking one result set atop the other. The tables do not necessarily have a common key, but their columns do have the same data types. For example, you want the result set to look like the following:

Solution

Use the set operation UNION ALL to combine rows from multiple tables:

1 select ename as ename_and_dname, deptno
2 from emp
3 where deptno = 10
4 union all
5 select '----------', null
6 from t1
7 union all
8 select dname, deptno
9 from dept

 

 

3.4. Retrieving Values from One Table That Do Not Exist in Another

Problem

You wish to find those values in one table, call it the source table, that do not also exist in some target table. For example, you want to find which departments (if any) in table DEPT do not exist in table EMP. In the example data, DEPTNO 40 from table DEPT does not exist in table EMP, so the result set should be the following:

DEPTNO

----------

40

 

Solution

Having functions that perform set difference is particularly useful for this problem. DB2, PostgreSQL, and Oracle support set difference operations. If your DBMS does not support a set difference function, use a subquery as shown for MySQL and SQL Server.

 

DB2 and PostgreSQL

Use the set operation EXCEPT:

select deptno from dept
except
select deptno from emp

 

Oracle

Use the set operation MINUS:

1 Use the set operation MINUS:
2 select deptno from dept
3 minus
4 select deptno from emp

 

MySQL and SQL Server

Use a subquery to return all DEPTNOs from table EMP into an outer query that searches table DEPT for rows that are not amongst the rows returned from the subquery:

1 select deptno
2 from dept
3 where deptno not in (select deptno from emp)

 

 

3.6. Adding Joins to a Query Without Interfering with Other Joins

Problem

You have a query that returns the results you want. You need additional information, but when trying to get it, you lose data from the original result set. For example, you want to return all employees, the location of the department in which they work, and the date they received a bonus. For this problem, the EMP_BONUS table contains the following data:

Your desired result set is the following:

  

Solution

You can use an outer join to obtain the additional information without losing the data from the original query. First join table EMP to table DEPT to get all employees and the location of the department they work, then outer join to table EMP_ BONUS to return the date of the bonus if there is one. Following is the DB2, MySQL, PostgreSQL, and SQL Server syntax:

1 select e.ename, d.loc, eb.received
2 from emp e join dept d
3 on (e.deptno=d.deptno)
4 left join emp_bonus eb
5 on (e.empno=eb.empno)
6 order by 2

 

 

You can also use a scalar subquery (a subquery placed in the SELECT list) to mimic an outer join:

select e.ename, d.loc,
(select eb.received from emp_bonus eb
where eb.empno=e.empno) as received
from emp e, dept d
where e.deptno=d.deptno
6 order by 2

 

 

---Chapter 4 Inserting, Updating, Deleting---

Inserting, Updating, Deleting, introduces recipes for inserting, updating, and deleting data, respectively. Most of the examples are very straightforward (perhaps even pedestrian). Nevertheless, operations such as inserting rows into one table from another table, the use of correlated subqueries in updates, an understanding of the effects of NULLs, and knowledge of new features such as multi-table inserts and the MERGE command are extremely useful for your toolbox. 

 

4.1. Inserting a New Record

Problem

You want to insert a new record into a table. For example, you want to insert a new record into the DEPT table. The value for DEPTNO should be 50, DNAME should be “PROGRAMMING”, and LOC should be “BALTIMORE”.

Solution

Use the INSERT statement with the VALUES clause to insert one row at a time:

1 insert into dept (deptno,dname,loc)
2 values (50,'PROGRAMMING','BALTIMORE')

 

 

For DB2 and MySQL you have the option of inserting one row at a time or multiple rows at a time by including multiple VALUES lists:

1 /* multi row insert */
2 insert into dept (deptno,dname,loc)
3 values (1,'A','B'), (2,'B','C')

As a shortcut, you can omit the column list in an INSERT statement:

1 insert into dept
2 values (50,'PROGRAMMING','BALTIMORE')

 

 

4.7. Blocking Inserts to Certain Columns

Problem

You wish to prevent users, or an errant software application, from inserting values into certain table columns. For example, you wish to allow a program to insert into EMP, but only into the EMPNO, ENAME, and JOB columns.

Solution

Create a view on the table exposing only those columns you wish to expose. Then force all inserts to go through that view. For example, to create a view exposing the three columns in EMP:

1 create view new_emps as
2 select empno, ename, job
3 from emp

Grant access to this view to those users and programs allowed to populate only the three fields in the view. Do not grant those users insert access to the EMP table. Users may then create new EMP records by inserting into the NEW_EMPS view, but they will not be able to provide values for columns other than the three that are specified in the view definition.

 

4.8. Modifying Records in a Table

Problem

You want to modify values for some or all rows in a table. For example, you might want to increase the salaries of everyone in department 20 by 10%.

Solution

Use the UPDATE statement to modify existing rows in a database table. For example:

1 update emp
2 set sal = sal*1.10
3 where deptno = 20

 

4.16. Deleting Duplicate Records

Problem

You want to delete duplicate records from a table. Consider the following table dupes:

Solution

Use a subquery with an aggregate function such as MIN to arbitrarily choose the ID to retain (in this case only the NAME with the smallest value for ID is not deleted):

delete from dupes
where id not in ( select min(id)
from dupes
group by name )

 

---Chapter 5 Metadata Queries---

Metadata Queries, introduces recipes for getting at your database metadata. It’s often very useful to find the indexes, constraints, and tables in your schema. The simple recipes here allow you to gain information about your schema. Additionally, “dynamic” SQL examples are shown here as well, i.e., SQL generated by SQL. 

5.1. Listing Tables in a Schema

Problem

You want to see a list all the tables you’ve created in a given schema.

Solution

The solutions that follow all assume you are working with the SMEAGOL schema. The basic approach to a solution is the same for all RDBMSs: you query a system table (or view) containing a row for each table in the database.

DB2

Query SYSCAT.TABLES:

1 select tabname
2 from syscat.tables
3 where tabschema = 'SMEAGOL'

 

Oracle

Query SYS.ALL_TABLES:

select table_name
from all_tables
where owner = 'SMEAGOL'

 

 

PostgreSQL, MySQL, and SQL Server

Query INFORMATION_SCHEMA.TABLES:

1 select table_name
2 from information_schema.tables
3 where table_schema = 'SMEAGOL'

 

 

5.6. Using SQL to Generate SQL

Problem

You want to create dynamic SQL statements, perhaps to automate maintenance tasks. You want to accomplish three tasks in particular: count the number of rows in your tables, disable foreign key constraints defined on your tables, and generate insert scripts from the data in your tables.

Solution

The concept is to use strings to build SQL statements, and the values that need to be filled in (such as the object name the command acts upon) will be supplied by data from the tables you are selecting from. Keep in mind, the queries only generate the statements; you must then run these statements via script, manually, or however you execute your SQL statements. The examples below are queries that would work on an Oracle system. For other RDBMSs the technique is exactly the same, the only difference being things like the names of the data dictionary tables and date formatting. 

1 /* generate SQL to count all the rows in all your tables */
2 select 'select count(*) from '||table_name||';' cnts
3 from user_tables;

1 /* disable foreign keys from all tables */
2 select 'alter table '||table_name|| ' disable constraint '||constraint_name||';' cons
3 from user_constraints
4 where constraint_type = 'R';

 

 

1 /* generate an insert script from some columns in table EMP */ 
2 select 'insert into emp(empno,ename,hiredate) '||chr(10)|| 'values( '||empno||','||''''||ename ||''',to_date('||''''||hiredate||''') );' inserts
3 from emp
4 where deptno = 10;

 

 

 

---Chapter 6 Working with Strings---

Working with Strings, introduces recipes for manipulating strings. SQL is not known for its string parsing capabilities, but with a little creativity (usually involving Cartesian products) along with the vast array of vendorspecific functions, you can accomplish quite a bit. This chapter is where the book begins to get interesting. Some of the more interesting examples include counting the occurrences of a character in a string, creating delimited lists from table rows, converting delimited lists and strings into rows, and separating numeric and character data from a string of alphanumeric characters. 

6.1. Walking a String

Problem

You want to traverse a string to return each character as a row, but SQL lacks a loop operation. For example, you want to display the ENAME “KING” from table EMP as four rows, where each row contains just characters from “KING”.

 

Solution

Use a Cartesian product to generate the number of rows needed to return each character of a string on its own line. Then use your DBMS’s built-in string parsing function to extract the characters you are interested in (SQL Server users will use SUBSTRING instead of SUBSTR and DATALENGTH instead of LENGTH):

1 select substr(e.ename,iter.pos,1) as C
2 from (select ename from emp where ename = 'KING') e,
3 (select id as pos from t10) iter
4 where iter.pos <= length(e.ename)

 

Discussion

The key to iterating through a string’s characters is to join against a table that has enough rows to produce the required number of iterations. This example uses table T10, which contains 10 rows (it has one column, ID, holding the values 1 through 10). The maximum number of rows that can be returned from this query is 10.

The following example shows the Cartesian product between E and ITER (i.e., between the specific name and the 10 rows from T10) without parsing ENAME: 

1 select ename, iter.pos
2 from (select ename from emp where ename = 'KING') e,
3 (select id as pos from t10) iter

 

6.3. Counting the Occurrences of a Character in a String

Problem

You want to count the number of times a character or substring occurs within a given string. Consider the following string:

10,CLARK,MANAGER

You want to determine how many commas are in the string.

Solution

Subtract the length of the string without the commas from the original length of the string to determine the number of commas in the string. Each DBMS provides functions for obtaining the length of a string and removing characters from a string. In most cases, these functions are LENGTH and REPLACE, respectively (SQL Server users will use the built-in function LEN rather than LENGTH):

1 select (length('10,CLARK,MANAGER')-length(replace('10,CLARK,MANAGER',',','')))/length(',')
2 as cnt
3 from t1

 

 

6.10. Creating a Delimited List from Table Rows

Problem

You want to return table rows as values in a delimited list, perhaps delimited by commas, rather than in vertical columns as they normally appear. You want to convert a result set from this:

Solution

Each DBMS requires a different approach to this problem. The key is to take advantage of the built-in functions provided by your DBMS. Understanding what is available to you will allow you to exploit your DBMS’s functionality and come up with creative solutions for a problem that is typically not solved in SQL.

MySQL

Use the built-in function GROUP_CONCAT to build the delimited list:

1 select deptno,
2 group_concat(ename order by empno separator, ',') as emps
3 from emp
4 group by deptno

 

 

SQL Server

Use recursive WITH to build the delimited list:

 1 with x (deptno, cnt, list, empno, len)
 2 as (
 3 select deptno, count(*) over (partition by deptno), cast(ename as varchar(100)),empno,1
 4 from emp
 5 union all
 6 select x.deptno, x.cnt, cast(x.list + ',' + e.ename as varchar(100)), e.empno, x.len+1
 7 from emp e, x
 8 where e.deptno = x.deptno
 9 and e.empno > x. empno
10 )
11 select deptno,list
12 from x
13 where len = cnt
14 order by 1

 

 

 

---Chapter 7 Working with Numbers---

Working with Numbers, introduces recipes for common number crunching. The recipes found here are extremely common and you’ll learn how easily window functions solve problems involving moving calculations and aggregations. Examples include creating running totals; finding mean, median, and mode; calculating percentiles; and accounting for NULL while performing aggregations.

 

7.9. Calculating a Mode

Problem

You want to find the mode (for those of you who don’t recall, the mode in mathematics is the element that appears most frequently for a given set of data) of the values in a column. For example, you wish to find mode of the salaries in DEPTNO 20. Based on the following salaries:

SAL

----------

800

1100

2975

3000

3000

 

the mode is 3000.

 

Solution

DB2 and SQL Server

Use the window function DENSE_RANK to rank the counts of the salaries to facilitate extracting the mode:

 1 select sal
 2 from (
 3 select sal, dense_rank()over( order by cnt desc) as rnk
 4 from (
 5 select sal, count(*) as cnt
 6 from emp
 7 where deptno = 20
 8 group by sal
 9 ) x
10 ) y
11 where rnk = 1

 

 The results are ranked based on the number of times each SAL occurs as is seen below:

 

MySQL and PostgreSQL

Use a subquery to find the mode:

1 select sal
2 from emp
3 where deptno = 20
4 group by sal
5 having count(*) >= all ( select count(*)
6 from emp
7 where deptno = 20
8 group by sal )

 

 

7.10. Calculating a Median

Problem

SAL

----------

800

1100

2975

3000

3000

 

the median is 2975.

 

Solution

Other than the Oracle solution (which uses supplied functions to compute a median), all of the solutions are based on the method described by Rozenshtein, Abramovich, and Birger in Optimizing Transact-SQL: Advanced Programming Techniques (SQL Forum Press, 1997). The introduction of window functions allows for a more efficient solution compared to the traditional self join.

 

MySQL and PostgreSQL

Use a self join to find the median:

1 select avg(sal)
2 from (
3 select e.sal
4 from emp e, emp d
5 where e.deptno = d.deptno and e.deptno = 20
6 group by e.sal
7 having sum(case when e.sal = d.sal then 1 else 0 end) >= abs(sum(sign(e.sal - d.sal)))
8 )

The median is computed by first self joining table EMP, which returns a Cartesian product for all the salaries (but the GROUP BY on E.SAL will prevent duplicates from being returned). The HAVING clause uses the function SUM to count the number of times E.SAL equals D.SAL; if this count is greater than or equal to the number of times E.SAL is greater than D.SAL then that row is the median.

 

SQL Server

Use the window functions COUNT(*) OVER and ROW_NUMBER to find the median:

 1 select avg(sal)
 2 from (
 3 select sal,
 4 count(*)over() total,
 5 cast(count(*)over() as decimal)/2 mid,
 6 ceiling(cast(count(*)over() as decimal)/2) next,
 7 row_number()over( order by sal) rn
 8 from emp
 9 where deptno = 20
10 ) x
11 where ( total%2 = 0 and rn in ( mid, mid+1 ) )
12 or ( total%2 = 1 and rn = next)

 

7.13. Computing Averages Without High and Low Values

Problem

You want to compute an average, but you wish to exclude the highest and lowest values in order to (hopefully) reduce the effect of skew. For example, you want to compute the average salary of all employees excluding the highest and lowest salaries.

 

Solution

MySQL and PostgreSQL

Use subqueries to exclude high and low values:

 

1  select avg(sal)
2  from emp
3  where sal not in (
4  (select min(sal) from emp),
5  (select max(sal) from emp)
6  )

 

 

DB2, Oracle, and SQL Server

Use an inline view with the windowing functions MAX OVER and MIN OVER to generate a result set from which you can easily eliminate the high and low values:

1 select avg(sal)
2  from (
3  select sal, min(sal)over() min_sal, max(sal)over() max_sal
4  from emp
5  ) x
6  where sal not in (min_sal,max_sal)

 

 

Chapter 8, Date Arithmetic, is the first of two chapters dealing with dates. Being able to perform simple date arithmetic is crucial to everyday tasks. Examples include determining the number of business days between two dates, calculating the difference between two dates in different units of time (day, month, year, etc.), and counting occurrences of days in a month.

Chapter 9, Date Manipulation, is the second of the two chapters dealing with dates. In this chapter you will find recipes for some of the most common date operations you will encounter in a typical work day. Examples include returning all days in a year, finding leap years, finding first and last days of a month, creating a calendar, and filling in missing dates for a range of dates. 

Chapter 10, Working with Ranges, introduces recipes for identifying values in ranges, and for creating ranges of values. Examples include automatically generating a sequence of rows, filling in missing numeric values for a range of values, locating the beginning and end of a range of values, and locating consecutive values. 

 

I skipped the above three chapters about date manipulation.

See the second part of the notes in <SQL COOKBOOK Ch.11-Appendix B>

posted @ 2017-05-15 17:23  sheepshaker  阅读(569)  评论(0编辑  收藏  举报