SQLZOO记录 1
0. SELECT basics
1. SELECT name
2. SELECT from world
3. SELECT from nobel
4. SELECT within SELECT
做这组训练之前可以先做Using nested SELECT和Nested SELECT Quiz了解进阶SELECT。
- List the name and continent of countries in the continents containing either Argentina or Australia. Order by name of the country.
错误
SELECT name, continent
FORM world
WHERE continent in
(SELECT continent
FROM world
WHERE name in ('Argentina', 'Australia'))
ORDER BY name
参考答案
SELECT name, continent
FROM world
WHERE continent in
(
(SELECT continent
FROM world
WHERE name = 'Argentina'),
(SELECT continent
FROM world
WHERE name = 'Australia')
)
ORDER BY name
- Which country has a population that is more than United Kingom but less than Germany? Show the name and the population.
SELECT name, population
FROM world
WHERE population >
(SELECT population
FROM world
WHERE name = 'United Kingdom')
AND population <
(SELECT population
FROM world
WHERE name = 'Germany')
BETWEEN...AND...是前后都包,所以不能用。
- Germany (population 80 million) has the largest population of the countries in Europe. Austria (population 8.5 million) has 11% of the population of Germany.
Show the name and the population of each country in Europe. Show the population as a percentage of the population of Germany.
提示:
Decimal places:You can use the function ROUND to remove the decimal places.
Percent symbol %:You can use the function CONCAT to add the percentage symbol.
ROUND 用法为 ROUND(列名,小数点),如果写成 ROUND(列名)在实践中也可行,小数点默认为 0。
错误
SELECT name, CONCAT(ROUND(100 * population/(SELECT population
FROM world
WHERE name = 'Germany')), '%') as percentage
FROM world
WHERE continent = 'Europe'
有结果但英文版里没显示correct,中文版是正确答案。
- Which countries have a GDP greater than every country in Europe? [Give the name only.] (Some countries may have NULL gdp values)
SELECT name
FROM world
WHERE GDP > ALL(SELECT GDP
FROM world
WHERE continent = 'Europe'
AND GDP > 0)
提示:
We can use the word ALL to allow >= or > or < or <=to act over a list. For example, you can find the largest country in the world, by population with this query:
SELECT name
FROM world
WHERE population >= ALL(SELECT population
FROM world
WHERE population>0)
You need the condition population>0 in the sub-query as some countries have null for population.
- Find the continents where all countries have a population <= 25000000. Then find the names of the countries associated with these continents. Show name, continent and population.
SELECT name, continent, population
FROM world x
WHERE 25000000 >= ALL
(SELECT population
FROM world y
WHERE y.continent = x.continent)
需要变换比较的思维
- Some countries have populations more than three times that of all of their neighbours (in the same continent). Give the countries and continents.
SELECT name, continent
FROM world x
WHERE population/3 > ALL
(SELECT population
FROM world y
WHERE y.continent = x.continent
AND y.name <> x.name) /* 别忘了把所选国家自己排除掉!不然就没有结果了 */
4..MORE ABOUT SUBQUERY
The nobel table can be used to practice more subquery
nobel(yr, subject, winner)
- 紅十字國際委員會 (International Committee of the Red Cross) 曾多次獲得和平獎。 試找出與紅十字國際委員會同年得獎的文學獎(Literature)得獎者和年份。
SELECT winner, yr
FROM nobel
WHERE yr IN (SELECT yr
FROM nobel
WHERE winner = 'International Committee of the Red Cross')
AND subject = 'Literature'
/*
1. 获奖不止一次,所以年份不唯一,这里要用 IN 而不是 =,注意两者区别
2. 清楚子查询的反括号 ) 在哪里,不是所有条件都包含在子查询中
3. 注意有多个查询条件
4. 引号打全
*/
- 日本物理學家益川敏英 (Toshihide Maskawa) 曾獲得物理獎。同年還有兩位日本人一同獲得物理獎。試列出這2位日本人的名稱。
SELECT winner
FROM nobel
WHERE yr = (SELECT yr
FROM nobel
WHERE winner = 'Toshihide Maskawa')
AND subject = 'Physics'
AND winner != 'Toshihide Maskawa'
/* 记得要排除本人 */
3.首次頒發的經濟獎 (Economics)的得獎者是誰?
SELECT winner
FROM nobel
WHERE subject = 'Economics'
AND yr <= ALL(SELECT yr
FROM nobel
WHERE subject = 'Economics')
用高级的复制表方法
SELECT winner
FROM nobel x
WHERE subject = 'Economics'
AND yr <= ALL(SELECT yr
FROM nobel y
WHERE y.subject = x.subject)
- 哪幾年頒發了物理獎,但沒有頒發化學獎?
SELECT DISTINCT yr
FROM nobel
WHERE subject = 'Physics'
AND yr NOT IN (SELECT yr
FROM nobel
WHERE subject = 'Chemistry')
/*别忘了年份去重*/
- 哪幾年的得獎者人數多於12人呢? 列出得獎人數多於12人的年份,獎項和得獎者。
提示:需要用到 GROUP BY and HAVING
By including a GROUP BY clause functions such as SUM and COUNT are applied to groups of items sharing values. When you specify GROUP BY continent the result is that you get only one row for each different value of continent. All the other columns must be "aggregated" by one of SUM, COUNT ...
The HAVING clause allows use to filter the groups which are displayed. The WHERE clause filters rows before the aggregation, the HAVING clause filters after the aggregation.
The WHERE filter takes place before the aggregating function.
The HAVING clause is tested after the GROUP BY.
SELECT yr, subject, winner
FROM nobel
WHERE yr IN (SELECT yr
FROM nobel
GROUP BY yr
HAVING COUNT(winner) > 12)
/* 可能不止一年里得奖人数超过 12 人,所以要用 IN */
- 哪些得獎者獲獎多於1次呢?他們是哪一年獲得哪項獎項呢? 列出他們的名字,獲獎年份及獎項。先按名字,再按年份順序排序。
SELECT winner, yr, subject
FROM nobel
WHERE winner IN(SELECT winner
FROM nobel
GROUP BY winner
HAVING COUNT(winner) > 1)
ORDER BY winner, yr
5. SUM and COUNT
- For each continent show the continent and number of countries.
SELECT continent, COUNT(name) as numberOfCountries
FROM world
GROUP BY continent
第7、8题看起来很相似,难度完全不同。
- For each continent show the continent and number of countries with populations of at least 10 million.
SELECT continent, COUNT(name)
FROM world
WHERE population >= 10000000
GROUP BY continent
- List the continents that have a total population of at least 100 million.
SELECT continent
FROM world
GROUP BY continent
HAVING SUM(population) >= 100000000
5.. MORE SUM and COUNT
The nobel table can be used to practice more SUM and COUNT functions
- For each subject show the first year that the prize was awarded.
错误
SELECT subject, yr
FROM nobel x
WHERE yr <= ALL(SELECT yr
FROM nobel y
WHERE y.subject = x.subject)
GROUP BY subject
参考答案
SELECT subject, MIN(yr)
FROM nobel
GROUP BY subject
- Show the number of different winners for each subject.
SELECT subject, COUNT(DISTINCT winner)
FROM nobel
GROUP BY subject
- For each subject show how many years have had prizes awarded.
SELECT subject, COUNT(DISTINCT yr)
FROM nobel
GROUP BY subject
/* 注意年份去重 */
- Show the years in which three prizes were given for Physics.
SELECT yr
FROM nobel
WHERE subject = 'Physics'
GROUP BY yr
HAVING COUNT(winner) = 3
- Show winners who have won more than one subject.
SELECT winner
FROM nobel
GROUP BY winner
HAVING COUNT(DISTINCT subject) > 1
/* 注意是不同奖项,要用到 DISTINCT */
- Show the year and subject where 3 prizes were given. Show only years 2000 onwards.
SELECT yr, subject
FROM nobel
WHERE yr >= 2000
GROUP BY yr, subject
HAVING COUNT(winner) = 3
6. JOIN
- Show player, teamid, coach, gtime for all goals scored in the first 10 minutes gtime<=10.
SELECT player, teamid, coach, gtime
FROM eteam JOIN goal ON (id = teamid)
WHERE gtime <= 10
- List the dates of the matches and the name of the team in which 'Fernando Santos' was the team1 coach.
SELECT mdate, teamname
FROM eteam JOIN game ON (eteam.id = team1)
WHERE coach = 'Fernando Santos'
/* 要清楚选哪几列 */
- For every match involving 'POL', show the matchid, date and the number of goals scored.
SELECT matchid, mdate, COUNT(gtime)
FROM game JOIN goal ON id = matchid
WHERE (team1 = 'POL' OR team2 = 'POL')
GROUP BY matchid, mdate
- List every match with the goals scored by each team as shown. This will use "CASE WHEN" which has not been explained in any previous exercises.
mdate | team1 | score1 | team2 | score2 |
---|---|---|---|---|
1 | July 2012 | ESP | 4 | 0 |
10 | June 2012 | ESP | 1 | 1 |
10 | June 2012 | IRL | 1 | 3 |
…… |
Notice in the query given every goal is listed. If it was a team1 goal then a 1 appears in score1, otherwise there is a 0. You could SUM this column to get a count of the goals scored by team1. Sort your result by mdate, matchid, team1 and team2.
SELECT mdate,
team1,
SUM(CASE WHEN teamid = team1 THEN 1 ELSE 0 END)
AS score1,
team2,
SUM(CASE WHEN teamid = team2 THEN 1 ELSE 0 END)
AS score2
FROM game
LEFT JOIN goal
ON id = matchid
GROUP BY mdate, matchid, team1, team2
/* 注意题干中的 notice */
6.. More JOIN
- Show the song for each track on the album 'Blur'.
提示:'Blur' 在 title 下
SELECT song
FROM album JOIN track ON album.asin = track.album
WHERE album.title = 'Blur'
- A "title track" is where the song is the same as the title. Find the title tracks.
SELECT song as TitleTrack
FROM album JOIN track ON (album.asin = track.album)
WHERE song = title
/* 主角是 song */
- Find the songs that appear on more than 2 albums. Include a count of the number of times each shows up.
SELECT song, COUNT(DISTINCT title)
FROM album JOIN track ON (album.asin = track.album)
GROUP BY song
HAVING COUNT(DISTINCT title) > 2
不懂为什么一定要用 title 而不是 track.album,用 track.album 就不用连接表了。
- A "good value" album is one where the price per track is less than 50 pence. Find the good value album - show the title, the price and the number of tracks.
SELECT title, price, COUNT(song)
FROM album JOIN track ON (album.asin = track.album)
GROUP BY title, price
HAVING price/COUNT(song) < 0.5
- List albums so that the album with the most tracks is first. Show the title and the number of tracks.
Where two or more albums have the same number of tracks you should order alphabetically.
错误
SELECT title, COUNT(song) AS num
FROM album JOIN track ON (album.asin = track.album)
GROUP BY title
ORDER BY num DESC
title 为 'The Greatest Hits'的歌曲数计数错误
7. More JOIN operations
- Obtain the cast list for the film 'Alien'
错误
SELECT actor.name
FROM actor JOIN movie ON actor.id = movie.director
WHERE title = 'Alien'
不懂为什么一定要连接三张表,数据库有讲 movie.director 与 actor.id 对应。
参考答案
SELECT actor.name
FROM actor JOIN casting ON (actorid = actor.id)
WHERE casting.movieid = (SELECT movie.id
FROM movie
WHERE title = 'Alien')
或者直接连接三张表
SELECT actor.name
FROM casting JOIN movie ON casting.movieid = movie.id
JOIN actor JOIN casting.actorid = actor.id
WHERE movie.title = 'Alien'
- List the films together with the leading star for all 1962 films.
SELECT movie.title, actor.name
FROM casting JOIN movie ON casting.movieid = movie.id
JOIN actor ON casting.actorid = actor.id
WHERE movie.yr = '1962'
AND casting.ord = 1
/* the leading star 就是主演,即 casting.ord = 1 */
- Which were the busiest years for 'Rock Hudson', show the year and the number of movies he made each year for any year in which he made more than 2 movies.
SELECT yr, COUNT(title)
FROM casting JOIN movie ON casting.movieid = movie.id
JOIN actor ON casting.actorid = actor.id
WHERE actor.name = 'Rock Hudson'
GROUP BY yr
HAVING COUNT(title) > 2
- List the film title and the leading actor for all of the films 'Julie Andrews' played in.
Title is not a unique field, create a table of IDs in your subquery.
SELECT movie.title, actor.name
FROM casting JOIN movie ON casting.movieid = movie.id
JOIN actor ON casting.actorid = actor.id
WHERE casting.movieid in (SELECT casting.movieid
FROM casting JOIN actor ON casting.actorid = actor.id
WHERE actor.name='Julie Andrews')
AND casting.ord = 1
- Obtain a list, in alphabetical order, of actors who've had at least 15 starring roles.
SELECT actor.name
FROM actor JOIN casting ON actor.id = casting.actorid
WHERE casting.ord = 1
GROUP BY actor.name
HAVING COUNT(actor.name) >= 15
/* 要清楚逻辑关系 */
- List the films released in the year 1978 ordered by the number of actors in the cast, then by title.
SELECT movie.title, COUNT(*)
FROM movie JOIN casting ON movie.id = casting.movieid
WHERE yr = 1978
GROUP BY movie.title
ORDER BY COUNT(casting.ord) DESC, movie.title
/*
COUNT(*) 可以写为 COUNT(casting.actorid)。
COUNT(*) 对行的数目进行计算,包含 null;
COUNT(字段名) 对特定的列的值具有的行数进行计算,不包含 null,得到的结果将是除去值为 null 和重复数据后的结果。
*/
- List all the people who have worked with 'Art Garfunkel'.
SELECT actor.name
FROM actor JOIN casting ON actor.id = casting.actorid
WHERE casting.movieid IN (SELECT casting.movieid
FROM casting JOIN actor ON casting.actorid = actor.id
WHERE actor.name = 'Art Garfunkel')
AND actor.name != 'Art Garfunkel'
/* 先求出 Art Garfunkel 演过的所有电影,其次查询所有演员,最后排除 Art Garfunkel */
8. Using NULL
- Note the INNER JOIN misses the teachers with no department and the departments with no teacher.
SELECT teacher.name, dept.name
FROM teacher INNER JOIN dept
ON (teacher.dept = dept.id)
/* 用 JOIN 也忽略 null */
- Use a different JOIN so that all teachers are listed.
SELECT teacher.name, dept.name
FROM teacher LEFT JOIN dept
ON (teacher.dept = dept.id)
- Use a different JOIN so that all departments are listed.
SELECT teacher.name, dept.name
FROM teacher RIGHT JOIN dept
ON (teacher.dept = dept.id)
- Use COALESCE to print the mobile number. Use the number '07986 444 2266' if there is no number given. Show teacher name and mobile number or '07986 444 2266'
SELECT name, COALESCE(mobile, '07986 444 2266') AS mobile
FROM teacher
- Use COUNT and GROUP BY dept.name to show each department and the number of staff. Use a RIGHT JOIN to ensure that the Engineering department is listed.
SELECT dept.name, COUNT(teacher.name)
FROM teacher RIGHT JOIN dept ON (teacher.dept = dept.id)
GROUP BY dept.name
/* COUNT() 里可以用 teacher.id、teacher.dept、teacher.name、teacher.phone */
- Use CASE to show the name of each teacher followed by 'Sci' if the teacher is in dept 1 or 2 and 'Art' otherwise.
SELECT name,
CASE WHEN (dept = 1 OR dept = 2) THEN 'Sci'
ELSE 'Art'
END
AS SA
FROM teacher
更好的:
SELECT name,
CASE WHEN dept IN (1, 2) THEN 'Sci'
ELSE 'Art'
END
AS SA
FROM teacher
8.. More Using NULL
- One MSP was kicked out of the Labour party and has no party. Find him.
提示:You might think that the phrase dept=NULL would work here. It doesn't. This is because NULL "propogates". Any normal expression that includes NULL is itself NULL, thus the value of the expressions 2+NULL and party || NULL and NULL=NULL for example are all NULL.
SELECT Name
FROM msp
WHERE Party IS NULL
- Obtain a list of all parties which have at least one MSP.
SELECT party.Name
FROM msp JOIN party ON (msp.Party = party.Code)
GROUP BY party.Name
HAVING COUNT(party.Name) >= 1
- Obtain a list of all MSPs by name, give the name of the MSP and the name of the party where available. Be sure that Canavan MSP, Dennis is in the list. Use ORDER BY msp.name to sort your output by MSP.
SELECT msp.Name, party.Name
FROM msp LEFT JOIN party ON (msp.Party = party.Code)
ORDER BY msp.name
/* 强调的原因是存在有 msp.Name 没有 party.Name 的情况 */
- Obtain a list of parties which have MSPs, include the number of MSPs.
SELECT party.Name, COUNT(*)
FROM msp JOIN party ON (msp.Party = party.Code)
GROUP BY party.Name
- A list of parties with the number of MSPs; include parties with no MSPs.
SELECT party.Name, COUNT(msp.Party) AS COUNT
FROM msp RIGHT JOIN party ON (msp.Party = party.Code)
GROUP BY party.Name
9. Self JOIN
- Give the id and the name for the stops on the '4' 'LRT' service.
SELECT stops.id, stops.name
FROM stops JOIN route ON (stops.id = route.stop)
WHERE route.num = '4'
AND route.company = 'LRT'
- The query shown gives the number of routes that visit either London Road (149) or Craiglockhart (53). Run the query and notice the two services that link these stops have a count of 2. Add a HAVING clause to restrict the output to these two routes.
SELECT company, num, COUNT(*)
FROM route
WHERE stop = 149 OR stop = 53
GROUP BY company, num
HAVING COUNT(num) = 2
- Execute the self join shown and observe that b.stop gives all the places you can get to from Craiglockhart, without changing routes. Change the query so that it shows the services from Craiglockhart to London Road.
SELECT a.company, a.num, a.stop, b.stop
FROM route a JOIN route b ON
(a.company = b.company AND a.num = b.num)
WHERE a.stop = 53
AND b.stop = 149
- The query shown is similar to the previous one, however by joining two copies of the stops table we can refer to stops by name rather than by number. Change the query so that the services between 'Craiglockhart' and 'London Road' are shown.
SELECT a.company, a.num, stopa.name, stopb.name
FROM route a JOIN route b ON (a.company = b.company
AND a.num = b.num)
JOIN stops stopa ON (a.stop = stopa.id)
JOIN stops stopb ON (b.stop = stopb.id)
WHERE stopa.name = 'Craiglockhart'
AND stopb.name = 'London Road'
- Give a list of all the services which connect stops 115 and 137 ('Haymarket' and 'Leith').
SELECT DISTINCT a.company, a.num
FROM route a JOIN route b ON (a.company = b.company
AND a.num = b.num)
WHERE a.stop = '115'
AND b.stop = '137'
/* 运行后对比正确输出才知道要加 DISTINCT */
- Give a distinct list of the stops which may be reached from 'Craiglockhart' by taking one bus, including 'Craiglockhart' itself, offered by the LRT company. Include the company and bus no. of the relevant services.
SELECT DISTINCT stops.name, route.company, route.num
FROM stops JOIN route ON (stops.id = route.stop)
WHERE route.num IN (SELECT route.num
FROM stops JOIN route ON (stops.id = route.stop)
WHERE stops.name = 'Craiglockhart')
AND route.company = 'LRT'
ORDER BY route.num
或直接链接
SELECT DISTINCT stopb.name, a.company, a.num
FROM route a JOIN route b ON (a.num = b.num
AND a.company = b.company)
JOIN stops stopa ON (a.stop = stopa.id)
JOIN stops stopb ON (b.stop = stopb.id)
WHERE stopa.name = 'Craiglockhart'
AND a.company = 'LRT'
如果是先连接两个 stops 表,要怎么表示?
- Find the routes involving two buses that can go from Craiglockhart to Lochend. Show the bus no. and company for the first bus, the name of the stop for the transfer,
and the bus no. and company for the second bus.
Hint: Self-join twice to find buses that visit Craiglockhart and Lochend, then join those on matching stops.
SELECT DISTINCT x.num, x.company, name, y.num, y.company
FROM (SELECT a.num, a.company, b.stop
FROM route a JOIN route b ON (a.num = b.num
AND a.company = b.company
AND a.stop != b.stop)
WHERE a.stop = (SELECT id
FROM stops
WHERE name ='Craiglockhart'))
AS x
JOIN (SELECT c.num, c.company, c.stop
FROM route c JOIN route d ON c.num = d.num
AND c.company = d.company
AND c.stop != d.stop
WHERE d.stop = (SELECT id
FROM stops
WHERE name = 'Lochend'))
AS y
ON x.stop = y.stop
JOIN stops ON x.stop = stops.id
ORDER BY x.num, stops.name, y.num
太太太难了……
8+ Numeric Examples
传送门:NSS Tutorial
- The example shows the number who responded for:
question 1
at 'Edinburgh Napier University'
studying '(8) Computer Science'
Show the the percentage who STRONGLY AGREE
SELECT A_STRONGLY_AGREE
FROM nss
WHERE question='Q01'
AND institution='Edinburgh Napier University'
AND subject='(8) Computer Science'
- Show the institution and subject where the score is at least 100 for question 15.
SELECT institution, subject
FROM nss
WHERE question='Q15'
AND score >= 100
- Show the institution and score where the score for '(8) Computer Science' is less than 50 for question 'Q15'.
SELECT institution, score
FROM nss
WHERE question='Q15'
AND score < 50
AND subject='(8) Computer Science'
- Show the subject and total number of students who responded to question 22 for each of the subjects '(8) Computer Science' and '(H) Creative Arts and Design'.
HINT: You will need to use SUM over the response column and GROUP BY subject
SELECT subject, SUM(response)
FROM nss
WHERE question='Q22'
AND subject IN ('(8) Computer Science', '(H) Creative Arts and Design')
GROUP BY subject
- Show the subject and total number of students who A_STRONGLY_AGREE to question 22 for each of the subjects '(8) Computer Science' and '(H) Creative Arts and Design'.
HINT: The A_STRONGLY_AGREE column is a percentage. To work out the total number of students who strongly agree you must multiply this percentage by the number who responded (response) and divide by 100 - take the SUM of that.
SELECT subject, SUM(response * A_STRONGLY_AGREE / 100)
FROM nss
WHERE question='Q22'
AND subject IN ('(8) Computer Science', '(H) Creative Arts and Design')
GROUP BY subject
/* 注意 A_STRONGLY_AGREE 是百分数,要得到正确的单位还要除以 100 */
- Show the percentage of students who A_STRONGLY_AGREE to question 22 for the subject '(8) Computer Science' show the same figure for the subject '(H) Creative Arts and Design'.
Use the ROUND function to show the percentage without decimal places.
SELECT subject, ROUND(SUM(response * A_STRONGLY_AGREE / 100) / SUM(response) * 100, 0) AS percentage
FROM nss
WHERE question='Q22'
AND subject IN ('(8) Computer Science', '(H) Creative Arts and Design')
GROUP BY subject
- Show the average scores for question 'Q22' for each institution that include 'Manchester' in the name.
The column score is a percentage - you must use the method outlined above to multiply the percentage by the response and divide by the total response. Give your answer rounded to the nearest whole number.
SELECT institution, ROUND(SUM(response * score / 100) / SUM(response) * 100)
FROM nss
WHERE question='Q22'
AND institution LIKE '%Manchester%'
GROUP BY institution
- Show the institution, the total sample size and the number of computing students for institutions in Manchester for 'Q01'.
SELECT institution, SUM(sample) AS size,
SUM(CASE WHEN subject = '(8) Computer Science' THEN sample
ELSE 0 END)
AS number
FROM nss
WHERE question = 'Q01'
AND institution LIKE '%Manchester%'
GROUP BY institution