db2 表关联查询

今天在MapReduce的练习中看到了一个题目:

file:

CHILD      PARENT    
---------- ----------
tom        lucy      
tom        jack      
jone       lucy      
jone       jack      
lucy       mary      
lucy       ben       
jack       alice     
jack       jesse     
terry      alice     
terry      jesse     
philip     terry     
philip     alma      
mark       terry     
mark       alma

输出结果要求:

GRANDCHILD GRANDPARENT
---------- -----------
jone       mary       
jone       ben        
jone       alice      
jone       jesse      
mark       alice      
mark       jesse      
philip     alice      
philip     jesse      
tom        mary       
tom        ben        
tom        alice      
tom        jesse  

 

我在思考,这个如果是DB2的一个表,应该能通过表连接来实现这个要求。于是生成表parent:

[db2inst1@win ~]$ db2 "select * from parent"

CHILD      PARENT    
---------- ----------
tom        lucy      
tom        jack      
jone       lucy      
jone       jack      
lucy       mary      
lucy       ben       
jack       alice     
jack       jesse     
terry      alice     
terry      jesse     
philip     terry     
philip     alma      
mark       terry     
mark       alma      

  14 record(s) selected.

 要达到这样的结果,一定要用到表的hash join。下面是我的SQL实现:

[db2inst1@win ~]$ db2 "select u.child GRANDCHILD, b.parent GRANDPARENT from (select * from parent where parent in (select child from parent)) as u ,(select * from parent where child in (select parent from parent)) as b where u.parent=b.child order by u.child"

 DB2的优化器重写成这样:

Optimized Statement:
-------------------
SELECT 
  DISTINCT Q1.CHILD AS "GRANDCHILD",
  Q3.PARENT AS "GRANDPARENT",
  Q3.CHILD,
  Q1.PARENT 
FROM 
  DB2INST1.PARENT AS Q1,
  DB2INST1.PARENT AS Q2,
  DB2INST1.PARENT AS Q3,
  DB2INST1.PARENT AS Q4 
WHERE 
  (Q1.PARENT = Q2.CHILD) AND 
  (Q2.CHILD = Q4.PARENT) AND 
  (Q2.CHILD = Q3.CHILD) 
ORDER BY 
  Q1.CHILD

关于SQL要怎么优化这一方面还有很多不足。。。

 

 

 

posted @ 2013-06-25 01:15  胡.杰  阅读(2654)  评论(0编辑  收藏  举报