hive 批量添加,删除分区
一.批量添加分区:
use bigdata;
alter table siebel_member
add if not exists
partition(dt='20180401') location '20180401'
partition(dt='20180402') location '20180402'
partition(dt='20180403') location '20180403'
partition(dt='20180404') location '20180404';
二.批量删除分区
alter table siebel_member drop if exists
partition(dt='20180401'),
partition(dt='20180402'),
partition(dt='20180403'),
partition(dt='20180404');
三.大量分区的添加
虽说是批量添加分区,但是一下子添加2000个分区,如果把2000个分区写在一个语句里面,hive是不允许的。
所以只好一个月一个月的加,尝试了使用shell 生成 line="hive -e "xxxxx"" ${line} ,但是解析错误,采用python + shell 解决了问题。
python 代码如下:
#!/usr/bin/env python
#encoding=utf8
import datetime
import os
import sys
def dateRange(beginDate, endDate):
dates = []
dt = datetime.datetime.strptime(beginDate, "%Y%m%d")
date = beginDate[:]
while date < endDate:
dates.append(date)
dt = dt + datetime.timedelta(1)
date = dt.strftime("%Y%m%d")
return dates
def genSql(datelist, table):
sql = """
hive -e "
use bigdata;
alter table """ + table + """
add if not exists """
for i in datelist:
sql = sql + """partition(dt='""" + i + """') location '""" + i + """' """
sql = sql.strip() + """;" """
os.system(sql)
if __name__ == '__main__':
if sys.argv.__len__() != 4:
print("参数错误")
sys.exit()
st = sys.argv[1]
et = sys.argv[2]
table = sys.argv[3]
datelist = dateRange(st,et)
genSql(datelist,table)
shell 用来调度 py
#添加分区,按月添加分区 会添加到et所在月的最后一天
function add_par(){
st=$1
et=$2
tbname=$3
while [ ${et} -ge ${st} ]
do
pet=$(date --date="+1 month ${et}" +"%Y%m%d")
python a.py ${et} ${pet} ${tbname}
et=$(date --date="1 month ago ${et}" +"%Y%m%d")
done
}
欢迎转载,不必注明出处