Fork me on GitHub

Himawari 8数据介绍及下载转换

Himawari 8数据介绍及下载转换

1.Himawari-8卫星简介

​ 日本发射的静止轨道卫星。JMA于2015年7月7日开始运营Himawari-8, Himawari-9号卫星于2017年3月10日开始后备运行。两颗卫星都位于向东约140.7度的轨道上,并将观测东亚和西太平洋区域15年。

2. 卫星特性介绍

​ 重访周期短(10min),光谱分辨率高,主传感器为AHI,常用于气象观测。

image-20201211013245993

3. 卫星数据格式介绍

​ 这里仅讨论L1级数据。FTP上分享的数据有.nc、.dat(HSD)两种格式。

1. nc
# Available Himawari  L1 Gridded Data 

## Full-disk
 Projection: EQR
 Observation area: 60S-60N, 80E-160W
 Temporal resolution: 10-minutes
 Spatial resolution: 5km (Pixel number: 2401, Line number: 2401)
                     2km (Pixel number: 6001, Line number: 6001)
 Data: albedo(reflectance*cos(SOZ) of band01~band06)
       Brightness temperature of band07~band16
       satellite zenith angle, satellite azimuth angle, 
       solar zenith angle, solar azimuth angle, observation hours (UT)

## Japan Area
 Projection: EQR
 Observation area: 23N-50N, 123E-150E
 Temporal resolution: 10-minutes
 Spatial resolution: 1km (Pixel number: 2701, Line number: 2601)
 Data: albedo(reflectance*cos(SOZ) of band01~band06)
       Brightness temperature of band07, 14, 15
       satellite zenith angle, satellite azimuth angle, 
       solar zenith angle, solar azimuth angle, observation hours (UT)

主要分为两个区域的数据,全圆盘、日本地区。这里仅讨论全圆盘区域,影像空间分辨率一般为5km(2401行/列)/2km(6001行/列),该数据集包含的数据主要有:albedo(反射率 band1-6)、bt(亮温 band 7-16)、太阳高度角/方位角、卫星高度角/方位角(可用于大气校正)、lon、lat...

## Full-disk
 NC_H08_YYYYMDD_hhmm_Rbb_FLDK.xxxxx_yyyyy.nc

 where YYYY: 4-digit year of observation start time (timeline);
       MM: 2-digit month of timeline;
       DD: 2-digit day of timeline;
       hh: 2-digit hour of timeline;
       mm: 2-gidit minutes of timeline;
       bb: 2-digit band number (varies from "01" to "16");
       xxxxx: pixel number; ("2401": 5km resolution, 
                             "6001": 2km resolution, )
       yyyyy: line number; ("2401": 5km resolution, 
                             "6001": 2km resolution, )

 Example: 
   NC_H08_20160831_0000_R21_FLDK.02401_02401.nc
   NC_H08_20160831_0000_R21_FLDK.06001_06001.nc
    
    ## Japan Area
 NC_H08_YYYYMMDD_hhmm_rbb_FLDK.xxxxx_yyyyy.nc

 where YYYY: 4-digit year of observation start time (timeline);
       MM: 2-digit month of timeline;
       DD: 2-digit day of timeline;
       hh: 2-digit hour of timeline;
       mm: 2-gidit minutes of timeline;
       bb: 2-digit band number (fixed to "14");
       xxxxx: pixel number; (fixed to "2701" : 1km resolution)
       yyyyy: line number; (fixed to "2601" : 1km resolution)

 Example: 
   NC_H08_20160831_0000_r14_FLDK.02701_02601.nc

以上是全圆盘数据的命名格式,其中NC_H08_YYYYMDD_hhmm_Rbb_FLDK.xxxxx_yyyyy.nc注意R代表全球,而r代表日本,后面两位代表分辨率。文件内的时间为UTC时间,跟北京时间相差8小时,简单说UTC时间 = 北京时间 - 8小时

2. hsd
# Available Himawari Standard Data

## Full-disk
 Observation area: Full-disk
 Temporal resolution: 10-minutes
 Spatial resolution: 0.5km (band 3), 1km (band 1,2,4), 2km (band 5-16)

## Japan Area
 Observation area: Japan area (Region 1 & 2)
 Temporal resolution: 2.5-minutes
 Spatial resolution: 0.5km (band 3), 1km (band 1,2,4), 2km (band 5-16)

## Target Area
 Observation area: Target area (Region 3)
 Temporal resolution: 2.5-minutes
 Spatial resolution: 0.5km (band 3), 1km (band 1,2,4), 2km (band 5-16)

## Color Image Data
 png images of Full-disk, Japan area and Target area, compositing three visible
 bands (blue: 0.47 micron; green: 0.51 micron; red: 0.64 micron).

​ 数据结构暂时不详,这里只看分辨率。分辨率全面优于nc,0.5km (band 3), 1km (band 1,2,4), 2km (band 5-16),最高达到了500m。提供的数据有全球、日本、目标区的影像及真彩图。

 where YYYY: 4-digit year of observation start time (timeline);
       MM: 2-digit month of timeline;
       DD: 2-digit day of timeline;
       hh: 2-digit hour of timeline;
       mm: 2-gidit minutes of timeline;
       bb: 2-digit band number (varies from "01" to "16");
       jj: spatial resolution ("05": 0.5km, "10": 1.0km, "20": 2.0km);
       kk: segment number (varies from "01" to "10"); and
       ll: total number of segments (fixed to "10").

 example: HS_H08_20150728_2200_B01_FLDK_R10_S0110.DAT

​ 在ftp中的存储形式是一个波段一个dat文件

image-20201211015020210
4. 数据的自动下载及转换

​ 主要参考以下几篇文章:https://blog.csdn.net/esa_dsq/article/details/105109487

https://blog.csdn.net/qq_44317919/article/details/108245097?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522160566320519724836713938%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=160566320519724836713938&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2allfirst_rank_v2~rank_v28_p-4-108245097.pc_first_rank_v2_rank_v28p&utm_term=himawari-8%E6%95%B0%E6%8D%AE%E8%87%AA%E5%8A%A8&spm=1018.2118.3001.4449

​ 原文是选择下载L3级AOD数据,这里笔者改成了下载L1级nc数据,数据下载到本地后入库并转换成tif,分别存入albedo/tbb两个子文件夹中。

​ Now, show u the main code.

  1. 核心模块:下载ftp数据

    class myFTP:
    
        ftp = ftplib.FTP()
     
        def __init__(self, host, port=21):
            '''
            @desc:连接FTP,host是IP地址,port是端口,默认21
            '''
            self.ftp.connect(host, port)  
     
        def Login(self, user, password):
            '''
            @desc:登录FTP连接,user是用户名,password是密码
            '''
            self.ftp.login(user, password)
            print(self.ftp.welcome)  # 显示登录信息
     
        def DownLoadFile(self, LocalFile, RemoteFile):
            '''
            @desc:下载单个文件,LocalFile表示本地存储路径和文件名,RemoteFile是FTP路径和文件名
            '''
            bufSize = 102400
     
            file_handler = open(LocalFile, 'wb')
            print(file_handler)
     
            # 接收服务器上文件并写入本地文件
            self.ftp.retrbinary('RETR ' + RemoteFile, file_handler.write, bufSize)
            self.ftp.set_debuglevel(0)
            file_handler.close()
            return True
     
        def DownLoadFileTree(self, LocalDir, tifDir, RemoteDir, choice, dateStr):
            '''
            @desc:下载整个目录下的文件,LocalDir表示本地存储路径, RemoteDir表示FTP路径
            '''
            # print("remoteDir:", RemoteDir)
            # 如果本地不存在该路径,则创建
            if not os.path.exists(LocalDir):
                os.makedirs(LocalDir)
     
            # 获取FTP路径下的全部文件名,以列表存储
            # 好像是乱序
            self.ftp.cwd(RemoteDir)   # 设置FTP当前操作的路径
            RemoteNames = self.ftp.nlst() # 获取目录下的文件
            RemoteNames.reverse()
     
            # print("RemoteNames:", RemoteNames)
            for file in RemoteNames:
                # 先下载为临时文件Local,下载完成后再改名为nc4格式的文件
                # 这是为了防止上一次下载中断后,最后一个下载的文件未下载完整,而再开始下载时,程序会识别为已经下载完成
                Local = os.path.join(LocalDir, file[0:-3] + ".temp")
                LocalNew = os.path.join(LocalDir, file)
     
                '''
                下载小时文件,只下载UTC时间1时至9时(北京时间9时至17时)的文件
                下载的文件必须是nc格式
                若已经存在,则跳过下载
                '''
                # 小时数据命名格式示例:NC_H08_20201210_1300_R21_FLDK.06001_06001.nc
                # R代表全球区域,分辨率分为5KM/2KM两种图像
                if choice == 1:
                    if int(file[16:18]) >= 1 and int(file[16:18]) <= 9 and file[21]=='R':
                        if not os.path.exists(LocalNew):
                            print("Downloading the file of %s" % file)
                            self.DownLoadFile(Local, file)
                            os.rename(Local, LocalNew)
                            print("The download of the file of %s has finished\n" % file)
                            albedoPath, tbbPath = nc2tiff(LocalNew, tifDir)
                            insert2database(file, LocalNew, albedoPath, tbbPath, dateStr)
                        elif os.path.exists(LocalNew):
                            print("The file of %s has already existed!\n" % file)
                    else:
                        pass
    
            self.ftp.cwd("..")  # 设置FTP当前操作的路径
            return
     
        def close(self):
            self.ftp.quit()
    

    ​ 该代码块是从上述链接中迁移而来,并且做了一定的修改。这里的难点只要在ftp命令的掌握和影像命名格式的熟悉,下面给出一些常用的ftp命令。

    from ftplib import FTP            #加载ftp模块
    
    #ftp登陆连接
    ftp=FTP()                         #设置变量
    ftp.set_debuglevel(2)             #打开调试级别2,显示详细信息
    ftp.connect("IP","port")          #连接的ftp sever和端口
    ftp.login("user","password")      #连接的用户名,密码
    ftp.getwelcome()                  #欢迎信息
    ftp.cmd("xxx/xxx")                #进入远程目录
    bufsize=1024                      #设置的缓冲区大小
    filename="filename.txt"           #需要下载的文件
    file_handle=open(filename,"wb").write #以写模式在本地打开文件
    ftp.retrbinaly("RETR filename.txt",file_handle,bufsize) #接收服务器上文件并写入本地文件
    ftp.set_debuglevel(0)             #关闭调试模式
    ftp.quit()                        #退出ftp
    
    #ftp相关命令操作
    ftp.cwd(pathname)                 #设置FTP当前操作的路径
    ftp.dir()                         #显示目录下所有目录信息
    ftp.nlst()                        #获取目录下的文件
    ftp.mkd(pathname)                 #新建远程目录
    ftp.pwd()                         #返回当前所在位置
    ftp.rmd(dirname)                  #删除远程目录
    ftp.delete(filename)              #删除远程文件
    ftp.rename(fromname, toname)      #将fromname修改名称为toname。
    ftp.storbinaly("STOR filename.txt",file_handel,bufsize)  #上传目标文件
    ftp.retrbinary("RETR filename.txt",file_handel,bufsize)  #下载FTP文件
    

    常用的上传及下载文件。

    # !/usr/bin/python
    # -*- coding: utf-8 -*-
     
    from ftplib import FTP
     
    def ftpconnect(host, username, password):
        ftp = FTP()
        # ftp.set_debuglevel(2)
        ftp.connect(host, 21)
        ftp.login(username, password)
        return ftp
     
    #从ftp下载文件
    def downloadfile(ftp, remotepath, localpath):
        bufsize = 1024
        fp = open(localpath, 'wb')
        ftp.retrbinary('RETR ' + remotepath, fp.write, bufsize)
        ftp.set_debuglevel(0)
        fp.close()
     
    #从本地上传文件到ftp
    def uploadfile(ftp, remotepath, localpath):
        bufsize = 1024
        fp = open(localpath, 'rb')
        ftp.storbinary('STOR ' + remotepath, fp, bufsize)
        ftp.set_debuglevel(0)
        fp.close()
    
  2. 数据入库(mysql)

    def insert2database(ftpPath, filePath, albedoPath, tbbPath, dateTime):
        '''
        @description:数据入库png
        '''
        #输入数据库的字段值
        png_uuid = str(uuid.uuid1())
    
        conn = pymysql.connect(
            db = 'himawari8',
            user = 'root',
            password = 'root',
            host = 'localhost',
            port = 3306
        )
        cur = conn.cursor()
        #首先查询数据是否存在
        #查询影像对应的uuid是否存在
        #新建表'downloadHimawari8',存放这些生成的数据
        sql = 'SELECT uuid FROM ' + 'downloadHimawari8' + \
            ' WHERE ftpPath=%s;'
        sql_data = (ftpPath)
        cur.execute(sql, sql_data)
        sql_res = cur.fetchall()
        if len(sql_res) == 0:
            print('正在入库...')
            sql = 'INSERT INTO ' + 'downloadHimawari8' + \
                ' (uuid,ftpPath,filePath,albedoPath,tbbPath,dateTime) VALUES (%s,%s,%s,%s,%s,%s);'
            sql_data = (png_uuid,ftpPath,filePath,albedoPath,tbbPath,dateTime)
            # print(sql%sql_data)
            cur.execute(sql, sql_data)
            conn.commit()        
        else:
            print('[warning] 该文件已经存在,即将更新文件!')
            sql = 'UPDATE ' + 'downloadHimawari8' + \
                ' SET uuid=%s,ftpPath=%s,filePath=%s,albedoPath=%s,tbbPath=%s,dateTime=%s;'
            sql_data = (png_uuid,ftpPath,filePath,albedoPath,tbbPath,dateTime)
            cur.execute(sql, sql_data)
            conn.commit()
    
        cur.close()
        conn.close()
    

    注意:这里需要安装mysql,参考https://www.cnblogs.com/winton-nfs/p/11524007.html

  3. nc转tif

    def nc2tiff(ifile, outDir):
        '''
        @desc: 只适用于Himawari-8 全球区域(R)对应的L1影像
                读取nc转换成tif,生成同名albedo/tbb文件
        @ifile: nc文件路径
        @outDir: 输出路径
        '''
        ds = h5py.File(ifile, mode='r')
        all_vars = list(ds.keys())
        firstData = ds['albedo_01'][:]
        anotherData = ds['tbb_10'][:]
        lon = ds['longitude'][:]
        lat = ds['latitude'][:]
    
        count = 0
        data1 = np.zeros((firstData.shape[0], firstData.shape[1], 6))
        for var in all_vars:
            if var.startswith(str('albedo_')):
                data1[:,:,count] = ds[var][:]
                count = count + 1
    
        count = 0
        data2 = np.zeros((anotherData.shape[0], anotherData.shape[1], 10))
        for var in all_vars:
            if var.startswith(str('tbb_')):
                data2[:,:,count] = ds[var][:]
                count = count + 1
    
        xCell = (lon.max()-lon.min())/len(lon)
        yCell = (lat.max()-lat.min())/len(lat)
        geotrans = (lon.min(), xCell, 0, lat.max(), 0, -yCell)
        srs = osr.SpatialReference()
        srs.ImportFromEPSG(4326)
        proj = srs.ExportToWkt()
        albedoPath = os.path.join(outDir, 'albedoPath')
        tbbPath = os.path.join(outDir, 'tbbPath')
        if not os.path.exists(albedoPath):
            os.mkdir(albedoPath)
        if not os.path.exists(tbbPath):
            os.mkdir(tbbPath)
        albedoPath = os.path.join(albedoPath, os.path.basename(ifile).replace('.nc', '_albedo.tif'))
        tbbPath = os.path.join(tbbPath, os.path.basename(ifile).replace('.nc', '_tbb.tif'))
        raster2tif(data1, None, type(data1), geotrans, proj, albedoPath)
        raster2tif(data2, None, type(data2), geotrans, proj, tbbPath)
        return albedoPath, tbbPath
    

    在读取albedo/tbb数据集后,需要读取lon/lat矩阵,计算出分辨率及起始点,来确定地理变换(geotrans)。设置好投影和仿射参数后,生成tif。

5. 结果展示

image-20201211021148388image-20201211021300761

image-20201211021300761

这样我们就生成了对应的tif数据,完成了第一步。

6. 后续

几点问题亟待解决

  1. Himawari-8的辐射定标和大气校正

     2.  **HSD**格式数据的解析
    
image-20201211021911606
posted @ 2020-12-11 02:38  Rser_ljw  阅读(8023)  评论(2编辑  收藏  举报