用Python计算栅格数据的真实面积
用Python计算栅格数据的真实面积
在地理空间分析中,栅格数据的像素值通常代表某种属性,比如土地利用比例、植被覆盖率等。这些数据往往基于经纬度网格表示的比例值,而为了更直观地理解这些数据的空间意义,我们需要将这些比例值转化为实际面积(如平方米或公顷)。
对于高分辨率的大尺寸栅格文件(GeoTIFF),一次性加载整个文件会耗费大量内存,甚至导致处理失败。为了解决这个问题,我们可以通过分块处理 的方式,逐步计算像素的真实面积,确保即使是大文件也能顺利处理。
一. 应用场景
该方法尤其适用于以下场景:
1. 比例值栅格数据:
当栅格像素值表示某个区域的比例(例如森林覆盖率)时,通过面积计算,可以得出真实的覆盖面积。
2. 土地利用分析:
计算农田、草地、建筑用地等各类土地利用类型的真实面积。
3. 遥感影像处理:
在遥感影像中将像素值的属性映射到真实的地理空间。
二.代码实现
import sys
import os
import rasterio
import numpy as np
from glob import glob
def haversine(coord1, coord2):
"""
Calculate the great circle distance in kilometers between two points
on the earth (specified in decimal degrees, with coordinates in the order of longitude, latitude).
Arguments:
coord1 -- tuple containing the longitude and latitude of the first point (lon1, lat1)
coord2 -- tuple containing the longitude and latitude of the second point (lon2, lat2)
Returns:
distance in kilometers.
"""
# Extract longitude and latitude, then convert from decimal degrees to radians
lon1, lat1 = np.radians(coord1)
lon2, lat2 = np.radians(coord2)
# Haversine formula
dlat = abs(lat2 - lat1)
dlon = abs(lon2 - lon1)
a = np.sin(dlat / 2) ** 2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon / 2) ** 2
c = 2 * np.arcsin(np.sqrt(a))
r = 6371 # Radius of earth in kilometers. Use 3956 for miles
return c * r
# Read geotiff file in data folder
# Calculate area for a single GeoTIFF file
def calculate_area_for_tif(input_tif, chunk_size=1024):
# Check if input file exists
if not os.path.isfile(input_tif):
raise FileNotFoundError(f"Input file '{input_tif}' does not exist.")
# Construct output file path
output_path = input_tif.split('.')[0] + '_area.tif'
# Skip if the area raster already exists
if os.path.exists(output_path):
print(f"Area of {input_tif} already calculated. Skipping...")
return
# Open the source raster
with rasterio.open(input_tif) as src:
# Check if the raster is in a geographic coordinate system
if not src.crs.is_geographic:
raise ValueError(f"The raster {input_tif} is not in a geographic coordinate system!")
# Get raster metadata and size
meta = src.meta.copy()
width, height = src.width, src.height
transform = src.transform
# Update metadata for output
meta.update(compress='lzw', dtype=rasterio.float32, count=1)
# Calculate the total number of chunks
total_chunks = ((height + chunk_size - 1) // chunk_size) * ((width + chunk_size - 1) // chunk_size)
chunk_count = 0 # Track processed chunks
# Create the output raster file
with rasterio.open(output_path, 'w', **meta) as dst:
print(f"Processing {input_tif} in chunks...")
for row_start in range(0, height, chunk_size):
row_end = min(row_start + chunk_size, height)
for col_start in range(0, width, chunk_size):
col_end = min(col_start + chunk_size, width)
# Read a chunk of the source raster
window = rasterio.windows.Window(
col_start, row_start, col_end - col_start, row_end - row_start
)
chunk_transform = src.window_transform(window)
rows, cols = np.meshgrid(
np.arange(row_start, row_end),
np.arange(col_start, col_end),
indexing='ij'
)
# Apply geotransform to get geographical coordinates
x_coords, y_coords = chunk_transform * (cols, rows)
x_coords_right, y_coords_right = chunk_transform * (cols + 1, rows)
x_coords_bottom, y_coords_bottom = chunk_transform * (cols, rows + 1)
xy_coords = np.stack((x_coords, y_coords), axis=0)
xy_coords_right = np.stack((x_coords_right, y_coords_right), axis=0)
xy_coords_bottom = np.stack((x_coords_bottom, y_coords_bottom), axis=0)
# Calculate the area for this chunk
length_right = haversine(xy_coords, xy_coords_right)
length_bottom = haversine(xy_coords, xy_coords_bottom)
area_chunk = length_right * length_bottom
# Write the chunk to the output raster
dst.write(area_chunk.astype(np.float32), 1, window=window)
# Update progress
chunk_count += 1
progress = (chunk_count / total_chunks) * 100
print(f"Progress: {progress:.2f}% ({chunk_count}/{total_chunks} chunks)")
print(f"\nArea of {input_tif} calculated and saved to {output_path}")
if __name__ == '__main__':
# Specify the input file path
input_tif = 'path/to/your/input_file.tif'
# Run the function
calculate_area_for_tif(input_tif, chunk_size=1024)
print('\nArea calculation complete!')
三、注意事项
结果的每个栅格的值代表该栅格的面积,单位:平方千米
! 本代码修改自迪肯大学王金柱博士的代码