netcdf4和masked array
netcdf4和masked array
By default, netcdf4-python returns numpy masked arrays with values equal to the missing_value or _FillValue variable attributes masked. The set_auto_mask Dataset and Variable methods can be used to disable this feature so that numpy arrays are always returned, with the missing values included. Prior to version 1.4.0 the default behavior was to only return masked arrays when the requested slice contained missing values. This behavior can be recovered using the set_always_mask method. If a masked array is written to a netCDF variable, the masked elements are filled with the value specified by the missing_value attribute. If the variable has no missing_value, the _FillValue is used instead.
以上这段话引自python的netCDF4的官网,它说,默认的,netcdf4-python返回numpy masked array并且那些和missing_value
或者_FillValue
相同的元素会被掩盖。
值得注意的是,显然netcdf
和numpy masked array
的思想或者考虑问题的思维有所不同,netcdf
会用_FillValue
填充应该被掩盖的元素,而numpy masked array
并不会主动使用fill_value
填充,只是存在一个mask
数组,里面记录了每一个元素到底是不是应该被掩盖。下面举个例子:
import numpy as np
import numpy.ma as ma
x = ma.MaskedArray([0, 1, 2, 3, 4], mask=[0, 1, 0, 0, 0])
>>> x
masked_array(data=[0, --, 2, 3, 4],
mask=[False, True, False, False, False],
fill_value=999999)
>>> x.data
array([0, 1, 2, 3, 4])
可以看到,尽管值被掩盖,但值没有被用fill_value
替换。
顺便提一下,有时我们需要特定的值来填充被掩盖的值,可以用filled
方法。
>>> x.filled(0)
array([0, 0, 2, 3, 4])