netcdf4和masked array

netcdf4和masked array

By default, netcdf4-python returns numpy masked arrays with values equal to the missing_value or _FillValue variable attributes masked. The set_auto_mask Dataset and Variable methods can be used to disable this feature so that numpy arrays are always returned, with the missing values included. Prior to version 1.4.0 the default behavior was to only return masked arrays when the requested slice contained missing values. This behavior can be recovered using the set_always_mask method. If a masked array is written to a netCDF variable, the masked elements are filled with the value specified by the missing_value attribute. If the variable has no missing_value, the _FillValue is used instead.

以上这段话引自python的netCDF4的官网,它说,默认的,netcdf4-python返回numpy masked array并且那些和missing_value 或者_FillValue相同的元素会被掩盖。

值得注意的是,显然netcdfnumpy masked array的思想或者考虑问题的思维有所不同,netcdf会用_FillValue填充应该被掩盖的元素,而numpy masked array并不会主动使用fill_value填充,只是存在一个mask数组,里面记录了每一个元素到底是不是应该被掩盖。下面举个例子:

import numpy as np
import numpy.ma as ma
x = ma.MaskedArray([0, 1, 2, 3, 4], mask=[0, 1, 0, 0, 0])
>>> x 
masked_array(data=[0, --, 2, 3, 4],
             mask=[False,  True, False, False, False],
       fill_value=999999)
>>> x.data
array([0, 1, 2, 3, 4])

可以看到,尽管值被掩盖,但值没有被用fill_value替换。

顺便提一下,有时我们需要特定的值来填充被掩盖的值,可以用filled方法。

>>> x.filled(0)
array([0, 0, 2, 3, 4])
posted @ 2020-05-10 23:11  gujianmu  阅读(1371)  评论(0编辑  收藏  举报