pandas 索引占用内存大小显示 pd.index.memory_usage()

>>> s = pd.Series(np.zeros(10**6))
>>> s.index
RangeIndex(start=0, stop=1000000, step=1)
>>> s.index.memory_usage()       # in bytes
128                    # the same as for Series([0.])

现在,如果我们删除一个元素,索引隐式地转换为类似于dict的结构,如下所示:

>>> s.drop(1, inplace=True)
>>> s.index
Int64Index([     0,      2,      3,      4,      5,      6,      7,
            ...
            999993, 999994, 999995, 999996, 999997, 999998, 999999],
           dtype='int64', length=999999)
>>> s.index.memory_usage()
7999992

该结构消耗8Mb内存!为了摆脱它,回到轻量级的类range结构,添加如下代码:

>>> s.reset_index(drop=True, inplace=True)
>>> s.index
RangeIndex(start=0, stop=999999, step=1)
>>> s.index.memory_usage()
128
posted @ 2024-02-12 05:53  myrj  阅读(20)  评论(0编辑  收藏  举报