PyTables 提供的一些工具
PyTables 提供了一些工具,可以方便查看以及分析生成的文件,以下是一个简单说明
ptdump
提供了查看数据以及元数据信息
- 命令
usage: ptdump [-h] [-v] [-d] [-a] [-s] [-c] [-i] [-R RANGE] filename[:nodepath]
The ptdump utility allows you look into the contents of your PyTables files. It lets you see not only the data but also the metadata
(that is, the *structure* and additional information in the form of *attributes*).
positional arguments:
filename[:nodepath] name of the HDF5 file to dump
options:
-h, --help show this help message and exit
-v, --verbose dump more metainformation on nodes
-d, --dump dump data information on leaves
-a, --showattrs show attributes in nodes (only useful when -v or -d are active)
-s, --sort sort output by node name
-c, --colinfo show info of columns in tables (only useful when -v or -d are active)
-i, --idxinfo show info of indexed columns (only useful when -v or -d are active)
-R RANGE, --range RANGE
select a RANGE of rows (in the form "start,stop,step") during the copy of *all* the leaves. Default values are
"None,None,1", which means a copy of all the rows.
- ptrepack
提供了对于内部数据的复制操作能力
usage: ptrepack [-h] [-v] [-o] [-R RANGE] [--non-recursive] [--dest-title TITLE] [--dont-create-sysattrs] [--dont-copy-userattrs]
[--overwrite-nodes] [--complevel COMPLEVEL]
[--complib {zlib,lzo,bzip2,blosc,blosc:blosclz,blosc:lz4,blosc:lz4hc,blosc:zlib,blosc:zstd,blosc2,blosc2:blosclz,blosc2:lz4,blosc2:lz4hc,blosc2:zlib,blosc2:zstd}]
[--shuffle {0,1}] [--bitshuffle {0,1}] [--fletcher32 {0,1}] [--keep-source-filters] [--chunkshape CHUNKSHAPE]
[--upgrade-flavors] [--dont-regenerate-old-indexes] [--sortby COLUMN] [--checkCSI] [--propindexes]
[--dont-allow-padding]
sourcefile:sourcegroup destfile:destgroup
This utility is very powerful and lets you copy any leaf, group or complete subtree into another file. During the copy process you are
allowed to change the filter properties if you want so. Also, in the case of duplicated pathnames, you can decide if you want to
overwrite already existing nodes on the destination file. Generally speaking, ptrepack can be useful in may situations, like replicating
a subtree in another file, change the filters in objects and see how affect this to the compression degree or I/O performance,
consolidating specific data in repositories or even *importing* generic HDF5 files and create true PyTables counterparts.
positional arguments:
sourcefile:sourcegroup
source file/group
destfile:destgroup destination file/group
options:
-h, --help show this help message and exit
-v, --verbose show verbose information
-o, --overwrite overwrite destination file
-R RANGE, --range RANGE
select a RANGE of rows (in the form "start,stop,step") during the copy of *all* the leaves. Default values are
"None,None,1", which means a copy of all the rows.
--non-recursive do not do a recursive copy. Default is to do it
--dest-title TITLE title for the new file (if not specified, the source is copied)
--dont-create-sysattrs
do not create sys attrs (default is to do it)
--dont-copy-userattrs
do not copy the user attrs (default is to do it)
--overwrite-nodes overwrite destination nodes if they exist. Default is to not overwrite them
--complevel COMPLEVEL
set a compression level (0 for no compression, which is the default)
--complib {zlib,lzo,bzip2,blosc,blosc:blosclz,blosc:lz4,blosc:lz4hc,blosc:zlib,blosc:zstd,blosc2,blosc2:blosclz,blosc2:lz4,blosc2:lz4hc,blosc2:zlib,blosc2:zstd}
set the compression library to be used during the copy. Defaults to zlib
--shuffle {0,1} activate or not the shuffle filter (default is active if complevel > 0)
--bitshuffle {0,1} activate or not the bitshuffle filter (not active by default)
--fletcher32 {0,1} whether to activate or not the fletcher32 filter (not active by default)
--keep-source-filters
use the original filters in source files. The default is not doing that if any of --complevel, --complib,
--shuffle --bitshuffle or --fletcher32 option is specified
--chunkshape CHUNKSHAPE
set a chunkshape. Possible options are: "keep" | "auto" | int | tuple. A value of "auto" computes a sensible
value for the chunkshape of the leaves copied. The default is to "keep" the original value
--upgrade-flavors when repacking PyTables 1.x or PyTables 2.x files, the flavor of leaves will be unset. With this, such a leaves
will be serialized as objects with the internal flavor ('numpy' for 3.x series)
--dont-regenerate-old-indexes
disable regenerating old indexes. The default is to regenerate old indexes as they are found
--sortby COLUMN do a table copy sorted by the index in "column". For reversing the order, use a negative value in the "step"
part of "RANGE" (see "-r" flag). Only applies to table objects
--checkCSI force the check for a CSI index for the --sortby column
--propindexes propagate the indexes existing in original tables. The default is to not propagate them. Only applies to table
objects
--dont-allow-padding remove the possible padding in compound types in source files. The default is to propagate it. Only applies to
table objects
- pt2to3 主要是进行版本迁移的工具
usage: pt2to3 [-h] [-r] [-p] [-o OUTPUT] [-i] filename
PyTables 2.x -> 3.x API transition tool This tool displays to standard out, so it is common to pipe this to another file: $ pt2to3
oldfile.py > newfile.py
positional arguments:
filename path to input file.
options:
-h, --help show this help message and exit
-r, --reverse reverts changes, going from 3.x -> 2.x.
-p, --no-ignore-previous
ignores previous_api() calls.
-o OUTPUT output file to write to.
-i, --inplace overwrites the file in-place.
说明
PyTables 内部的一些工具,实际上也是对于内部api的调用,只是提供了更加方便的使用,方便我们分析生成的数据文件以及进行一些额外的操作
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)
2024-02-05 dremio 下载大量查询结果数据的一个技巧
2024-02-05 spring-plugin简单使用
2023-02-05 dremio DacDaemonYarnApplication 简单说明
2023-02-05 apache twill 开发参考流程
2022-02-05 glob 方便的nodejs 文件查找包
2022-02-05 actionhero Initializer的优先级
2022-02-05 grouparoo 插件加载处理