Python工具箱系列(四十八)

如何操作docx文档(下)

当要更细致的操作WORD文档时，python-docx库就有些力不从心了。这时强力的python win32com库可以帮助我们完成更细致复杂的工作。笔者经常要组织大型文档的汇总（例如标书），此时文档中插入的图片各式各样，缩写时也无从知道图片在整个文档中的顺序，所以对所有图片加题注就是一件非常费时的工作。此外，图片大小不一，整体的美观度会下降。如果人工调整，非常枯燥费力，此时python就可以大显神威，完成大量的自动化的工作，最后再辅以少量的人工即可。以下代码显示了对docx中的图片与表格进行调整的技巧。

import win32com.client as win32
from win32com.client import constants
import argparse
import sys


class action:

    def adjustfigures(self, infile, outfile):
        """
        调整文档中所有图片的大小
        """
        doc_app = win32.gencache.EnsureDispatch('Word.Application')
        doc_app.Visible = True
        doc = doc_app.Documents.Open(infile)
        print(f'pythonwin32com adjust figure in {infile}')
        print(f'shape counts: {doc.Shapes.Count}')
        print(f'inlineshape counts: {doc.InlineShapes.Count}')

        for index, shape in enumerate(doc.InlineShapes):
            print(f'handle figure {index}')
            # 调整图片大小
            ratio = shape.Height/shape.Width
            shape.Width = 200
            shape.Height = round(shape.Width*ratio)

            rng = shape.Range
            # 插入题注
            rng.InsertCaption(Label=constants.wdCaptionFigure,
                              Position=constants.wdCaptionPositionBelow, Title=f" InlineShapes-{index+1}")
            # 居中对齐
            rng.ParagraphFormat.Alignment = constants.wdAlignParagraphCenter

        doc.SaveAs(outfile)
        doc.Close()
        doc_app.Quit()

    def adjusttables(self, infile, outfile):
        """
        调整表格
        """
        doc_app = win32.gencache.EnsureDispatch('Word.Application')
        doc_app.Visible = True
        doc = doc_app.Documents.Open(infile)
        print(f'pythonwin32com adjust tables in {infile}')
        print(f'table counts: {doc.Tables.Count}')
        for index, table in enumerate(doc.Tables):
            print(f'handle table {index}')
            rng = table.Range
            # 插入题注
            rng.InsertCaption(Label=constants.wdCaptionTable,
                              Position=constants.wdCaptionPositionAbove, Title=f" table-{index+1}")
            # 居中对齐
            rng.ParagraphFormat.Alignment = constants.wdAlignParagraphCenter

        doc.SaveAs(outfile)
        doc.Close()
        doc_app.Quit()

if __name__ == '__main__':
    def parser():
        """
        分析用户命令行
        """
        parser = argparse.ArgumentParser()
        parser.add_argument("inputfilename", type=str, help="要处理的文档名称")
        parser.add_argument(
            "-t", "--table", action="store_true", help="调整表格位置")
        parser.add_argument("-f", "--figures",
                            action="store_true", help="调整图片位置")

        args = parser.parse_args()

        # 判断参数输入情况,如果没有参数,则显示帮助。
        if len(sys.argv) == 1:
            parser.print_help()
            return

        updator = action()
        
        docxfilename = args.inputfilename
        targetfilename = r'd:\test\demo.docx'
        if args.table:
            updator.adjusttables(docxfilename, targetfilename)

        if args.figures:
            updator.adjustfigures(docxfilename, targetfilename)

    parser()

上述代码的用例如下。

# 调整文档中的图片大小，并且使其居中，同时加入题注
python .\office05.py -f "d:\test\1.docx"

# 给表格加题注，并且使表格整体居中
python .\office05.py -t "d:\test\1.docx"

上述代码中，使用本系列前文介绍的argparse标准库用来分析命令行参数。其中-f选项表示处理文档中的图片，-t选项表示处理文档中的表格。可以根据示例代码进行任意修改以适应自己的需求。例如将图片的宽与高进行自定义，调整对齐格式等。当大型docx文档中的图片数以千百计时，上述代码能够节省大量的时间精力，值得花些时间调整。

posted @ 2024-01-18 11:00 西安衍舆航天阅读(44) 评论(0) 编辑收藏举报

刷新页面返回顶部

shanxihualu

Python工具箱系列(四十八)

公告