Python提取Word中图片的实现步骤

for paragraph in doc.paragraphs:
    for run in paragraph.runs:
        if run.element.xml.startswith('<w:drawing'):
            inline = run.inline_shapes[0]
            if inline.has_image:
                image = inline._inline.graphic.graphicData.pic.nvPicPr.cNvPr.name
                print(image)
复制代码
获取图片二进制数据

from docx.shared import Inches

for paragraph in doc.paragraphs:
    for run in paragraph.runs:
        if run.element.xml.startswith('<w:drawing'):
            inline = run.inline_shapes[0]
            if inline.has_image:
                image = inline._inline.graphic.graphicData.pic.nvPicPr.cNvPr.name
                image_data = inline._inline.graphic.graphicData.pic.blipFill.blip
                with open(f"{image}.png", 'wb') as f:
                    f.write(image_data)
复制代码
复制代码
如果你需要从letter.docx文档中提取所有图片数据,可以使用以下代码实现。

import docx
from docx.shared import Inches

doc = docx.Document('letter.docx')

for paragraph in doc.paragraphs:
    for run in paragraph.runs:
        if run.element.xml.startswith('<w:drawing'):
            inline = run.inline_shapes[0]
            if inline.has_image:
                image = inline._inline.graphic.graphicData.pic.nvPicPr.cNvPr.name
                image_data = inline._inline.graphic.graphicData.pic.blipFill.blip
                with open(f"{image}.png", 'wb') as f:
                    f.write(image_data)
复制代码
复制代码
如果你只需要提取某一个特定的Word文档中的图片,可以通过修改文档名称和图片名称信息,使用以下代码解决。

import docx
from docx.shared import Inches

doc = docx.Document('example.docx')

for paragraph in doc.paragraphs:
    for run in paragraph.runs:
        if run.element.xml.startswith('<w:drawing'):
            inline = run.inline_shapes[0]
            if inline.has_image:
                image = inline._inline.graphic.graphicData.pic.nvPicPr.cNvPr.name
                if image == 'image.png':
                    image_data = inline._inline.graphic.graphicData.pic.blipFill.blip
                    with open(f"{image}.png", 'wb') as f:
                        f.write(image_data)
复制代码

 

posted @   祺琪  阅读(531)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)
点击右上角即可分享
微信分享提示