python 读pdf

import pdfplumber
import pandas as pd
from PIL import Image
pdf = pdfplumber.open("xue1.pdf")
#Load page_0
p0 = pdf.pages[0]
table = p0.extract_table()

df = pd.DataFrame(table[1:], columns=table[0])
print(df.infer_objects)
for column in ["Effective", "Received"]:
    df[column] = df[column].str.replace(" ", "")

 

posted @ 2023-03-19 12:13  myrj  阅读(23)  评论(0编辑  收藏  举报