PdfiumViewer组件扩展(Pdfium.Net.Free)--删除或编辑pdf内容
1.PdfiumViewer组件扩展(Pdfium.Net.Free)--概述2.PdfiumViewer组件扩展(Pdfium.Net.Free)--快速入门3.PdfiumViewer组件扩展(Pdfium.Net.Free)--PDF预览器框选4.PdfiumViewer组件扩展(Pdfium.Net.Free)--打开大文件处理5.PdfiumViewer组件扩展(Pdfium.Net.Free)--加载字体6.PdfiumViewer组件扩展(Pdfium.Net.Free)--创建字符子集7.PdfiumViewer组件扩展(Pdfium.Net.Free)--添加文本8.PdfiumViewer组件扩展(Pdfium.Net.Free)--添加图片9.PdfiumViewer组件扩展(Pdfium.Net.Free)--添加水印
10.PdfiumViewer组件扩展(Pdfium.Net.Free)--删除或编辑pdf内容
11.PdfiumViewer组件扩展(Pdfium.Net.Free)--PDF操作12.PdfiumViewer组件扩展(Pdfium.Net.Free)--签名13.PdfiumViewer组件扩展(Pdfium.Net.Free)--注解14.PdfiumViewer组件扩展(Pdfium.Net.Free)--可视化编辑pdf15.什么是转换矩阵以及如何使用它16.Pdfium.Net.Free 添加 bblanchon.PDFium nuget方式项目地址:
Pdfium.Net:https://github.com/1000374/Pdfium.Net.Free
PdfiumViewer:https://github.com/1000374/PdfiumViewer
Pdfium.Net.Free 支持
-
.NETFramework 4.0
-
.NETFramework 4.5
-
.NETStandard 2.0
- .Net8.0
可以和PdfiumViewer.Free共同使用预览、编辑pdf,也可以直接引用Pdfium.Net.Free 操作pdf,Pdfium.Net.Free封装了现有Pdfium的函数,实现了部分操作pdf的功能,部分功能等待后续~~
如需删除或者编辑pdf中的内容,首先要获取pdf内需要修改或者删除的对象,所有对页面编辑操作都需要调用GenerateContent函数方才生效
获取pdf所有对象的方法:
返回的信息包含当前对象的index、文字及字体信息(如对象是文本)、位置信息
1 2 3 4 5 6 | var pathPdf = "./Pdfium.NetTests/resources/fontText.pdf" ; using ( var doc = PdfDocument.Load( new MemoryStream(File.ReadAllBytes(pathPdf)))) { var page0 = doc.Pages[0]; var infos = page0.GetCharacterInformation(); } |
如上述不能满足需求,请使用下示例获取:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | //How to know "GetObject" index? var pathPdf = "./Pdfium.NetTests/resources/fontText.pdf" ; using ( var doc = PdfDocument.Load( new MemoryStream(File.ReadAllBytes(pathPdf)))) { var arr = "武则天" .ToCharArray(); var page0 = doc.Pages[0]; var count = page0.GetObjectsCount(); for ( int i = 0; i < count; i++) { var obj = page0.GetObject(0); if (!obj.IsNull) { var objType = obj.PageObjGetObjType(); switch (objType) { case FpdfPageObj.FPDF_PAGEOBJ_UNKNOWN: { } break ; case FpdfPageObj.FPDF_PAGEOBJ_TEXT: { var txt = obj.TextObjGetText(page0.PageText); } break ; case FpdfPageObj.FPDF_PAGEOBJ_PATH: { var res = obj.PathMoveTo(10, 20); } break ; case FpdfPageObj.FPDF_PAGEOBJ_IMAGE: { /* Matrix: | a, c, e| ==> | width, 0, offsetX| * | b, d, f| | 0, height, offsetY|*/ var bitmap = obj.ImageObjGetBitmap(); var res = obj.ImageObjSetMatrix(bitmap.Width, 0.1, 0, bitmap.Height, 100, 100); if (!bitmap.IsNull) { //There is a feature request discussing this: https://crbug.com/pdfium/1930 (disclaimer: I'm the reporter) //TLDR The functions you mention do provide the main data stream, but for some filters complementary data would be needed to actually re - construct the image, which pdfium does not provide. //For CCITTDecode, as the TIFF format can use, pdfium's public API does not tell the CCITT group, but this would be needed to re-construct the TIFF header, which the PDF format strips. And I think BlackIs1 info would also be needed; possibly more. //JBIG2Decode may optionally use a separate JBIG2Globals stream, which again pdfium does not provide.I had filed a separate bug about this: https://crbug.com/pdfium/1927. However, I guess the raw JBIG2 data might not be very useful except for re-insertion into a PDF. IIRC the way pikepdf handles JBIG2 extraction to files is to just decode the data and re-encode to some other format. From a programmatic POV that's not ideal, but I guess the context is that standalone JBIG2 isn't really supported by end-user apps. //Concerning FPDFImageObj_GetImageDataDecoded(), note that it does not fully decode images; it only applies "simple" filters(see https://crbug.com/pdfium/1203#c7), so the function name is a bit misleading. //For the plain pixel data, you can use FPDFImageObj_GetBitmap(), FPDFBitmap_GetBuffer() & co, but note that FPDF_BITMAP is limited in supported pixel formats and bit depth(e.g.no CMYK, B / W, > 8bpc RGB(A), ...). } } break ; case FpdfPageObj.FPDF_PAGEOBJ_SHADING: { } break ; case FpdfPageObj.FPDF_PAGEOBJ_FORM: { } break ; } } } page0.GenerateContent(); } |
编辑对象代码示例(文本):经测试、对于中文替换有时会出现乱码,暂未发现设置字库的方式,可通过先删除在添加文本的方式修改
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | var pathPdf = "./Pdfium.NetTests/resources/hello_world.pdf" ; using ( var doc = PdfDocument.Load( new MemoryStream(File.ReadAllBytes(pathPdf)))) { //var fontPath = @"c:\Windows\fonts\simhei.ttf"; //doc.LoadFont(fontPath); var page0 = doc.Pages[0]; var obj = page0.GetObject(0); if (!obj.IsNull) { if (!obj.IsNull) { var objType = obj.PageObjGetObjType(); switch (objType) { case FpdfPageObj.FPDF_PAGEOBJ_TEXT: { var txt = obj.TextObjGetText(page0.PageText); var res = obj.TextSetText( "Changed for SetText test" ); } break ; } } page0.GenerateContent(); doc.Save( "./Pdfium.NetTests/TextObjFontChange.pdf" ); } } |
删除对象:
1 2 3 4 5 6 7 8 9 10 11 12 | var pathPdf = "./Pdfium.NetTests/resources/fontText.pdf" ; using ( var doc = PdfDocument.Load( new MemoryStream(File.ReadAllBytes(pathPdf)))) { var page0 = doc.Pages[0]; var obj = page0.GetObject(0); if (!obj.IsNull) { var res = page0.RemoveObject(obj); page0.GenerateContent(); doc.Save( "./Pdfium.NetTests/TextObjFont.pdf" ); } } |
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· 单线程的Redis速度为什么快?
· SQL Server 2025 AI相关能力初探
· AI编程工具终极对决:字节Trae VS Cursor,谁才是开发者新宠?
· 展开说说关于C#中ORM框架的用法!