格式化文档

1.清除尾部参考文献和头部封面及目录

2.去除水印

3.去除右边所有船锚(宏),清除嵌入对象(宏),去除表格(宏),去除所有图形

Sub DeleteAllShapes()
    Dim shp As Shape
    For Each shp In ActiveDocument.Shapes
        shp.Delete
    Next shp
End Sub

Sub DeleteAllInlineShapesAndShapes()
    ' 删除所有嵌入对象
    Dim ils As InlineShape
    For Each ils In ActiveDocument.InlineShapes
        ils.Delete
    Next ils

    ' 删除所有形状
    Dim shp As Shape
    For Each shp In ActiveDocument.Shapes
        shp.Delete
    Next shp
End Sub

Sub DeleteAllTables()
    Dim tbl As Table
    For Each tbl In ActiveDocument.Tables
        tbl.Delete
    Next tbl
End Sub

Sub RemoveAllShapes()
    Dim shp As Shape
    Dim ils As InlineShape
    
    ' 删除嵌入式图形
    For Each ils In ActiveDocument.InlineShapes
        ils.Delete
    Next ils
    
    ' 删除浮动图形
    For Each shp In ActiveDocument.Shapes
        shp.Delete
    Next shp
End Sub

4.转换为pdf裁剪顶部页码,转换为word

Microscopic Office Word另存为pdf,裁剪PDF转word

5.清理

5.1去除括号及其内容

Sub DeleteParenthesesAndContent()
    Dim doc As Document
    Dim rng As Range
    Set doc = ActiveDocument
    Set rng = doc.Content
    With rng.Find
        .Text = "\(*\)"
        .Replacement.Text = ""
        .Forward = True
        .Wrap = wdFindContinue
        .Format = False
        .MatchWildcards = True
        .Execute Replace:=wdReplaceAll
    End With
End Sub

5.2删除“图1-8”这种格式的文字
Ctrl + H > 更多 > 通配符 > 查找:图[0-9]{1,}-[0-9]{1,} > 替换为空,然后把“图”改成“表”再清理一遍。

5.3清除“续表”样文字,清除“三 、”这种样式的文字
查找替换:[0-9一二三四五六七八九十] 、
5.4清除“选择题参考答案:”这句话所在段落

Sub DeleteParagraphsContainingText()
    Dim para As Paragraph
    Dim searchText As String
    
    ' 设置要查找的文本
    searchText = "选择题参考答案:"
    
    ' 遍历所有段落
    For Each para In ActiveDocument.Paragraphs
        If InStr(para.Range.Text, searchText) > 0 Then
            para.Range.Delete
        End If
    Next para
End Sub

5.5手动删除所有课后习题,手动删除残余船锚
通过第[一二三四五六七八九十]章跳转
pps:
5.6 清除特殊符号
[①②③④⑤⑥⑦⑧⑨⑩⑪⑫⑬⑭⑮⑯⑰⑱⑲⑳㉑㉒㉓㉔㉕㉖㉗㉘㉙㉚㉛㉜㉝㉞㉟㊱㊲㊳㊴㊵]
ps:

6.去除编号

复制到另一个word中仅保留文本,查找替换:[0-9].(如果删除课后思考就只用清理括号),再次清理括号及内容
6.1 使用【*】清理
6.2 使用附表[0-9][0-9]附表[0-9]清理

6.去除空格,去除换行

分成一栏,复制到记事本,执行python脚本

with open("新建文本文档.txt", 'r', encoding='utf-8') as file:
    text = file.read()
    
text = text.replace('\n', '')

with open('output.txt', 'w', encoding='utf-8') as file:
    file.write(text)

p*4s:

7. 去除制表符

查找替换^t

ppps:
查找文中配图配表下的注释文字用的统配符:???:《


特殊符号:

  1. ①②③④⑤⑥⑦⑧⑨⑩⑪⑫⑬⑭⑮⑯⑰⑱⑲⑳㉑㉒㉓㉔㉕㉖㉗㉘㉙㉚㉛㉜㉝㉞㉟㊱㊲㊳㊴㊵
posted @ 2024-10-02 06:36  基础狗  阅读(22)  评论(0编辑  收藏  举报