Python and Microsoft Word
国外网站看到的文章:
Accessing Microsoft Word with Python follows the same syntax that we used for Excel.
Let’s take a quick look at how to access Word.
from time import sleep import win32com.client as win32 RANGE = range(3, 8) def word(): word = win32.gencache.EnsureDispatch('Word.Application') doc = word.Documents.Add() word.Visible = True sleep(1) rng = doc.Range(0,0) rng.InsertAfter('Hacking Word with Python\r\n\r\n') sleep(1) for i in RANGE: rng.InsertAfter('Line %d\r\n' % i) sleep(1) rng.InsertAfter("\r\nPython rules!\r\n") doc.Close(False) word.Application.Quit() if __name__ == '__main__': word()
This particular example is also based on something from Chun’s book as well. However, there are lots of other examples on the web that look almost exactly like this too. Let’s unpack this code now. To get a handle on the Microsoft Word application, we callwin32.gencache.EnsureDispatch(‘Word.Application’); then we add a new document by calling the word instance’s Documents.Add(). If you want to show the user what you’re up to, you can set the visibility of Word to True.
If you want to add text to the document, then you’ll want to tell Word where you want the text to go. That’s where the Range method comes in. While you can’t see it, there is a “grid” of sorts that tells Word how to layout the text onscreen. So if we want to insert text at the very top of the document, we tell it to start at (0,0). To add a new line in Word, we need to append “\r\n” to the end of our string. If you don’t know about the annoyances of line endings on different platforms, you should spend some time with Google and learn about it so you don’t get bit by weird bugs!
The rest of the code is pretty self-explanatory and will be left to the reader to interpret. We’ll move on to opening and saving documents now:
# Based on examples from http://code.activestate.com/recipes/279003/ word.Documents.Open(doc) word.ActiveDocument.SaveAs("c:\\a.txt", FileFormat=win32com.client.constants.wdFormatTextLineBreaks)
Here we show how to open an existing Word document and save it as text. I haven’t tested this one fully, so your mileage may vary. If you want to read the text in the document, you can do the following:
docText = word.Documents[0].Content
And that ends the Python hacking lesson on Word documents. Since a lot of the information I was finding on Microsoft Word and Python was old and crusty and didn’t seem to work half the time, I don’t add to the mess of bad information. Hopefully this will get you started on your own journey into the wild wonders of Word manipulation.