基于python的extract_msg模块提取outlook邮箱保存的msg文件中的附件
Posted on 2020-09-19 11:04 520_1351 阅读(2800) 评论(0) 编辑 收藏 举报笔者保存了一些outlook邮箱中保存的一些msg格式的邮件文件,现需要将其中的附件提取出来,
当然直接在outlook中就可以另存附件,但outlook默认是不支持批量提取邮件中的附件的
思考过几种方案,其中之一就是使用python编程语言下的extract_msg模块,记录如下
1、安装extract_msg模块 pip install extract-msg ,笔者写此随笔时,最新版本为extract-msg 0.27.4
发布于Released: Sep 3, 2020,项目说明:https://pypi.org/project/extract-msg
2、安装后,最简单的使用,直接在命令行一条命令,即可将msg中的文件解压到当前目录下的一个子目录中(目录名与邮件信息有关)
#会在当前目录下,生成一个目录,然后将msg邮件文件中的附件和message.txt解压到其中 python -m extract_msg qq_5201351.msg
3、在py文件中,可以使用如下方法只提取其中的附件(需要先创建要保存附件的目录):
import extract_msg msg = extract_msg.Message("qq_5201351.msg") msg_attachment = msg.attachments if msg_attachment: for attachment in msg_attachment: attachment.save(customPath="./qq_5201351_dir")
++++++未解决的问题>>>>:
1、使用上面的方法对于大多数msg都能够正常提取出附件,或者邮件内容,但是笔者有的mgs提取时会报如下错误,
目录未找到解决方法, 如有找到解决方法的,欢迎下方留言,非常感谢!
Traceback (most recent call last): File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\msg.py", line 422, in named return self.__namedProperties AttributeError: 'Message' object has no attribute '_MSGFile__namedProperties' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "C:\Users\QQ5201351\Desktop\mail\test\test.py", line 5, in <module> msg = extract_msg.Message("Important_msg_from_qq5201351.msg") File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\message.py", line 28, in __init__ MessageBase.__init__(self, path, prefix, attachmentClass, filename, delayAttachments, overrideEncoding) File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\message_base.py", line 61, in __init__ self.named File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\msg.py", line 424, in named self.__namedProperties = Named(self) File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\named.py", line 63, in __init__ self.__properties.append(StringNamedProperty(entry, names[entry['id']], msg._getTypedData(streamID)) if entry['pkind'] == constants.STRING_NAMED else NumericalNamedProperty(entry, msg._getTypedData(streamID))) File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\msg.py", line 177, in _getTypedData found, result = self._getTypedStream('__substg1.0_' + id, prefix, _type) File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\msg.py", line 246, in _getTypedStream raise NotImplementedError('The stream specified is of type {}. We don\'t currently understand exactly how this type works. If it is mandatory that you have the contents of this stream, please create an issue labled "NotImplementedError: _getTypedStream {}".'.format(_type, _type)) NotImplementedError: The stream specified is of type 1014. We don't currently understand exactly how this type works. If it is mandatory that you have the contents of this stream, please create an issue labled "NotImplementedError: _getTypedStream 1014".
尊重别人的劳动成果 转载请务必注明出处:https://www.cnblogs.com/5201351/p/13695389.html
作者:一名卑微的IT民工
出处:https://www.cnblogs.com/5201351
本博客所有文章仅用于学习、研究和交流目的,欢迎非商业性质转载。
由于博主的水平不高,文章没有高度、深度和广度,只是凑字数,不足和错误之处在所难免,希望大家能够批评指出。
博主是利用读书、参考、引用、复制和粘贴等多种方式打造成自己的文章,请原谅博主成为一个卑微的IT民工!