NLTK debug记录——"[nltk_data] Error loading xxx"下载数据集失败
问题:运行nltk.download("xxx")时遇到连接下载失败Error
解决:
- 在gitee上下载对应的.zip词库包(如,nltk_data/pakages/copora/目录下的下载链接);
- NLTK下载数据集时会自动搜索某些以./nltk_data/为结尾的目录(见附注),找到一个这样的目录并确保自己有写这个目录的权限,如果上一层目录下没有nltk_data文件夹就新建一个名为nltk_data的文件夹,将1. 中下载的.zip文件上传到./nltk_data/下,重新运行代码即可。
【附注】找到nltk下载数据集时会搜索和存放的目录:
查看nltk的安装目录下的downloader.py下载代码,
vim ~/.local/lib/python3.8/site-packages/nltk/downloader.py
发现下载数据集的函数的注释中有以下备选目录:
``/usr/share/nltk_data``, ``/usr/local/share/nltk_data``,
``/usr/lib/nltk_data``, ``/usr/local/lib/nltk_data``, ``~/nltk_data``
... ... def default_download_dir(self): """ Return the directory to which packages will be downloaded by default. This value can be overridden using the constructor, or on a case-by-case basis using the ``download_dir`` argument when calling ``download()``. On Windows, the default download directory is ``PYTHONHOME/lib/nltk``, where *PYTHONHOME* is the directory containing Python, e.g. ``C:\\Python25``. On all other platforms, the default directory is the first of the following which exists or which can be created with write permission: ``/usr/share/nltk_data``, ``/usr/local/share/nltk_data``, ``/usr/lib/nltk_data``, ``/usr/local/lib/nltk_data``, ``~/nltk_data``. """ ... ...
选择有写权限的目录创建并存放数据集的.zip压缩文件即可。
用代码改变世界!就是这样,喵!
posted on 2023-10-26 15:05 Mju_halcyon 阅读(159) 评论(0) 编辑 收藏 举报