mac配置python自然语言处理环境
一、nltk安装
Ⅰ、工具安装步骤
1、根据python版本从 https://pypi.python.org/pypi/setuptools 下载对应版本的setuptools。然后,在终端下运行,sudo sh Downloads/setuptools-0.6c11-py2.7.egg
2、安装pip 在终端下运行sudo easy_install pip
3、安装Numpy和matplotlib。运行 sudo pip install -U numpy matplotlib
4、安装pyyaml 和nltk 运行sudo pip install -U pyyaml nltk
Ⅱ、遇到的问题
1、安装pip常见问题
Error: No available formula with the name "pip" Homebrew provides pip via: `brew install python`. However you will then have two Pythons installed on your Mac, so alternatively you can install pip via the instructions at: https://pip.readthedocs.org/en/stable/installing/#install-pip
采用:
sudo easy_install pip
记得加sudo
2.安装pyyaml常见问题:
sudo pip install -U pyyaml nltk
会遇到下面的问题:
Installing collected packages: six Found existing installation: six 1.4.1 DEPRECATION: Uninstalling a distutils installed project (six) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project. Uninstalling six-1.4.1: ...
采用下面的命令安装:
sudo pip install libName --upgrade --ignore-installed six
3.然后用上述同样的方式安装nltk
4.更新nmpy:
pip install --upgrade numpy
import sklearn.datasets Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Library/Python/2.7/site-packages/sklearn/__init__.py", line 57, in <module> from .base import clone File "/Library/Python/2.7/site-packages/sklearn/base.py", line 11, in <module> from .utils.fixes import signature File "/Library/Python/2.7/site-packages/sklearn/utils/__init__.py", line 10, in <module> from .murmurhash import murmurhash3_32 File "numpy.pxd", line 155, in init sklearn.utils.murmurhash (sklearn/utils/murmurhash.c:5029) ValueError: numpy.dtype has the wrong size, try recompiling
参考文档: https://blog.wizchen.com/2016/06/17/Mac%E4%B8%8B%E6%9B%B4%E6%96%B0python%E7%A7%91%E5%AD%A6%E8%AE%A1%E7%AE%97%E5%BA%93numpy/
解决的办法是关闭sip:
重启电脑,在电脑启动时按住command+R,等画面上出现苹果图标,会看到打开了一个实用工具窗口,打开终端,输入:
csrutil disable
重启完毕后,再次在终端输入:sudo pip install -U numpy
就可以成功了,记得一定要加sudo。
5、同理,如果要安装matplotlib:sudo pip install matplotlib
也一定要加sudo
二、nltk使用
1、进入到python
>>>import nltk
>>>nltk.download()
会调出一个对话框:可以进行package的下载
但是呢,一般是下载不成功的。需要手动去下载数据包
(可以联系本文作者要数据包,也可以自己百度一下,会有资源的),之后就可以进行文本的各种实验了。
2、自行python实验