Python自然语言处理学习笔记(14):2.6 Summary 小结

 转载请注明出处一块努力的牛皮糖”:http://www.cnblogs.com/yuxc/

新手上路,翻译不恰之处,恳请指出,不胜感谢 

 

2.6 Summary   小结

 

A text corpus is a large, structured collection of texts. NLTK comes with many corpora, e.g., the Brown Corpus, nltk.corpus.brown.

 文本语料库是一个大型的结构化的一系列的文本。NLTK包含了许多语料库,例如,Brown Corpusnltk.corpus.brown

 

Some text corpora are categorized, e.g., by genre or topic; sometimes the categories of a corpus overlap each other.

一些文本语料库进行了分类,例如通过类型或者主题;有时候语料库的类别相互重叠。

 

A conditional frequency distribution is a collection of frequency distributions, each one for a different condition. They can be used for counting word frequencies,given a context or a genre.

条件频率分布是一系列的条件分布,每个都是不同的条件。它们可以用于通过给定内容或者类型对单词频率计数。

 

Python programs more than a few lines long should be entered using a text editor, saved to a file with a .py extension, and accessed using an import statement.

有数行的Python程序应该使用文本编辑器输入,保存为.py的文件,并使用import语句来访问。

 

Python functions permit you to associate a name with a particular block of code, and reuse that code as often as necessary.

Python函数允许将一段特别的代码块与名字联系起来,并且频繁地重用代码。

 

Some functions, known as “methods,” are associated with an object, and we give the object name followed by a period followed by the method name, like this: x.funct(y), e.g., word.isalpha().

一些被称为“方法”的函数与对象联系起来,我们随后通过方法名给出了对象名称,就像这样:x.funct(y),例如,word.isalpha()

 

To find out about some variable v, type help(v) in the Python interactive interpreter to read the help entry for this kind of object.

为了找出一些变量v,在Pyhon交互解释器中输入help(v)来阅读这种对象的帮助条目。

 

WordNet is a semantically oriented dictionary of English, consisting of synonym sets—or synsets—and organized into a network.

WordNet是一个面向语义的英语字典,由同义词的集合—或同义词集组成—并且组织成网络。

 

Some functions are not available by default, but must be accessed using Python’s import statement.

有些函数的默认值不是有效的,但必须使用Pythonimport语句来访问。

posted @ 2011-08-05 21:24  牛皮糖NewPtone  阅读(2046)  评论(1编辑  收藏  举报