牛皮糖NewPtone - 博客园

2011年8月8日

摘要：本文中，我们将使用前面已经实现的水波渗透算法来测定对于固定大小的网格，在不同开放概率p下发生渗透的概率。关于该部分的具体说明如下：How many trials are needed to make a prediction on whether a grid generated with probability p percolates? How many different values of p should be considered to determine the percolation probability q? 需要做多少次实验才能确定一个按概率p产生的网格是否渗透? 需要考阅读全文

posted @ 2011-08-08 14:09 牛皮糖NewPtone 阅读(613) 评论(0) 推荐(0) 编辑

2011年8月7日

水波探测算法的实现

摘要：水波探测算法的实现迷茫童鞋的阅读指南该项目的详细描述和算法的具体说明请参见前一篇 Project2 Percolation in Grids 网格渗透本人也给出了percolation_provided.py的函数说明俗话说得好:凡事说起来容易，做起来难。博主没有谨慎思考打开IDE就是一顿乱敲，然后就是不停地修复各种BUG。下面谈谈该算法的实现以及遇到的问题和解决办法：Step1: 国际惯例导入提供的函数：from percolation_provided import * 再敲定函数名：percolation_wave(input_grid)。名字取好了就可以开始干活了，参数先暂定一个阅读全文

posted @ 2011-08-07 23:10 牛皮糖NewPtone 阅读(733) 评论(0) 推荐(0) 编辑

Python自然语言处理学习笔记(25)：3.9 格式化：从列表到字符串

摘要： 3.9Formatting: From Lists to Strings 格式化：从列表到字符串 Often we write a program to report a single data item, such as a particular element in a corpus that meets some complicated criterion, or a single summary statistic such as a word-count or the performance of a tagger. More often, we write a program to 阅读全文

posted @ 2011-08-07 20:19 牛皮糖NewPtone 阅读(2610) 评论(0) 推荐(1) 编辑

2011年8月6日

Python自然语言处理学习笔记(24)：3.8 分割

摘要： 3.8Segmentation 分割 This section discusses more advanced concepts, which you may prefer to skip on the first time through this chapter. Tokenization is an instance of a more general problem of segmentation. In this section, we will look at two other instances of this problem, which use radically（根本上）阅读全文

posted @ 2011-08-06 22:46 牛皮糖NewPtone 阅读(1697) 评论(0) 推荐(0) 编辑

Python自然语言处理学习笔记(23)：3.7 用正则表达式文本分词

摘要： 3.7Regular Expressions for Tokenizing Text 用正则表达式文本分词 Tokenization is the task of cutting a string into identifiable linguistic units that constitute a piece of language data. Although it is a fundamental task, we have been able to delay it until now because many corpora are already tokenized, and . 阅读全文

posted @ 2011-08-06 22:36 牛皮糖NewPtone 阅读(3706) 评论(0) 推荐(0) 编辑

Python自然语言处理学习笔记(22)：3.6 规格化文本

摘要： 3.6Normalizing Text 规格化文本 In earlier program examples we have often converted text to lowercase before doing anything with its words, e.g., set(w.lower() for w in text). By using lower(), we have normalized the text to lowercase so that the distinction between The and the is ignored. Often we want t 阅读全文

posted @ 2011-08-06 22:27 牛皮糖NewPtone 阅读(2162) 评论(0) 推荐(0) 编辑

Python自然语言处理学习笔记(21)：3.5 正则表达式的有益应用

摘要： 3.5Useful Applications of Regular Expressions 正则表达式的有益应用 The previous examples all involved searching for words w that match some regular expression regexp using re.search(regexp, w). Apart from checking whether a regular expression matches a word, we can use regular expressions to extract material 阅读全文

posted @ 2011-08-06 16:08 牛皮糖NewPtone 阅读(2001) 评论(0) 推荐(0) 编辑

Python自然语言处理学习笔记(20)：3.4 使用正则表达式检测词组

摘要：转载请注明出处“一块努力的牛皮糖”：http://www.cnblogs.com/yuxc/新手上路，翻译不恰之处，恳请指出，不胜感谢Updated log3.4Regular Expressions for Detecting Word Patterns 使用正则表达式检测词组 Many linguistic processing tasks involve pattern matching（模式匹配）. For example, we can find words ending with ed using endswith('ed'). We saw a variety o 阅读全文

posted @ 2011-08-06 15:32 牛皮糖NewPtone 阅读(2213) 评论(0) 推荐(0) 编辑

Python自然语言处理学习笔记(19):3.3 使用Unicode进行文字处理

摘要： 3.3Text Processing with Unicode使用Unicode进行文字处理 Our programs will often need to deal with different languages, and different character sets. The concept of “plain text” is a fiction（虚构）. If you live in the English-speaking world you probably use ASCII, possibly without realizing it. If you live in Eu 阅读全文

posted @ 2011-08-06 14:39 牛皮糖NewPtone 阅读(2819) 评论(0) 推荐(0) 编辑

2011年8月5日

Python自然语言处理学习笔记(18)：3.2 字符串：最底层的文本处理

摘要：转载请注明出处“一块努力的牛皮糖”：http://www.cnblogs.com/yuxc/新手上路，翻译不恰之处，恳请指出，不胜感谢　Updated log1st 2011.8.6 3.2Strings: Text Processing at the Lowest Level 字符串：最底层的文本处理PS:个人认为这部分很重要，字符串处理是NLP里最基本的部分，各位童鞋好好看，老鸟略过...It’s time to study a fundamental data type that we’ve been studiously（故意地） avoiding so far. In earlier 阅读全文

posted @ 2011-08-05 23:13 牛皮糖NewPtone 阅读(2552) 评论(0) 推荐(0) 编辑

公告