Winter Plan Report 01

Two projects:

1.  Financial data mining + cutting edge of pricing models of financial products

2.  Database + QTUI (for Jobs)

 

--------------------

Today is the second day:

1.  Finished reading the first 2 chapters of web crawlers using java. (prepare to do it with scrapy following the same train of thought)

2.  First 2 chapter of nltk

---------------------

Report:

nltk

1.  nltk words, sents... 

2.  len(set(w))

3.  w.lower() for w in text if w.startswith('sb')    functional programming too cute and elegant

crawler

1.  corresponding data structures using for the specific goal: HashSet, Queue, PriorityQueue

2.  HttpClient

3.  A complete sample for a crawler

4.  an introduction to heritrix

posted on 2013-01-03 18:18  surghost  阅读(116)  评论(0编辑  收藏  举报

导航