Winter Plan Report 01
Two projects:
1. Financial data mining + cutting edge of pricing models of financial products
2. Database + QTUI (for Jobs)
--------------------
Today is the second day:
1. Finished reading the first 2 chapters of web crawlers using java. (prepare to do it with scrapy following the same train of thought)
2. First 2 chapter of nltk
---------------------
Report:
nltk
1. nltk words, sents...
2. len(set(w))
3. w.lower() for w in text if w.startswith('sb') functional programming too cute and elegant
crawler
1. corresponding data structures using for the specific goal: HashSet, Queue, PriorityQueue
2. HttpClient
3. A complete sample for a crawler
4. an introduction to heritrix