更高性价比的技术

  技术不分贵贱,用不同的语言一样能实现相同的东西,但我想说的是相比C++和Java,总有一些东西能更高效得做到一些我们想要达到的目的。以下列出来的就是对我感触较深的几点:

  1. 用HTML5去做App上的事情。比如 在美中国学生尤雨溪(Evan You)两天打造HTML5版的Clear。做的事情并不复杂,但是对个人PR收益极大,讨巧,性价比太高了...
  2. 用脚本去做一些看上去非常高端的事情。比如爬虫。类似ruby这样的语言带给我们的方便是难以比拟的,我不清楚之前类似爬虫这样的事情是怎么去做的,但是ruby脚本化的实现,让我非常吃惊。

     1 require 'rubygems'
    2 require 'mechanize'
    3 require 'csv'
    4
    5 file=File.open("test.txt",'w')
    6
    7 @num = 0
    8
    9 def parse_page(agent, page, file)
    10 throw "error" unless page
    11 table = page.search('table#gdem_list')
    12 puts "==Start Prase"
    13 throw "empty" if table.empty?
    14 table = table.first
    15
    16 rows = table.search('tbody.tbody').search('tr')
    17 rows.each do |row|
    18 id = row.search('td')[1].inner_text.strip
    19 id = id.split('_')[1]
    20 puts "== saving #{id}"
    21 agent.get("http://datamirror.csdb.cn/gdemDownload.dem?id=#{id}&fileType=gdem_utm&type=gdem_utm").save_as
    22 @num += 1
    23 puts "saved num:#{@num} id:#{id}"
    24 file.write(id + ",")
    25 end
    26 end
    27
    28 def parse_line(line)
    29 end
    30
    31 def print_page(page)
    32 pp page
    33 puts page.to_s.length
    34 end
    35
    36 puts "Starting Grab.rb"
    37 MATH_CLASSES_URL = "http://datamirror.csdb.cn/list.dem?opType=search&type=gdem&mode=zb&txtMaxX=140&txtMinX=70&txtMaxY=60&txtMinY=10"
    38 agent = Mechanize.new
    39 agent.user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11"
    40 #agent.keep_alive = true
    41
    42 cookie = Mechanize::Cookie.new("JSESSIONID", "BA58528B76124698AD033EE6DF12B986:-1")
    43 cookie.domain = "datamirror.csdb.cn"
    44 cookie.path = "/"
    45 agent.cookie_jar.add!(cookie)
    46
    47 puts "Getting page for the first time"
    48 page = agent.get MATH_CLASSES_URL
    49 print_page(page)
    50
    51 puts "Configuring FORM"
    52 form_sel = page.form(:name => "listForm")
    53 puts form_sel
    54
    55 # maxRows:80
    56 # maxRows:20
    57 # gdem_list_tr_:true
    58 # gdem_list_p_:1
    59 # gdem_list_mr_:80
    60 form_sel.maxRows = 80
    61 form_sel["gdem_list_tr_"] = true
    62 form_sel["gdem_list_mr_"] = 80
    63
    64 puts "==Start Set Value of Each Select"
    65 (1..35).each do |page|
    66
    67 form_sel["gdem_list_p_"] = page
    68 puts "=========Posting FORM========="
    69
    70 new_page = form_sel.submit
    71 pp new_page
    72 parse_page(agent, new_page,file)
    73 puts "Posted FORM"
    74 end
    75 file.close
  3. 做浏览器的插件。像比如做火车票自动登录的插件,难度低,但影响非常大。

    https://github.com/zzdhidden/12306



posted @ 2012-02-23 09:54  stoned  阅读(426)  评论(0编辑  收藏  举报