scrapy使用PhantomJS和selenium爬取数据
1.phantomjs 安装
下载:http://phantomjs.org/download.html
解压:
tar -jxvf phantomjs-2.1.1-linux-x86_64.tar.bz2
重命名:
mv /usr/local/phantomjs-2.1.1-linux-x86_64/ /usr/local/phantomjs
软连接:
ln -s /usr/local/phantomjs/bin/phantomjs /usr/bin/
[root@izuf622gt8apcfsz7i1mqdz /]# phantomjs
phantomjs>
2.selenium 安装
pip 安装: pip install selenium
使用:
def process_request(self, request, spider): driver = webdriver.PhantomJS() # driver = webdriver.Chrome() driver.get(request.url) body = driver.page_source input_first = driver.find_element_by_id('stockID_') input_first.clear() input_first.send_keys('000150') button = driver.find_element_by_id('button') dataClick = button.click() print(dataClick) body = driver.page_source # driver.switch_to.frame('i_nr') # print("访问:", driver.page_source) return HtmlResponse(driver.current_url, body=body, encoding='utf-8')