用Pandas爬取表格数据

小实例:

import  pandas as pd
df = pd.read_html(r"./素材/股票.html", encoding='GBK',header=0)[0]
print(df[['代码','简称']])

学习网址:

https://www.cnblogs.com/amingcn/p/15271774.html 

 

 

批量爬取

一共47页,通过for循环构建47个网页url,再用pd.read_html()循环爬取。

df = pd.DataFrame()
for i in range(1, 48):
    url = f'http://vip.stock.finance.sina.com.cn/q/go.php/vComStockHold/kind/jgcg/index.phtml?p={i}'
    df = pd.concat([df, pd.read_html(url)[0]]) # 爬取+合并DataFrame

 

posted @ 2022-10-21 17:27  yongqi-911  阅读(56)  评论(0编辑  收藏  举报