Python爬虫学习-简单利用urllib.request和正则表达式抓取职位信息

1: 利用urllib.request和正则表达式抓取职位信息,并写入本地文件

 

 1 # coding:utf-8
 2 
 3 import re
 4 import requests
 5 import urllib.request
 6 
 7 #利用urllib和re正则提取网页数据
 8 
 9 '''
10 url = 'https://search.51job.com/list/020000,000000,0124,01,9,99,%2520,2,1.html?lang=c&stype=&postchannel=0000&workyear=99&cotype=99&degreefrom=99&jobterm=99&companysize=99&providesalary=99&lonlat=0%2C0&radius=-1&ord_field=0&confirmdate=9&fromType=&dibiaoid=0&address=&line=&specialarea=00&from=&w'
11 # response = requests.get(url)
12 # response.encoding='gbk'
13 # wbdata =response.text
14 
15 wbdata=urllib.request.urlopen(url).read().decode('gbk')
16 # print(len(wbdata))
17 
18 pat ='<a target="_blank" title="(.*?)"'
19 data = re.compile(pat).findall(wbdata)
20 # print(data)
21 
22 #输出到文件
23 # with open('jobs.txt','w') as f:
24 #     for k in range(len(data)):
25 #         print(data[k])
26 #         f.write(data[k]+'\n')
27 
28 #输出至console
29 for k in range(len(data)):
30     print(data[k])
31 '''
32 print("--"*20)
33 #超时设置
34 # for i in range(0,20):
35 #     try:
36 #         file=urllib.request.urlopen("http://baidu.com",timeout=0.2).read().decode('gbk')
37 #         print(len(file))
38 #     except Exception as err:
39 #         print("出现异常:可能网页超时!"+str(err))
40 
41 #get请求实战-获取51job职位信息
42 keywd="Python"
43 pat1='<div class="el">.*?title="(.*?)" href="(http.*?)".*?<span class="t4">(.*?)</span>.*?</div>'
44 pat2='<span class="t4">(.*?)</span>'
45 
46 # keywd=urllib.request.quote(keywd)
47 for i in range(1,11):
48     url="https://search.51job.com/list/020000,000000,0000,00,9,99,"+keywd+",2,"+str(i)+".html"
49     file=urllib.request.urlopen(url)
50     # print(file.geturl())
51     data=file.read().decode('gbk')
52     print("----------------第"+str(i)+"页-----------------")
53     rst1=re.compile(pat1,re.S).findall(data)
54     # rst2 = re.compile(pat2).findall(data)
55     # rst=list(zip(rst1,rst2))
56     for j in range(0,len(rst1)):
57         print(rst1[j])
58         with open('jobs.txt','a') as f:
59             f.write(str(rst1[j]) + '\n')
60 
61     # rst2 = re.compile(pat2).findall(data)
62     # for z in range(0, len(rst2)):
63     #     print(rst2[z])
View Code

 

2: 抓取信息如下

  1 ----------------第1页-----------------
  2 ('自动化测试工程师Selenium', 'https://jobs.51job.com/shanghai-ypq/114603381.html?s=01&t=5', '1-1.5万/月')
  3 ('大数据研发工程师', 'https://jobs.51job.com/shanghai/67963188.html?s=01&t=6', '')
  4 ('Python爬虫工程师', 'https://jobs.51job.com/shanghai-pdxq/121129060.html?s=01&t=0', '1-1.5万/月')
  5 ('Python高级开发工程师', 'https://jobs.51job.com/shanghai-pdxq/114332244.html?s=01&t=0', '2-4万/月')
  6 ('Python爬虫工程师', 'https://jobs.51job.com/shanghai/120028078.html?s=01&t=0', '1-1.5万/月')
  7 ('Python开发工程师', 'https://jobs.51job.com/shanghai-xhq/119981428.html?s=01&t=0', '1-1.5万/月')
  8 ('python开发工程师/大数据建模', 'https://jobs.51job.com/shanghai-hpq/114718480.html?s=01&t=0', '6-8千/月')
  9 ('Python开发工程师', 'https://jobs.51job.com/shanghai/120395604.html?s=01&t=0', '1.2-1.5万/月')
 10 ('C/C++/Python开发工程师', 'https://jobs.51job.com/shanghai-xhq/120909208.html?s=01&t=0', '15-20万/年')
 11 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/89716807.html?s=01&t=0', '1-1.5万/月')
 12 ('erlang/python服务器开发工程师', 'https://jobs.51job.com/shanghai-xhq/98416948.html?s=01&t=0', '1.5-3.5万/月')
 13 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/120657799.html?s=01&t=0', '1.5-2万/月')
 14 ('Python开发工程师 (MJ000231)', 'https://jobs.51job.com/shanghai-jaq/117808653.html?s=01&t=0', '0.6-1万/月')
 15 ('python开发工程师-A0122', 'https://jobs.51job.com/shanghai-jaq/119864919.html?s=01&t=0', '1-1.5万/月')
 16 ('高级python后端工程师(AI平台)', 'https://jobs.51job.com/shanghai-ypq/120959109.html?s=01&t=0', '1.5-2万/月')
 17 ('初级Python工程师', 'https://jobs.51job.com/shanghai-pdxq/116357032.html?s=01&t=0', '10-15万/年')
 18 ('Python开发工程师', 'https://jobs.51job.com/shanghai/115980776.html?s=01&t=0', '')
 19 ('python数据分析', 'https://jobs.51job.com/shanghai-bsq/120583326.html?s=01&t=0', '1.5-2.5万/月')
 20 ('Python开发工程师', 'https://jobs.51job.com/hefei/121179694.html?s=01&t=0', '0.8-1.2万/月')
 21 ('Python开发专家', 'https://jobs.51job.com/shanghai-mhq/121173046.html?s=01&t=0', '3-3.5万/月')
 22 ('Python开发工程师', 'https://jobs.51job.com/shanghai-jaq/120911076.html?s=01&t=0', '1-1.5万/月')
 23 ('python Web开发工程师', 'https://jobs.51job.com/shanghai-pdxq/120283479.html?s=01&t=0', '1-1.3万/月')
 24 ('python web后台开发', 'https://jobs.51job.com/shanghai-cnq/119461330.html?s=01&t=0', '0.6-1.5万/月')
 25 ('Python开发工程师(金融科技)', 'https://jobs.51job.com/shanghai-xhq/117251350.html?s=01&t=0', '0.7-1.5万/月')
 26 ('Python开发工程师', 'https://jobs.51job.com/shanghai-ypq/107196999.html?s=01&t=0', '0.8-1.5万/月')
 27 ('Python高级软件工程师', 'https://jobs.51job.com/shanghai-mhq/120966514.html?s=01&t=0', '21-35万/年')
 28 ('P0076-python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/121163335.html?s=01&t=0', '1-2万/月')
 29 ('Python 应用开发工程师', 'https://jobs.51job.com/shanghai-cnq/121160736.html?s=01&t=0', '0.7-1.3万/月')
 30 ('软件开发工程师(GO/Lua/python)', 'https://jobs.51job.com/shanghai-jaq/119529417.html?s=01&t=0', '1.5-2.5万/月')
 31 ('运维开发工程师(Python开发)', 'https://jobs.51job.com/shanghai/119187608.html?s=01&t=0', '2.5-4万/月')
 32 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/98528369.html?s=01&t=0', '1-1.5万/月')
 33 ('Python开发工程师', 'https://jobs.51job.com/shanghai-jaq/120386104.html?s=01&t=0', '1-2.2万/月')
 34 ('Python开发工程师', 'https://jobs.51job.com/shanghai-mhq/118338654.html?s=01&t=0', '0.8-1万/月')
 35 ('PFS-Python Developer', 'http://durrgroup.51job.com/jobinfo1.html?id=120187451', '')
 36 ('python数据分析师', 'https://jobs.51job.com/shanghai-pdxq/112471902.html?s=01&t=0', '1-1.5万/月')
 37 ('25923-Python高级工程师(深圳)', 'https://jobs.51job.com/shanghai-hpq/118208611.html?s=01&t=0', '')
 38 ('硬件工程师(python)', 'https://jobs.51job.com/shanghai/120159337.html?s=01&t=0', '')
 39 ('Python开发工程师', 'https://jobs.51job.com/shanghai-ypq/121106633.html?s=01&t=0', '1.3-1.9万/月')
 40 ('Python后端开发工程师', 'https://jobs.51job.com/shanghai-xhq/118007556.html?s=01&t=0', '1.2-1.8万/月')
 41 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/120944265.html?s=01&t=0', '1-1.5万/月')
 42 ('Python全栈工程师', 'https://jobs.51job.com/shanghai-mhq/109925368.html?s=01&t=0', '0.8-1.6万/月')
 43 ('python开发', 'https://jobs.51job.com/shanghai-hkq/117623816.html?s=01&t=0', '2-3万/月')
 44 ('高级Python/Django后端软件工程师', 'https://jobs.51job.com/shanghai-cnq/109764789.html?s=01&t=0', '1.5-2万/月')
 45 ('软件工程师(汽车行业优先)精通python', 'https://jobs.51job.com/shanghai-pdxq/120189871.html?s=01&t=0', '0.8-2万/月')
 46 ('Python 爬虫工程师(薪智)', 'https://jobs.51job.com/shanghai-mhq/119329837.html?s=01&t=0', '1.5-2万/月')
 47 ('高级Python开发工程师', 'https://jobs.51job.com/shanghai/105294644.html?s=01&t=0', '2.5-5万/月')
 48 ('Python开发工程师', 'https://jobs.51job.com/shanghai/101535573.html?s=01&t=0', '1.5-2万/月')
 49 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/115644674.html?s=01&t=0', '1.5-2.2万/月')
 50 ('Python开发经理', 'https://jobs.51job.com/shanghai-pdxq/114043106.html?s=01&t=0', '2-2.5万/月')
 51 ('Python高级开发工程师', 'https://jobs.51job.com/shanghai-pdxq/118673386.html?s=01&t=0', '1.8-3万/月')
 52 ('高级Python开发工程师', 'https://jobs.51job.com/shanghai-mhq/98639401.html?s=01&t=0', '1.5-2.5万/月')
 53 ('python开发', 'https://jobs.51job.com/shanghai-hkq/117622468.html?s=01&t=0', '2-3万/月')
 54 ----------------第2页-----------------
 55 ('python工程师', 'https://jobs.51job.com/shanghai-sjq/120620177.html?s=01&t=0', '0.8-2万/月')
 56 ('Python(Odoo)工程师', 'https://jobs.51job.com/shanghai-pdxq/119746269.html?s=01&t=0', '1-1.5万/月')
 57 ('Python/Odoo高级开发工程师', 'https://jobs.51job.com/shanghai/116881344.html?s=01&t=0', '2.5-3万/月')
 58 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/116927659.html?s=01&t=0', '0.8-1.5万/月')
 59 ('Python开发工程师', 'https://jobs.51job.com/shanghai-cnq/120895491.html?s=01&t=0', '1-1.5万/月')
 60 ('Python高级开发工程师', 'https://jobs.51job.com/shanghai-pdxq/120349450.html?s=01&t=0', '1.5-2万/月')
 61 ('中级后端工程师(python/odoo)', 'https://jobs.51job.com/shanghai-sjq/107147430.html?s=01&t=0', '1.5-2万/月')
 62 ('实习生(Python开发)', 'https://jobs.51job.com/shanghai/119951678.html?s=01&t=0', '1-5千/月')
 63 ('Python开发工程师', 'https://jobs.51job.com/shanghai-ypq/120474003.html?s=01&t=0', '0.8-1.2万/月')
 64 ('Python后端开发', 'https://jobs.51job.com/shanghai-xhq/121023917.html?s=01&t=0', '1.5-2.5万/月')
 65 ('Python软件工程师', 'https://jobs.51job.com/shanghai-xhq/117038800.html?s=01&t=0', '1-1.5万/月')
 66 ('初级python/R 工程师', 'https://jobs.51job.com/shanghai-xhq/116055667.html?s=01&t=0', '0.5-1万/月')
 67 ('python', 'https://jobs.51job.com/shanghai-pdxq/120778793.html?s=01&t=0', '6.5-9.5千/月')
 68 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/120941553.html?s=01&t=0', '0.8-1.5万/月')
 69 ('python开发项目经理', 'https://jobs.51job.com/shanghai-ypq/117944790.html?s=01&t=0', '2.6-4万/月')
 70 ('Python/PHP后端程序员', 'https://jobs.51job.com/shanghai-xhq/114897348.html?s=01&t=0', '1-1.5万/月')
 71 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/119493296.html?s=01&t=0', '15-25万/年')
 72 ('Python开发工程师(外汇岗位)', 'https://jobs.51job.com/shanghai-pdxq/120797741.html?s=01&t=0', '1-1.6万/月')
 73 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/112330285.html?s=01&t=0', '1-1.7万/月')
 74 ('Python开发工程师(***)', 'https://jobs.51job.com/shanghai/120694247.html?s=01&t=0', '1000元/天')
 75 ('python开发', 'https://jobs.51job.com/shanghai-hkq/117614339.html?s=01&t=0', '2-3万/月')
 76 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/118308129.html?s=01&t=0', '1.1-2万/月')
 77 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/119981957.html?s=01&t=0', '1.5-2万/月')
 78 ('Python开发工程师', 'https://jobs.51job.com/shenzhen/118443903.html?s=01&t=0', '3-4万/月')
 79 ('Python高级开发工程师', 'https://jobs.51job.com/shanghai-sjq/120173104.html?s=01&t=0', '1.5-2万/月')
 80 ('python 工程师', 'https://jobs.51job.com/shanghai-xhq/120570442.html?s=01&t=0', '1.5-2万/月')
 81 ('Python高级开发工程师', 'https://jobs.51job.com/shanghai-mhq/120105386.html?s=01&t=0', '1.5-3万/月')
 82 ('软件开发工程师(Python)', 'https://jobs.51job.com/shenzhen-nsq/118492627.html?s=01&t=0', '0.6-1.2万/月')
 83 ('python开发', 'https://jobs.51job.com/shanghai-pdxq/118924590.html?s=01&t=0', '6-8千/月')
 84 ('Python开发工程师', 'https://jobs.51job.com/shanghai-xhq/115691023.html?s=01&t=0', '1.3-1.8万/月')
 85 ('Python工程师', 'https://jobs.51job.com/shanghai-cnq/119488897.html?s=01&t=0', '1.5-2万/月')
 86 ('python开发', 'https://jobs.51job.com/shanghai-xhq/115310459.html?s=01&t=0', '6-8千/月')
 87 ('大数据算法开发/Python开发', 'https://jobs.51job.com/shanghai-pdxq/120055740.html?s=01&t=0', '1.5-2万/月')
 88 ('Python开发(09)', 'https://jobs.51job.com/shanghai-pdxq/120786708.html?s=01&t=0', '0.8-1.1万/月')
 89 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/120472220.html?s=01&t=0', '1-1.5万/月')
 90 ('Python工程师', 'https://jobs.51job.com/shanghai/119281603.html?s=01&t=0', '2-3.5万/月')
 91 ('Python高级开发工程师', 'https://jobs.51job.com/shanghai-hkq/119893009.html?s=01&t=0', '1.5-2万/月')
 92 ('Python运维开发工程师', 'https://jobs.51job.com/shanghai-mhq/119681490.html?s=01&t=0', '1.5-2万/月')
 93 ('Python工程师', 'https://jobs.51job.com/shanghai/102367533.html?s=01&t=0', '1.5-2万/月')
 94 ('Python开发工程师', 'https://jobs.51job.com/shanghai-ypq/120946896.html?s=01&t=0', '1.5-2万/月')
 95 ('Python开发工程师', 'https://jobs.51job.com/shanghai-hpq/109016702.html?s=01&t=0', '2.3-2.8万/月')
 96 ('Senior Python Software Engineer', 'https://jobs.51job.com/shanghai/119066163.html?s=01&t=0', '1.5-3万/月')
 97 ('高级软件工程师  Golang/Python', 'https://jobs.51job.com/shanghai-cnq/120220700.html?s=01&t=0', '2-6万/月')
 98 ('Python开发工程师', 'https://jobs.51job.com/shanghai/114088389.html?s=01&t=0', '1-1.5万/月')
 99 ('Python 开发工程师', 'https://jobs.51job.com/shanghai-pdxq/120139836.html?s=01&t=0', '0.8-1.5万/月')
100 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/115129775.html?s=01&t=0', '1.5-2万/月')
101 ('python开发', 'https://jobs.51job.com/shanghai-hkq/117611952.html?s=01&t=0', '2-3万/月')
102 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/110125924.html?s=01&t=0', '1-1.5万/月')
103 ('Python开发工程师', 'https://jobs.51job.com/shanghai-xhq/118610557.html?s=01&t=0', '1-2万/月')
104 ('Python 架构', 'https://jobs.51job.com/shanghai-pdxq/120317481.html?s=01&t=0', '2.5-3.5万/月')
View Code

 

posted @ 2020-04-08 09:19  MorePrograms  阅读(383)  评论(0编辑  收藏  举报