BeautifulSoup库
一、安装BeautifulSoup库
可以现在目前python安装了哪些包
安装beautifulsoup
二、beautifulsoup官网
https://www.crummy.com/software/BeautifulSoup/bs4/doc/
三、beautifulsoup的主要解析器
四、beautifulsoup的find函数
查找html的title
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | from bs4 import BeautifulSoup html = """ <html lang="en"> <head> <meta charset="UTF-8"> <title>bobby基本信息</title> <script src="jquery-3.5.1.min.js"></script> </head> <body> <div id="info"> <p style="color: blue">讲师信息</p> <div class="teacher_info"> Python全栈工程师 <p class="age">年龄:29</p> <p class="name">姓名:bobby</p> <p class="work_years">工作年限:7年</p> <p class="position">职位:python开发工程师</p> </div> <p style="color:aquamarine">课程信息</p> <table class="courses"> <tbody><tr><th>课程名称</th> <th>讲师</th> <th>地址</th> </tr><tr> <td>django打造在线教育</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/78.html">访问</a></td> </tr><tr> <td>python高级编程</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/200.html">访问</a></td> </tr><tr> <td>scrapy分布式爬虫</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/92.html">访问</a></td> </tr><tr> <td>diango rest framework打造生鲜电商</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/131.html">访问</a></td> </tr><tr> <td>tornado从入门到精通</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/290.html">访问</a></td> </tr></tbody></table> </div> </body> </html> """ bs = BeautifulSoup(html, "html.parser" ) title_tag = bs.title.string print (title_tag) #点取元素的时候,只取第一个匹配的元素 div_tag1 = bs.title print ( "div_tag1:" + str (div_tag1)) |
输出结果:
查找html中的div元素
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | from bs4 import BeautifulSoup html = """ <html lang="en"> <head> <meta charset="UTF-8"> <title>bobby基本信息</title> <script src="jquery-3.5.1.min.js"></script> </head> <body> <div id="info"> <p style="color: blue">讲师信息</p> <div class="teacher_info"> Python全栈工程师 <p class="age">年龄:29</p> <p class="name">姓名:bobby</p> <p class="work_years">工作年限:7年</p> <p class="position">职位:python开发工程师</p> </div> <p style="color:aquamarine">课程信息</p> <table class="courses"> <tbody><tr><th>课程名称</th> <th>讲师</th> <th>地址</th> </tr><tr> <td>django打造在线教育</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/78.html">访问</a></td> </tr><tr> <td>python高级编程</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/200.html">访问</a></td> </tr><tr> <td>scrapy分布式爬虫</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/92.html">访问</a></td> </tr><tr> <td>diango rest framework打造生鲜电商</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/131.html">访问</a></td> </tr><tr> <td>tornado从入门到精通</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/290.html">访问</a></td> </tr></tbody></table> </div> </body> </html> """ bs = BeautifulSoup(html, "html.parser" ) div_tag2 = bs.find( "div" ) print ( "div_tag2:" + str (div_tag2)) |
输出结果:
查找html中的所有P元素
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | from bs4 import BeautifulSoup html = """ <html lang="en"> <head> <meta charset="UTF-8"> <title>bobby基本信息</title> <script src="jquery-3.5.1.min.js"></script> </head> <body> <div id="info"> <p style="color: blue">讲师信息</p> <div class="teacher_info"> Python全栈工程师 <p class="age">年龄:29</p> <p class="name">姓名:bobby</p> <p class="work_years">工作年限:7年</p> <p class="position">职位:python开发工程师</p> </div> <p style="color:aquamarine">课程信息</p> <table class="courses"> <tbody><tr><th>课程名称</th> <th>讲师</th> <th>地址</th> </tr><tr> <td>django打造在线教育</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/78.html">访问</a></td> </tr><tr> <td>python高级编程</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/200.html">访问</a></td> </tr><tr> <td>scrapy分布式爬虫</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/92.html">访问</a></td> </tr><tr> <td>diango rest framework打造生鲜电商</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/131.html">访问</a></td> </tr><tr> <td>tornado从入门到精通</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/290.html">访问</a></td> </tr></tbody></table> </div> </body> </html> """ bs = BeautifulSoup(html, "html.parser" ) #找回所有的元素 div_tag3 = bs.find_all( "p" ) print ( "p:" + str (div_tag3)) for p in div_tag3: print (p.string) |
输出结果:
指定id进行html查找
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | from bs4 import BeautifulSoup html = """ <html lang="en"> <head> <meta charset="UTF-8"> <title>bobby基本信息</title> <script src="jquery-3.5.1.min.js"></script> </head> <body> <div id="info"> <p style="color: blue">讲师信息</p> <div class="teacher_info"> Python全栈工程师 <p class="age">年龄:29</p> <p class="name">姓名:bobby</p> <p class="work_years">工作年限:7年</p> <p class="position">职位:python开发工程师</p> </div> <p style="color:aquamarine">课程信息</p> <table class="courses"> <tbody><tr><th>课程名称</th> <th>讲师</th> <th>地址</th> </tr><tr> <td>django打造在线教育</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/78.html">访问</a></td> </tr><tr> <td>python高级编程</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/200.html">访问</a></td> </tr><tr> <td>scrapy分布式爬虫</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/92.html">访问</a></td> </tr><tr> <td>diango rest framework打造生鲜电商</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/131.html">访问</a></td> </tr><tr> <td>tornado从入门到精通</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/290.html">访问</a></td> </tr></tbody></table> </div> </body> </html> """ bs = BeautifulSoup(html, "html.parser" ) div_tag4 = bs.find( id = "info" ) print ( "div_tag4:" + str (div_tag4)) div_tag5 = bs.find_all( "div" , id = "info" ) print ( "div_tag5:" + str (div_tag5)) |
输出结果:
正则表达式匹配元素
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | import re from bs4 import BeautifulSoup html = """ <html lang="en"> <head> <meta charset="UTF-8"> <title>bobby基本信息</title> <script src="jquery-3.5.1.min.js"></script> </head> <body> <div id="info-955"> <p style="color: blue">讲师信息</p> <div class="teacher_info"> Python全栈工程师 <p class="age">年龄:29</p> <p class="name">姓名:bobby</p> <p class="work_years">工作年限:7年</p> <p class="position">职位:python开发工程师</p> </div> <p style="color:aquamarine">课程信息</p> <table class="courses"> <tbody><tr><th>课程名称</th> <th>讲师</th> <th>地址</th> </tr><tr> <td>django打造在线教育</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/78.html">访问</a></td> </tr><tr> <td>python高级编程</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/200.html">访问</a></td> </tr><tr> <td>scrapy分布式爬虫</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/92.html">访问</a></td> </tr><tr> <td>diango rest framework打造生鲜电商</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/131.html">访问</a></td> </tr><tr> <td>tornado从入门到精通</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/290.html">访问</a></td> </tr></tbody></table> </div> </body> </html> """ bs = BeautifulSoup(html, "html.parser" ) div_tag = bs.find( "div" , id = re. compile ( "info-\d+" )) print (div_tag) |
输出结果:
根据网页字符串定位元素
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | import re from bs4 import BeautifulSoup html = """ <html lang="en"> <head> <meta charset="UTF-8"> <title>bobby基本信息</title> <script src="jquery-3.5.1.min.js"></script> </head> <body> <div id="info-955"> <p style="color: blue">讲师信息</p> <div class="teacher_info"> Python全栈工程师 <p class="age">年龄:29</p> <p class="name">姓名:bobby</p> <p class="work_years">工作年限:7年</p> <p class="position">职位:python开发工程师</p> </div> <p style="color:aquamarine">课程信息</p> <table class="courses"> <tbody><tr><th>课程名称</th> <th>讲师</th> <th>地址</th> </tr><tr> <td>django打造在线教育</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/78.html">访问</a></td> </tr><tr> <td>python高级编程</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/200.html">访问</a></td> </tr><tr> <td>scrapy分布式爬虫</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/92.html">访问</a></td> </tr><tr> <td>diango rest framework打造生鲜电商</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/131.html">访问</a></td> </tr><tr> <td>tornado从入门到精通</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/290.html">访问</a></td> </tr></tbody></table> </div> </body> </html> """ bs = BeautifulSoup(html, "html.parser" ) div_tag = bs.find(string = "django打造在线教育" ) print (div_tag) |
输出结果:
输出dom树子标签的标签名
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | import re from bs4 import BeautifulSoup html = """ <html lang="en"> <head> <meta charset="UTF-8"> <title>bobby基本信息</title> <script src="jquery-3.5.1.min.js"></script> </head> <body> <div id="info-955"> <p style="color: blue">讲师信息</p> <div class="teacher_info"> Python全栈工程师 <p class="age">年龄:29</p> <p class="name">姓名:bobby</p> <p class="work_years">工作年限:7年</p> <p class="position">职位:python开发工程师</p> </div> <p style="color:aquamarine">课程信息</p> <table class="courses"> <tbody><tr><th>课程名称</th> <th>讲师</th> <th>地址</th> </tr><tr> <td>django打造在线教育</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/78.html">访问</a></td> </tr><tr> <td>python高级编程</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/200.html">访问</a></td> </tr><tr> <td>scrapy分布式爬虫</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/92.html">访问</a></td> </tr><tr> <td>diango rest framework打造生鲜电商</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/131.html">访问</a></td> </tr><tr> <td>tornado从入门到精通</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/290.html">访问</a></td> </tr></tbody></table> </div> </body> </html> """ bs = BeautifulSoup(html, "html.parser" ) div_tag = bs.find( "div" , id = re. compile ( "info-\d+" )) childrens = div_tag.contents for child in childrens: if child.name: print (child.name) childrens_childrens = div_tag.descendants for child_child in childrens_childrens: if child_child.name: print (child_child.name) |
输出如下:输出子标签的标签名,遍历子元素
输出dom树的父标签的标签名
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | import re from bs4 import BeautifulSoup html = """ <html lang="en"> <head> <meta charset="UTF-8"> <title>bobby基本信息</title> <script src="jquery-3.5.1.min.js"></script> </head> <body> <div id="info-955"> <p style="color: blue">讲师信息</p> <div class="teacher_info"> Python全栈工程师 <p class="age">年龄:29</p> <p class="name">姓名:bobby</p> <p class="work_years">工作年限:7年</p> <p class="position">职位:python开发工程师</p> </div> <p style="color:aquamarine">课程信息</p> <table class="courses"> <tbody><tr><th>课程名称</th> <th>讲师</th> <th>地址</th> </tr><tr> <td>django打造在线教育</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/78.html">访问</a></td> </tr><tr> <td>python高级编程</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/200.html">访问</a></td> </tr><tr> <td>scrapy分布式爬虫</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/92.html">访问</a></td> </tr><tr> <td>diango rest framework打造生鲜电商</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/131.html">访问</a></td> </tr><tr> <td>tornado从入门到精通</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/290.html">访问</a></td> </tr></tbody></table> </div> </body> </html> """ bs = BeautifulSoup(html, "html.parser" ) parents = bs.find( "p" ,{ "class" : "name" }).parents for parent in parents: print (parent.name) |
输出结果:
输出dom树的兄弟标签的标签名
输出下一个兄弟标签的标签名
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | import re from bs4 import BeautifulSoup html = """ <html lang="en"> <head> <meta charset="UTF-8"> <title>bobby基本信息</title> <script src="jquery-3.5.1.min.js"></script> </head> <body> <div id="info-955"> <p style="color: blue">讲师信息</p> <div class="teacher_info"> Python全栈工程师 <p class="age">年龄:29</p> <p class="name">姓名:bobby</p> <p class="work_years">工作年限:7年</p> <p class="position">职位:python开发工程师</p> </div> <p style="color:aquamarine">课程信息</p> <table class="courses"> <tbody><tr><th>课程名称</th> <th>讲师</th> <th>地址</th> </tr><tr> <td>django打造在线教育</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/78.html">访问</a></td> </tr><tr> <td>python高级编程</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/200.html">访问</a></td> </tr><tr> <td>scrapy分布式爬虫</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/92.html">访问</a></td> </tr><tr> <td>diango rest framework打造生鲜电商</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/131.html">访问</a></td> </tr><tr> <td>tornado从入门到精通</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/290.html">访问</a></td> </tr></tbody></table> </div> </body> </html> """ bs = BeautifulSoup(html, "html.parser" ) next_siblings = bs.find( "p" ,{ "class" : "age" }).next_siblings for sibling in next_siblings: print (sibling.string) |
输出结果:
输出上一个兄弟标签的标签名
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | import re from bs4 import BeautifulSoup html = """ <html lang="en"> <head> <meta charset="UTF-8"> <title>bobby基本信息</title> <script src="jquery-3.5.1.min.js"></script> </head> <body> <div id="info-955"> <p style="color: blue">讲师信息</p> <div class="teacher_info"> Python全栈工程师 <p class="age">年龄:29</p> <p class="name">姓名:bobby</p> <p class="work_years">工作年限:7年</p> <p class="position">职位:python开发工程师</p> </div> <p style="color:aquamarine">课程信息</p> <table class="courses"> <tbody><tr><th>课程名称</th> <th>讲师</th> <th>地址</th> </tr><tr> <td>django打造在线教育</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/78.html">访问</a></td> </tr><tr> <td>python高级编程</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/200.html">访问</a></td> </tr><tr> <td>scrapy分布式爬虫</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/92.html">访问</a></td> </tr><tr> <td>diango rest framework打造生鲜电商</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/131.html">访问</a></td> </tr><tr> <td>tornado从入门到精通</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/290.html">访问</a></td> </tr></tbody></table> </div> </body> </html> """ bs = BeautifulSoup(html, "html.parser" ) previous_siblings = bs.find( "p" ,{ "class" : "name" }).previous_siblings for sibling in previous_siblings: print (sibling.string) |
输出结果:
如果要输出前一个兄弟标签的标签名,需要去掉回车换行符
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | import re from bs4 import BeautifulSoup html = """ <html lang="en"> <head> <meta charset="UTF-8"> <title>bobby基本信息</title> <script src="jquery-3.5.1.min.js"></script> </head> <body> <div id="info-955"> <p style="color: blue">讲师信息</p> <div class="teacher_info"> Python全栈工程师 <p class="age">年龄:29</p><p class="name">姓名:bobby</p> <p class="work_years">工作年限:7年</p> <p class="position">职位:python开发工程师</p> </div> <p style="color:aquamarine">课程信息</p> <table class="courses"> <tbody><tr><th>课程名称</th> <th>讲师</th> <th>地址</th> </tr><tr> <td>django打造在线教育</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/78.html">访问</a></td> </tr><tr> <td>python高级编程</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/200.html">访问</a></td> </tr><tr> <td>scrapy分布式爬虫</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/92.html">访问</a></td> </tr><tr> <td>diango rest framework打造生鲜电商</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/131.html">访问</a></td> </tr><tr> <td>tornado从入门到精通</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/290.html">访问</a></td> </tr></tbody></table> </div> </body> </html> """ bs = BeautifulSoup(html, "html.parser" ) previous_sibling = bs.find( "p" ,{ "class" : "name" }).previous_sibling print (previous_sibling.string) |
注意:此处html去掉回车换行符,否则无输出
输出结果:
获取html的某些标签元素的属性值
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | import re from bs4 import BeautifulSoup html = """ <html lang="en"> <head> <meta charset="UTF-8"> <title>bobby基本信息</title> <script src="jquery-3.5.1.min.js"></script> </head> <body> <div id="info-955"> <p style="color: blue">讲师信息</p> <div class="teacher_info"> Python全栈工程师 <p class="age">年龄:29</p> <p class="name">姓名:bobby</p> <p class="work_years">工作年限:7年</p> <p class="position">职位:python开发工程师</p> </div> <p style="color:aquamarine">课程信息</p> <table class="courses"> <tbody><tr><th>课程名称</th> <th>讲师</th> <th>地址</th> </tr><tr> <td>django打造在线教育</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/78.html">访问</a></td> </tr><tr> <td>python高级编程</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/200.html">访问</a></td> </tr><tr> <td>scrapy分布式爬虫</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/92.html">访问</a></td> </tr><tr> <td>diango rest framework打造生鲜电商</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/131.html">访问</a></td> </tr><tr> <td>tornado从入门到精通</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/290.html">访问</a></td> </tr></tbody></table> </div> </body> </html> """ bs = BeautifulSoup(html, "html.parser" ) name_tag = bs.find( "p" ,{ "class" : "name" }) print (name_tag[ "class" ]) print (name_tag.get( "class" )) |
输出结果:
元素多值属性问题
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 | import re from bs4 import BeautifulSoup html = """ <html lang="en"> <head> <meta charset="UTF-8"> <title>bobby基本信息</title> <script src="jquery-3.5.1.min.js"></script> </head> <body> <div id="info-955"> <p style="color: blue">讲师信息</p> <div class="teacher_info"> Python全栈工程师 <p class="age">年龄:29</p> <p class="name bobbyname" data-bind="bobby">姓名:bobby</p> <p class="work_years">工作年限:7年</p> <p class="position">职位:python开发工程师</p> </div> <p style="color:aquamarine">课程信息</p> <table class="courses"> <tbody><tr><th>课程名称</th> <th>讲师</th> <th>地址</th> </tr><tr> <td>django打造在线教育</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/78.html">访问</a></td> </tr><tr> <td>python高级编程</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/200.html">访问</a></td> </tr><tr> <td>scrapy分布式爬虫</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/92.html">访问</a></td> </tr><tr> <td>diango rest framework打造生鲜电商</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/131.html">访问</a></td> </tr><tr> <td>tornado从入门到精通</td> <td>bobby</td> <td><a href="https://coding.imooc.com/class/290.html">访问</a></td> </tr></tbody></table> </div> </body> </html> """ bs = BeautifulSoup(html, "html.parser" ) name_tag = bs.find( "p" ,{ "class" : "name" }) print (name_tag[ "class" ]) print (name_tag.get( "class" )) print (name_tag[ "data-bind" ]) print (name_tag.get( "data-bind" )) |
输出结果:
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)
2022-05-19 STP协议