BeautifulSoup库children(),descendants()方法的使用
Posted on 2017-04-18 00:33 沉默改良者 阅读(5556) 评论(3) 编辑 收藏 举报BeautifulSoup库children(),descendants()方法的使用
示例网站:http://www.pythonscraping.com/pages/page3.html
网站内容:
网站部分重要源代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | < table id="giftList"> < tr >< th > Item Title </ th >< th > Description </ th >< th > Cost </ th >< th > Image </ th ></ tr > < tr id="gift1" class="gift">< td > Vegetable Basket </ td >< td > This vegetable basket is the perfect gift for your health conscious (or overweight) friends! < span class="excitingNote">Now with super-colorful bell peppers!</ span > </ td >< td > $15.00 </ td >< td > < img src="../img/gifts/img1.jpg"> </ td ></ tr > < tr id="gift2" class="gift">< td > Russian Nesting Dolls </ td >< td > Hand-painted by trained monkeys, these exquisite dolls are priceless! And by "priceless," we mean "extremely expensive"! < span class="excitingNote">8 entire dolls per set! Octuple the presents!</ span > </ td >< td > $10,000.52 </ td >< td > < img src="../img/gifts/img2.jpg"> </ td ></ tr > < tr id="gift3" class="gift">< td > Fish Painting </ td >< td > If something seems fishy about this painting, it's because it's a fish! < span class="excitingNote">Also hand-painted by trained monkeys!</ span > </ td >< td > $10,005.00 </ td >< td > < img src="../img/gifts/img3.jpg"> </ td ></ tr > < tr id="gift4" class="gift">< td > Dead Parrot </ td >< td > This is an ex-parrot! < span class="excitingNote">Or maybe he's only resting?</ span > </ td >< td > $0.50 </ td >< td > < img src="../img/gifts/img4.jpg"> </ td ></ tr > < tr id="gift5" class="gift">< td > Mystery Box </ td >< td > If you love suprises, this mystery box is for you! Do not place on light-colored surfaces. May cause oil staining. < span class="excitingNote">Keep your friends guessing!</ span > </ td >< td > $1.50 </ td >< td > < img src="../img/gifts/img6.jpg"> </ td ></ tr > </ table > |
1.children()方法的使用
1 2 3 4 5 6 7 8 | # -*- coding: utf-8 -*- from urllib.request import urlopen from bs4 import BeautifulSoup html = urlopen( "http://www.pythonscraping.com/pages/page3.html" ) bsObj = BeautifulSoup(html, "lxml" ) for child in bsObj.find( "table" ,{ "id" : "giftList" }).children: print (child) |
运行得到的结果为:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | < tr >< th > Item Title </ th >< th > Description </ th >< th > Cost </ th >< th > Image </ th ></ tr > < tr class="gift" id="gift1">< td > Vegetable Basket </ td >< td > This vegetable basket is the perfect gift for your health conscious (or overweight) friends! < span class="excitingNote">Now with super-colorful bell peppers!</ span > </ td >< td > $15.00 </ td >< td > < img src="../img/gifts/img1.jpg"/> </ td ></ tr > < tr class="gift" id="gift2">< td > Russian Nesting Dolls </ td >< td > Hand-painted by trained monkeys, these exquisite dolls are priceless! And by "priceless," we mean "extremely expensive"! < span class="excitingNote">8 entire dolls per set! Octuple the presents!</ span > </ td >< td > $10,000.52 </ td >< td > < img src="../img/gifts/img2.jpg"/> </ td ></ tr > < tr class="gift" id="gift3">< td > Fish Painting </ td >< td > If something seems fishy about this painting, it's because it's a fish! < span class="excitingNote">Also hand-painted by trained monkeys!</ span > </ td >< td > $10,005.00 </ td >< td > < img src="../img/gifts/img3.jpg"/> </ td ></ tr > < tr class="gift" id="gift4">< td > Dead Parrot </ td >< td > This is an ex-parrot! < span class="excitingNote">Or maybe he's only resting?</ span > </ td >< td > $0.50 </ td >< td > < img src="../img/gifts/img4.jpg"/> </ td ></ tr > < tr class="gift" id="gift5">< td > Mystery Box </ td >< td > If you love suprises, this mystery box is for you! Do not place on light-colored surfaces. May cause oil staining. < span class="excitingNote">Keep your friends guessing!</ span > </ td >< td > $1.50 </ td >< td > < img src="../img/gifts/img6.jpg"/> </ td ></ tr > |
根据文章中的字面意思来分析:
children()方法指代的是与parent离得最近(也就是下一个)标签,程序中的children指代的是tr这个标签。
实验:将children用tr替换掉会得到与以上相同的结果吗?
1 2 3 4 5 6 7 8 | # -*- coding: utf-8 -*- from urllib.request import urlopen from bs4 import BeautifulSoup html = urlopen( "http://www.pythonscraping.com/pages/page3.html" ) bsObj = BeautifulSoup(html, "lxml" ) for child in bsObj.find( "table" ,{ "id" : "giftList" }).tr: print (child) |
运行结果为:
1 2 3 4 5 6 7 8 9 10 11 12 | < th > Item Title </ th > < th > Description </ th > < th > Cost </ th > < th > Image </ th > |
对以上实验结果进行分析得到:children可以列出所有的子类,而直接指定标签,则不行。
2.descendants()方法的使用
1 2 3 4 5 6 7 8 | # -*- coding: utf-8 -*- from urllib.request import urlopen from bs4 import BeautifulSoup html = urlopen( "http://www.pythonscraping.com/pages/page3.html" ) bsObj = BeautifulSoup(html, "lxml" ) for child in bsObj.find( "table" ,{ "id" : "giftList" }).descendants: print (child) |
运行结果为:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 | < tr >< th > Item Title </ th >< th > Description </ th >< th > Cost </ th >< th > Image </ th ></ tr > < th > Item Title </ th > Item Title < th > Description </ th > Description < th > Cost </ th > Cost < th > Image </ th > Image < tr class="gift" id="gift1">< td > Vegetable Basket </ td >< td > This vegetable basket is the perfect gift for your health conscious (or overweight) friends! < span class="excitingNote">Now with super-colorful bell peppers!</ span > </ td >< td > $15.00 </ td >< td > < img src="../img/gifts/img1.jpg"/> </ td ></ tr > < td > Vegetable Basket </ td > Vegetable Basket < td > This vegetable basket is the perfect gift for your health conscious (or overweight) friends! < span class="excitingNote">Now with super-colorful bell peppers!</ span > </ td > This vegetable basket is the perfect gift for your health conscious (or overweight) friends! < span class="excitingNote">Now with super-colorful bell peppers!</ span > Now with super-colorful bell peppers! < td > $15.00 </ td > $15.00 < td > < img src="../img/gifts/img1.jpg"/> </ td > < img src="../img/gifts/img1.jpg"/> < tr class="gift" id="gift2">< td > Russian Nesting Dolls </ td >< td > Hand-painted by trained monkeys, these exquisite dolls are priceless! And by "priceless," we mean "extremely expensive"! < span class="excitingNote">8 entire dolls per set! Octuple the presents!</ span > </ td >< td > $10,000.52 </ td >< td > < img src="../img/gifts/img2.jpg"/> </ td ></ tr > < td > Russian Nesting Dolls </ td > Russian Nesting Dolls < td > Hand-painted by trained monkeys, these exquisite dolls are priceless! And by "priceless," we mean "extremely expensive"! < span class="excitingNote">8 entire dolls per set! Octuple the presents!</ span > </ td > Hand-painted by trained monkeys, these exquisite dolls are priceless! And by "priceless," we mean "extremely expensive"! < span class="excitingNote">8 entire dolls per set! Octuple the presents!</ span > 8 entire dolls per set! Octuple the presents! < td > $10,000.52 </ td > $10,000.52 < td > < img src="../img/gifts/img2.jpg"/> </ td > < img src="../img/gifts/img2.jpg"/> < tr class="gift" id="gift3">< td > Fish Painting </ td >< td > If something seems fishy about this painting, it's because it's a fish! < span class="excitingNote">Also hand-painted by trained monkeys!</ span > </ td >< td > $10,005.00 </ td >< td > < img src="../img/gifts/img3.jpg"/> </ td ></ tr > < td > Fish Painting </ td > Fish Painting < td > If something seems fishy about this painting, it's because it's a fish! < span class="excitingNote">Also hand-painted by trained monkeys!</ span > </ td > If something seems fishy about this painting, it's because it's a fish! < span class="excitingNote">Also hand-painted by trained monkeys!</ span > Also hand-painted by trained monkeys! < td > $10,005.00 </ td > $10,005.00 < td > < img src="../img/gifts/img3.jpg"/> </ td > < img src="../img/gifts/img3.jpg"/> < tr class="gift" id="gift4">< td > Dead Parrot </ td >< td > This is an ex-parrot! < span class="excitingNote">Or maybe he's only resting?</ span > </ td >< td > $0.50 </ td >< td > < img src="../img/gifts/img4.jpg"/> </ td ></ tr > < td > Dead Parrot </ td > Dead Parrot < td > This is an ex-parrot! < span class="excitingNote">Or maybe he's only resting?</ span > </ td > This is an ex-parrot! < span class="excitingNote">Or maybe he's only resting?</ span > Or maybe he's only resting? < td > $0.50 </ td > $0.50 < td > < img src="../img/gifts/img4.jpg"/> </ td > < img src="../img/gifts/img4.jpg"/> < tr class="gift" id="gift5">< td > Mystery Box </ td >< td > If you love suprises, this mystery box is for you! Do not place on light-colored surfaces. May cause oil staining. < span class="excitingNote">Keep your friends guessing!</ span > </ td >< td > $1.50 </ td >< td > < img src="../img/gifts/img6.jpg"/> </ td ></ tr > < td > Mystery Box </ td > Mystery Box < td > If you love suprises, this mystery box is for you! Do not place on light-colored surfaces. May cause oil staining. < span class="excitingNote">Keep your friends guessing!</ span > </ td > If you love suprises, this mystery box is for you! Do not place on light-colored surfaces. May cause oil staining. < span class="excitingNote">Keep your friends guessing!</ span > Keep your friends guessing! < td > $1.50 </ td > $1.50 < td > < img src="../img/gifts/img6.jpg"/> </ td > < img src="../img/gifts/img6.jpg"/> |
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 10年+ .NET Coder 心语,封装的思维:从隐藏、稳定开始理解其本质意义
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· 10年+ .NET Coder 心语 ── 封装的思维:从隐藏、稳定开始理解其本质意义
· 地球OL攻略 —— 某应届生求职总结
· 提示词工程——AI应用必不可少的技术
· Open-Sora 2.0 重磅开源!
· 周边上新:园子的第一款马克杯温暖上架