Python BeautifulSoup4合并table单元格

 1     from bs4 import BeautifulSoup 
 2 
 3     str_table = """<table><tr><td>456</td><td>123</td></tr><tr><td>123</td><td>456</td></tr><tr><td>123</td><td>456</td><td>789</td></tr><tr><td>456</td><td>123</td></tr></table>"""
 4     soup = BeautifulSoup(str_table, "html.parser")
 5     row_span = 1
 6     for tr in soup.find_all('tr'):
 7         tds = tr.find_all('td')
 8         next_trs = tr.next_siblings
 9         for next_tr in next_trs:
10             if tds[0].get_text() == next_tr.contents[0].get_text():
11                 row_span += 1
12                 tds[0]["rowspan"] = row_span
13                 next_tr.contents[0].extract()
14             else:
15                 row_span = 1
16                 break
17     print(soup.prettify())

 

posted @ 2020-11-17 21:12  tinaleft  阅读(771)  评论(0编辑  收藏  举报