XPATH中text()和string()的使用区别
<table style="WIDTH: 95.45%; BORDER-COLLAPSE: collapse; EMPTY-CELLS: show; MARGIN-LEFT: 4.55%; MARGIN-TOP: 2pt" cellspacing="0" cellpadding="4"> <tbody> <tr style="PAGE-BREAK-INSIDE: avoid"> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt">• </td> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> Diversified Income Series (Service Class): Maximum long-term total return consistent with reasonable risk. </td></tr> <tr style="PAGE-BREAK-INSIDE: avoid"> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt">• </td> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> Emerging Markets Series (Service Class): Long-term capital appreciation. </td></tr> <tr style="PAGE-BREAK-INSIDE: avoid"> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt">• </td> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> Limited-Term Diversified Income Series (Service Class): Maximum total return, consistent with reasonable risk. </td></tr> <tr style="PAGE-BREAK-INSIDE: avoid"> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt">• </td> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> REIT Series (Service Class): Maximum long-term total return, with capital appreciation as a secondary objective. </td></tr> <tr style="PAGE-BREAK-INSIDE: avoid"> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt">• </td> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> Small Cap Value Series (Service Class): Capital appreciation. </td></tr> <tr style="PAGE-BREAK-INSIDE: avoid"> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt">• </td> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> Smid Cap Core Series (Service Class): Long-term capital appreciation. </td></tr> <tr style="PAGE-BREAK-INSIDE: avoid"> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt">• </td> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> U.S. Growth Series (Service Class): Long-term capital appreciation. </td></tr> <tr style="PAGE-BREAK-INSIDE: avoid"> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 1.39%; VERTICAL-ALIGN: top; WHITE-SPACE: nowrap; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; PADDING-BOTTOM: 0pt; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 0pt; LINE-HEIGHT: 13pt; PADDING-RIGHT: 2pt">• </td> <td style="FONT-SIZE: 10pt; TEXT-DECORATION: none; FONT-FAMILY: Arial Narrow; WIDTH: 98.61%; VERTICAL-ALIGN: top; TEXT-TRANSFORM: none; FONT-WEIGHT: normal; COLOR: #000000; PADDING-BOTTOM: 0pt; FONT-STYLE: normal; TEXT-ALIGN: left; PADDING-LEFT: 2pt; LINE-HEIGHT: 13pt">Delaware VIP<sup style="FONT-SIZE: 85%; VERTICAL-ALIGN: text-top; TEXT-TRANSFORM: none; FONT-STYLE: normal"><font style="PADDING-LEFT: 1pt"></font>®</sup> Value Series (Service Class): Long-term capital appreciation. </td></tr></tbody></table>
在上面的Html表格中,我们要抓出每个tr标签中第2个td的文本内容,一开始想到的XPATH语句是这么写的:
//td[contains(text(),':') and contains(text(),'(') and contains(text(),')') and (contains(text(),'Class') or contains(text(),'Shares'))]
结果发现提不出来,将text()函数改为string()函数,就可以提出来了:
//td[contains(string(),':') and contains(string(),'(') and contains(string(),')') and (contains(string(),'Class') or contains(string(),'Shares'))]
原文档中有些td标签文本有换行,而且可能还夹杂着其他子标签,这时候可能用text()提取不出来,可以改用string(),string()可以将所有子标签中的文本串成一起提出来,可以满足绝大部分时候的需求。