xpath定位

一.什么是Xpath

Xpath是一种在xml文档中查找信息的语言

xpath的基本语法

xpath(query) ：返回query表达式对应的所有节点的selector list

>>> response.xpath('//div[@class="quote"]//small/text()')
[<Selector xpath='//div[@class="quote"]//small/text()' data='Albert Einstein'>, <Selector xpath='//div[@class="quote"]//small/text()' data='J.K. Rowling'>, <Selector xpath='//div[@class="quote"]//small/text()' data='Albert Einstein'>, <Selector xpath='//div[@class="quote"]//small/text()' data='Jane Austen'>, <Selector xpath='//div[@class="quote"]//small/text()' data='Marilyn Monroe'>, <Selector xpath='//div[@class="quote"]//small/text()' data='Albert Einstein'>, <Selector xpath='//div[@class="quote"]//small/text()' data='André Gide'>, <Selector xpath='//div[@class="quote"]//small/text()' data='Thomas A. Edison'>, <Selector xpath='//div[@class="quote"]//small/text()' data='Eleanor Roosevelt'>, <Selector xpath='//div[@class="quote"]//small/text()' data='Steve Martin'>]

extract()：序列化该节点为Unicode字符串并返回list。

>>> response.xpath('//div[@class="quote"]//small/text()').extract()
['Albert Einstein', 'J.K. Rowling', 'Albert Einstein', 'Jane Austen', 'Marilyn Monroe', 'Albert Einstein', 'André Gide', 'Thomas A. Edison', 'Eleanor Roosevelt', 'Steve Martin']

extract()[0] ：取出list中的值。

>>> response.xpath('//div[@class="quote"][1]//small/text()').extract()[0]
'Albert Einstein'

二.Xpath定位元素的几种方式

1.绝对定位

绝对定位存在很大问题，就是如果页面元素改变了，xpath随之会改变，不稳定，不推荐使用

2.标签+属性定位-xpath=“//标签名[@属性=‘属性值’]”

例如，百度首页的输入框的xpath就可以表示为//*[@id="kw"]，其中*表示所有的标
签名
当单一的属性无法确定到一个元素时，可以使用组合属性的方式
例如，百度首页的输入框可以表示为//*[@id="kw" and @name="wd"]
当然，也可以使用其他的逻辑运算，比如or、not

3.text()方法定位

例如，登陆页面的登陆按钮的超链接的xpath可以表示为//*[text()='Login']

4.contains()方法定位，也叫模糊定位

xpath = "//标签名[contains(@属性，‘属性值’)]"

例如，上面的登陆按钮也可以写成//a[contains(@id,'loginlink')]

只要属性中包含给出的字符串就可以定位到元素了

5.starts-with,ends-with方法定位

starts-with ----匹配以xx开头的属性值；ends-with ----匹配以xx结尾的属性值

//*[starts-with(@value,'Logi')]可以定位到登陆按钮

ends-with是xpath2.0的用法，但一般浏览器只支持xpath1.0

6.如果一个元素无法通过自身的属性定位到，那么可以先定位到他的上一级或者上n级，然后一级一级的找到他

例如定位到页面上的sn号 //div[@id='device_info']/table[@class='detail']/tbody/tr[@class='odd']/td

三.Xpath通过节点属性获取节点

div[@属性=‘属性值’]

//th[@class='common']

xpath获取节点属性值

/@属性

//th/em/following-sibling::a[1]/@href   # 可以得到href属性的值 组成的selector list

四.xpath获取元素节点中的文本

xpath定位到的是节点本身，要想获取到节点中的文本，需要使用/text()

//tbody[@id ="separatorline"]/following-sibling::tbody//th/em/following-sibling::a[1]/text()

五、xpath定位兄弟元素

following-sibling 随后的兄弟元素
preceding-sibling 之前的兄弟元素

有时候要定位，发现有的是tbody组成，但tbody没有任何属性可以使用，这是就可以定位兄弟标签了，例如

//tbody[@id="separatorline"]/following-sibling::tbody

:: 表示当前节点的父节点

想定位到第一个tbody可以使用

//tbody[@id="separatorline"]/following-sibling::tbody[1]

想要定位下方的第N个tbody可以使用：

//tbody[@id="separatorline"]/following-sibling::tbody[N]

六、xpath运算符

运算符	描述	实例	返回值
\|	计算两个节点集	//book \| //cd	返回所有拥有 book 和 cd 元素的节点集

=	等于	price=9.80	如果 price 是 9.80，则返回 true。如果 price 是 9.90，则返回 false。
!=	不等于	price!=9.80	如果 price 是 9.90，则返回 true。如果 price 是 9.80，则返回 false。
<	小于	price<9.80	如果 price 是 9.00，则返回 true。如果 price 是 9.90，则返回 false。
<=	小于或等于	price<=9.80	如果 price 是 9.00，则返回 true。如果 price 是 9.90，则返回 false。
>	大于	price>9.80	如果 price 是 9.90，则返回 true。如果 price 是 9.80，则返回 false。
>=	大于或等于	price>=9.80	如果 price 是 9.90，则返回 true。如果 price 是 9.70，则返回 false。
or	或	price=9.80 or price=9.70	如果 price 是 9.80，则返回 true。如果 price 是 9.50，则返回 false。
and	与	price>9.00 and price<9.90	如果 price 是 9.80，则返回 true。如果 price 是 8.50，则返回 false。

alert不是页面元素，是javascript的一个控件，所以不能右键检查，不能用传统的方式定位
附：关于js三种弹出框的介绍：https://blog.csdn.net/qq_33247435/article/details/85626051

一、定义和用法
alert() 方法用于显示带有一条指定消息和一个 OK 按钮的警告框。

selenium提供了三个处理alert的方法

注意：首先需要切换窗口到alert

driver.switch_to.alert()

#点击确定 driver.switch_to.alert.accept()

#点击取消 driver.switch_to.alert.dismiss()

#获取弹窗的文本信息 driver.switch_to.alert.text

可以通过抓取到的弹窗的信息，判断操作是否成功

time.sleep(5)
res = driver.switch_to.alert.text
driver.switch_to.alert.accept()
print(res)

posted @ 2020-08-11 22:43 Aline2 阅读(305) 评论(0) 收藏举报

刷新页面返回顶部

Aline

xpath定位

五、xpath定位兄弟元素

六、xpath运算符

公告