Selenium3笔记-WebDriver源码初探

Selenium3 有哪些变化？

其实相对于与Selenium2，Selenium3没有做太多的改动。下面给出官方的文档说明，供参考。

参考文档：https://seleniumhq.wordpress.com/2013/08/28/the-road-to-selenium-3/

“We aim for Selenium 3 to be “a tool for user-focused automation of mobile and web apps”,Developers from projects such as Appium, ios-driver and selendroidwill be working on the suite of tests to enable this.”
“Selenium 3 will see the removal of the original Selenium Core implementations, and consequently we’ll be deprecating the RC APIs too，the original implementation will be available as a download, but it will no longer be actively developed once we release 3.0.”

所以对于Selenium3来说最大的变动可能就是更加专注于手机和web的测试，尤其是手机的支持，因为你晓得的，现在更多的是移动的时代。

对于Selenium2中对于RemotControl的实现我看了下Selenium3的源码发现确实不在支持，而更多的转向了W3C standard，不是独成一套Selenium自己的WebDriver API.关于这个需要插如一下有关W3C WebDriver的知识。

有关W3C WebDriver

参考文档： https://www.w3.org/TR/webdriver/，https://www.w3.org/testing/Activity，https://github.com/w3c/webdriver

W3C组织制定了一套浏览器自动化的规范叫做WebDriver，这套规范规定了所有的浏览器生产商都必须遵守这个规范。其实定义了好多的遵循的接口和WebDriver的概念。对于Chrome，Firefox，Opera,Safari.etc他们都需要遵守这个规范并且实现规范里面的接口，这些实现一般都是伴随浏览器的开发进行的。

所以你应该明白了，Selenium不管是WebDriver还是RemoteWebDriver都是W3C WebDriver的一种实现而已。真正的核心浏览器的交互在对应的浏览器的WebDriver上，其实你有了对应的浏览器的WebDriver，参考W3C的标准接口文档HTTP-based wire protocol你就可以单独实现浏览器的操作。就是Client-Server的沟通。所有支持的命令列表如下：

举个ChromeDriver的例子。。。

首先我们找到ChromeDriver ，这个自然到chromium项目上去下载就好了。

https://sites.google.com/a/chromium.org/chromedriver/这里也有很多详细的接口的说明，这里的接口说明跟上面的W3C的接口说明差不多。你需要针对不同的浏览器下载对应的版本。下面我以下载的一个win版本的为例（下载地址：http://chromedriver.storage.googleapis.com/2.23/chromedriver_win32.zip ）

WebDriver的使用

1.1 查看下chromedriver.exe提供给我们的一些可用的命令。

里面的使用很详细，这里我们只需要使用一个参数来启动ChromeDriver的server， –port ,命令如下：chromedriver.exe –port 9514，或者直接不输入端口直接回车，界面命令如下：

启动后chromedriver会在本地的9514端口号上进行监听通信，根据不同的命令发送到浏览器上，浏览器进行交互。比如启动一个chrome浏览器对应的命令是session，单独的ChromeDriver的HTTP通信URI是：http://localhost:9514/session,对于通过RemoteWebDriver的URL是：http://localhost:9514/wd/hub/session

WebDriver -New Session

看一下这个说明： https://www.w3.org/TR/webdriver/#dfn-new-session，操作流程如下：

The remote end steps are:

If the remote end is an intermediary node, take implementation-defined steps that either result in returning an error with error code session not created, or in returning a success with data that is isomorphic to that returned by remote ends according to the rest of this algorithm.
If the maximum active sessions is equal to the length of the list of active sessions, return error with error code session not created.
If there is a current user prompt, return error with error code session not created.
Let capabilities be the result of getting a property named "capabilities" from the parameters argument.
Let capabilities result be the result of processing capabilities with capabilities as an argument.
If capabilities result is an error, return error with error code session not created.
Let capabilities be capabilities result’s data.
Let session id be the result of generating a UUID.
Let session be a new session with the session ID of session id.
Set the current session to session.
Append session to active sessions.

上面的流程已经在最新的Selenium WebDriver中实现了。所有启动一个浏览器做的session操作可以参考如下核心Selenium代码逻辑。

1. 第一步设置chromeDriver的路径后面代码用到：System.setProperty("webdriver.chrome.driver", "chromedriver.exe");

2. 第二步构建一个命令行对象用于执行chromedriver.exe的命令：

org.openqa.selenium.remote.service.DriverService.Builder.build()

public DS build() {
     if (port == 0) {
       port = PortProber.findFreePort(); //可用的端口号，例如232323,那么后面用到的命令就是:chromedriver.exe –port 232323
     }

     if (exe == null) {
       exe = findDefaultExecutable();
     }

ImmutableList<String> args = createArgs();

return createDriverService(exe, port, args, environment);
}

1. 核心selenium命令执行类：org.openqa.selenium.remote.RemoteWebDriver.RemoteWebDriver(CommandExecutor, Capabilities, Capabilities)

public RemoteWebDriver(CommandExecutor executor, Capabilities desiredCapabilities,
Capabilities requiredCapabilities) {
this.executor = executor;

init(desiredCapabilities, requiredCapabilities);

    if (executor instanceof NeedsLocalLogs) {
      ((NeedsLocalLogs)executor).setLocalLogs(localLogs);
    }

    try {
      startClient(desiredCapabilities, requiredCapabilities);
    } catch (RuntimeException e) {
      try {
        stopClient(desiredCapabilities, requiredCapabilities);
      } catch (Exception ignored) {
        // Ignore the clean-up exception. We'll propagate the original failure.
      }

throw e;
}

    try {
      startSession(desiredCapabilities, requiredCapabilities);
    } catch (RuntimeException e) {
      try {
        quit();
      } catch (Exception ignored) {
        // Ignore the clean-up exception. We'll propagate the original failure.
      }

throw e;
}
}

以上的代码完成了如下的操作：

1. 初始化desiredCapabilities对象，这是发送到客户端的JSON 数据，

2. 启动一个session，这里包含一个判断，如果这是一个NEW_SESSION，那么会在上面构建的chromedriver上启动chromedriver然后在发送session命令。后台操作HTTP请求用到的是Apache HttpClient的API.

上面说明下WebDriver的通信是HTTP的协议，因此这里所有的通信都是通过JSON Wired进行沟通的RESTFul格式。也就是说所有的沟通都是一次RESTFul的request和response的过程。

参考如下Selenium的说明： https://github.com/SeleniumHQ/selenium/wiki/JsonWireProtocol#command-summary

JSON Request: