Headless Chrome architecture
Headless Chrome architecture
skyostil@
February 17th, 2016
go/ghost-rider (internal link), crbug.com/546953
Introduction
The Headless Chrome (aka. Ghost Rider) project is about making it possible to run Chrome in a headless/server environment. Expected use cases include automated regression, performance and compatibility testing as well as extracting data from web pages.
This document describes the overall architecture of the project.
Goals
-
Allow headless applications to embed Chromium’s content layer and Blink with minimal memory and performance overhead.
-
Make it possible to efficiently and deterministically load multiple independent web pages in a single process.
-
Export fine grained structural data about web pages (e.g., the DOM and layout geometry).
-
Provide insulation against API churn.
-
Minimize the number of invasive or headless-specific changes (e.g., #ifdefs) to Chromium’s code base.
Non-Goals
-
Make it possible to embed Chromium into a graphical toolkit or framework.
Architecture
The project has two main deliverables:
-
The headless library, which lets an embedding app control the browser and interact with web pages.
-
A headless shell, which is a sample application exercising the various features of the headless API.
The headless library exports an API at three different levels:
-
A C++ embedder API, which allows developers to integrate the headless library into their application environment. The headless library provides default implementations for low level adaptation points such as networking and the run loop, and the embedder API allows replacing them with custom ones.
-
A C++ client API which is used to control the browser, e.g., by opening tabs, navigating to different pages, executing Javascript, listening for load events, etc.
Note that this API can also optionally be exported over the standard DevTools wire format for tools like WebDriver and Telemetry. -
A Javascript extension API which allows direct, synchronous access to the web content.
This library will be implemented on top of the content API. Note however that the content API itself will not be exposed to the embedder. The runtime will execute in a single process, but will have a separate “browser” and “renderer” threads.
Embedder API
The main classes of the embedder API are:
-
headless::HeadlessBrowser
-
Global instance of the browser.
-
Has adaptation points for network layer and message loop behavior.
-
Runs the message loop and maintains the browser and renderer threads.
-
Optionally exposes a remote debugging port for Chrome DevTools.
-
headless::HeadlessWebContents
-
Represents an individual top-level browsing context, i.e., a tab.
-
Provides the client API interface for this tab.
-
headless::HeadlessNetwork
-
Provides helpers for integrating with socket-style or HTTP transactional network stacks.
Client API
The client API allows the application to interact with the headless browser by
-
navigating,
-
executing Javascript,
-
reading from and writing to the DOM,
-
observing network events,
-
synthesizing user input,
-
capturing screenshots,
-
inspecting worker scripts,
-
recording Chrome Traces,
-
etc.
The API based on the Chrome DevTools remote debugging protocol and is similarly split into domains for the various types of functionality. Initially we will expose the standard DevTools domains and later introduce a new headless domain.
The API interfaces are automatically generated from the DevTools protocol definition and run on the browser thread asynchronously w.r.t. the web content. As an example, the interface for the Page domain could look like this:
namespace page {
class Agent {
public:
void Navigate(
scoped_ptr<NavigateParams> params,
base::Callback<void(scoped_ptr<NavigateResult>)> callback);
...
};
class NavigateParams {
public:
static scoped_ptr<NavigateParams> Create();
NavigateParams* set_url(const std::string& url);
};
class NavigateResult {
public:
int frame_id() const;
};
} // namespace page
For more details, see Remote debugging API.
Extension API
The specific extension API has been superseded by embedder-provided mojo services and richer DevTools functionality for manipulating and observing the target page.
API diagram
The relationships between the API classes as well as how they relate to content and Blink interfaces is shown below.
Directory organization
The headless API will live in a new top-level directory with the following structure:
-
headless/
-
public/ Headers for the API.
-
lib/ Implementation and tests for the API.
-
app/ Sample headless shell app.
-
test/ Test support.
The embedding app will also have a transitive dependency on base/ and net/.
Testing plan
Any new functionality added for headless mode must be covered by an automated test. These tests should either live in a headless test suite or (preferably) in an existing test suite such as the DevTools tests if the feature isn’t specific to headless. Headless tests are exercised as a part of the regular Chrome testing waterfall as well as on a custom bot which uses the headless build configuration and runs without a display server.
Larger features such as deterministic page loading should consider targeted testing, e.g., with page cyclers.
Open issues
-
Should the client API objects be gin::Wrappables so they could be directly bound to JavaScript too? This would be convenient for PhantomJS.
-
Resolved: Client API objects are exposed as regular C++ objects. If access from Javascript is required, a mojo service can be used to expose the bindings.
-
How much functionality is needed in the extension API and how much is already covered by the existing DevTools domains?
-
Resolved: We will add any missing functionality the protocol, either in existing domains or in a new headless domain.