How Browser Works

Extract from:
  http://taligarsiel.com/Projects/howbrowserswork1.htm

lexer:词法分析器

Catalog:
1. Introduction
2. The rendering engine
  2.1 Rendering engines
  2.2 The main flow
  2.3 Main flow examples
  2.4 Parsing and DOM tree construction
    2.4.1 Parsing - general
    2.4.2 HTML parser
    2.4.3 CSS parsing
    2.4.4 Parsing scripts
    2.4.5 The order of processing scripts and style sheets
  2.5 Render tree construction
  2.6 Layout
  2.7 Painting
  2.8 Dynamic changes
  2.9 The rendering engine's threads
  2.10 CSS2 visual model
  2.11 Resources

###############
Introduction
###############

Five major browsers used today
  - Internet Explorer
  - Firefox
  - Safari
  - Chrome
  - Opera

Main functionality is
  to present the web resource you choose,
  by requesting it from the server
  and displaying it on the browser window.

The resource format is
  usually HTML but also PDF, image and more.

The location of the resource is
  specified by the user using a URI (Uniform resource Identifier).

The way the browser interprets and displays HTML files is
  specified in the HTML and CSS specifications.

These specifications are maintained by the W3C (World Wide Web Consortium) organization,  which is the standards organization for the web.

The browser's main components are:
  1. The user interface
  2. The browser engine
  3. The rendering engine
  4. Networking
  5. UI backend
  6. Javascript interpreter
  7. Data storage

###################
The rendering engine
###################

By default the rendering engine can display HTML and XML documents and images.

It can display other types through a plug-in (a browser extension).
An example is displaying PDF using a PDF viewer plug-in.

Our reference browsers - Firefox, Chrome and Safari are built upon two rendering engines.

Firefox uses Gecko - a "home made" Mozilla rendering engine.
Both Safari and Chrome use Webkit.

Webkit is an open source rendering engine which started as an engine for the Linux platform and was modified by Apple to support Mac and Windows.

Basic flow of the rendering engine:
  1. Parsing HTML to construct the DOM tree
  2. Render tree construction
  3. Layout of the render tree
  4. Painting the render tree

Parsing can be separated into two sub processes
  - lexical analysis
  - syntax analysis.

Lexical analysis is the process of breaking the input into tokens.
Tokens are the language vocabulary - the collection of valid building blocks.
In human language it will consist of all the words that appear in the dictionary for that language.

Syntax analysis is the applying of the language syntax rules.

Parsers usually divide the work between two components
  - the lexer(sometimes called tokenizer) that is responsible for breaking the input into valid tokens,
  and the parser that is responsible for constructing the parse tree by analyzing the document structure according to the language syntax rules.

The lexer knows how to strip irrelevant characters like white spaces and line breaks.

BNF: Backus–Naur Form
In computer science, BNF (Backus Normal Form or Backus–Naur Form) is a notation technique for context-free grammars, often used to describe the syntax of languages used in computing, such as computer programming languages, document formats, instruction sets and communication protocols.
It is applied wherever exact descriptions of languages are needed, for instance, in official language specifications, in manuals, and in textbooks on programming language theory.

 

 

 

 

 

 

 

 

 

 

posted @ 2012-01-12 12:00  万法自然~  阅读(361)  评论(0编辑  收藏  举报