【譯】12.2.4 解析狀態 Parse state - HTML Standard
HTML
Living Standard — Last Updated 20 August 2017
12.2.4 Parse state
Parts of this specification are © Copyright 2004-2014 Apple Inc., Mozilla Foundation, and Opera Software ASA.
You are granted a license to use, reproduce and create derivative works of this document.
12.2.4.1 插入模式 The insertion mode
The insertion mode is a state variable that controls the primary operation of the tree construction stage.
insertion mode 是一個狀態變量,它控制樹在構建階段的主要操作。
Initially, the insertion mode is "initial". It can change to "before html", "before head", "in head", "in head noscript", "after head", "in body", "text", "in table", "in table text", "in caption", "in column group", "in table body", "in row", "in cell", "in select", "in select in table", "in template", "after body", "in frameset", "after frameset", "after after body", and "after after frameset" during the course of the parsing, as described in the tree construction stage. The insertion mode affects how tokens are processed and whether CDATA sections are supported.
最初,insertion mode 為 initial
。在解析過程中,它可以改變為 before html
、before head
、in head
、in head noscript
、after head
、in body
、text
、in table
、in table text
、in caption
、in column group
、in table body
、in row
、in cell
、in select
、in select in table
、in template
、after body
、in frameset
、after frameset
、after after body
、after after frameset
,正如 樹構造 階段中所描述的。insertion mode 影響如何處理 tokens ,以及是否支持 CDATA 區段。
Several of these modes, namely "in head", "in body", "in table", and "in select", are special, in that the other modes defer to them at various times. When the algorithm below says that the user agent is to do something "using the rules for the m insertion mode", where m is one of these modes, the user agent must use the rules described under the m insertion mode's section, but must leave the insertion mode unchanged unless the rules in m themselves switch the insertion mode to a new value.
in heai
、in body
、in table
、select
,上述幾種模式是特殊的,因為其他模式在不同時候對他們進行響應。當下面的算法表明用戶代理是做某事「使用規則到達插入模式m
」,這里的m
是上述特殊模式之一,用戶代理必需使用在下面 m 插入模式的章節中描述的規則,但必需保持插入模式不變,除非該規則在 m
自身切換 insertion mode
為新值。
When the insertion mode is switched to "text" or "in table text", the original insertion mode is also set. This is the insertion mode to which the tree construction stage will return.
當插入模式切換為 text
或 in table text
時,也設置了原始插入模式。這是樹構建階段將返回的插入模式。
Similarly, to parse nested template elements, a stack of template insertion modes is used. It is initially empty. The current template insertion mode is the insertion mode that was most recently added to the stack of template insertion modes. The algorithms in the sections below will push insertion modes onto this stack, meaning that the specified insertion mode is to be added to the stack, and pop insertion modes from the stack, which means that the most recently added insertion mode must be removed from the stack.
類似的,使用一個模版插入模式的堆棧,來解析嵌套 template
元素。它最初是空的。當前插入模式是最近添加到插入模式堆棧的插入模式。下面章節中的算法將插入模式 push 到這個堆棧中,這意味著指定的插入模式將添加到堆棧中;並且從堆棧中 pop 插入模式,這意味著必需從堆棧中移除最近添加的插入模式。
When the steps below require the UA to reset the insertion mode appropriately, it means the UA must follow these steps:
當下面的步驟要求用戶代理適當的重置插入模式,意味著用戶代理必需遵循這些步驟:
- Let last be false.
設last
為false
。 - Let node be the last node in the stack of open elements.
設node
為打開元素堆棧的最後一個節點。 - Loop: If node is the first node in the stack of open elements, then set last to true, and, if the parser was originally created as part of the HTML fragment parsing algorithm (fragment case), set node to the context element passed to that algorithm.
Loop
:如果node
是打開元素堆棧的第一個節點,那麼最後設置為true
,並且,如果解析器最初是作為 HTML 片段解析算法(fragment case)的一部分創建的,那麼node
設置為傳遞給該算法的上下文元素。 -
If node is a select element, run these substeps:
如果node
是一個select
元素,運行這些子步驟:- If last is true, jump to the step below labeled done.
如果last
為true
,跳到下面的步驟標記done
。 - Let ancestor be node.
設ancestor
為node
。 - Loop: If ancestor is the first node in the stack of open elements, jump to the step below labeled done.
Loop
:如果ancestor
是打開元素堆棧的第一個節點,跳到下面的步驟標記done
。 - Let ancestor be the node before ancestor in the stack of open elements.
設ancestor
為打開元素堆棧之前的ancestor
。 - If ancestor is a template node, jump to the step below labeled done.
如果ancestor
是一個template
節點,跳到下面的步驟標記done
。 - If ancestor is a table node, switch the insertion mode to "in select in table" and abort these steps.
如果ancestor
是一個table
節點,將插入模式切換為in select in table
,并終止這些步驟。 - Jump back to the step labeled loop.
跳轉回步驟標記Loop
。 - Done: Switch the insertion mode to "in select" and abort these steps.
Done
:切換插入模式為in select
,并終止這些步驟。
- If last is true, jump to the step below labeled done.
- If node is a td or th element and last is false, then switch the insertion mode to "in cell" and abort these steps.
如果node
是td
或th
元素,並且last
為false
,那麼切換插入模式為in cell
,并終止這些步驟。 - If node is a tr element, then switch the insertion mode to "in row" and abort these steps.
如果node
是tr
元素,那麼切換插入模式為in row
,并終止這些步驟。 - If node is a tbody, thead, or tfoot element, then switch the insertion mode to "in table body" and abort these steps.
如果node
是tbody
、thead
或tfoot
元素,那麼切換插入模式為in table body
,并終止這些步驟。 - If node is a caption element, then switch the insertion mode to "in caption" and abort these steps.
如果node
是caption
元素,那麼切換插入模式為in caption
,并終止這些步驟。 - If node is a colgroup element, then switch the insertion mode to "in column group" and abort these steps.
如果node
是colgroup
元素,那麼切換插入模式為in column group
,并終止這些步驟。 - If node is a table element, then switch the insertion mode to "in table" and abort these steps.
如果node
是tabla
元素,那麼切換插入模式為in table
,并終止這些步驟。 - If node is a template element, then switch the insertion mode to the current template insertion mode and abort these steps.
如果node
是template
元素,那麼切換插入模式為當前模版插入模式,并終止這些步驟。 - If node is a head element and last is false, then switch the insertion mode to "in head" and abort these steps.
如果node
是head
元素,並且last
為false
,那麼切換插入模式為in haed
,并終止這些步驟。 - If node is a body element, then switch the insertion mode to "in body" and abort these steps.
如果node
是body
元素,那麼切換插入模式為in body
,并終止這些步驟。 - If node is a frameset element, then switch the insertion mode to "in frameset" and abort these steps. (fragment case)
如果node
是frameset
元素,那麼切換插入模式為in frameset
,并終止這些步驟。(fragment case) -
If node is an html element, run these substeps:
如果node
是html
元素,運行這些子步驟:- If the head element pointer is null, switch the insertion mode to "before head" and abort these steps. (fragment case)
如果head
元素指針為null
,切換插入模式為before head
,并終止這些步驟。(fragment case) - Otherwise, the head element pointer is not null, switch the insertion mode to "after head" and abort these steps.
否則,該head
元素指針不為null
,切換插入模式為after head
,并終止這些步驟。
- If the head element pointer is null, switch the insertion mode to "before head" and abort these steps. (fragment case)
- If last is true, then switch the insertion mode to "in body" and abort these steps. (fragment case)
如果last
為true
,那麼切換插入模式為in body
,并終止這些步驟。(fragment case) - Let node now be the node before node in the stack of open elements.
設現在的node
為打開元素堆棧中的節點的之前的node
。 - Return to the step labeled loop.
回到步驟標籤Loop
。
12.2.4.2 打開元素的堆棧 The stack of open elements
Initially, the stack of open elements is empty. The stack grows downwards; the topmost node on the stack is the first one added to the stack, and the bottommost node of the stack is the most recently added node in the stack (notwithstanding when the stack is manipulated in a random access fashion as part of the handling for misnested tags).
最初,打開元素的堆棧是空的。堆棧向下生長;堆棧最頂部的 node
是第一個添加到堆棧的節點,並且堆棧最底部的 node
是最近添加到堆棧的節點(儘管在處理錯誤嵌套的標籤時,堆棧以隨機的訪問方式控制)。
Note: The "before html" insertion mode creates the html document element, which is then added to the stack.
Note: 在before html
插入模式下創建html
文檔元素,然後將其添加到堆棧中。
Note: In the fragment case, the stack of open elements is initialized to contain an html element that is created as part of that algorithm. (The fragment case skips the "before html" insertion mode.)
Note: 在碎片容器中,開放元素堆棧已被初始化為包含一個html
元素,這是作為它的算法的一部分創建的(碎片容器跳過了before html
插入模式)。
The html node, however it is created, is the topmost node of the stack. It only gets popped off the stack when the parser finishes.
無論如何,都將創建html
節點,並且它將是堆棧最頂部的節點。只有當解析完成,它才會從堆棧中彈出。
The current node is the bottommost node in this stack of open elements.
當前節點是在這個打開元素堆棧中最底部的的節點。
The adjusted current node is the context element if the parser was created by the HTML fragment parsing algorithm and the stack of open elements has only one element in it (fragment case); otherwise, the adjusted current node is the current node.
如果解析器是在 HTML 碎片解析算法中創建的,並且打開元素堆棧中只有一個元素,那麼校正后的當前節點為上下文元素;否則,校正后的當前節點就是當前節點。
Elements in the stack of open elements fall into the following categories:
在打開元素堆棧中的元素分為下列類別:
-
Special 特殊的
The following elements have varying levels of special parsing rules: HTML's address, applet, area, article, aside, base, basefont, bgsound, blockquote, body, br, button, caption, center, col, colgroup, dd, details, dir, div, dl, dt, embed, fieldset, figcaption, figure, footer, form, frame, frameset, h1, h2, h3, h4, h5, h6, head, header, hgroup, hr, html, iframe, img, input, keygen, li, link, listing, main, marquee, menu, meta, nav, noembed, noframes, noscript, object, ol, p, param, plaintext, pre, script, section, select, source, style, summary, table, tbody, td, template, textarea, tfoot, th, thead, title, tr, track, ul, wbr, xmp; MathML mi, MathML mo, MathML mn, MathML ms, MathML mtext, and MathML annotation-xml; and SVG foreignObject, SVG desc, and SVG title.
以下元素擁有不同程度的特殊解析規則:HTML 的address
、applet、area
、article、aside
、base
、basefont
、bgsound
、blockquote
、body
、br
、button
、caption
、center
、col
、colgroup
、dd
、details
、dir
、div
、dl
、dt
、embed
、fieldset
、figcaption
、figure
、footer
、form
、frame
、frameset
、h1
、h2
、h3
、h4
、h5
、h6
、head
、header
、hgroup
、hr
、html
、iframe
、img
、input
、keygen
、li
、link
、listing
、main
、marquee
、menu
、meta
、nav
、noembed
、noframes
、noscript
、object
、ol
、p
、param
、plaintext
、pre
、script
、section
、select
、source
、style
、summary
、table
、tbody
、td
、template
、textarea
、tfoot
、th
、thead
、title
、tr
、track
、ul
、wbr
、xmp
;MathML mi
、MathML mo
、MathML mn
、MathML ms
、MathML mtext
、MathML annotation-xml
;以及SVG foreignObject
、SVG desc
、SVG title
。Note: An image start tag token is handled by the tree builder, but it is not in this list because it is not an element; it gets turned into an img element.
Note: 在樹構造中會處理image
起始標籤,但它不在這個列表中,因為它不是一個元素;它變成了img
元素。 -
Formatting 格式化
The following HTML elements are those that end up in the list of active formatting elements: a, b, big, code, em, font, i, nobr, s, small, strike, strong, tt, and u.
下列 HTML 元素在現役格式化元素的列表中結束:a
、b
、big
、code
、em
、font
、i
、nobr
、s
、small
、strike
、strong
、tt
、u
。
-
Ordinary 普通的
All other elements found while parsing an HTML document.
在解析 HTML 文檔時發現的所有其他元素。Typically, the special elements have the start and end tag tokens handled specifically, while ordinary elements' tokens fall into "any other start tag" and "any other end tag" clauses, and some parts of the tree builder check if a particular element in the stack of open elements is in the special category. However, some elements (e.g., the option element) have their start or end tag tokens handled specifically, but are still not in the special category, so that they get the ordinary handling elsewhere.
通常,當普通元素在「任何其他起始標籤」和「任何其他結束標籤」之間作為子句,特殊元素的起始標籤和結束標籤令牌會進行特殊處理,並且樹構造器的某些部分會檢查特定元素在打開元素堆棧中是否是屬於特殊類別。然而,某些元素(例如,option
元素)有特殊的起始標籤或結束標籤令牌處理,但它仍然不在特殊類別中,這是為了在其他地方得到普通處理。
The stack of open elements is said to have an element target node in a specific scope consisting of a list of element types list when the following algorithm terminates in a match state:
當以下算法在匹配狀態終止時,該打開元素堆棧被認為在特定作用域中存在元素目標節點,算法包含一個元素類型的列表 list
:
- Initialize node to be the current node (the bottommost node of the stack).
初始化node
為當前節點(堆棧中最底部的節點)。 - If node is the target node, terminate in a match state.
如果node
是目標節點,終止於匹配狀態。 - Otherwise, if node is one of the element types in list, terminate in a failure state.
否則,如果node
是list
中的元素類型之一,終止於失敗狀態。 - Otherwise, set node to the previous entry in the stack of open elements and return to step 2. (This will never fail, since the loop will always terminate in the previous step if the top of the stack — an html element — is reached.)
否則,設置node
為打開元素堆棧中的前一個元素,并返回到步驟 2 。(這永遠不會失敗,因為如果到達了堆棧的頂部 —— 一個html
元素,這個循環將在前一個步驟終止。)
The stack of open elements is said to have a particular element in scope when it has that element in the specific scope consisting of the following element types:
當下列元素類型作為 list
時,符合在特定作用域中存在元素目標節點,該打開元素堆棧被認為在作用域中存在特定的元素:
applet
caption
html
table
td
th
marquee
object
template
MathML mi
MathML mo
MathML mn
MathML ms
MathML mtext
MathML annotation-xml
SVG foreignObject
SVG desc
SVG title
The stack of open elements is said to have a particular element in list item scope when it has that element in the specific scope consisting of the following element types:
當下列元素類型作為 list
時,符合在特定作用域中存在元素目標節點,該打開元素堆棧被認為在列表條目作用域中存在特定的元素:
- All the element types listed above for the has an element in scope algorithm.
算法 在作用域中有特定的元素 列出的所有元素。 -
ol
in the HTML namespace
HTML 命名空間中的ol
- ul in the HTML namespace
HTML 命名空間中的ul
The stack of open elements is said to have a particular element in button scope when it has that element in the specific scope consisting of the following element types:
當下列元素類型作為 list
時,符合在特定作用域中存在元素目標節點,該打開元素堆棧被認為在按鈕作用域中存在特定的元素:
- All the element types listed above for the has an element in scope algorithm.
算法 在作用域中有特定的元素 列出的所有元素。 -
button
in the HTML namespace
HTML 命名空間中的button
The stack of open elements is said to have a particular element in table scope when it has that element in the specific scope consisting of the following element types:
當下列元素類型作為 list
時,符合在特定作用域中存在元素目標節點,該打開元素堆棧被認為在表格作用域中存在特定的元素:
-
html
in the HTML namespace
HTML 命名空間中的html
-
table
in the HTML namespace
HTML 命名空間中的table
-
template
in the HTML namespace
HTML 命名空間中的template
The stack of open elements is said to have a particular element in select scope when it has that element in the specific scope consisting of all element types except the following:
當除下列元素類型以外的所有元素類型作為 list
時,符合在特定作用域中存在元素目標節點,該打開元素堆棧被認為在選擇作用域中存在特定的元素:
-
optgroup
in the HTML namespace
HTML 命名空間中的optgroup
-
option
in the HTML namespace
HTML 命名空間中的option
Nothing happens if at any time any of the elements in the stack of open elements are moved to a new location in, or removed from, the Document tree. In particular, the stack is not changed in this situation. This can cause, amongst other strange effects, content to be appended to nodes that are no longer in the DOM.
在任何時候,打開元素堆棧中的任何元素移動到一個新的位置或者從文檔樹中移除,都不會觸發任何操作。要注意的是,在這種情況下,堆棧沒有變動。這可能導致一些奇怪的效果,內容被附加在DOM中已不存在的節點。
Note: In some cases (namely, when closing misnested formatting elements), the stack is manipulated in a random-access fashion.
Note: 在某些情況下(即,關閉錯誤嵌套的格式化元素時),堆棧是以隨機存取的方式進行操作的。
12.2.4.3 現役格式化元素的列表 The list of active formatting elements
Initially, the list of active formatting elements is empty. It is used to handle mis-nested formatting element tags.
起初,現役格式化元素的列表為空。它是用於處理錯誤嵌套的格式化元素標籤。
The list contains elements in the formatting category, and markers. The markers are inserted when entering applet, object, marquee, template, td, th, and caption elements, and are used to prevent formatting from "leaking" into applet, object, marquee, template, td, th, and caption elements.
該列表包含格式化類別中的元素,以及標記。當進入 applet
、object
、marquee
、template
、td
、th
、caption
元素時附加該標記,這用於防止格式化「洩漏」到 applet
、object
、marquee
、template
、td
、th
、caption
元素。
In addition, each element in the list of active formatting elements is associated with the token for which it was created, so that further elements can be created for that token if necessary.
此外,現役格式化元素列表中的每個元素都與創建它的 token
關聯,所以當必要時可以為該 token
創建進一步的元素。
When the steps below require the UA to push onto the list of active formatting elements an element element, the UA must perform the following steps:
但下文的步驟要求用戶代理將元素 element
加入到現役格式化元素的列表中時,用戶代理必需執行以下步驟:
-
If there are already three elements in the list of active formatting elements after the last marker, if any, or anywhere in the list if there are no markers, that have the same tag name, namespace, and attributes as element, then remove the earliest such element from the list of active formatting elements. For these purposes, the attributes must be compared as they were when the elements were created by the parser; two elements have the same attributes if all their parsed attributes can be paired such that the two attributes in each pair have identical names, namespaces, and values (the order of the attributes does not matter).
如果在現役格式化元素列表中的最後一個標記后存在三個具有與element
標籤名稱、命名空間、屬性都一樣的元素,那麼從現役格式化元素列表中移除第一個這樣的元素;如果不存在標記,那麼不限定三個相同元素在列表中的位置。為了達成這些目的,必需像解析器創建元素時那樣去比較屬性;如果兩個元素的所有屬性經過解析都能配對(屬性的-順序並不重要),使得每一對中的兩個屬性具有相同的名稱、命名空間和值,認為兩個元素具有相同的屬性。Note: This is the Noah's Ark clause. But with three per family instead of two.
Note: 這是諾亞方舟的條例。但是每家庭三個,而不是兩個。 - Add element to the list of active formatting elements.
添加元素到現役格式化元素列表。
When the steps below require the UA to reconstruct the active formatting elements, the UA must perform the following steps:
當下文的步驟要求用戶代理重建現役格式化元素時,用戶代理必需執行以下步驟:
- If there are no entries in the list of active formatting elements, then there is nothing to reconstruct; stop this algorithm.
如果現役格式化列表中沒有條目,那也沒有什麼可供重建的;終止這個算法。 - If the last (most recently added) entry in the list of active formatting elements is a marker, or if it is an element that is in the stack of open elements, then there is nothing to reconstruct; stop this algorithm.
如果現役格式化元素列表中的最後一個(最近添加的)條目是一個標記,或者如果它是打開元素堆棧中的元素,那麼也沒有什麼可重建的,終止這個算法。 - Let entry be the last (most recently added) element in the list of active formatting elements.
設entry
為現役格式化元素列表中的最後一個(最近添加的)元素。 -
Rewind
: If there are no entries before entry in the list of active formatting elements, then jump to the step labeled create.Rewind
: 如果在現役格式化元素列表中,沒有元素在entry
之前,那麼跳轉到步驟標籤create
。 - Let
entry
be the entry one earlier thanentry
in the list of active formatting elements.
設entry
為在現役格式化元素列表中,比entry
的早一個加入的元素。 - If
entry
is neither a marker nor an element that is also in the stack of open elements, go to the step labeled rewind.
如果entry
既不是一個標記,也不是一個在打開元素堆棧中的元素,跳轉到步驟標籤Rewind
。 -
Advance
: Letentry
be the element one later thanentry
in the list of active formatting elements.Advance
: 設entry
為在現役格式化元素列表中,比entry
後一個加入的元素。 -
Create
: Insert an HTML element for the token for which the elemententry
was created, to obtainnew element
.Create
: 為創建entry
的令牌插入一個 HTML 元素,得到new element
。 - Replace the entry for
entry
in the list with an entry fornew element
.
用new element
的條目替換列表中entry
的條目。 - If the entry for
new element
in the list of active formatting elements is not the last entry in the list, return to the step labeled advance.
如果new element
的條目在現役格式化元素列表中不是列表最後的條目,返回到步驟標籤Advance
。
This has the effect of reopening all the formatting elements that were opened in the current body, cell, or caption (whichever is youngest) that haven't been explicitly closed.
這將重新打開所以在當前主體、單元格和標題(最年輕的)中打開的所有元素,這些元素沒有被明確的關閉。
Note: The way this specification is written, the list of active formatting elements always consists of elements in chronological order with the least recently added element first and the most recently added element last (except for while steps 7 to 10 of the above algorithm are being executed, of course).
Note: 這個規範的編寫方式,現役格式化元素列表的元素永遠按時間順序排序,並且較前添加的元素在前,最近添加的元素在後(當然,執行上述算法的 7 至 10 步時是例外的)。
When the steps below require the UA to clear the list of active formatting elements up to the last marker, the UA must perform the following steps:
當下文的步驟要求用戶代理將現役格式化元素列表清除至最後一個標記處時,用戶代理必需執行以下步驟:
- Let
entry
be the last (most recently added) entry in the list of active formatting elements.
設entry
為現役格式化元素列表中最後(最近添加)的條目。 - Remove
entry
from the list of active formatting elements.
從現役格式化元素列表中移除entry
。 - If
entry
was a marker, then stop the algorithm at this point. The list has been cleared up to the last marker.
如果entry
是一個標記,在這裡停止算法。該列表已被清除至最後一個標記。 - Go to step 1.
回到步驟 1。
12.2.4.4 元素的指針 The element pointers
Initially, the head
element pointer and the form
element pointer are both null.
最初,head
元素指針和 from
元素指針都是無效的。
Once a head
element has been parsed (whether implicitly or explicitly) the head
element pointer gets set to point to this node.
一旦一個 head
元素被解析(不論是隱式或是顯式),head
元素指針將被設置為指向這個節點。
The form
element pointer points to the last form
element that was opened and whose end tag has not yet been seen. It is used to make form controls associate with forms in the face of dramatically bad markup, for historical reasons. It is ignored inside template
elements.
form
元素指針指向最後一個打開的並且未見到結束標籤的 form
元素。由於歷史原因,它被用作使表單控件與表單相關聯。它在 template
元素內部會被忽略。
12.2.4.5 其他解析狀態標記 Other parsing state flags
The scripting flag is set to "enabled" if scripting was enabled for the Document
with which the parser is associated when the parser was created, and "disabled" otherwise.
如果在解析器創建時與解析器相關聯的 Document
中啟用腳本,那麼 scripting flag 被設置為 "enabled",否則被設置為 "disabled"。
Note: The scripting flag can be enabled even when the parser was originally created for the HTML fragment parsing algorithm, even though
script
elements don't execute in that case.
Note: 即使在為 HTML 碎片解析算法創建解析器時,也可以將scripting flag
設置為enabled
,哪怕在這種情況下script
元素不執行。
The frameset-ok flag is set to "ok" when the parser is created. It is set to "not ok" after certain tokens are seen.
在創建解析器時,frameset-ok flag 被設置為 "ok"。當看到某些特定的令牌時,它被設為 "not ok"。