ace -- 语法高亮
Creating a Syntax Highlighter for Ace 给ace创建一个语法高亮
Creating a new syntax highlighter for Ace is extremely simple. You'll need to define two pieces of code: a new mode, and a new set of highlighting rules.
创建一个新的ace语法高亮极为简单。你需要定义两个代码: 一个新的mode和一组新的高亮规则。
Where to Start
We recommend using the Ace Mode Creator when defining your highlighter. This allows you to inspect your code's tokens, as well as providing a live preview of the syntax highlighter in action.
我们建议使用 Ace Mode Creator 定义你的高亮。这允许你检查你的代码的tokens,以及在操作中提供语法高亮的实时预览。
Ace Mode Creator : https://ace.c9.io/tool/mode_creator.html
Defining a Mode
Every language needs a mode. A mode contains the paths to a language's syntax highlighting rules, indentation rules, and code folding rules. Without defining a mode, Ace won't know anything about the finer aspects of your language.
Here is the starter template we'll use to create a new mode:
每种语言都需要一个mode。mode包含语言的语法高亮规则,缩进规则和代码折叠规则的路径。在没有定义mode的情况下,ACE对你语言的细微之处一无所知
这是一个启动模板,我们将用它创建一个新的mode:
What's going on here? First, you're defining the path to TextMode
(more on this later). Then you're pointing the mode to your definitions for the highlighting rules, as well as your rules for code folding. Finally, you're setting everything up to find those rules, and exporting the Mode so that it can be consumed. That's it!
这里发生了什么?首先,你定义了TextMode的路径(稍后对此进行更多的阐述)。然后,你将mode指向你定义的高亮规则以及代码折叠规则。最后你设置所有的内容来查找这些规则,并导出该Mode以便它可以被使用。
Regarding TextMode
, you'll notice that it's only being used once: oop.inherits(Mode, TextMode);
. If your new language depends on the rules of another language, you can choose to inherit the same rules, while expanding on it with your language's own requirements. For example, PHP inherits from HTML, since it can be embedded directly inside .html pages. You can either inherit from TextMode
, or any other existing mode, if it already relates to your language.
关于 TextMode, 你会注意到它只使用了一次:oop.inherits(Mode, TextMode); 如果你的新语言依赖于其他语言的规则,那么你可以选择继承相同的规则,同时根据你的语言自身的需求对其进行扩展。例如,PHP从HTML继承,因为PHP可以直接嵌入到.html页面中。你也可以从 TextMode继承,或者其他已有的mode,如果它已经涉及到你的语言。
All Ace modes can be found in the lib/ace/mode folder.
ace的所有modes都可以在 lib/ace/mode 文件夹中找到
Defining Syntax Highlighting Rules 定义语法高亮规则
The Ace highlighter can be considered to be a state machine. Regular expressions define the tokens for the current state, as well as the transitions into another state. Let's define mynew_highlight_rules.js, which our mode above uses.
All syntax highlighters start off looking something like this:
ace高亮可以被认为是一个状态机。正则表达式给当前状态定义tokens,以及转换到另一个状态。让我们定义 mynew_highlight_rules.js,上面使用的mode。
所有的语法高亮开始都像这样:
The token state machine operates on whatever is defined in this.$rules
. The highlighter always begins at the start
state, and progresses down the list, looking for a matching regex
. When one is found, the resulting text is wrapped within a <span class="ace_<token>">
tag, where <token>
is defined as the token
property. Note that all tokens are preceded by the ace_
prefix when they're rendered on the page.
token状态机运行在 this.$rules里不管什么定义。高亮总是从start 状态开始,并沿着列表前进,寻找匹配的正则表达式regex。当找到文本时,被找到的文本被包裹在<span class="ace_<token>">标签中, <token>是上面定义的 token属性。请注意,当tokens渲染到页面上时,都会以 ace_ 前缀呈现。
Once again, we're inheriting from TextHighlightRules
here. We could choose to make this any other language set we want, if our new language requires previously defined syntaxes. For more information on extending languages, see "extending Highlighters" below.
再来一次,我们从 TextHighlightRules 继承下来。如果我们的新语言需要先前定义的语法,我们可以选择把它变成我们想要的任何其它语言集。有关扩展语言的更多信息,请查看下面的 extending Highlighters
Defining Tokens 定义tokens
The Ace highlighting system is heavily inspired by the TextMate language grammar. Most tokens will follow the conventions of TextMate when naming grammars. A thorough (albeit incomplete) list of tokens can be found on the Ace Wiki.
ace高亮系统深受 TextMate language grammar 启发。当命名语法时,大多数tokens将遵循 TextMate的约定。在ace wiki上可以找到完整的token列表 (虽然不完整):
token列表: https://github.com/ajaxorg/ace/wiki/Creating-or-Extending-an-Edit-Mode#commonTokens
For the complete list of tokens, see tool/tmtheme.js. It is possible to add new token names, but the scope of that knowledge is outside of this document.
有关完整的tokens列表, 请查看 tool/tmtheme.js https://github.com/ajaxorg/ace/blob/master/tool/tmtheme.js 可以添加新的token名称,但该知识的范围在该文档之外。
Multiple tokens can be applied to the same text by adding dots in the token, e.g. token: support.function
wraps the text in a <span class="ace_support ace_function">
tag.
通过在tokens添加 点 ,可以将多个tokens作用于同一文本。例如 token: support.function 将文本包裹在 <span class="ace_support ace_function">标签中。
Defining Regular Expressions 定义正则表达式
Regular expressions can either be a RegExp or String definition
正则表达式既可以是正则表达式也可以是字符串定义
If you're using a regular expression, remember to start and end the line with the /
character, like this:
如果你使用一个正则表达式,记住像下面这样,在一行的开始和结束使用 / 字符。
A caveat of using stringed regular expressions is that any \
character must be escaped. That means that even an innocuous regular expression like this:
使用字符串形式的正则表达式的一个警告是任何 \ 字符必须被转义。这意味着,即使是一个像下面这样的无害的正则表达式:
Must actually be written like this:
必须像下面这样编写:
Groupings 分组
You can also include flat regexps--(var)
--or have matching groups--((a+)(b+))
. There is a strict requirement whereby matching groups must cover the entire matched string; thus, (hel)lo
is invalid. If you want to create a non-matching group, simply start the group with the ?:
predicate; thus, (hel)(?:lo)
is okay. You can, of course, create longer non-matching groups. For example:
你也可以包括 单一的正则 --(var)-- 或者 匹配组 --((a+)(b+))。严格要求匹配组必须覆盖整个匹配字符串,因此 (hel)lo 是无效的。如果你想创建一个不匹配的组,只需要用 ?: 谓语作为组的开始;像 (hel)(?:lo) 也是可以的。 当然,你可以创建更长的非匹配组。 例如:
For flat regular expression matches, token
can be a String, or a Function that takes a single argument (the match) and returns a string token. For example, using a function might look like this:
对于单一的正则表达式匹配, token可以是一个 String, 或者是一个接收单个参数(当前匹配)并返回一个字符串token的Function。例如,使用函数可能看起来像下面这样:
If token
is a function,it should take the same number of arguments as there are groups, and return an array of tokens.
如果token是一个函数,它应该具有与组相同的参数数目,并且返回一个tokens数组。
For grouped regular expressions, token
can be a String, in which case all matched groups are given that same token, like this:
对于分组正则表达式,token可以是 String , 在这种情况下,所有的匹配组都被赋予相同的token。像下面这样
More commonly, though, token
is an Array (of the same length as the number of groups), whereby matches are given the token of the same alignment as in the match. For a complicated regular expression, like defining a function, that might look something like this:
然而,更常见的是,token是一个数组(长度与 组的数量 相同),由此,匹配被赋予与匹配中相同的对齐的token。对于一个复杂的正则表达式,像定义一个函数,看起来可能像下面这样:
Defining States 定义状态
The syntax highlighting state machine stays in the start
state, until you define a next
state for it to advance to. At that point, the tokenizer stays in that new state
, until it advances to another state. Afterwards, you should return to the original start
state.
语法高亮状态机停留在 start 状态,直到你给它定义一个 next 状态来更新。此时, tokenizer保持在新的 state , 直到它进入到另一个状态。然后, 你应该回到原来的 start 状态。
Here's an example:
In this extremely short sample, we're defining some highlighting rules for when Ace detects a <![CDATA
tag. When one is encountered, the tokenizer moves from start
into the cdata
state. It remains there, applying the text
token to any string it encounters. Finally, when it hits a closing ]>
symbol, it returns to the start
state and continues to tokenize anything else.
在这个非常短的示例中,我们定义了一些用于检测 <![CDATA 标签的高亮规则。当遇到一个时,tokenizer从 start 移动到 cdata状态。它仍然存在,将 ‘text’ token应用到它遇到的任何字符串。最后,当它命中关闭 ]> 符号时, 它返回到start 状态并且继续标记任何其他东西。
Using the TMLanguage Tool 使用 TMLanguage 工具
There is a tool that will take an existing tmlanguage file and do its best to convert it into Javascript for Ace to consume. Here's what you need to get started:
有一个工具,它将使用现有的 tmlanguage 文件,并尽最大努力将其转换成 Javascript以供 ace使用。一下是你需要开始的:
- In the Ace repository, navigate to the tools folder.
- 在ace库中, 导航到 tools 文件夹
- Run
npm install
to install required dependencies.- 运行 npm install 安装需要的依赖
- Run
node tmlanguage.js <path_to_tmlanguage_file>
; for example,node <path_to_tmlanguage_file> /Users/Elrond/elven.tmLanguage
- 运行 node tmlanguage.js <path_to_tmlanguage_file> 例如: node tmlanguage /Users/Elrond/elven.tmLanguage
Two files are created and placed in lib/ace/mode: one for the language mode, and one for the set of highlight rules. You will still need to add the code into ace/ext/modelist.js, and add a sample file for testing.
两个文件被创建并放置在 lib/ace/mode 目录下: 一个是语言 mode, 一个是高亮规则的集合。你仍然需要将代码添加到 ace/ext/modelist.js中,并添加用于测试的示例文件。
A Note on Accuracy 关于精度的一点注记
Your .tmlanguage file will then be converted to the best of the converter’s ability. It is an understatement to say that the tool is imperfect. Probably, language mode creation will never be able to be fully autogenerated. There's a list of non-determinable items; for example:
你的 .tmlanguage 文件会转换为 转换器最好的能力。这是一个轻描淡写的说法,该工具是不完美的。也许,语言模式的创造永远不能完全自生。这里有一个不可确定的项目清单,如下:
- The use of regular expression lookbehinds
This is a concept that JavaScript simply does not have and needs to be faked- 正则表达式查找表的使用
- 这是一个javascript根本没有,需要伪造的概念。
- Deciding which state to transition to
While the tool does create new states correctly, it labels them with generic terms likestate_2
,state_10
, e.t.c.- 决定向哪个 状态 过渡
- 虽然工具确实创建了新的状态,但它用 state_2, state_10等通用属于来标记它们。
- Extending modes
Many modes say something likeinclude source.c
, to mean, “add all the rules in C highlighting.” That syntax does not make sense to Ace or this tool (though of course you can extending existing highlighters).- 扩展模式
- 许多模式都说一些类似于 include source.c 的例子, 意思是”在c高亮中加入所有的规则“。这种语法对于ace或者这个工具是没有意义的(当然,你可以扩展现有的高亮显示器)。
- Rule preference order
- 规则偏好顺序
- Gathering keywords
Most likely, you’ll need to take keywords from your language file and run them throughcreateKeywordMapper()
- 关键词采集
- 最有可能的,你需要从你的语言文件中获取关键词,并通过 createKeywordMapper() 运行它们。
However, the tool is an excellent way to get a quick start, if you already possess a tmlanguage file for you language.
然而。如果你对你的语言已经拥有了一个 tmlanguage 文件,这个工具是一个很好的快速入门的方法。
Extending Highlighters 扩展高亮
Suppose you're working on a LuaPage, PHP embedded in HTML, or a Django template. You'll need to create a syntax highlighter that takes all the rules from the original language (Lua, PHP, or Python) and extends it with some additional identifiers (<?lua
, <?php
, {%
, for example). Ace allows you to easily extend a highlighter using a few helper functions.
假设你正在处理一个 LuaPage, PHP 嵌入到 HTML, 或者一个 Django模板。你需要创建一个语法高亮程序,它从原始语言(Lua, PHP, or Python)获取所有语法规则,并使用一些附加标识符(例如, <?lua <?php, {%)扩展它。ace允许你使用几个辅助函数轻松扩展高亮。
Getting Existing Rules 获取已有的规则
To get the existing syntax highlighting rules for a particular language, use the getRules()
function. For example:
要获得特定语言的现有语法高亮规则,使用getRules() 函数,例如:
Extending a Highlighter
The addRules
method does one thing, and it does one thing well: it adds new rules to an existing rule set, and prefixes any state with a given tag. For example, let's say you've got two sets of rules, defined like this:
addRules 方法做一件事,并且做的很好: 它向现有规则集添加新规则,并且用一个给定的标签给任何状态添加前缀。例如,假设你有两套规则,定义如下:
If you want to incorporate newRules
into this.$rules
, you'd do something like this:
如果你想将 newRules 合并到 this.$rules , 你可以这样做:
Extending Two Highlighters
The last function available to you combines both of these concepts, and it's called embedRules
. It takes three parameters:
最后一个可用的函数将这两个概念结合起来,称为 embedRules。 它接收三个参数:
- An existing rule set to embed with
- 嵌入现有的规则
- A prefix to apply for each state in the existing rule set
- 在现有规则集中应用每个状态的前缀
- A set of new states to add
- 添加一组新的状态
Like addRules
, embedRules
adds on to the existing this.$rules
object.
像 addRules, embedRules 添加到现有的 this.$rules 对象。
To explain this visually, let's take a look at the syntax highlighter for Lua pages, which combines all of these concepts:
为了直观的解释这一点,让我们看看 Lua页面的语法高亮,它结合了所有这些概念:
Here, this.$rules
starts off as a set of HTML highlighting rules. To this set, we add two new checks for <%=
and <?lua=
. We also delegate that if one of these rules are matched, we should move onto the lua-start
state. Next, embedRules
takes the already existing set of LuaHighlightRules
and applies the lua-
prefix to each state there. Finally, it adds two new checks for %>
and ?>
, allowing the state machine to return to start
.
这里, this.$rules 规则从一组 HTML高亮规则开始。对于这个集合,我们添加了两个新的检查 <%= 和 <?lua= 。我们还授权,如果这些规则中的一个匹配,我们应该移动到 lua-start 状态。接下来,embedRules将已经存在的 LuaHIghlightRUles集合应用lua-前缀到每个状态。最后, 它为 %> 和 ?> 添加了两个新的检查,允许状态机返回到 start 。
Code Folding
Adding new folding rules to your mode can be a little tricky. First, insert the following lines of code into your mode definition:
在你的mode中添加新的折叠规则可能会有点棘手。 首先,将下面几行代码插入到你的mode定义中。
You'll be defining your code folding rules into the lib/ace/mode/folding folder. Here's a template that you can use to get started:
你将代码折叠规则定义到 lib/ace/mode/folding 文件夹。 这里有个模板你可以用它来开始。
Just like with TextMode
for syntax highlighting, BaseFoldMode
contains the starting point for code folding logic. foldingStartMarker
defines your opening folding point, while foldingStopMarker
defines the stopping point. For example, for a C-style folding system, these values might look like this:
就像TextMode语法高亮一样,BaseFoldMode包含代码折叠逻辑的起点。foldingStartMarker 定义了你的折叠打开点, 而foldingStopMarker定义了停止点。例如,对于 C-style 折叠系统,这些值可能是这样:
These regular expressions identify various symbols--{
, [
, //
--to pay attention to. getFoldWidgetRange
matches on these regular expressions, and when found, returns the range of relevant folding points. For more information on the Range
object, see the Ace API documentation.
这些正则表达式各种符号-- {,[,// -- 要注意。 在这些正则表达式上匹配 getFoldWidgetRange, 当找到时,返回相关折叠点的范围。有关Range对象的更多信息,查看 the Ace API documentation
Again, for a C-style folding mechanism, a range to return for the starting fold might look like this:
同样,对于 C-style 折叠机构,返回起始折叠范围可能是这样:
Let's say we stumble across the code block hello_world() {
. Our range object here becomes:
Testing Your Highlighter
The best way to test your tokenizer is to see it live, right? To do that, you'll want to modify the live Ace demo to preview your changes. You can find this file in the root Ace directory with the name kitchen-sink.html.
- add an entry to
supportedModes
inace/ext/modelist.js
-
add a sample file to
demo/kitchen-sink/docs/
with same name as the mode file
Once you set this up, you should be able to witness a live demonstration of your new highlighter.
Adding Automated Tests
Adding automated tests for a highlighter is trivial so you are not required to do it, but it can help during development.
In lib/ace/mode/_test
create a file named
text_<modeName>.txt
with some example code. (You can skip this if the document you have added in demo/docs
both looks good and covers various edge cases in your language syntax).
Run node highlight_rules_test.js -gen
to preserve current output of your tokenizer in tokens_<modeName>.json
After this running highlight_rules_test.js optionalLanguageName
will compare output of your tokenizer with the correct output you've created.
Any files ending with the _test.js suffix are automatically run by Ace's Travis CI server.