词法分析生成器 之 lexertl 【4】添加文件解析行号功能
目标: 将文件名和行号信息存在Token中以便词法分析和语法分析时输出更详细的信息。这在调试你的分析器时将会有非常大帮助。
做法: 记得之前 Boost.Spirit 有一个 file_iterator类和position_iterator类,仔细看了一下,确实满足 lexertl match_results类 对迭代器的要求。 好,那就写几行代码验证一下吧。
test.txt 文件 内容为: abcd1234TTTT
运行结果如下:
可以看到,已经正确地解析出了3个token,并且输出起始行列号与介绍行列号信息。
lexert 作者 Ben Hanson 似乎正准备自己为lexetl定义一个file_iterator 用于取代Boost.Spirit中 file_iterator。 这里我将Ben Hanson的Blog拷贝了过来。 如果真的另外开发一个file_iterator,我们期待在编译速度以及运行性能上能够超过Boost.Spirit中file_iterator……
The lexertl Blog
29.09.2009
As I have recently started a revamp of lexertl I have decided to start a blog to keep everybody up to date. As this version is not feature complete yet, I have added a separate zip file which you can find here.
So far I have implemented the following improvements:
- Auto compression of
wchar_t
based state machines (overridable). - A generic lookup mechanism based around iterators.
- Added the
lexertl::skip
token constant. - Removed regex macro length limitation.
- Made the BOL (
^
) link a singleton (as it can only occur at the beginning of a token). debug::dump()
now compresses ranges.
This dramatically reduces the list of (easier) features I wanted to add and just leaves the following for the immediate future:
file_iterator
(this will also replace the one inBoost.Spirit
)- Turn
size_t
into a templated type for state machine creation. - Re-write the code generator.
- Redo serialisation.