Calcite(一):javacc语法框架及使用

  是一个动态数据管理框架。 它包含许多组成典型数据库管理系统的部分,但省略了存储原语。它提供了行业标准的SQL解析器和验证器,具有可插入规则和成本函数的可自定义优化器,逻辑和物理代数运算符,从SQL到代数(以及相反)的各种转换。

  以上是官方描述,用大白话描述就是,calcite实现了一套标准的sql解析功能,比如实现了标准hive sql的解析,可以避免繁杂且易出错的语法问题。并暴露了相关的扩展接口供用户自定义使用。其提供了逻辑计划修改功能,用户可以实现自己的优化。(害,好像还是很绕!不管了)

  

1. calcite的两大方向

  从核心功能上讲,或者某种程度上讲,我们可以将calicite分为两大块,一块是对sql语法的解析,另一块是对语义的转化与实现;

  为什么要将其分为几块呢?我们知道,基本上所有的分层,都是为了简化各层的逻辑。如果我们将所有的逻辑全放在一个层上,必然存在大量的耦合,互相嵌套,很难实现专业的人做专业的事。语法解析,本身是一件比较难的事情,但是因为有很多成熟的编译原理理论支持,所以,这方面有许多现成的实现可以利用,或者即使是自己单独实现这一块,也不会有太大麻烦。所以,这一层是一定要分出来的。

  而对语义的转化与实现,则是用户更关注的一层,如果说前面的语法是标准规范的话,那么语义才是实现者最关心的东西。规范是为了减轻使用者的使用难度,而其背后的逻辑则可能有天壤之别。当有了前面的语法解析树之后,再来进一步处理语义的东西,必然方便了许多。但也必定是个复杂的工作,因为上下文关联语义,并不好处理。

  而我们本篇只关注语法解析这一块功能,而calcite使用javacc作为其语法解析器,所以我们自然主关注向javacc了。与javacc类似的,还有antlr,这个留到我们后面再说。

  calcite中,javacc应该属于一阶段的编译,而java中引入javacc编译后的样板代码,再执行自己的逻辑,可以算作是二阶段编译。我们可以简单的参考下下面这个图,说明其意义。

 

 

2. javacc的语法框架

  本文仅站在一个使用者的角度来讲解javacc, 因为javacc本身必然是复杂的难以讲清的, 而且如果想要细致了解javacc则肯定是需要借助官网的。

  首先,来看下javacc的编码框架:

javacc_options                            /* javacc 的各种配置选项设置,需了解具体配置含义后以kv形式配置 */
"PARSER_BEGIN" "(" <IDENTIFIER> ")"        /* parser代码开始定义,标识下面的代码是纯粹使用java编写的 */
java_compilation_unit                    /* parser的入口代码编写,纯java, 此处将决定外部如何调用parser */
"PARSER_END" "(" <IDENTIFIER> ")"        /* parser代码结束标识,javacc将会把以上代码纯粹当作原文copy到parser中 */
( production )*                            /* 各种语法产生式,按照编译原理的类似样子定义语法产生式,由javacc去分析具体代码逻辑,嵌入到parser中,该部分产生式代码将被编译到上面的parser中,所以方法可以完全供parser调用 */
<EOF>                                    /* 文件结束标识 */

  以上就是javacc的语法定义的框架了,它是一个整个的parser.jj文件。即这个文件只要按照这种框架写了,然后调用javacc进行编译后,可以得到一系列的编译器样板代码了。

  但是,如何去编写去编写这些语法呢?啥都不知道,好尴尬。不着急,且看下一节。

 

3. javacc中的关键词及使用

  之所以我们无从下手写javacc的jj文件,是因为我们不知道有些什么关键词,以及没有给出一些样例。主要熟能生巧嘛。

  javacc中的关键词非常的少,一个是因为这种词法解析器的方法论是非常成熟的,它可以按照任意的语法作出解析。二一个是它不负责太多的业务实现相关的东西,它只管理解语义,翻译即可。而它其中仅有的几个关键词,也还有一些属于辅助类的功能。真正必须的关键词就更少了。列举如下:

TOKEN                        /* 定义一些确定的普通词或关键词,主要用于被引用 */
SPECIAL_TOKEN                /* 定义一些确定的特殊用途的普通词或关键词,主要用于被引用或抛弃 */
SKIP                        /* 定义一些需要跳过或者忽略的单词或短语,主要用于分词或者注释 */
MORE                        /* token的辅助定义工具,用于确定连续的多个token */
EOF                            /* 文件结束标识或者语句结束标识 */
IGNORE_CASE                    /* 辅助选项,忽略大小写 */
JAVACODE                    /* 辅助选项,用于标识本段代码是java */
LOOKAHEAD                    /* 语法二义性处理工具,用于预读多个token,以便明确语义 */
PARSER_BEGIN                /* 样板代码,固定开头 */
PARSER_END                    /* 样板代码,固定结尾 */
TOKEN_MGR_DECLS                /* 辅助选项 */

  有了这些关键词的定义,我们就可以来写个hello world 了。其主要作用就是验证语法是否是 hello world.

options {
    STATIC = false;
    ERROR_REPORTING = true;
    JAVA_UNICODE_ESCAPE = true;
    UNICODE_INPUT = false;
    IGNORE_CASE = true;
    DEBUG_PARSER = false;
    DEBUG_LOOKAHEAD = false;
    DEBUG_TOKEN_MANAGER = false;
}

PARSER_BEGIN(HelloWorldParser)

package my;

import java.io.FileInputStream;

/** 
 * hello world parser
 */
@SuppressWarnings({"nls", "unused"})
public class HelloWorldParser {

    /**
     * 测试入口
     */
    public static void main( String args[] ) throws Throwable {
        // 编译器会默认生成构造方法
        String sqlFilePath = args[0];
        final HelloWorldParser parser = new HelloWorldParser(new FileInputStream(sqlFilePath));
        try {
            parser.hello();
        } catch(Throwable t) {
            System.err.println(":1: not parsed");
            t.printStackTrace();
            return;
        }
        System.out.println("ok");
    }
    public void hello () throws ParseException {
        helloEof();
    }

} // end class

PARSER_END(HelloWorldParser)

void helloEof() :
{}
{
    // 匹配到hello world 后,打印文字,否则抛出异常
    (
    <HELLO>
    |
    "HELLO2"
    ) 
    <WORLD>
    { System.out.println("ok to match hello world."); }
}

TOKEN : 
{
    <HELLO: "hello">
|    <WORLD: "world">
}

SKIP:
{
    " "
|   "\t"
|   "\r"
|   "\n"
}

  命名为 hello.jj, 运行 javacc 编译该jj文件。

    > javacc hello.jj
    > javac my/*.java
    > java my.HelloWorldParser

  

4. javacc中的编译原理

  javacc作为一个词法解析器,其主要作用是提供词法解析功能。当然,只有它自己知道词是不够的,它还有一个非常重要的功能,能够翻译成java语言(不止java)的解析器,这样用户就可以调用这些解析器进行业务逻辑实现了。所以,从某种角度上说,它相当于是一个脚手架,帮我们生成一些模板代码。

  词法解析作为一个非常通用的话题,各种大牛科学家们,早就总结出非常多的方法论的东西了。即编译原理。但要想深入理解其理论,还是非常难的,只能各自随缘了。随便列举几个名词,供大家参考:

产生式
终结符与非终结符,运算分量
预测分析法,左递归,回溯,上下文无关
DFA, NFA, 正则匹配,模式,kmp算法,trie树
附加操作,声明
LL, LR, 二义性
词法
语法

  可以说,整个javacc就是编译原理的其中一小部分实现。当然了,我们平时遇到编译的地方非常多,因为我们所使用的语言,都需要被编译成汇编或机器语言,才能被执行,比如javacc, gcc...。所以,编译原理无处不在。

  这里,我们单说jj文件如何被编译成java文件?总体上,大的原理就按照编译原理来就好了。我们只说一些映射关系。

"a" "b" -> 代表多个连续token
| -> 对应if或者switch语义
(..)* -> 对应while语义
["a"] -> 对应if语句,可匹配0-1次
(): {} -> 对应语法的产生式
{} -> 附加操作,在匹配后嵌入执行
<id> 对应常量词或容易描述的token描述

  javacc 默认会生成几个辅助类:

XXConstants: 定义一些常量值,比如将TOKEN定义的值转换为一个个的数字;
HelloWorldParserTokenManager: token管理器, 用于读取token, 可以自定义处理;
JavaCharStream: CharStream的实现,会根据配置选项生成不同的类;
ParseException: 解析错误时抛出的类;
Token: 读取到的单词描述类;
TokenMgrError: 读取token错误时抛出的错误;

  具体看下javacc中有些什么选项配置,请查看官网。https://javacc.github.io/javacc/documentation/grammar.html#javacc-options

  从编写代码的角度来说,我们基本上只要掌握基本的样板格式和正则表达式就可以写出javacc的语法了。如果想要在具体的java代码中应用,则需要自己组织需要的语法树结构或其他了。

 

5. javacc 编译实现源码解析

  javacc本身也是用java写的,可读性还是比较强的。我们就略微扫一下吧。它的仓库地址: https://github.com/javacc/javacc

  其入口在: src/main/java/org/javacc/parser/Main.java

/**
   * A main program that exercises the parser.
   */
  public static void main(String args[]) throws Exception {
    int errorcode = mainProgram(args);
    System.exit(errorcode);
  }

  /**
   * The method to call to exercise the parser from other Java programs.
   * It returns an error code.  See how the main program above uses
   * this method.
   */
  public static int mainProgram(String args[]) throws Exception {

    if (args.length == 1 && args[args.length -1].equalsIgnoreCase("-version")) {
        System.out.println(Version.versionNumber);
        return 0;
    }
    
    // Initialize all static state
    reInitAll();

    JavaCCGlobals.bannerLine("Parser Generator", "");

    JavaCCParser parser = null;
    if (args.length == 0) {
      System.out.println("");
      help_message();
      return 1;
    } else {
      System.out.println("(type \"javacc\" with no arguments for help)");
    }

    if (Options.isOption(args[args.length-1])) {
      System.out.println("Last argument \"" + args[args.length-1] + "\" is not a filename.");
      return 1;
    }
    for (int arg = 0; arg < args.length-1; arg++) {
      if (!Options.isOption(args[arg])) {
        System.out.println("Argument \"" + args[arg] + "\" must be an option setting.");
        return 1;
      }
      Options.setCmdLineOption(args[arg]);
    }




    try {
      java.io.File fp = new java.io.File(args[args.length-1]);
      if (!fp.exists()) {
         System.out.println("File " + args[args.length-1] + " not found.");
         return 1;
      }
      if (fp.isDirectory()) {
         System.out.println(args[args.length-1] + " is a directory. Please use a valid file name.");
         return 1;
      }
      // javacc 本身也使用的语法解析器生成 JavaCCParser, 即相当于自依赖咯
      parser = new JavaCCParser(new java.io.BufferedReader(new java.io.InputStreamReader(new java.io.FileInputStream(args[args.length-1]), Options.getGrammarEncoding())));
    } catch (SecurityException se) {
      System.out.println("Security violation while trying to open " + args[args.length-1]);
      return 1;
    } catch (java.io.FileNotFoundException e) {
      System.out.println("File " + args[args.length-1] + " not found.");
      return 1;
    }

    try {
      System.out.println("Reading from file " + args[args.length-1] + " . . .");
      // 使用静态变量来实现全局数据共享
      JavaCCGlobals.fileName = JavaCCGlobals.origFileName = args[args.length-1];
      JavaCCGlobals.jjtreeGenerated = JavaCCGlobals.isGeneratedBy("JJTree", args[args.length-1]);
      JavaCCGlobals.toolNames = JavaCCGlobals.getToolNames(args[args.length-1]);
      // javacc 语法解析入口
      // 经过解析后,它会将各种解析数据放入到全局变量中
      parser.javacc_input();

      // 2012/05/02 - Moved this here as cannot evaluate output language
      // until the cc file has been processed. Was previously setting the 'lg' variable
      // to a lexer before the configuration override in the cc file had been read.
      String outputLanguage = Options.getOutputLanguage();
      // TODO :: CBA --  Require Unification of output language specific processing into a single Enum class
        boolean isJavaOutput = Options.isOutputLanguageJava();
        boolean isCPPOutput = outputLanguage.equals(Options.OUTPUT_LANGUAGE__CPP);

        // 2013/07/22 Java Modern is a
        boolean isJavaModern = isJavaOutput && Options.getJavaTemplateType().equals(Options.JAVA_TEMPLATE_TYPE_MODERN);

        if (isJavaOutput) {
        lg = new LexGen();
      } else if (isCPPOutput) {
        lg = new LexGenCPP();
      } else {
          return unhandledLanguageExit(outputLanguage);
      }

      JavaCCGlobals.createOutputDir(Options.getOutputDirectory());

      if (Options.getUnicodeInput())
      {
         NfaState.unicodeWarningGiven = true;
         System.out.println("Note: UNICODE_INPUT option is specified. " +
              "Please make sure you create the parser/lexer using a Reader with the correct character encoding.");
      }
      // 将词法解析得到的信息,重新语义加强,构造出更连贯的上下文信息,供后续使用
      Semanticize.start();
      boolean isBuildParser = Options.getBuildParser();

       // 2012/05/02 -- This is not the best way to add-in GWT support, really the code needs to turn supported languages into enumerations
      // and have the enumerations describe the deltas between the outputs. The current approach means that per-langauge configuration is distributed
      // and small changes between targets does not benefit from inheritance.
        if (isJavaOutput) {
            if (isBuildParser) {
                // 1. 生成parser框架信息
                new ParseGen().start(isJavaModern);
            }

            // Must always create the lexer object even if not building a parser.
            // 2. 生成语法解析信息
            new LexGen().start();
            // 3. 生成其他辅助类
            Options.setStringOption(Options.NONUSER_OPTION__PARSER_NAME, JavaCCGlobals.cu_name);
            OtherFilesGen.start(isJavaModern);
        } else if (isCPPOutput) { // C++ for now
            if (isBuildParser) {
                new ParseGenCPP().start();
            }
            if (isBuildParser) {
                new LexGenCPP().start();
            }
            Options.setStringOption(Options.NONUSER_OPTION__PARSER_NAME, JavaCCGlobals.cu_name);
            OtherFilesGenCPP.start();
        } else {
            unhandledLanguageExit(outputLanguage);
        }


      // 编译结果状态判定,输出
      if ((JavaCCErrors.get_error_count() == 0) && (isBuildParser || Options.getBuildTokenManager())) {
        if (JavaCCErrors.get_warning_count() == 0) {
            if (isBuildParser) {
                System.out.println("Parser generated successfully.");
            }
        } else {
          System.out.println("Parser generated with 0 errors and "
                             + JavaCCErrors.get_warning_count() + " warnings.");
        }
        return 0;
      } else {
        System.out.println("Detected " + JavaCCErrors.get_error_count() + " errors and "
                           + JavaCCErrors.get_warning_count() + " warnings.");
        return (JavaCCErrors.get_error_count()==0)?0:1;
      }
    } catch (MetaParseException e) {
      System.out.println("Detected " + JavaCCErrors.get_error_count() + " errors and "
                         + JavaCCErrors.get_warning_count() + " warnings.");
      return 1;
    } catch (ParseException e) {
      System.out.println(e.toString());
      System.out.println("Detected " + (JavaCCErrors.get_error_count()+1) + " errors and "
                         + JavaCCErrors.get_warning_count() + " warnings.");
      return 1;
    }
  }

  以上,就是javacc的编译运行框架,其词法解析仍然靠着自身的jj文件,生成的 JavaCCParser 进行解析:

    1. 生成的 JavaCCParser, 然后调用 javacc_input() 解析出词法信息;
    2. 将解析出的语法信息放入到全局变量中;
    3. 使用Semanticize 将词法语义加强,转换为javacc可处理的结构;
    4. 使用ParseGen 生成parser框架信息;
    5. 使用LexGen 生成语法描述方法;
    6. 使用OtherFilesGen 生成同级辅助类;

  下面我们就前面几个重点类,展开看看其实现就差不多了。

 

5.1. javacc语法定义

  前面说了,javacc在编译其他语言时,它自己又定义了一个语法文件,用于第一步的词法分析。可见这功能的普启遍性。我们大致看下入口即可,更多完整源码可查看: src/main/javacc/JavaCC.jj

void javacc_input() :
    {
      String id1, id2;
      initialize();
    }
{
  javacc_options()
  {
  }
    "PARSER_BEGIN" "(" id1=identifier()
    {
      addcuname(id1);
    }
                 ")"
    {
      processing_cu = true;
      parser_class_name = id1;

          if (!isJavaLanguage()) {
            JavaCCGlobals.otherLanguageDeclTokenBeg = getToken(1);
            while(getToken(1).kind != _PARSER_END) {
              getNextToken();
            }
            JavaCCGlobals.otherLanguageDeclTokenEnd = getToken(1);
          }
    }
    CompilationUnit()
    {
      processing_cu = false;
    }
    "PARSER_END" "(" id2=identifier()
    {
      compare(getToken(0), id1, id2);
    }
    ")"
  ( production() )+
  <EOF>
}
...

  可以看出,这种语法定义,与说明文档相差不太多,可以说是一种比较接近自然语言的实现了。

 

5.2. Semanticize 语义处理

  Semanticize 将前面词法解析得到数据,进一步转换成容易被理解的语法树或者其他信息。

  // org.javacc.parser.Semanticize#start
  static public void start() throws MetaParseException {

    if (JavaCCErrors.get_error_count() != 0) throw new MetaParseException();

    if (Options.getLookahead() > 1 && !Options.getForceLaCheck() && Options.getSanityCheck()) {
      JavaCCErrors.warning("Lookahead adequacy checking not being performed since option LOOKAHEAD " +
              "is more than 1.  Set option FORCE_LA_CHECK to true to force checking.");
    }

    /*
     * The following walks the entire parse tree to convert all LOOKAHEAD's
     * that are not at choice points (but at beginning of sequences) and converts
     * them to trivial choices.  This way, their semantic lookahead specification
     * can be evaluated during other lookahead evaluations.
     */
    for (Iterator<NormalProduction> it = bnfproductions.iterator(); it.hasNext();) {
      ExpansionTreeWalker.postOrderWalk(((NormalProduction)it.next()).getExpansion(),
                                        new LookaheadFixer());
    }

    /*
     * The following loop populates "production_table"
     */
    for (Iterator<NormalProduction> it = bnfproductions.iterator(); it.hasNext();) {
      NormalProduction p = it.next();
      if (production_table.put(p.getLhs(), p) != null) {
        JavaCCErrors.semantic_error(p, p.getLhs() + " occurs on the left hand side of more than one production.");
      }
    }

    /*
     * The following walks the entire parse tree to make sure that all
     * non-terminals on RHS's are defined on the LHS.
     */
    for (Iterator<NormalProduction> it = bnfproductions.iterator(); it.hasNext();) {
      ExpansionTreeWalker.preOrderWalk((it.next()).getExpansion(), new ProductionDefinedChecker());
    }

    /*
     * The following loop ensures that all target lexical states are
     * defined.  Also piggybacking on this loop is the detection of
     * <EOF> and <name> in token productions.  After reporting an
     * error, these entries are removed.  Also checked are definitions
     * on inline private regular expressions.
     * This loop works slightly differently when USER_TOKEN_MANAGER
     * is set to true.  In this case, <name> occurrences are OK, while
     * regular expression specs generate a warning.
     */
    for (Iterator<TokenProduction> it = rexprlist.iterator(); it.hasNext();) {
      TokenProduction tp = (TokenProduction)(it.next());
      List<RegExprSpec> respecs = tp.respecs;
      for (Iterator<RegExprSpec> it1 = respecs.iterator(); it1.hasNext();) {
        RegExprSpec res = (RegExprSpec)(it1.next());
        if (res.nextState != null) {
          if (lexstate_S2I.get(res.nextState) == null) {
            JavaCCErrors.semantic_error(res.nsTok, "Lexical state \"" + res.nextState +
                                                   "\" has not been defined.");
          }
        }
        if (res.rexp instanceof REndOfFile) {
          //JavaCCErrors.semantic_error(res.rexp, "Badly placed <EOF>.");
          if (tp.lexStates != null)
            JavaCCErrors.semantic_error(res.rexp, "EOF action/state change must be specified for all states, " +
                    "i.e., <*>TOKEN:.");
          if (tp.kind != TokenProduction.TOKEN)
            JavaCCErrors.semantic_error(res.rexp, "EOF action/state change can be specified only in a " +
                    "TOKEN specification.");
          if (nextStateForEof != null || actForEof != null)
            JavaCCErrors.semantic_error(res.rexp, "Duplicate action/state change specification for <EOF>.");
          actForEof = res.act;
          nextStateForEof = res.nextState;
          prepareToRemove(respecs, res);
        } else if (tp.isExplicit && Options.getUserTokenManager()) {
          JavaCCErrors.warning(res.rexp, "Ignoring regular expression specification since " +
                  "option USER_TOKEN_MANAGER has been set to true.");
        } else if (tp.isExplicit && !Options.getUserTokenManager() && res.rexp instanceof RJustName) {
            JavaCCErrors.warning(res.rexp, "Ignoring free-standing regular expression reference.  " +
                    "If you really want this, you must give it a different label as <NEWLABEL:<"
                    + res.rexp.label + ">>.");
            prepareToRemove(respecs, res);
        } else if (!tp.isExplicit && res.rexp.private_rexp) {
          JavaCCErrors.semantic_error(res.rexp, "Private (#) regular expression cannot be defined within " +
                  "grammar productions.");
        }
      }
    }

    removePreparedItems();

    /*
     * The following loop inserts all names of regular expressions into
     * "named_tokens_table" and "ordered_named_tokens".
     * Duplications are flagged as errors.
     */
    for (Iterator<TokenProduction> it = rexprlist.iterator(); it.hasNext();) {
      TokenProduction tp = (TokenProduction)(it.next());
      List<RegExprSpec> respecs = tp.respecs;
      for (Iterator<RegExprSpec> it1 = respecs.iterator(); it1.hasNext();) {
        RegExprSpec res = (RegExprSpec)(it1.next());
        if (!(res.rexp instanceof RJustName) && !res.rexp.label.equals("")) {
          String s = res.rexp.label;
          Object obj = named_tokens_table.put(s, res.rexp);
          if (obj != null) {
            JavaCCErrors.semantic_error(res.rexp, "Multiply defined lexical token name \"" + s + "\".");
          } else {
            ordered_named_tokens.add(res.rexp);
          }
          if (lexstate_S2I.get(s) != null) {
            JavaCCErrors.semantic_error(res.rexp, "Lexical token name \"" + s + "\" is the same as " +
                    "that of a lexical state.");
          }
        }
      }
    }

    /*
     * The following code merges multiple uses of the same string in the same
     * lexical state and produces error messages when there are multiple
     * explicit occurrences (outside the BNF) of the string in the same
     * lexical state, or when within BNF occurrences of a string are duplicates
     * of those that occur as non-TOKEN's (SKIP, MORE, SPECIAL_TOKEN) or private
     * regular expressions.  While doing this, this code also numbers all
     * regular expressions (by setting their ordinal values), and populates the
     * table "names_of_tokens".
     */

    tokenCount = 1;
    for (Iterator<TokenProduction> it = rexprlist.iterator(); it.hasNext();) {
      TokenProduction tp = (TokenProduction)(it.next());
      List<RegExprSpec> respecs = tp.respecs;
      if (tp.lexStates == null) {
        tp.lexStates = new String[lexstate_I2S.size()];
        int i = 0;
        for (Enumeration<String> enum1 = lexstate_I2S.elements(); enum1.hasMoreElements();) {
          tp.lexStates[i++] = (String)(enum1.nextElement());
        }
      }
      Hashtable table[] = new Hashtable[tp.lexStates.length];
      for (int i = 0; i < tp.lexStates.length; i++) {
        table[i] = (Hashtable)simple_tokens_table.get(tp.lexStates[i]);
      }
      for (Iterator<RegExprSpec> it1 = respecs.iterator(); it1.hasNext();) {
        RegExprSpec res = (RegExprSpec)(it1.next());
        if (res.rexp instanceof RStringLiteral) {
          RStringLiteral sl = (RStringLiteral)res.rexp;
          // This loop performs the checks and actions with respect to each lexical state.
          for (int i = 0; i < table.length; i++) {
            // Get table of all case variants of "sl.image" into table2.
            Hashtable table2 = (Hashtable)(table[i].get(sl.image.toUpperCase()));
            if (table2 == null) {
              // There are no case variants of "sl.image" earlier than the current one.
              // So go ahead and insert this item.
              if (sl.ordinal == 0) {
                sl.ordinal = tokenCount++;
              }
              table2 = new Hashtable();
              table2.put(sl.image, sl);
              table[i].put(sl.image.toUpperCase(), table2);
            } else if (hasIgnoreCase(table2, sl.image)) { // hasIgnoreCase sets "other" if it is found.
              // Since IGNORE_CASE version exists, current one is useless and bad.
              if (!sl.tpContext.isExplicit) {
                // inline BNF string is used earlier with an IGNORE_CASE.
                JavaCCErrors.semantic_error(sl, "String \"" + sl.image + "\" can never be matched " +
                        "due to presence of more general (IGNORE_CASE) regular expression " +
                        "at line " + other.getLine() + ", column " + other.getColumn() + ".");
              } else {
                // give the standard error message.
                JavaCCErrors.semantic_error(sl, "Duplicate definition of string token \"" + sl.image + "\" " +
                        "can never be matched.");
              }
            } else if (sl.tpContext.ignoreCase) {
              // This has to be explicit.  A warning needs to be given with respect
              // to all previous strings.
              String pos = ""; int count = 0;
              for (Enumeration<RegularExpression> enum2 = table2.elements(); enum2.hasMoreElements();) {
                RegularExpression rexp = (RegularExpression)(enum2.nextElement());
                if (count != 0) pos += ",";
                pos += " line " + rexp.getLine();
                count++;
              }
              if (count == 1) {
                JavaCCErrors.warning(sl, "String with IGNORE_CASE is partially superseded by string at" + pos + ".");
              } else {
                JavaCCErrors.warning(sl, "String with IGNORE_CASE is partially superseded by strings at" + pos + ".");
              }
              // This entry is legitimate.  So insert it.
              if (sl.ordinal == 0) {
                sl.ordinal = tokenCount++;
              }
              table2.put(sl.image, sl);
              // The above "put" may override an existing entry (that is not IGNORE_CASE) and that's
              // the desired behavior.
            } else {
              // The rest of the cases do not involve IGNORE_CASE.
              RegularExpression re = (RegularExpression)table2.get(sl.image);
              if (re == null) {
                if (sl.ordinal == 0) {
                  sl.ordinal = tokenCount++;
                }
                table2.put(sl.image, sl);
              } else if (tp.isExplicit) {
                // This is an error even if the first occurrence was implicit.
                if (tp.lexStates[i].equals("DEFAULT")) {
                  JavaCCErrors.semantic_error(sl, "Duplicate definition of string token \"" + sl.image + "\".");
                } else {
                  JavaCCErrors.semantic_error(sl, "Duplicate definition of string token \"" + sl.image +
                          "\" in lexical state \"" + tp.lexStates[i] + "\".");
                }
              } else if (re.tpContext.kind != TokenProduction.TOKEN) {
                JavaCCErrors.semantic_error(sl, "String token \"" + sl.image + "\" has been defined as a \"" +
                        TokenProduction.kindImage[re.tpContext.kind] + "\" token.");
              } else if (re.private_rexp) {
                JavaCCErrors.semantic_error(sl, "String token \"" + sl.image +
                        "\" has been defined as a private regular expression.");
              } else {
                // This is now a legitimate reference to an existing RStringLiteral.
                // So we assign it a number and take it out of "rexprlist".
                // Therefore, if all is OK (no errors), then there will be only unequal
                // string literals in each lexical state.  Note that the only way
                // this can be legal is if this is a string declared inline within the
                // BNF.  Hence, it belongs to only one lexical state - namely "DEFAULT".
                sl.ordinal = re.ordinal;
                prepareToRemove(respecs, res);
              }
            }
          }
        } else if (!(res.rexp instanceof RJustName)) {
          res.rexp.ordinal = tokenCount++;
        }
        if (!(res.rexp instanceof RJustName) && !res.rexp.label.equals("")) {
          names_of_tokens.put(new Integer(res.rexp.ordinal), res.rexp.label);
        }
        if (!(res.rexp instanceof RJustName)) {
          rexps_of_tokens.put(new Integer(res.rexp.ordinal), res.rexp);
        }
      }
    }

    removePreparedItems();

    /*
     * The following code performs a tree walk on all regular expressions
     * attaching links to "RJustName"s.  Error messages are given if
     * undeclared names are used, or if "RJustNames" refer to private
     * regular expressions or to regular expressions of any kind other
     * than TOKEN.  In addition, this loop also removes top level
     * "RJustName"s from "rexprlist".
     * This code is not executed if Options.getUserTokenManager() is set to
     * true.  Instead the following block of code is executed.
     */

    if (!Options.getUserTokenManager()) {
      FixRJustNames frjn = new FixRJustNames();
      for (Iterator<TokenProduction> it = rexprlist.iterator(); it.hasNext();) {
        TokenProduction tp = (TokenProduction)(it.next());
        List<RegExprSpec> respecs = tp.respecs;
        for (Iterator<RegExprSpec> it1 = respecs.iterator(); it1.hasNext();) {
          RegExprSpec res = (RegExprSpec)(it1.next());
          frjn.root = res.rexp;
          ExpansionTreeWalker.preOrderWalk(res.rexp, frjn);
          if (res.rexp instanceof RJustName) {
            prepareToRemove(respecs, res);
          }
        }
      }
    }

    removePreparedItems();

    /*
     * The following code is executed only if Options.getUserTokenManager() is
     * set to true.  This code visits all top-level "RJustName"s (ignores
     * "RJustName"s nested within regular expressions).  Since regular expressions
     * are optional in this case, "RJustName"s without corresponding regular
     * expressions are given ordinal values here.  If "RJustName"s refer to
     * a named regular expression, their ordinal values are set to reflect this.
     * All but one "RJustName" node is removed from the lists by the end of
     * execution of this code.
     */

    if (Options.getUserTokenManager()) {
      for (Iterator<TokenProduction> it = rexprlist.iterator(); it.hasNext();) {
        TokenProduction tp = (TokenProduction)(it.next());
        List<RegExprSpec> respecs = tp.respecs;
        for (Iterator<RegExprSpec> it1 = respecs.iterator(); it1.hasNext();) {
          RegExprSpec res = (RegExprSpec)(it1.next());
          if (res.rexp instanceof RJustName) {

            RJustName jn = (RJustName)res.rexp;
            RegularExpression rexp = (RegularExpression)named_tokens_table.get(jn.label);
            if (rexp == null) {
              jn.ordinal = tokenCount++;
              named_tokens_table.put(jn.label, jn);
              ordered_named_tokens.add(jn);
              names_of_tokens.put(new Integer(jn.ordinal), jn.label);
            } else {
              jn.ordinal = rexp.ordinal;
              prepareToRemove(respecs, res);
            }
          }
        }
      }
    }

    removePreparedItems();

    /*
     * The following code is executed only if Options.getUserTokenManager() is
     * set to true.  This loop labels any unlabeled regular expression and
     * prints a warning that it is doing so.  These labels are added to
     * "ordered_named_tokens" so that they may be generated into the ...Constants
     * file.
     */
    if (Options.getUserTokenManager()) {
      for (Iterator<TokenProduction> it = rexprlist.iterator(); it.hasNext();) {
        TokenProduction tp = (TokenProduction)(it.next());
        List<RegExprSpec> respecs = tp.respecs;
        for (Iterator<RegExprSpec> it1 = respecs.iterator(); it1.hasNext();) {
          RegExprSpec res = (RegExprSpec)(it1.next());
          Integer ii = new Integer(res.rexp.ordinal);
          if (names_of_tokens.get(ii) == null) {
            JavaCCErrors.warning(res.rexp, "Unlabeled regular expression cannot be referred to by " +
                    "user generated token manager.");
          }
        }
      }
    }

    if (JavaCCErrors.get_error_count() != 0) throw new MetaParseException();

    // The following code sets the value of the "emptyPossible" field of NormalProduction
    // nodes.  This field is initialized to false, and then the entire list of
    // productions is processed.  This is repeated as long as at least one item
    // got updated from false to true in the pass.
    boolean emptyUpdate = true;
    while (emptyUpdate) {
      emptyUpdate = false;
      for (Iterator<NormalProduction> it = bnfproductions.iterator(); it.hasNext();) {
        NormalProduction prod = (NormalProduction)it.next();
        if (emptyExpansionExists(prod.getExpansion())) {
          if (!prod.isEmptyPossible()) {
            emptyUpdate = prod.setEmptyPossible(true);
          }
        }
      }
    }

    if (Options.getSanityCheck() && JavaCCErrors.get_error_count() == 0) {

      // The following code checks that all ZeroOrMore, ZeroOrOne, and OneOrMore nodes
      // do not contain expansions that can expand to the empty token list.
      for (Iterator<NormalProduction> it = bnfproductions.iterator(); it.hasNext();) {
        ExpansionTreeWalker.preOrderWalk(((NormalProduction)it.next()).getExpansion(), new EmptyChecker());
      }

      // The following code goes through the productions and adds pointers to other
      // productions that it can expand to without consuming any tokens.  Once this is
      // done, a left-recursion check can be performed.
      for (Iterator<NormalProduction> it = bnfproductions.iterator(); it.hasNext();) {
        NormalProduction prod = it.next();
        addLeftMost(prod, prod.getExpansion());
      }

      // Now the following loop calls a recursive walk routine that searches for
      // actual left recursions.  The way the algorithm is coded, once a node has
      // been determined to participate in a left recursive loop, it is not tried
      // in any other loop.
      for (Iterator<NormalProduction> it = bnfproductions.iterator(); it.hasNext();) {
        NormalProduction prod = it.next();
        if (prod.getWalkStatus() == 0) {
          prodWalk(prod);
        }
      }

      // Now we do a similar, but much simpler walk for the regular expression part of
      // the grammar.  Here we are looking for any kind of loop, not just left recursions,
      // so we only need to do the equivalent of the above walk.
      // This is not done if option USER_TOKEN_MANAGER is set to true.
      if (!Options.getUserTokenManager()) {
        for (Iterator<TokenProduction> it = rexprlist.iterator(); it.hasNext();) {
          TokenProduction tp = (TokenProduction)(it.next());
          List<RegExprSpec> respecs = tp.respecs;
          for (Iterator<RegExprSpec> it1 = respecs.iterator(); it1.hasNext();) {
            RegExprSpec res = (RegExprSpec)(it1.next());
            RegularExpression rexp = res.rexp;
            if (rexp.walkStatus == 0) {
              rexp.walkStatus = -1;
              if (rexpWalk(rexp)) {
                loopString = "..." + rexp.label + "... --> " + loopString;
                JavaCCErrors.semantic_error(rexp, "Loop in regular expression detected: \"" + loopString + "\"");
              }
              rexp.walkStatus = 1;
            }
          }
        }
      }

      /*
       * The following code performs the lookahead ambiguity checking.
       */
      if (JavaCCErrors.get_error_count() == 0) {
        for (Iterator<NormalProduction> it = bnfproductions.iterator(); it.hasNext();) {
          ExpansionTreeWalker.preOrderWalk((it.next()).getExpansion(), new LookaheadChecker());
        }
      }

    } // matches "if (Options.getSanityCheck()) {"

    if (JavaCCErrors.get_error_count() != 0) throw new MetaParseException();

  }
  
  // org.javacc.parser.ExpansionTreeWalker#postOrderWalk
  // 后续遍历节点,与前序遍历类似
  /**
   * Visits the nodes of the tree rooted at "node" in post-order.
   * i.e., it visits the children first and then executes
   * opObj.action.
   */
  static void postOrderWalk(Expansion node, TreeWalkerOp opObj) {
    if (opObj.goDeeper(node)) {
      if (node instanceof Choice) {
        for (Iterator it = ((Choice)node).getChoices().iterator(); it.hasNext();) {
          postOrderWalk((Expansion)it.next(), opObj);
        }
      } else if (node instanceof Sequence) {
        for (Iterator it = ((Sequence)node).units.iterator(); it.hasNext();) {
          postOrderWalk((Expansion)it.next(), opObj);
        }
      } else if (node instanceof OneOrMore) {
        postOrderWalk(((OneOrMore)node).expansion, opObj);
      } else if (node instanceof ZeroOrMore) {
        postOrderWalk(((ZeroOrMore)node).expansion, opObj);
      } else if (node instanceof ZeroOrOne) {
        postOrderWalk(((ZeroOrOne)node).expansion, opObj);
      } else if (node instanceof Lookahead) {
        Expansion nested_e = ((Lookahead)node).getLaExpansion();
        if (!(nested_e instanceof Sequence && (Expansion)(((Sequence)nested_e).units.get(0)) == node)) {
          postOrderWalk(nested_e, opObj);
        }
      } else if (node instanceof TryBlock) {
        postOrderWalk(((TryBlock)node).exp, opObj);
      } else if (node instanceof RChoice) {
        for (Iterator it = ((RChoice)node).getChoices().iterator(); it.hasNext();) {
          postOrderWalk((Expansion)it.next(), opObj);
        }
      } else if (node instanceof RSequence) {
        for (Iterator it = ((RSequence)node).units.iterator(); it.hasNext();) {
          postOrderWalk((Expansion)it.next(), opObj);
        }
      } else if (node instanceof ROneOrMore) {
        postOrderWalk(((ROneOrMore)node).regexpr, opObj);
      } else if (node instanceof RZeroOrMore) {
        postOrderWalk(((RZeroOrMore)node).regexpr, opObj);
      } else if (node instanceof RZeroOrOne) {
        postOrderWalk(((RZeroOrOne)node).regexpr, opObj);
      } else if (node instanceof RRepetitionRange) {
        postOrderWalk(((RRepetitionRange)node).regexpr, opObj);
      }
    }
    opObj.action(node);
  }

  

5.3. ParseGen 生成parser框架

  ParseGen 生成一些header, 将java_compilation 写进去等。

    // org.javacc.parser.ParseGen#start
    public void start(boolean isJavaModernMode) throws MetaParseException {

        Token t = null;

        if (JavaCCErrors.get_error_count() != 0) {
            throw new MetaParseException();
        }

        if (Options.getBuildParser()) {
            final List<String> tn = new ArrayList<String>(toolNames);
            tn.add(toolName);

            // This is the first line generated -- the the comment line at the top of the generated parser
            genCodeLine("/* " + getIdString(tn, cu_name + ".java") + " */");

            boolean implementsExists = false;
            final boolean extendsExists = false;

            if (cu_to_insertion_point_1.size() != 0) {
                Object firstToken = cu_to_insertion_point_1.get(0);
                printTokenSetup((Token) firstToken);
                ccol = 1;
                for (final Iterator<Token> it = cu_to_insertion_point_1.iterator(); it.hasNext();) {
                    t = it.next();
                    if (t.kind == IMPLEMENTS) {
                        implementsExists = true;
                    } else if (t.kind == CLASS) {
                        implementsExists = false;
                    }

                    printToken(t);
                }
            }

            if (implementsExists) {
                genCode(", ");
            } else {
                genCode(" implements ");
            }
            genCode(cu_name + "Constants ");
            if (cu_to_insertion_point_2.size() != 0) {
                printTokenSetup((Token) (cu_to_insertion_point_2.get(0)));
                for (final Iterator<Token> it = cu_to_insertion_point_2.iterator(); it.hasNext();) {
                    printToken(it.next());
                }
            }

            genCodeLine("");
            genCodeLine("");

            new ParseEngine().build(this);

            if (Options.getStatic()) {
                genCodeLine("  static private " + Options.getBooleanType()
                        + " jj_initialized_once = false;");
            }
            if (Options.getUserTokenManager()) {
                genCodeLine("  /** User defined Token Manager. */");
                genCodeLine("  " + staticOpt() + "public TokenManager token_source;");
            } else {
                genCodeLine("  /** Generated Token Manager. */");
                genCodeLine("  " + staticOpt() + "public " + cu_name + "TokenManager token_source;");
                if (!Options.getUserCharStream()) {
                    if (Options.getJavaUnicodeEscape()) {
                        genCodeLine("  " + staticOpt() + "JavaCharStream jj_input_stream;");
                    } else {
                        genCodeLine("  " + staticOpt() + "SimpleCharStream jj_input_stream;");
                    }
                }
            }
            genCodeLine("  /** Current token. */");
            genCodeLine("  " + staticOpt() + "public Token token;");
            genCodeLine("  /** Next token. */");
            genCodeLine("  " + staticOpt() + "public Token jj_nt;");
            if (!Options.getCacheTokens()) {
                genCodeLine("  " + staticOpt() + "private int jj_ntk;");
            }
            if (Options.getDepthLimit() > 0) {
                genCodeLine("  " + staticOpt() + "private int jj_depth;");
            }
            if (jj2index != 0) {
                genCodeLine("  " + staticOpt() + "private Token jj_scanpos, jj_lastpos;");
                genCodeLine("  " + staticOpt() + "private int jj_la;");
                if (lookaheadNeeded) {
                    genCodeLine("  /** Whether we are looking ahead. */");
                    genCodeLine("  " + staticOpt() + "private " + Options.getBooleanType()
                            + " jj_lookingAhead = false;");
                    genCodeLine("  " + staticOpt() + "private " + Options.getBooleanType()
                            + " jj_semLA;");
                }
            }
            if (Options.getErrorReporting()) {
                genCodeLine("  " + staticOpt() + "private int jj_gen;");
                genCodeLine("  " + staticOpt() + "final private int[] jj_la1 = new int["
                        + maskindex + "];");
                final int tokenMaskSize = (tokenCount - 1) / 32 + 1;
                for (int i = 0; i < tokenMaskSize; i++) {
                    genCodeLine("  static private int[] jj_la1_" + i + ";");
                }
                genCodeLine("  static {");
                for (int i = 0; i < tokenMaskSize; i++) {
                    genCodeLine("       jj_la1_init_" + i + "();");
                }
                genCodeLine("    }");
                for (int i = 0; i < tokenMaskSize; i++) {
                    genCodeLine("    private static void jj_la1_init_" + i + "() {");
                    genCode("       jj_la1_" + i + " = new int[] {");
                    for (final Iterator it = maskVals.iterator(); it.hasNext();) {
                        final int[] tokenMask = (int[]) (it.next());
                        genCode("0x" + Integer.toHexString(tokenMask[i]) + ",");
                    }
                    genCodeLine("};");
                    genCodeLine("    }");
                }
            }
            if (jj2index != 0 && Options.getErrorReporting()) {
                genCodeLine("  " + staticOpt() + "final private JJCalls[] jj_2_rtns = new JJCalls["
                        + jj2index + "];");
                genCodeLine("  " + staticOpt() + "private " + Options.getBooleanType()
                        + " jj_rescan = false;");
                genCodeLine("  " + staticOpt() + "private int jj_gc = 0;");
            }
            genCodeLine("");

            if (Options.getDebugParser()) {
                genCodeLine("  {");
                genCodeLine("      enable_tracing();");
                genCodeLine("  }");
            }

            if (!Options.getUserTokenManager()) {
                if (Options.getUserCharStream()) {
                    genCodeLine("  /** Constructor with user supplied CharStream. */");
                    genCodeLine("  public " + cu_name + "(CharStream stream) {");
                    if (Options.getStatic()) {
                        genCodeLine("     if (jj_initialized_once) {");
                        genCodeLine("       System.out.println(\"ERROR: Second call to constructor of static parser.  \");");
                        genCodeLine("       System.out.println(\"       You must either use ReInit() "
                                + "or set the JavaCC option STATIC to false\");");
                        genCodeLine("       System.out.println(\"       during parser generation.\");");
                        genCodeLine("       throw new "+(Options.isLegacyExceptionHandling() ? "Error" : "RuntimeException")+"();");
                        genCodeLine("     }");
                        genCodeLine("     jj_initialized_once = true;");
                    }
                    if (Options.getTokenManagerUsesParser()) {
                        genCodeLine("     token_source = new " + cu_name
                                + "TokenManager(this, stream);");
                    } else {
                        genCodeLine("     token_source = new " + cu_name + "TokenManager(stream);");
                    }
                    genCodeLine("     token = new Token();");
                    if (Options.getCacheTokens()) {
                        genCodeLine("     token.next = jj_nt = token_source.getNextToken();");
                    } else {
                        genCodeLine("     jj_ntk = -1;");
                    }
                    if (Options.getDepthLimit() > 0) {
                        genCodeLine("    jj_depth = -1;");
                    }
                    if (Options.getErrorReporting()) {
                        genCodeLine("     jj_gen = 0;");
                        if (maskindex > 0) {
                            genCodeLine("     for (int i = 0; i < " + maskindex
                                    + "; i++) jj_la1[i] = -1;");
                        }
                        if (jj2index != 0) {
                            genCodeLine("     for (int i = 0; i < jj_2_rtns.length; i++) jj_2_rtns[i] = new JJCalls();");
                        }
                    }
                    genCodeLine("  }");
                    genCodeLine("");
                    genCodeLine("  /** Reinitialise. */");
                    genCodeLine("  " + staticOpt() + "public void ReInit(CharStream stream) {");

                    if (Options.isTokenManagerRequiresParserAccess()) {
                        genCodeLine("     token_source.ReInit(this,stream);");
                    } else {
                        genCodeLine("     token_source.ReInit(stream);");
                    }


                    genCodeLine("     token = new Token();");
                    if (Options.getCacheTokens()) {
                        genCodeLine("     token.next = jj_nt = token_source.getNextToken();");
                    } else {
                        genCodeLine("     jj_ntk = -1;");
                    }
                    if (Options.getDepthLimit() > 0) {
                        genCodeLine("    jj_depth = -1;");
                    }
                    if (lookaheadNeeded) {
                        genCodeLine("     jj_lookingAhead = false;");
                    }
                    if (jjtreeGenerated) {
                        genCodeLine("     jjtree.reset();");
                    }
                    if (Options.getErrorReporting()) {
                        genCodeLine("     jj_gen = 0;");
                        if (maskindex > 0) {
                            genCodeLine("     for (int i = 0; i < " + maskindex
                                    + "; i++) jj_la1[i] = -1;");
                        }
                        if (jj2index != 0) {
                            genCodeLine("     for (int i = 0; i < jj_2_rtns.length; i++) jj_2_rtns[i] = new JJCalls();");
                        }
                    }
                    genCodeLine("  }");
                } else {

                    if (!isJavaModernMode) {
                        genCodeLine("  /** Constructor with InputStream. */");
                        genCodeLine("  public " + cu_name + "(java.io.InputStream stream) {");
                        genCodeLine("      this(stream, null);");
                        genCodeLine("  }");
                        genCodeLine("  /** Constructor with InputStream and supplied encoding */");
                        genCodeLine("  public " + cu_name
                                + "(java.io.InputStream stream, String encoding) {");
                        if (Options.getStatic()) {
                            genCodeLine("     if (jj_initialized_once) {");
                            genCodeLine("       System.out.println(\"ERROR: Second call to constructor of static parser.  \");");
                            genCodeLine("       System.out.println(\"       You must either use ReInit() or "
                                    + "set the JavaCC option STATIC to false\");");
                            genCodeLine("       System.out.println(\"       during parser generation.\");");
                            genCodeLine("       throw new "+(Options.isLegacyExceptionHandling() ? "Error" : "RuntimeException")+"();");
                            genCodeLine("     }");
                            genCodeLine("     jj_initialized_once = true;");
                        }

                        if (Options.getJavaUnicodeEscape()) {
                            if (!Options.getGenerateChainedException()) {
                                genCodeLine("     try { jj_input_stream = new JavaCharStream(stream, encoding, 1, 1); } "
                                        + "catch(java.io.UnsupportedEncodingException e) {"
                                        + " throw new RuntimeException(e.getMessage()); }");
                            } else {
                                genCodeLine("     try { jj_input_stream = new JavaCharStream(stream, encoding, 1, 1); } "
                                        + "catch(java.io.UnsupportedEncodingException e) { throw new RuntimeException(e); }");
                            }
                        } else {
                            if (!Options.getGenerateChainedException()) {
                                genCodeLine("     try { jj_input_stream = new SimpleCharStream(stream, encoding, 1, 1); } "
                                        + "catch(java.io.UnsupportedEncodingException e) { "
                                        + "throw new RuntimeException(e.getMessage()); }");
                            } else {
                                genCodeLine("     try { jj_input_stream = new SimpleCharStream(stream, encoding, 1, 1); } "
                                        + "catch(java.io.UnsupportedEncodingException e) { throw new RuntimeException(e); }");
                            }
                        }
                        if (Options.getTokenManagerUsesParser() && !Options.getStatic()) {
                            genCodeLine("     token_source = new " + cu_name
                                    + "TokenManager(this, jj_input_stream);");
                        } else {
                            genCodeLine("     token_source = new " + cu_name
                                    + "TokenManager(jj_input_stream);");
                        }
                        genCodeLine("     token = new Token();");
                        if (Options.getCacheTokens()) {
                            genCodeLine("     token.next = jj_nt = token_source.getNextToken();");
                        } else {
                            genCodeLine("     jj_ntk = -1;");
                        }
                        if (Options.getDepthLimit() > 0) {
                            genCodeLine("    jj_depth = -1;");
                        }
                        if (Options.getErrorReporting()) {
                            genCodeLine("     jj_gen = 0;");
                            if (maskindex > 0) {
                                genCodeLine("     for (int i = 0; i < " + maskindex
                                        + "; i++) jj_la1[i] = -1;");
                            }
                            if (jj2index != 0) {
                                genCodeLine("     for (int i = 0; i < jj_2_rtns.length; i++) jj_2_rtns[i] = new JJCalls();");
                            }
                        }
                        genCodeLine("  }");
                        genCodeLine("");

                        genCodeLine("  /** Reinitialise. */");
                        genCodeLine("  " + staticOpt()
                                + "public void ReInit(java.io.InputStream stream) {");
                        genCodeLine("      ReInit(stream, null);");
                        genCodeLine("  }");

                        genCodeLine("  /** Reinitialise. */");
                        genCodeLine("  "
                                + staticOpt()
                                + "public void ReInit(java.io.InputStream stream, String encoding) {");
                        if (!Options.getGenerateChainedException()) {
                            genCodeLine("     try { jj_input_stream.ReInit(stream, encoding, 1, 1); } "
                                    + "catch(java.io.UnsupportedEncodingException e) { "
                                    + "throw new RuntimeException(e.getMessage()); }");
                        } else {
                            genCodeLine("     try { jj_input_stream.ReInit(stream, encoding, 1, 1); } "
                                    + "catch(java.io.UnsupportedEncodingException e) { throw new RuntimeException(e); }");
                        }

                        if (Options.isTokenManagerRequiresParserAccess()) {
                            genCodeLine("     token_source.ReInit(this,jj_input_stream);");
                        } else {
                            genCodeLine("     token_source.ReInit(jj_input_stream);");
                        }

                        genCodeLine("     token = new Token();");
                        if (Options.getCacheTokens()) {
                            genCodeLine("     token.next = jj_nt = token_source.getNextToken();");
                        } else {
                            genCodeLine("     jj_ntk = -1;");
                        }
                        if (Options.getDepthLimit() > 0) {
                            genCodeLine("    jj_depth = -1;");
                        }
                        if (jjtreeGenerated) {
                            genCodeLine("     jjtree.reset();");
                        }
                        if (Options.getErrorReporting()) {
                            genCodeLine("     jj_gen = 0;");
                            genCodeLine("     for (int i = 0; i < " + maskindex
                                    + "; i++) jj_la1[i] = -1;");
                            if (jj2index != 0) {
                                genCodeLine("     for (int i = 0; i < jj_2_rtns.length; i++) jj_2_rtns[i] = new JJCalls();");
                            }
                        }
                        genCodeLine("  }");
                        genCodeLine("");

                    }

                    final String readerInterfaceName = isJavaModernMode ? "Provider" : "java.io.Reader";
                    final String stringReaderClass = isJavaModernMode ? "StringProvider"
                            : "java.io.StringReader";


                    genCodeLine("  /** Constructor. */");
                    genCodeLine("  public " + cu_name + "(" + readerInterfaceName + " stream) {");
                    if (Options.getStatic()) {
                        genCodeLine("     if (jj_initialized_once) {");
                        genCodeLine("       System.out.println(\"ERROR: Second call to constructor of static parser. \");");
                        genCodeLine("       System.out.println(\"       You must either use ReInit() or "
                                + "set the JavaCC option STATIC to false\");");
                        genCodeLine("       System.out.println(\"       during parser generation.\");");
                        genCodeLine("       throw new "+(Options.isLegacyExceptionHandling() ? "Error" : "RuntimeException")+"();");
                        genCodeLine("     }");
                        genCodeLine("     jj_initialized_once = true;");
                    }
                    if (Options.getJavaUnicodeEscape()) {
                        genCodeLine("     jj_input_stream = new JavaCharStream(stream, 1, 1);");
                    } else {
                        genCodeLine("     jj_input_stream = new SimpleCharStream(stream, 1, 1);");
                    }
                    if (Options.getTokenManagerUsesParser() && !Options.getStatic()) {
                        genCodeLine("     token_source = new " + cu_name
                                + "TokenManager(this, jj_input_stream);");
                    } else {
                        genCodeLine("     token_source = new " + cu_name
                                + "TokenManager(jj_input_stream);");
                    }
                    genCodeLine("     token = new Token();");
                    if (Options.getCacheTokens()) {
                        genCodeLine("     token.next = jj_nt = token_source.getNextToken();");
                    } else {
                        genCodeLine("     jj_ntk = -1;");
                    }
                    if (Options.getDepthLimit() > 0) {
                        genCodeLine("    jj_depth = -1;");
                    }
                    if (Options.getErrorReporting()) {
                        genCodeLine("     jj_gen = 0;");
                        if (maskindex > 0) {
                            genCodeLine("     for (int i = 0; i < " + maskindex
                                    + "; i++) jj_la1[i] = -1;");
                        }
                        if (jj2index != 0) {
                            genCodeLine("     for (int i = 0; i < jj_2_rtns.length; i++) jj_2_rtns[i] = new JJCalls();");
                        }
                    }
                    genCodeLine("  }");
                    genCodeLine("");

                    // Add-in a string based constructor because its convenient (modern only to prevent regressions)
                    if (isJavaModernMode) {
                        genCodeLine("  /** Constructor. */");
                        genCodeLine("  public " + cu_name
                                + "(String dsl) throws ParseException, "+Options.getTokenMgrErrorClass() +" {");
                        genCodeLine("       this(new " + stringReaderClass + "(dsl));");
                        genCodeLine("  }");
                        genCodeLine("");

                        genCodeLine("  public void ReInit(String s) {");
                        genCodeLine("      ReInit(new " + stringReaderClass + "(s));");
                        genCodeLine("  }");

                    }


                    genCodeLine("  /** Reinitialise. */");
                    genCodeLine("  " + staticOpt() + "public void ReInit(" + readerInterfaceName
                            + " stream) {");
                    if (Options.getJavaUnicodeEscape()) {
                        genCodeLine("    if (jj_input_stream == null) {");
                        genCodeLine("       jj_input_stream = new JavaCharStream(stream, 1, 1);");
                        genCodeLine("    } else {");
                        genCodeLine("       jj_input_stream.ReInit(stream, 1, 1);");
                        genCodeLine("    }");
                    } else {
                        genCodeLine("    if (jj_input_stream == null) {");
                        genCodeLine("       jj_input_stream = new SimpleCharStream(stream, 1, 1);");
                        genCodeLine("    } else {");
                        genCodeLine("       jj_input_stream.ReInit(stream, 1, 1);");
                        genCodeLine("    }");
                    }

                    genCodeLine("    if (token_source == null) {");

                    if (Options.getTokenManagerUsesParser() && !Options.getStatic()) {
                        genCodeLine(" token_source = new " + cu_name + "TokenManager(this, jj_input_stream);");
                    } else {
                        genCodeLine(" token_source = new " + cu_name + "TokenManager(jj_input_stream);");
                    }

                    genCodeLine("    }");
                    genCodeLine("");

                    if (Options.isTokenManagerRequiresParserAccess()) {
                        genCodeLine("     token_source.ReInit(this,jj_input_stream);");
                    } else {
                        genCodeLine("     token_source.ReInit(jj_input_stream);");
                    }

                    genCodeLine("     token = new Token();");
                    if (Options.getCacheTokens()) {
                        genCodeLine("     token.next = jj_nt = token_source.getNextToken();");
                    } else {
                        genCodeLine("     jj_ntk = -1;");
                    }
                    if (Options.getDepthLimit() > 0) {
                        genCodeLine("    jj_depth = -1;");
                    }
                    if (jjtreeGenerated) {
                        genCodeLine("     jjtree.reset();");
                    }
                    if (Options.getErrorReporting()) {
                        genCodeLine("     jj_gen = 0;");
                        if (maskindex > 0) {
                            genCodeLine("     for (int i = 0; i < " + maskindex
                                    + "; i++) jj_la1[i] = -1;");
                        }
                        if (jj2index != 0) {
                            genCodeLine("     for (int i = 0; i < jj_2_rtns.length; i++) jj_2_rtns[i] = new JJCalls();");
                        }
                    }
                    genCodeLine("  }");

                }
            }
            genCodeLine("");
            if (Options.getUserTokenManager()) {
                genCodeLine("  /** Constructor with user supplied Token Manager. */");
                genCodeLine("  public " + cu_name + "(TokenManager tm) {");
            } else {
                genCodeLine("  /** Constructor with generated Token Manager. */");
                genCodeLine("  public " + cu_name + "(" + cu_name + "TokenManager tm) {");
            }
            if (Options.getStatic()) {
                genCodeLine("     if (jj_initialized_once) {");
                genCodeLine("       System.out.println(\"ERROR: Second call to constructor of static parser. \");");
                genCodeLine("       System.out.println(\"       You must either use ReInit() or "
                        + "set the JavaCC option STATIC to false\");");
                genCodeLine("       System.out.println(\"       during parser generation.\");");
                genCodeLine("       throw new "+(Options.isLegacyExceptionHandling() ? "Error" : "RuntimeException")+"();");
                genCodeLine("     }");
                genCodeLine("     jj_initialized_once = true;");
            }
            genCodeLine("     token_source = tm;");
            genCodeLine("     token = new Token();");
            if (Options.getCacheTokens()) {
                genCodeLine("     token.next = jj_nt = token_source.getNextToken();");
            } else {
                genCodeLine("     jj_ntk = -1;");
            }
            if (Options.getDepthLimit() > 0) {
                genCodeLine("    jj_depth = -1;");
            }
            if (Options.getErrorReporting()) {
                genCodeLine("     jj_gen = 0;");
                if (maskindex > 0) {
                    genCodeLine("     for (int i = 0; i < " + maskindex + "; i++) jj_la1[i] = -1;");
                }
                if (jj2index != 0) {
                    genCodeLine("     for (int i = 0; i < jj_2_rtns.length; i++) jj_2_rtns[i] = new JJCalls();");
                }
            }
            genCodeLine("  }");
            genCodeLine("");
            if (Options.getUserTokenManager()) {
                genCodeLine("  /** Reinitialise. */");
                genCodeLine("  public void ReInit(TokenManager tm) {");
            } else {
                genCodeLine("  /** Reinitialise. */");
                genCodeLine("  public void ReInit(" + cu_name + "TokenManager tm) {");
            }
            genCodeLine("     token_source = tm;");
            genCodeLine("     token = new Token();");
            if (Options.getCacheTokens()) {
                genCodeLine("     token.next = jj_nt = token_source.getNextToken();");
            } else {
                genCodeLine("     jj_ntk = -1;");
            }
            if (Options.getDepthLimit() > 0) {
                genCodeLine("    jj_depth = -1;");
            }
            if (jjtreeGenerated) {
                genCodeLine("     jjtree.reset();");
            }
            if (Options.getErrorReporting()) {
                genCodeLine("     jj_gen = 0;");
                if (maskindex > 0) {
                    genCodeLine("     for (int i = 0; i < " + maskindex + "; i++) jj_la1[i] = -1;");
                }
                if (jj2index != 0) {
                    genCodeLine("     for (int i = 0; i < jj_2_rtns.length; i++) jj_2_rtns[i] = new JJCalls();");
                }
            }
            genCodeLine("  }");
            genCodeLine("");
            genCodeLine("  " + staticOpt()
                    + "private Token jj_consume_token(int kind) throws ParseException {");
            if (Options.getCacheTokens()) {
                genCodeLine("     Token oldToken = token;");
                genCodeLine("     if ((token = jj_nt).next != null) jj_nt = jj_nt.next;");
                genCodeLine("     else jj_nt = jj_nt.next = token_source.getNextToken();");
            } else {
                genCodeLine("     Token oldToken;");
                genCodeLine("     if ((oldToken = token).next != null) token = token.next;");
                genCodeLine("     else token = token.next = token_source.getNextToken();");
                genCodeLine("     jj_ntk = -1;");
            }
            genCodeLine("     if (token.kind == kind) {");
            if (Options.getErrorReporting()) {
                genCodeLine("       jj_gen++;");
                if (jj2index != 0) {
                    genCodeLine("       if (++jj_gc > 100) {");
                    genCodeLine("         jj_gc = 0;");
                    genCodeLine("         for (int i = 0; i < jj_2_rtns.length; i++) {");
                    genCodeLine("           JJCalls c = jj_2_rtns[i];");
                    genCodeLine("           while (c != null) {");
                    genCodeLine("             if (c.gen < jj_gen) c.first = null;");
                    genCodeLine("             c = c.next;");
                    genCodeLine("           }");
                    genCodeLine("         }");
                    genCodeLine("       }");
                }
            }
            if (Options.getDebugParser()) {
                genCodeLine("       trace_token(token, \"\");");
            }
            genCodeLine("       return token;");
            genCodeLine("     }");
            if (Options.getCacheTokens()) {
                genCodeLine("     jj_nt = token;");
            }
            genCodeLine("     token = oldToken;");
            if (Options.getErrorReporting()) {
                genCodeLine("     jj_kind = kind;");
            }
            genCodeLine("     throw generateParseException();");
            genCodeLine("  }");
            genCodeLine("");
            if (jj2index != 0) {
                genCodeLine("  @SuppressWarnings(\"serial\")");
                genCodeLine("  static private final class LookaheadSuccess extends "+(Options.isLegacyExceptionHandling() ? "java.lang.Error" : "java.lang.RuntimeException")+" {");
                genCodeLine("    @Override");
                genCodeLine("    public Throwable fillInStackTrace() {");
                genCodeLine("      return this;");
                genCodeLine("    }");
                genCodeLine("  }");
                genCodeLine("  static private final LookaheadSuccess jj_ls = new LookaheadSuccess();");
                genCodeLine("  " + staticOpt() + "private " + Options.getBooleanType()
                        + " jj_scan_token(int kind) {");
                genCodeLine("     if (jj_scanpos == jj_lastpos) {");
                genCodeLine("       jj_la--;");
                genCodeLine("       if (jj_scanpos.next == null) {");
                genCodeLine("         jj_lastpos = jj_scanpos = jj_scanpos.next = token_source.getNextToken();");
                genCodeLine("       } else {");
                genCodeLine("         jj_lastpos = jj_scanpos = jj_scanpos.next;");
                genCodeLine("       }");
                genCodeLine("     } else {");
                genCodeLine("       jj_scanpos = jj_scanpos.next;");
                genCodeLine("     }");
                if (Options.getErrorReporting()) {
                    genCodeLine("     if (jj_rescan) {");
                    genCodeLine("       int i = 0; Token tok = token;");
                    genCodeLine("       while (tok != null && tok != jj_scanpos) { i++; tok = tok.next; }");
                    genCodeLine("       if (tok != null) jj_add_error_token(kind, i);");
                    if (Options.getDebugLookahead()) {
                        genCodeLine("     } else {");
                        genCodeLine("       trace_scan(jj_scanpos, kind);");
                    }
                    genCodeLine("     }");
                } else if (Options.getDebugLookahead()) {
                    genCodeLine("     trace_scan(jj_scanpos, kind);");
                }
                genCodeLine("     if (jj_scanpos.kind != kind) return true;");
                genCodeLine("     if (jj_la == 0 && jj_scanpos == jj_lastpos) throw jj_ls;");
                genCodeLine("     return false;");
                genCodeLine("  }");
                genCodeLine("");
            }
            genCodeLine("");
            genCodeLine("/** Get the next Token. */");
            genCodeLine("  " + staticOpt() + "final public Token getNextToken() {");
            if (Options.getCacheTokens()) {
                genCodeLine("     if ((token = jj_nt).next != null) jj_nt = jj_nt.next;");
                genCodeLine("     else jj_nt = jj_nt.next = token_source.getNextToken();");
            } else {
                genCodeLine("     if (token.next != null) token = token.next;");
                genCodeLine("     else token = token.next = token_source.getNextToken();");
                genCodeLine("     jj_ntk = -1;");
            }
            if (Options.getErrorReporting()) {
                genCodeLine("     jj_gen++;");
            }
            if (Options.getDebugParser()) {
                genCodeLine("       trace_token(token, \" (in getNextToken)\");");
            }
            genCodeLine("     return token;");
            genCodeLine("  }");
            genCodeLine("");
            genCodeLine("/** Get the specific Token. */");
            genCodeLine("  " + staticOpt() + "final public Token getToken(int index) {");
            if (lookaheadNeeded) {
                genCodeLine("     Token t = jj_lookingAhead ? jj_scanpos : token;");
            } else {
                genCodeLine("     Token t = token;");
            }
            genCodeLine("     for (int i = 0; i < index; i++) {");
            genCodeLine("       if (t.next != null) t = t.next;");
            genCodeLine("       else t = t.next = token_source.getNextToken();");
            genCodeLine("     }");
            genCodeLine("     return t;");
            genCodeLine("  }");
            genCodeLine("");
            if (!Options.getCacheTokens()) {
                genCodeLine("  " + staticOpt() + "private int jj_ntk_f() {");
                genCodeLine("     if ((jj_nt=token.next) == null)");
                genCodeLine("       return (jj_ntk = (token.next=token_source.getNextToken()).kind);");
                genCodeLine("     else");
                genCodeLine("       return (jj_ntk = jj_nt.kind);");
                genCodeLine("  }");
                genCodeLine("");
            }
            if (Options.getErrorReporting()) {
                if (!Options.getGenerateGenerics()) {
                    genCodeLine("  " + staticOpt()
                            + "private java.util.List jj_expentries = new java.util.ArrayList();");
                } else {
                    genCodeLine("  "
                            + staticOpt()
                            + "private java.util.List<int[]> jj_expentries = new java.util.ArrayList<int[]>();");
                }
                genCodeLine("  " + staticOpt() + "private int[] jj_expentry;");
                genCodeLine("  " + staticOpt() + "private int jj_kind = -1;");
                if (jj2index != 0) {
                    genCodeLine("  " + staticOpt() + "private int[] jj_lasttokens = new int[100];");
                    genCodeLine("  " + staticOpt() + "private int jj_endpos;");
                    genCodeLine("");
                    genCodeLine("  " + staticOpt()
                            + "private void jj_add_error_token(int kind, int pos) {");
                    genCodeLine("     if (pos >= 100) {");
                    genCodeLine("        return;");
                    genCodeLine("     }");
                    genCodeLine("");
                    genCodeLine("     if (pos == jj_endpos + 1) {");
                    genCodeLine("       jj_lasttokens[jj_endpos++] = kind;");
                    genCodeLine("     } else if (jj_endpos != 0) {");
                    genCodeLine("       jj_expentry = new int[jj_endpos];");
                    genCodeLine("");
                    genCodeLine("       for (int i = 0; i < jj_endpos; i++) {");
                    genCodeLine("         jj_expentry[i] = jj_lasttokens[i];");
                    genCodeLine("       }");
                    genCodeLine("");
                    if (!Options.getGenerateGenerics()) {
                        genCodeLine("       for (java.util.Iterator it = jj_expentries.iterator(); it.hasNext();) {");
                        genCodeLine("         int[] oldentry = (int[])(it.next());");
                    } else {
                        genCodeLine("       for (int[] oldentry : jj_expentries) {");
                    }

                    genCodeLine("         if (oldentry.length == jj_expentry.length) {");
                    genCodeLine("           boolean isMatched = true;");
                    genCodeLine("");
                    genCodeLine("           for (int i = 0; i < jj_expentry.length; i++) {");
                    genCodeLine("             if (oldentry[i] != jj_expentry[i]) {");
                    genCodeLine("               isMatched = false;");
                    genCodeLine("               break;");
                    genCodeLine("             }");
                    genCodeLine("");
                    genCodeLine("           }");
                    genCodeLine("           if (isMatched) {");
                    genCodeLine("             jj_expentries.add(jj_expentry);");
                    genCodeLine("             break;");
                    genCodeLine("           }");
                    genCodeLine("         }");
                    genCodeLine("       }");
                    genCodeLine("");
                    genCodeLine("       if (pos != 0) {");
                    genCodeLine("         jj_lasttokens[(jj_endpos = pos) - 1] = kind;");
                    genCodeLine("       }");
                    genCodeLine("     }");
                    genCodeLine("  }");
                }
                genCodeLine("");
                genCodeLine("  /** Generate ParseException. */");
                genCodeLine("  " + staticOpt() + "public ParseException generateParseException() {");
                genCodeLine("     jj_expentries.clear();");
                genCodeLine("     " + Options.getBooleanType() + "[] la1tokens = new "
                        + Options.getBooleanType() + "[" + tokenCount + "];");
                genCodeLine("     if (jj_kind >= 0) {");
                genCodeLine("       la1tokens[jj_kind] = true;");
                genCodeLine("       jj_kind = -1;");
                genCodeLine("     }");
                genCodeLine("     for (int i = 0; i < " + maskindex + "; i++) {");
                genCodeLine("       if (jj_la1[i] == jj_gen) {");
                genCodeLine("         for (int j = 0; j < 32; j++) {");
                for (int i = 0; i < (tokenCount - 1) / 32 + 1; i++) {
                    genCodeLine("           if ((jj_la1_" + i + "[i] & (1<<j)) != 0) {");
                    genCode("             la1tokens[");
                    if (i != 0) {
                        genCode((32 * i) + "+");
                    }
                    genCodeLine("j] = true;");
                    genCodeLine("           }");
                }
                genCodeLine("         }");
                genCodeLine("       }");
                genCodeLine("     }");
                genCodeLine("     for (int i = 0; i < " + tokenCount + "; i++) {");
                genCodeLine("       if (la1tokens[i]) {");
                genCodeLine("         jj_expentry = new int[1];");
                genCodeLine("         jj_expentry[0] = i;");
                genCodeLine("         jj_expentries.add(jj_expentry);");
                genCodeLine("       }");
                genCodeLine("     }");
                if (jj2index != 0) {
                    genCodeLine("     jj_endpos = 0;");
                    genCodeLine("     jj_rescan_token();");
                    genCodeLine("     jj_add_error_token(0, 0);");
                }
                genCodeLine("     int[][] exptokseq = new int[jj_expentries.size()][];");
                genCodeLine("     for (int i = 0; i < jj_expentries.size(); i++) {");
                if (!Options.getGenerateGenerics()) {
                    genCodeLine("       exptokseq[i] = (int[])jj_expentries.get(i);");
                } else {
                    genCodeLine("       exptokseq[i] = jj_expentries.get(i);");
                }
                genCodeLine("     }");


                if (isJavaModernMode) {
                    // Add the lexical state onto the exception message
                    genCodeLine("     return new ParseException(token, exptokseq, tokenImage, token_source == null ? null : " +cu_name+ "TokenManager.lexStateNames[token_source.curLexState]);");
                } else {
                    genCodeLine("     return new ParseException(token, exptokseq, tokenImage);");
                }

                genCodeLine("  }");
            } else {
                genCodeLine("  /** Generate ParseException. */");
                genCodeLine("  " + staticOpt() + "public ParseException generateParseException() {");
                genCodeLine("     Token errortok = token.next;");
                if (Options.getKeepLineColumn()) {
                    genCodeLine("     int line = errortok.beginLine, column = errortok.beginColumn;");
                }
                genCodeLine("     String mess = (errortok.kind == 0) ? tokenImage[0] : errortok.image;");
                if (Options.getKeepLineColumn()) {
                    genCodeLine("     return new ParseException("
                            + "\"Parse error at line \" + line + \", column \" + column + \".  "
                            + "Encountered: \" + mess);");
                } else {
                    genCodeLine("     return new ParseException(\"Parse error at <unknown location>.  "
                            + "Encountered: \" + mess);");
                }
                genCodeLine("  }");
            }
            genCodeLine("");

            genCodeLine("  " + staticOpt() + "private " + Options.getBooleanType()
                    + " trace_enabled;");
            genCodeLine("");
            genCodeLine("/** Trace enabled. */");
            genCodeLine("  " + staticOpt() + "final public boolean trace_enabled() {");
            genCodeLine("     return trace_enabled;");
            genCodeLine("  }");
            genCodeLine("");

            if (Options.getDebugParser()) {
                genCodeLine("  " + staticOpt() + "private int trace_indent = 0;");

                genCodeLine("/** Enable tracing. */");
                genCodeLine("  " + staticOpt() + "final public void enable_tracing() {");
                genCodeLine("     trace_enabled = true;");
                genCodeLine("  }");
                genCodeLine("");
                genCodeLine("/** Disable tracing. */");
                genCodeLine("  " + staticOpt() + "final public void disable_tracing() {");
                genCodeLine("     trace_enabled = false;");
                genCodeLine("  }");
                genCodeLine("");
                genCodeLine("  " + staticOpt() + "protected void trace_call(String s) {");
                genCodeLine("     if (trace_enabled) {");
                genCodeLine("       for (int i = 0; i < trace_indent; i++) { System.out.print(\" \"); }");
                genCodeLine("       System.out.println(\"Call:    \" + s);");
                genCodeLine("     }");
                genCodeLine("     trace_indent = trace_indent + 2;");
                genCodeLine("  }");
                genCodeLine("");
                genCodeLine("  " + staticOpt() + "protected void trace_return(String s) {");
                genCodeLine("     trace_indent = trace_indent - 2;");
                genCodeLine("     if (trace_enabled) {");
                genCodeLine("       for (int i = 0; i < trace_indent; i++) { System.out.print(\" \"); }");
                genCodeLine("       System.out.println(\"Return: \" + s);");
                genCodeLine("     }");
                genCodeLine("  }");
                genCodeLine("");
                genCodeLine("  " + staticOpt()
                        + "protected void trace_token(Token t, String where) {");
                genCodeLine("     if (trace_enabled) {");
                genCodeLine("       for (int i = 0; i < trace_indent; i++) { System.out.print(\" \"); }");
                genCodeLine("       System.out.print(\"Consumed token: <\" + tokenImage[t.kind]);");
                genCodeLine("       if (t.kind != 0 && !tokenImage[t.kind].equals(\"\\\"\" + t.image + \"\\\"\")) {");
                genCodeLine("         System.out.print(\": \\\"\" + "+Options.getTokenMgrErrorClass() + ".addEscapes("+"t.image) + \"\\\"\");");
                genCodeLine("       }");
                genCodeLine("       System.out.println(\" at line \" + t.beginLine + "
                        + "\" column \" + t.beginColumn + \">\" + where);");
                genCodeLine("     }");
                genCodeLine("  }");
                genCodeLine("");
                genCodeLine("  " + staticOpt() + "protected void trace_scan(Token t1, int t2) {");
                genCodeLine("     if (trace_enabled) {");
                genCodeLine("       for (int i = 0; i < trace_indent; i++) { System.out.print(\" \"); }");
                genCodeLine("       System.out.print(\"Visited token: <\" + tokenImage[t1.kind]);");
                genCodeLine("       if (t1.kind != 0 && !tokenImage[t1.kind].equals(\"\\\"\" + t1.image + \"\\\"\")) {");
                genCodeLine("         System.out.print(\": \\\"\" + "+Options.getTokenMgrErrorClass() + ".addEscapes("+"t1.image) + \"\\\"\");");
                genCodeLine("       }");
                genCodeLine("       System.out.println(\" at line \" + t1.beginLine + \""
                        + " column \" + t1.beginColumn + \">; Expected token: <\" + tokenImage[t2] + \">\");");
                genCodeLine("     }");
                genCodeLine("  }");
                genCodeLine("");
            } else {
                genCodeLine("  /** Enable tracing. */");
                genCodeLine("  " + staticOpt() + "final public void enable_tracing() {");
                genCodeLine("  }");
                genCodeLine("");
                genCodeLine("  /** Disable tracing. */");
                genCodeLine("  " + staticOpt() + "final public void disable_tracing() {");
                genCodeLine("  }");
                genCodeLine("");
            }

            if (jj2index != 0 && Options.getErrorReporting()) {
                genCodeLine("  " + staticOpt() + "private void jj_rescan_token() {");
                genCodeLine("     jj_rescan = true;");
                genCodeLine("     for (int i = 0; i < " + jj2index + "; i++) {");
                genCodeLine("       try {");
                genCodeLine("         JJCalls p = jj_2_rtns[i];");
                genCodeLine("");
                genCodeLine("         do {");
                genCodeLine("           if (p.gen > jj_gen) {");
                genCodeLine("             jj_la = p.arg; jj_lastpos = jj_scanpos = p.first;");
                genCodeLine("             switch (i) {");
                for (int i = 0; i < jj2index; i++) {
                    genCodeLine("               case " + i + ": jj_3_" + (i + 1) + "(); break;");
                }
                genCodeLine("             }");
                genCodeLine("           }");
                genCodeLine("           p = p.next;");
                genCodeLine("         } while (p != null);");
                genCodeLine("");
                genCodeLine("         } catch(LookaheadSuccess ls) { }");
                genCodeLine("     }");
                genCodeLine("     jj_rescan = false;");
                genCodeLine("  }");
                genCodeLine("");
                genCodeLine("  " + staticOpt() + "private void jj_save(int index, int xla) {");
                genCodeLine("     JJCalls p = jj_2_rtns[index];");
                genCodeLine("     while (p.gen > jj_gen) {");
                genCodeLine("       if (p.next == null) { p = p.next = new JJCalls(); break; }");
                genCodeLine("       p = p.next;");
                genCodeLine("     }");
                genCodeLine("");
                genCodeLine("     p.gen = jj_gen + xla - jj_la; ");
                genCodeLine("     p.first = token;");
                genCodeLine("     p.arg = xla;");
                genCodeLine("  }");
                genCodeLine("");
            }

            if (jj2index != 0 && Options.getErrorReporting()) {
                genCodeLine("  static final class JJCalls {");
                genCodeLine("     int gen;");
                genCodeLine("     Token first;");
                genCodeLine("     int arg;");
                genCodeLine("     JJCalls next;");
                genCodeLine("  }");
                genCodeLine("");
            }

            if (cu_from_insertion_point_2.size() != 0) {
                printTokenSetup((Token) (cu_from_insertion_point_2.get(0)));
                ccol = 1;
                for (final Iterator it = cu_from_insertion_point_2.iterator(); it.hasNext();) {
                    t = (Token) it.next();
                    printToken(t);
                }
                printTrailingComments(t);
            }
            genCodeLine("");

            saveOutput(Options.getOutputDirectory() + File.separator + cu_name
                    + getFileExtension(Options.getOutputLanguage()));

        } // matches "if (Options.getBuildParser())"

    }

  

5.4. LexGen 语法解析生成

  LexGen 生成语法树,将所有的产生式转换成相应的语法表示。

  // org.javacc.parser.LexGen#start
  public void start() throws IOException
  {
    if (!Options.getBuildTokenManager() ||
        Options.getUserTokenManager() ||
        JavaCCErrors.get_error_count() > 0)
      return;

    final String codeGeneratorClass = Options.getTokenManagerCodeGenerator();
    keepLineCol = Options.getKeepLineColumn();
    errorHandlingClass = Options.getTokenMgrErrorClass();
    List choices = new ArrayList();
    Enumeration e;
    TokenProduction tp;
    int i, j;

    staticString = (Options.getStatic() ? "static " : "");
    tokMgrClassName = cu_name + "TokenManager";

    if (!generateDataOnly && codeGeneratorClass == null) PrintClassHead();
    BuildLexStatesTable();

    e = allTpsForState.keys();

    boolean ignoring = false;

    while (e.hasMoreElements())
    {
      int startState = -1;
      NfaState.ReInit();
      RStringLiteral.ReInit();

      String key = (String)e.nextElement();

      lexStateIndex = GetIndex(key);
      lexStateSuffix = "_" + lexStateIndex;
      List<TokenProduction> allTps = (List<TokenProduction>)allTpsForState.get(key);
      initStates.put(key, initialState = new NfaState());
      ignoring = false;

      singlesToSkip[lexStateIndex] = new NfaState();
      singlesToSkip[lexStateIndex].dummy = true;

      if (key.equals("DEFAULT"))
        defaultLexState = lexStateIndex;

      for (i = 0; i < allTps.size(); i++)
      {
        tp = (TokenProduction)allTps.get(i);
        int kind = tp.kind;
        boolean ignore = tp.ignoreCase;
        List<RegExprSpec> rexps = tp.respecs;

        if (i == 0)
          ignoring = ignore;

        for (j = 0; j < rexps.size(); j++)
        {
          RegExprSpec respec = (RegExprSpec)rexps.get(j);
          curRE = respec.rexp;

          rexprs[curKind = curRE.ordinal] = curRE;
          lexStates[curRE.ordinal] = lexStateIndex;
          ignoreCase[curRE.ordinal] = ignore;

          if (curRE.private_rexp)
          {
            kinds[curRE.ordinal] = -1;
            continue;
          }

          if (!Options.getNoDfa() && curRE instanceof RStringLiteral &&
              !((RStringLiteral)curRE).image.equals(""))
          {
            ((RStringLiteral)curRE).GenerateDfa(this, curRE.ordinal);
            if (i != 0 && !mixed[lexStateIndex] && ignoring != ignore) {
              mixed[lexStateIndex] = true;
            }
          }
          else if (curRE.CanMatchAnyChar())
          {
            if (canMatchAnyChar[lexStateIndex] == -1 ||
                canMatchAnyChar[lexStateIndex] > curRE.ordinal)
              canMatchAnyChar[lexStateIndex] = curRE.ordinal;
          }
          else
          {
            Nfa temp;

            if (curRE instanceof RChoice)
              choices.add(curRE);

            temp = curRE.GenerateNfa(ignore);
            temp.end.isFinal = true;
            temp.end.kind = curRE.ordinal;
            initialState.AddMove(temp.start);
          }

          if (kinds.length < curRE.ordinal)
          {
            int[] tmp = new int[curRE.ordinal + 1];

            System.arraycopy(kinds, 0, tmp, 0, kinds.length);
            kinds = tmp;
          }
          //System.out.println("   ordina : " + curRE.ordinal);

          kinds[curRE.ordinal] = kind;

          if (respec.nextState != null &&
              !respec.nextState.equals(lexStateName[lexStateIndex]))
            newLexState[curRE.ordinal] = respec.nextState;

          if (respec.act != null && respec.act.getActionTokens() != null &&
              respec.act.getActionTokens().size() > 0)
            actions[curRE.ordinal] = respec.act;

          switch(kind)
          {
          case TokenProduction.SPECIAL :
            hasSkipActions |= (actions[curRE.ordinal] != null) ||
            (newLexState[curRE.ordinal] != null);
            hasSpecial = true;
            toSpecial[curRE.ordinal / 64] |= 1L << (curRE.ordinal % 64);
            toSkip[curRE.ordinal / 64] |= 1L << (curRE.ordinal % 64);
            break;
          case TokenProduction.SKIP :
            hasSkipActions |= (actions[curRE.ordinal] != null);
            hasSkip = true;
            toSkip[curRE.ordinal / 64] |= 1L << (curRE.ordinal % 64);
            break;
          case TokenProduction.MORE :
            hasMoreActions |= (actions[curRE.ordinal] != null);
            hasMore = true;
            toMore[curRE.ordinal / 64] |= 1L << (curRE.ordinal % 64);

            if (newLexState[curRE.ordinal] != null)
              canReachOnMore[GetIndex(newLexState[curRE.ordinal])] = true;
            else
              canReachOnMore[lexStateIndex] = true;

            break;
          case TokenProduction.TOKEN :
            hasTokenActions |= (actions[curRE.ordinal] != null);
            toToken[curRE.ordinal / 64] |= 1L << (curRE.ordinal % 64);
            break;
          }
        }
      }

      // Generate a static block for initializing the nfa transitions
      NfaState.ComputeClosures();

      for (i = 0; i < initialState.epsilonMoves.size(); i++)
        ((NfaState)initialState.epsilonMoves.elementAt(i)).GenerateCode();

      if (hasNfa[lexStateIndex] = (NfaState.generatedStates != 0))
      {
        initialState.GenerateCode();
        startState = initialState.GenerateInitMoves(this);
      }

      if (initialState.kind != Integer.MAX_VALUE && initialState.kind != 0)
      {
        if ((toSkip[initialState.kind / 64] & (1L << initialState.kind)) != 0L ||
            (toSpecial[initialState.kind / 64] & (1L << initialState.kind)) != 0L)
          hasSkipActions = true;
        else if ((toMore[initialState.kind / 64] & (1L << initialState.kind)) != 0L)
          hasMoreActions = true;
        else
          hasTokenActions = true;

        if (initMatch[lexStateIndex] == 0 ||
            initMatch[lexStateIndex] > initialState.kind)
        {
          initMatch[lexStateIndex] = initialState.kind;
          hasEmptyMatch = true;
        }
      }
      else if (initMatch[lexStateIndex] == 0)
        initMatch[lexStateIndex] = Integer.MAX_VALUE;

      RStringLiteral.FillSubString();

      if (hasNfa[lexStateIndex] && !mixed[lexStateIndex])
        RStringLiteral.GenerateNfaStartStates(this, initialState);

      if (generateDataOnly || codeGeneratorClass != null) {
        RStringLiteral.UpdateStringLiteralData(totalNumStates, lexStateIndex);
        NfaState.UpdateNfaData(totalNumStates, startState, lexStateIndex,
                               canMatchAnyChar[lexStateIndex]);
      } else {
        RStringLiteral.DumpDfaCode(this);
        if (hasNfa[lexStateIndex]) {
          NfaState.DumpMoveNfa(this);
        }
      }
      totalNumStates += NfaState.generatedStates;
      if (stateSetSize < NfaState.generatedStates)
        stateSetSize = NfaState.generatedStates;
    }

    for (i = 0; i < choices.size(); i++)
      ((RChoice)choices.get(i)).CheckUnmatchability();

    CheckEmptyStringMatch();

    if (generateDataOnly || codeGeneratorClass != null) {
      tokenizerData.setParserName(cu_name);
      NfaState.BuildTokenizerData(tokenizerData);
      RStringLiteral.BuildTokenizerData(tokenizerData);
      int[] newLexStateIndices = new int[maxOrdinal];
      StringBuilder tokenMgrDecls = new StringBuilder();
      if (token_mgr_decls != null && token_mgr_decls.size() > 0) {
        Token t = (Token)token_mgr_decls.get(0);
        for (j = 0; j < token_mgr_decls.size(); j++) {
          tokenMgrDecls.append(((Token)token_mgr_decls.get(j)).image + " ");
        }
      }
      tokenizerData.setDecls(tokenMgrDecls.toString());
      Map<Integer, String> actionStrings = new HashMap<Integer, String>();
      for (i = 0; i < maxOrdinal; i++) {
        if (newLexState[i] == null) {
          newLexStateIndices[i] = -1;
        } else {
          newLexStateIndices[i] = GetIndex(newLexState[i]);
        }
        // For java, we have this but for other languages, eventually we will
        // simply have a string.
        Action act = actions[i];
        if (act == null) continue;
        StringBuilder sb = new StringBuilder();
        for (int k = 0; k < act.getActionTokens().size(); k++) {
          sb.append(((Token)act.getActionTokens().get(k)).image);
          sb.append(" ");
        }
        actionStrings.put(i, sb.toString());
      }
      tokenizerData.setDefaultLexState(defaultLexState);
      tokenizerData.setLexStateNames(lexStateName);
      tokenizerData.updateMatchInfo(
          actionStrings, newLexStateIndices,
          toSkip, toSpecial, toMore, toToken);
      if (generateDataOnly) return;
      Class<TokenManagerCodeGenerator> codeGenClazz;
      TokenManagerCodeGenerator gen;
      try {
        codeGenClazz = (Class<TokenManagerCodeGenerator>)Class.forName(codeGeneratorClass);
        gen = codeGenClazz.newInstance();
      } catch(Exception ee) {
        JavaCCErrors.semantic_error(
            "Could not load the token manager code generator class: " +
            codeGeneratorClass + "\nError: " + ee.getMessage());
        return;
      }
      gen.generateCode(tokenizerData);
      gen.finish(tokenizerData);
      return;
    }

    RStringLiteral.DumpStrLiteralImages(this);
    DumpFillToken();
    NfaState.DumpStateSets(this);
    NfaState.DumpNonAsciiMoveMethods(this);
    DumpGetNextToken();

    if (Options.getDebugTokenManager())
    {
      NfaState.DumpStatesForKind(this);
      DumpDebugMethods();
    }

    if (hasLoop)
    {
      genCodeLine(staticString + "int[] jjemptyLineNo = new int[" + maxLexStates + "];");
      genCodeLine(staticString + "int[] jjemptyColNo = new int[" + maxLexStates + "];");
      genCodeLine(staticString + "" + Options.getBooleanType() + "[] jjbeenHere = new " + Options.getBooleanType() + "[" + maxLexStates + "];");
    }

    DumpSkipActions();
    DumpMoreActions();
    DumpTokenActions();

    NfaState.PrintBoilerPlate(this);

    String charStreamName;
    if (Options.getUserCharStream())
      charStreamName = "CharStream";
    else
    {
      if (Options.getJavaUnicodeEscape())
        charStreamName = "JavaCharStream";
      else
        charStreamName = "SimpleCharStream";
    }

    writeTemplate(BOILERPLATER_METHOD_RESOURCE_URL,
      "charStreamName", charStreamName,
      "lexStateNameLength", lexStateName.length,
      "defaultLexState", defaultLexState,
      "noDfa", Options.getNoDfa(),
      "generatedStates", totalNumStates);

    DumpStaticVarDeclarations(charStreamName);
    genCodeLine(/*{*/ "}");

    // TODO :: CBA --  Require Unification of output language specific processing into a single Enum class
    String fileName = Options.getOutputDirectory() + File.separator +
                      tokMgrClassName +
                      getFileExtension(Options.getOutputLanguage());

    if (Options.getBuildParser()) {
      saveOutput(fileName);
    }
  }

  

5.5. 生成其他辅助类

  OtherFilesGen 生成其他几个辅助类,比如Toekn, ParseException...

  // org.javacc.parser.OtherFilesGen#start
  static public void start(boolean isJavaModern) throws MetaParseException {

    JavaResourceTemplateLocations templateLoc = isJavaModern ? JavaFiles.RESOURCES_JAVA_MODERN : JavaFiles.RESOURCES_JAVA_CLASSIC;

    Token t = null;

    if (JavaCCErrors.get_error_count() != 0) throw new MetaParseException();

      // Added this if condition -- 2012/10/17 -- cba
    if ( Options.isGenerateBoilerplateCode()) {

        if (isJavaModern) {
            JavaFiles.gen_JavaModernFiles();
        }

        JavaFiles.gen_TokenMgrError(templateLoc);
        JavaFiles.gen_ParseException(templateLoc);
        JavaFiles.gen_Token(templateLoc);
    }


    if (Options.getUserTokenManager()) {
        // CBA -- I think that Token managers are unique so will always be generated
      JavaFiles.gen_TokenManager(templateLoc);
    } else if (Options.getUserCharStream()) {
        // Added this if condition -- 2012/10/17 -- cba
        if (Options.isGenerateBoilerplateCode()) {
            JavaFiles.gen_CharStream(templateLoc);
        }
    } else {
           // Added this if condition -- 2012/10/17 -- cba

        if (Options.isGenerateBoilerplateCode()) {
          if (Options.getJavaUnicodeEscape()) {
            JavaFiles.gen_JavaCharStream(templateLoc);
          } else {
            JavaFiles.gen_SimpleCharStream(templateLoc);
          }
        }
    }

    try {
      ostr = new java.io.PrintWriter(
                new java.io.BufferedWriter(
                   new java.io.FileWriter(
                     new java.io.File(Options.getOutputDirectory(), cu_name + CONSTANTS_FILENAME_SUFFIX)
                   ),
                   8192
                )
             );
    } catch (java.io.IOException e) {
      JavaCCErrors.semantic_error("Could not open file " + cu_name + "Constants.java for writing.");
      throw new Error();
    }

    List<String> tn = new ArrayList<String>(toolNames);
    tn.add(toolName);
    ostr.println("/* " + getIdString(tn, cu_name + CONSTANTS_FILENAME_SUFFIX) + " */");

    if (cu_to_insertion_point_1.size() != 0 &&
        ((Token)cu_to_insertion_point_1.get(0)).kind == PACKAGE
       ) {
      for (int i = 1; i < cu_to_insertion_point_1.size(); i++) {
        if (((Token)cu_to_insertion_point_1.get(i)).kind == SEMICOLON) {
          printTokenSetup((Token)(cu_to_insertion_point_1.get(0)));
          for (int j = 0; j <= i; j++) {
            t = (Token)(cu_to_insertion_point_1.get(j));
            printToken(t, ostr);
          }
          printTrailingComments(t, ostr);
          ostr.println("");
          ostr.println("");
          break;
        }
      }
    }
    ostr.println("");
    ostr.println("/**");
    ostr.println(" * Token literal values and constants.");
    ostr.println(" * Generated by org.javacc.parser.OtherFilesGen#start()");
    ostr.println(" */");

    if(Options.getSupportClassVisibilityPublic()) {
        ostr.print("public ");
    }
    ostr.println("interface " + cu_name + "Constants {");
    ostr.println("");

    RegularExpression re;
    ostr.println("  /** End of File. */");
    ostr.println("  int EOF = 0;");
    for (java.util.Iterator<RegularExpression> it = ordered_named_tokens.iterator(); it.hasNext();) {
      re = it.next();
      ostr.println("  /** RegularExpression Id. */");
      ostr.println("  int " + re.label + " = " + re.ordinal + ";");
    }
    ostr.println("");
    if (!Options.getUserTokenManager() && Options.getBuildTokenManager()) {
      for (int i = 0; i < Main.lg.lexStateName.length; i++) {
        ostr.println("  /** Lexical state. */");
        ostr.println("  int " + LexGen.lexStateName[i] + " = " + i + ";");
      }
      ostr.println("");
    }
    ostr.println("  /** Literal token values. */");
    ostr.println("  String[] tokenImage = {");
    ostr.println("    \"<EOF>\",");

    for (java.util.Iterator<TokenProduction> it = rexprlist.iterator(); it.hasNext();) {
      TokenProduction tp = (TokenProduction)(it.next());
      List<RegExprSpec> respecs = tp.respecs;
      for (java.util.Iterator<RegExprSpec> it2 = respecs.iterator(); it2.hasNext();) {
        RegExprSpec res = (RegExprSpec)(it2.next());
        re = res.rexp;
        ostr.print("    ");
        if (re instanceof RStringLiteral) {
          ostr.println("\"\\\"" + add_escapes(add_escapes(((RStringLiteral)re).image)) + "\\\"\",");
        } else if (!re.label.equals("")) {
          ostr.println("\"<" + re.label + ">\",");
        } else {
          if (re.tpContext.kind == TokenProduction.TOKEN) {
            JavaCCErrors.warning(re, "Consider giving this non-string token a label for better error reporting.");
          }
          ostr.println("\"<token of kind " + re.ordinal + ">\",");
        }

      }
    }
    ostr.println("  };");
    ostr.println("");
    ostr.println("}");

    ostr.close();

  }

  以上解析,感悟,待完善中。

 

posted @ 2021-08-08 17:21  阿牛20  阅读(2006)  评论(0编辑  收藏  举报