用正则表达式做词法分析
前段时间要做一个条件过滤相关的控件,其中有一个就是给一个表达式字符串,然后构建一个UI出来。
要通过字符串构建UI,第一步肯定是字符串解析。在编译原理里面,解析分2不走,词法分析,语法分析。
这个task只给了我很短时间,并且这个表达式语法也不复杂,再就是也没有什么性能的要求。综合考虑,词法分析就用正则表达式做了。
词法分析函数如下:
代码
public static string[] Syntax(string exp)
{
Regex[] regs = {//place your custom syntax here
new Regex(@"^\s*\("), //'('
new Regex(@"^\s*\)"), //')'
new Regex(@"^\s*AND(?=[\W])"), //'AND'
new Regex(@"^\s*NOT(?=[\W])"), //NOT
new Regex(@"^\s*OR(?=[\W])"), //OR
new Regex(@"^\s*\w*\s*(=|<=|<>|>=|>)\s*(\w*|N'[\w]*')(?=\s*\))"),//field=1234 or field =N'abc_1234'
new Regex(@"^\s*\w*.\w*\(\s*\w*,(\s*\w*|N'\w*')\)(?=\s*\))"),//pairmatch.method1(field,value);
};
List<string> words = new List<string>();
int cursor = 0;
while (cursor < exp.Length)
{
bool matched = false;
foreach (Regex item in regs)
{
Match m = item.Match(exp.Substring(cursor));
if (m.Captures.Count > 0)
{
words.Add(m.Captures[0].Value.Trim(' '));
cursor += m.Captures[0].Value.Length;
matched = true;
break;
}
}
if (!matched)
{
System.Console.WriteLine("Error happened at near of {0}", exp.Substring(cursor, 12));
throw new Exception("Syntax Error in expression:\n" + exp);
}
}
return words.ToArray();
}
{
Regex[] regs = {//place your custom syntax here
new Regex(@"^\s*\("), //'('
new Regex(@"^\s*\)"), //')'
new Regex(@"^\s*AND(?=[\W])"), //'AND'
new Regex(@"^\s*NOT(?=[\W])"), //NOT
new Regex(@"^\s*OR(?=[\W])"), //OR
new Regex(@"^\s*\w*\s*(=|<=|<>|>=|>)\s*(\w*|N'[\w]*')(?=\s*\))"),//field=1234 or field =N'abc_1234'
new Regex(@"^\s*\w*.\w*\(\s*\w*,(\s*\w*|N'\w*')\)(?=\s*\))"),//pairmatch.method1(field,value);
};
List<string> words = new List<string>();
int cursor = 0;
while (cursor < exp.Length)
{
bool matched = false;
foreach (Regex item in regs)
{
Match m = item.Match(exp.Substring(cursor));
if (m.Captures.Count > 0)
{
words.Add(m.Captures[0].Value.Trim(' '));
cursor += m.Captures[0].Value.Length;
matched = true;
break;
}
}
if (!matched)
{
System.Console.WriteLine("Error happened at near of {0}", exp.Substring(cursor, 12));
throw new Exception("Syntax Error in expression:\n" + exp);
}
}
return words.ToArray();
}
这是个很简陋的词法分析器。如果要添加新的切词规则,只要在regs里面添加对应的正则表达式就可以了。函数返回的是一个字符串数组(也就是切好后的词)。