语法解析器续:case..when表达式计算
之前写过一篇博客,是关于如何解析类似sql之类的解析器实现参考:https://www.cnblogs.com/yougewe/p/13774289.html
之前的解析器,更多的是是做语言的翻译转换工作,并不涉及具体的数据运算。而且抛弃了许多上下文关联语法处理,所以相对还是简单的。
那么,如果我们想做一下数据运算呢?比如我给你一些值,然后给你一个表达式,你可以给出其运算结果吗?
1. 表达式运算难度如何?
比如,已知表达式为, field1 > 0 and field2 > 0, 然后已知道值 field1 = 1, field2 = 2; 那么,此运算结果必当为true。这很理所当然!
但以上,仅为人工处理,自己用大脑做了下运算,得到结果。如果转换为代码,又当如何?
我想,我们至少要做这么几件事:
1. 解析出所有字段有field1, field2;
2. 解析出比较运算符 >;
3. 解析出右边具体的比较值;
4. 解析出连接运算符and;
5. 做所有的比较运算;
6. 关联优先级得到最终结果;
怎么样?现在还觉得很简单吗?如果是,请收下我的膝盖!
但是,如果真要做这种泛化的场景,那就相当相当复杂了,要知道类似于HIVE之类的重量级产品,语法解析都是其中重要的组成部分。实际上,这可能涉及到相当多的语言规范需要做了。所以,必然超出我们的简化理解范围。
所以,我这里仅挑一个简单场景做解析:即如题所说,case..when..的解析。
所以,我们可以范围缩减为,给定表达式:case when field1 > 0 then 'f1' else 'fn' end; 的判断解析。比如给定值 field1=1, 则应得到结果 f1, 如果给定值 field1=0, 则应得到结果 fn.
在划定范围之后,好像更有了目标感了。但是问题真的简单了吗?实际上,还是有相当多的分支需要处理的,因为case..when..中可以嵌套其他语法。所以,我们只能尽力而为了。
2. case..when..表达式运算的实现
命题确立之后,我们可以开始着手如何实现了。如上描述,我们有两个已知条件:表达式和基础值。
基于上一篇文章的解析,我们基本可以快速得到所有组成case when 的元素token信息了。这就为我们省去了不少事。这里,我着重给一个如何获取整个case..when..词句的实现,使其可形成一个独立的词组。
// 将case..when.. 归结为sql类关键词的实现中 public SqlKeywordAstHandler(TokenDescriptor masterToken, Iterator<TokenDescriptor> candidates, TokenTypeEnum tokenType) { super(masterToken, candidates, tokenType); String word = masterToken.getRawWord().toLowerCase(); if("case".equals(word)) { completeCaseWhenTokens(candidates); } } /** * 实例case...when... 词汇列表 * * @param candidates 待用词汇 */ private void completeCaseWhenTokens(Iterator<TokenDescriptor> candidates) { boolean syntaxClosed = false; while (candidates.hasNext()) { TokenDescriptor token = candidates.next(); addExtendToken(token); if("end".equalsIgnoreCase(token.getRawWord())) { syntaxClosed = true; break; } } if(!syntaxClosed) { throw new SyntaxException("语法错误:case..when..未闭合"); } }
以上,就是获取case..when..词组的方法了,主要就是从case开始,到end结束,中间的所有词根,都被划作其范围。当然,还有一个重要的点,是将数据字段找出来,放到可取到的地方。
有了一个个独立的元素,我们就可以进行语义分析了。该分析可以放在该解析器中,但也许并不会太通用,所以,此处我将其抽象为一个单独的值运算类。在需要的地方,再实例化该运算类,即可。核心问题如上一节中描述,具体实现代码如下:
import com.my.mvc.app.common.exception.SyntaxException; import com.my.mvc.app.common.helper.parser.SyntaxStatement; import com.my.mvc.app.common.helper.parser.TokenDescriptor; import com.my.mvc.app.common.helper.parser.TokenTypeEnum; import lombok.extern.slf4j.Slf4j; import org.apache.commons.lang3.StringUtils; import java.math.BigDecimal; import java.util.*; import java.util.stream.Collectors; /** * 功能描述: case..when.. 真实数据运算帮助类 * */ @Slf4j public class CaseWhenElDataCalcHelper { /** * case when 完整语法 */ private SyntaxStatement caseWhenStmt; public CaseWhenElDataCalcHelper(SyntaxStatement caseWhenStmt) { this.caseWhenStmt = caseWhenStmt; } /** * 计算case..when的结果 * * @param suppliers 原始所有值 * @return 最终计算出的值 */ public String calcCaseWhenData(Map<String, String> suppliers) { List<TokenDescriptor> allTokens = caseWhenStmt.getAllTokens(); TokenDescriptor masterToken = allTokens.get(0); if(!"case".equalsIgnoreCase(masterToken.getRawWord())) { throw new SyntaxException("不是case..when..表达式"); } int tokenLen = allTokens.size(); if(tokenLen < 3) { throw new SyntaxException("case..when..表达式语法错误"); } TokenDescriptor closureToken = allTokens.get(tokenLen - 1); if(!"end".equalsIgnoreCase(closureToken.getRawWord())) { throw new SyntaxException("case..when..表达式未闭合"); } // 暂只支持 case when xxx then xxx... end 语法 // 不支持 case field_name when 1 then '1'... end, 即单字段判定不支持 List<TokenDescriptor> whenExpressionCandidates; for (int i = 1; i < tokenLen - 1; i++) { whenExpressionCandidates = new ArrayList<>(); TokenDescriptor currentToken = allTokens.get(i); if("when".equalsIgnoreCase(currentToken.getRawWord())) { // 需走各分支逻辑 while (i + 1 < tokenLen) { TokenDescriptor nextToken = allTokens.get(i + 1); if("then".equalsIgnoreCase(nextToken.getRawWord())) { break; } whenExpressionCandidates.add(nextToken); ++i; } if(judgeWhenExpression(whenExpressionCandidates, suppliers)) { List<TokenDescriptor> resultCandidates = scrapeCaseWhenResultCandidates(allTokens, i + 1); return calcExpressionData(resultCandidates, suppliers); } // 直接进入下一轮迭代,then后面为空迭代 } if("else".equalsIgnoreCase(currentToken.getRawWord())) { List<TokenDescriptor> resultCandidates = scrapeCaseWhenResultCandidates(allTokens, i); return calcExpressionData(resultCandidates, suppliers); } } return null; } /** * 捞出所有的结果运算token列表 * * @param allTokens 全局token表 * @param start 偏移量 * @return 获取到的所有结果运算token */ private List<TokenDescriptor> scrapeCaseWhenResultCandidates(List<TokenDescriptor> allTokens, int start) { List<TokenDescriptor> resultCandidates = new ArrayList<>(); while (start + 1 < allTokens.size()) { TokenDescriptor nextToken = allTokens.get(start + 1); String word = nextToken.getRawWord(); if("when".equalsIgnoreCase(word) || "else".equalsIgnoreCase(word) || "end".equalsIgnoreCase(word)) { break; } resultCandidates.add(nextToken); ++start; } return resultCandidates; } /** * 判断when条件是否成立 * * @param operatorCandidates 可供运算的表达式token列表 * @param suppliers 原始字段取值来源 * @return true:符合该判定,false:判定失败 */ private boolean judgeWhenExpression(List<TokenDescriptor> operatorCandidates, Map<String, String> suppliers) { List<AndOrOperatorSupervisor> supervisors = partitionByPriority(operatorCandidates); Boolean prevJudgeSuccess = null; for (AndOrOperatorSupervisor calc1 : supervisors) { Map<String, List<TokenDescriptor>> unitGroup = calc1.getUnitGroupTokens(); String op = unitGroup.get("OP").get(0).getRawWord(); if(calc1.getPrevType() != null && prevJudgeSuccess != null && !prevJudgeSuccess) { // 上一个and运算为false, 本次直接为false if("and".equalsIgnoreCase(calc1.getPrevType().getRawWord())) { continue; } } String leftValue = calcExpressionData(unitGroup.get("LEFT"), suppliers); TokenTypeEnum resultType = getPriorDataTypeByTokenList(unitGroup.get("RIGHT")); boolean myJudgeSuccess; if("in".equals(op)) { myJudgeSuccess = checkExistsIn(leftValue, unitGroup.get("RIGHT"), resultType); } else if("notin".equals(op)) { myJudgeSuccess = !checkExistsIn(leftValue, unitGroup.get("RIGHT"), resultType); } else { String rightValue = calcExpressionData(unitGroup.get("RIGHT"), suppliers); myJudgeSuccess = checkCompareTrue(leftValue, op, rightValue, resultType); } TokenDescriptor prevType = calc1.getPrevType(); TokenDescriptor nextType = calc1.getNextType(); // 单条件判定 if(prevType == null && nextType == null) { return myJudgeSuccess; } if(nextType == null) { return myJudgeSuccess; } prevJudgeSuccess = myJudgeSuccess; if("and".equalsIgnoreCase(nextType.getRawWord())) { if(!myJudgeSuccess && calc1.getNeedMoreStepsAfterMyFail() <= 0) { return false; } continue; } if("or".equalsIgnoreCase(nextType.getRawWord())) { if(myJudgeSuccess) { return true; } continue; } log.warn("解析到未知的next连接判定符:{}", nextType); throw new SyntaxException("语法解析错误"); } log.warn("未判定出结果,使用默认返回,请检查"); return false; } /** * 根据值信息推断运算数据类型 * * @param tokenList 结果列表(待运算) * @return 计算的数据类型,数字或字符串 */ private TokenTypeEnum getPriorDataTypeByTokenList(List<TokenDescriptor> tokenList) { for (TokenDescriptor token : tokenList) { if(token.getTokenType() == TokenTypeEnum.WORD_STRING) { return TokenTypeEnum.WORD_STRING; } } return TokenTypeEnum.WORD_NUMBER; } /** * 运算返回具体的 判定值 * * @param resultCandidates 结果表达式token列表 * @param suppliers 原始字段取值来源 * @return true:符合该判定,false:判定失败 */ private String calcExpressionData(List<TokenDescriptor> resultCandidates, Map<String, String> suppliers) { // 暂时假设结果中不再提供运算处理 TokenDescriptor first = resultCandidates.get(0); if(first.getTokenType() == TokenTypeEnum.WORD_NORMAL) { if("null".equalsIgnoreCase(first.getRawWord())) { return null; } return suppliers.get(first.getRawWord()); } return unwrapStringToken(first.getRawWord()); } /** * 判断给定值是否在列表中 * * @param aValue 要判定的值 * @param itemList 范围表 * @return true:成立, false:不在其中 */ private boolean checkExistsIn(String aValue, List<TokenDescriptor> itemList, TokenTypeEnum valueType) { if(aValue == null) { return false; } BigDecimal aValueNumber = null; for (TokenDescriptor tk1 : itemList) { if(valueType == TokenTypeEnum.WORD_NUMBER) { if(aValueNumber == null) { aValueNumber = new BigDecimal(aValue); } if(aValueNumber.compareTo( new BigDecimal(tk1.getRawWord())) == 0) { return true; } continue; } if(aValue.equals(unwrapStringToken(tk1.getRawWord()))) { return true; } } return false; } /** * 将字符串两边的引号去除,保持字符串属性 * * @param wrappedStr 含引号的字符串,如 'abc',"abc" * @return abc 无引号包裹的字符串 */ private String unwrapStringToken(String wrappedStr) { if(wrappedStr == null || wrappedStr.length() == 0) { return null; } char[] values = wrappedStr.toCharArray(); int i = 0; while (i < values.length - 1 && (values[i] == '"' || values[i] == '\'')) { i++; } int j = values.length - 1; while (j > 0 && (values[j] == '"' || values[j] == '\'')) { j--; } return new String(values, i, j - i + 1); } /** * 比较两个值ab是否基于op成立 * * @param aValue 左值 * @param op 比较运算符 * @param bValue 右值 * @param valueType 值类型, 主要是区分数字与字符 * @return 是否等式成立, true:成立, false:不成立 */ private boolean checkCompareTrue(String aValue, String op, String bValue, TokenTypeEnum valueType) { // 首先进行相生性判定 if("null".equals(bValue)) { bValue = null; } switch(op) { case "=": if(bValue == null) { return aValue == null; } return bValue.equals(aValue); case "!=": case "<>": if(bValue == null) { return aValue != null; } return !bValue.equals(aValue); } if(bValue == null) { log.warn("非null值不能用比较符号运算"); throw new SyntaxException("语法错误"); } // >=,<=,>,< 判定 int compareResult = compareTwoData(aValue, bValue, valueType); switch(op) { case ">": return compareResult > 0; case ">=": return compareResult >= 0; case "<=": return compareResult <= 0; case "<": return compareResult < 0; } throw new SyntaxException("未知的运算符"); } // 比较两个值大小ab private int compareTwoData(String aValue, String bValue, TokenTypeEnum tokenType) { bValue = unwrapStringToken(bValue); if(bValue == null) { // 按任意值大于null 规则处理 return aValue == null ? 0 : 1; } // 被比较值为null, 则按小于计算 if(aValue == null) { return -1; } if(tokenType == TokenTypeEnum.WORD_NUMBER) { return new BigDecimal(aValue).compareTo( new BigDecimal(bValue)); } return aValue.compareTo(unwrapStringToken(bValue)); } // 将token重新分组,以便可以做原子运算 private List<AndOrOperatorSupervisor> partitionByPriority(List<TokenDescriptor> tokens) { // 1. 取左等式token列表 // 2. 取等式表达式 // 3. 取右等式token列表 // 4. 构建一个表达式,做最小分组 // 5. 检查是否有下一运算符,如有则必定为and|or|( // 6. 保存上一连接判定符,新开一个分组 // 7. 重复步骤1-6,直到取完所有token // 前置运算符,决定是否要运算本节点,以及结果的合并方式 // 比如 and, 则当前点必须参与运算,如果前节点结果为false,则直接返回false // 否则先计算本节点 TokenDescriptor preType = null; // 当前节点计算完成后,判断下一运算是否有必要触发 // 为and时则当前为true时必须触发,为or时当前为false触发 TokenDescriptor nextType = null; // key 为 left, op, right, 各value为细分tks Map<String, List<TokenDescriptor>> unitGroup = new HashMap<>(); String currentReadPos = "LEFT"; List<TokenDescriptor> smallGroupTokenList = new ArrayList<>(); // 以上为描述单个运算的字符,使用一个list就可以描述无括号的表达式了 List<AndOrOperatorSupervisor> bracketGroup = new ArrayList<>(); AndOrOperatorSupervisor supervisor = new AndOrOperatorSupervisor(null, unitGroup); bracketGroup.add(supervisor); for (int i = 0; i < tokens.size(); i++) { TokenDescriptor token = tokens.get(i); String word = token.getRawWord().toLowerCase(); TokenTypeEnum tokenType = token.getTokenType(); // 忽略分隔符,假设只有一级运算,忽略空格带来的复杂优先级问题 if(tokenType == TokenTypeEnum.CLAUSE_SEPARATOR) { continue; } // 字段直接判定 if(tokenType == TokenTypeEnum.COMPARE_OPERATOR && !",".equals(word)) { unitGroup.put("OP", Collections.singletonList(token)); currentReadPos = "RIGHT"; continue; } // is null, is not null 解析 if("is".equals(word)) { while (i + 1 < tokens.size()) { TokenDescriptor nextToken = tokens.get(i + 1); if("null".equalsIgnoreCase(nextToken.getRawWord())) { TokenDescriptor opToken = new TokenDescriptor("=", TokenTypeEnum.COMPARE_OPERATOR); unitGroup.put("OP", Collections.singletonList(opToken)); currentReadPos = "RIGHT"; List<TokenDescriptor> curTokenList = unitGroup.computeIfAbsent( currentReadPos, r -> new ArrayList<>()); curTokenList.add(nextToken); // 跳过1个token i += 1; break; } if("not".equalsIgnoreCase(nextToken.getRawWord())) { if(i + 2 >= tokens.size()) { throw new SyntaxException("语法错误3: is"); } nextToken = tokens.get(i + 2); if(!"null".equalsIgnoreCase(nextToken.getRawWord())) { throw new SyntaxException("语法错误4: is"); } TokenDescriptor opToken = new TokenDescriptor("!=", TokenTypeEnum.COMPARE_OPERATOR); unitGroup.put("OP", Collections.singletonList(opToken)); currentReadPos = "RIGHT"; List<TokenDescriptor> curTokenList = unitGroup.computeIfAbsent( currentReadPos, r -> new ArrayList<>()); curTokenList.add(nextToken); // 跳过2个token i += 2; break; } } continue; } // in (x,x,xx) 语法解析 if("in".equals(word)) { TokenDescriptor opToken = new TokenDescriptor("in", TokenTypeEnum.COMPARE_OPERATOR); unitGroup.put("OP", Collections.singletonList(opToken)); currentReadPos = "RIGHT"; List<TokenDescriptor> curTokenList = unitGroup.computeIfAbsent( currentReadPos, r -> new ArrayList<>()); i = parseInItems(tokens, curTokenList, i); continue; } // not in (x,xxx,xx) 语法解析 if("not".equals(word)) { if(i + 1 > tokens.size()) { throw new SyntaxException("语法错误:not"); } TokenDescriptor nextToken = tokens.get(i + 1); // 暂不支持 not exists 等语法 if(!"in".equalsIgnoreCase(nextToken.getRawWord())) { throw new SyntaxException("不支持的语法:not"); } TokenDescriptor opToken = new TokenDescriptor("notin", TokenTypeEnum.COMPARE_OPERATOR); unitGroup.put("OP", Collections.singletonList(opToken)); currentReadPos = "RIGHT"; List<TokenDescriptor> curTokenList = unitGroup.computeIfAbsent( currentReadPos, r -> new ArrayList<>()); i = parseInItems(tokens, curTokenList, i + 1); continue; } // 暂只解析一级,无括号情况 if("and".equals(word) || "or".equals(word)) { supervisor.setNextType(token); // 滚动到下一运算分支 unitGroup = new HashMap<>(); supervisor = new AndOrOperatorSupervisor(token, unitGroup); bracketGroup.add(supervisor); currentReadPos = "LEFT"; continue; } List<TokenDescriptor> curTokenList = unitGroup.computeIfAbsent( currentReadPos, r -> new ArrayList<>()); curTokenList.add(token); } sortAndSetCalcFlag(bracketGroup); return bracketGroup; } /** * 按优先级重排序或其他工作 * * @param bracketGroup 按先后排序的原子运算 */ private void sortAndSetCalcFlag(List<AndOrOperatorSupervisor> bracketGroup) { for (int i = bracketGroup.size() - 1; i > 0; i--) { AndOrOperatorSupervisor calc1 = bracketGroup.get(i); TokenDescriptor prevUnionDesc = calc1.getPrevType(); if(prevUnionDesc == null) { continue; } if(!"or".equalsIgnoreCase(prevUnionDesc.getRawWord())) { continue; } // 增加当前节点失败后的重试次数 for (int j = 0; j < i; j++) { AndOrOperatorSupervisor calcTemp = bracketGroup.get(j); calcTemp.incrStepAfterMyFail(); } } } /** * 解析in中的所有元素到结果中 * * @param tokens 所有token * @param curTokenList 当前结果表 * @param start in 开始的地方 * @return in 语法结束位置 */ private int parseInItems(List<TokenDescriptor> tokens, List<TokenDescriptor> curTokenList, int start) { while (start + 1 < tokens.size()) { TokenDescriptor nextToken = tokens.get(++start); String nextWord = nextToken.getRawWord(); if("(".equals(nextWord) || ",".equals(nextWord)) { // in 开始 continue; } if(")".equals(nextWord)) { break; } curTokenList.add(nextToken); } return start; } /** * 最小运算单元描述符 */ private class AndOrOperatorSupervisor { // 前置运算符,决定是否要运算本节点,以及结果的合并方式 // 比如 and, 则当前点必须参与运算,如果前节点结果为false,则直接返回false // 否则先计算本节点 private TokenDescriptor prevType; // 当前节点计算完成后,判断下一运算是否有必要触发 // 为and时则当前为true时必须触发,为or时当前为false触发 private TokenDescriptor nextType; // key 为 left, op, right, 各value为细分tks private Map<String, List<TokenDescriptor>> unitGroupTokens; /** * 当当前节点运算失败后,还需要尝试的更多运算数量(如 or 运算) */ private int needMoreStepsAfterMyFail; public AndOrOperatorSupervisor(TokenDescriptor prevType, Map<String, List<TokenDescriptor>> unitGroupTokens) { this.prevType = prevType; this.unitGroupTokens = unitGroupTokens; } public void setNextType(TokenDescriptor nextType) { this.nextType = nextType; } public TokenDescriptor getPrevType() { return prevType; } public TokenDescriptor getNextType() { return nextType; } public Map<String, List<TokenDescriptor>> getUnitGroupTokens() { return unitGroupTokens; } public int getNeedMoreStepsAfterMyFail() { return needMoreStepsAfterMyFail; } public void incrStepAfterMyFail() { this.needMoreStepsAfterMyFail++; } @Override public String toString() { return StringUtils.join( unitGroupTokens.get("LEFT").stream() .map(TokenDescriptor::getRawWord) .collect(Collectors.toList()), ' ') + unitGroupTokens.get("OP").get(0).getRawWord() + StringUtils.join( unitGroupTokens.get("RIGHT").stream() .map(TokenDescriptor::getRawWord) .collect(Collectors.toList()), ' ') + ", prev=" + prevType + ", next=" + nextType ; } } }
每使用时,传入case..when..的语句构造出一个新的计算实例,然后调用 calcCaseWhenData(rawData), 带入已知参数信息,即可运算出最终的case..when..值。
为使处理简单起见,这里并没有深入各种逻辑嵌套处理,直接忽略掉括号的处理了。另外,对于数值类的运算也暂时被忽略,如 field1 > 1+1 这种运算,并不会计算出2来。这些东西,需要的同学,完全可以稍加完善,即可支持处理这些逻辑。
对于null值的处理,此处遵循null值小于任意值,任意值大于null值的约定。类型区分字符型与数值型,以右值为类型推断依据。即不允许将数值型值写为字符串型,除非两者可以得到同样的结果。非数值型字段,不得用于数学运算。(尽管以上实现并未处理数值运算)
因 case when 的语法还是比较清晰的,所以我们只是做了顺序地读取,判定即得出结果。另外对于 case when 的单值判定并不支持,所以实现并不复杂。但这完全不影响我们理解整个语法处理的思想。相信需要的同学定能有所启发。
3. 表达式计算单元测试
以上仅实现代码,需要附加上各种场景测试,才算可以work的东西。主要就是针对种 and/or, in, is null 等的处理。如下:
import com.my.mvc.app.common.helper.CaseWhenElDataCalcHelper; import com.my.mvc.app.common.helper.SimpleSyntaxParser; import com.my.mvc.app.common.helper.parser.ParsedClauseAst; import lombok.extern.slf4j.Slf4j; import org.junit.Assert; import org.junit.Test; import java.util.HashMap; import java.util.Map; @Slf4j public class CaseWhenElDataCalcHelperTest { @Test public void testCaseWhenSimple1() { String condition; ParsedClauseAst parsedClause; CaseWhenElDataCalcHelper helper; Map<String, String> rawData; condition = "case \n" + "\twhen (kehu_phone is null or field1 != 'c') then m_phone \n" + "\telse kehu_phone\n" + "end"; parsedClause = SimpleSyntaxParser.parse(condition); helper = new CaseWhenElDataCalcHelper(parsedClause.getAst().get(0)); rawData = new HashMap<>(); rawData.put("kehu_phone", "kehu_phone_v1"); rawData.put("field1", "field1_v"); rawData.put("m_phone", "m_phone_v"); Assert.assertEquals("case..when..中解析字段信息不正确", 3, parsedClause.getIdMapping().size()); Assert.assertEquals("case..when..解析结果错误", rawData.get("m_phone"), helper.calcCaseWhenData(rawData)); condition = "case \n" + "\twhen (kehu_phone is null) then m_phone \n" + "\telse kehu_phone\n" + "end"; parsedClause = SimpleSyntaxParser.parse(condition); helper = new CaseWhenElDataCalcHelper(parsedClause.getAst().get(0)); rawData = new HashMap<>(); rawData.put("kehu_phone", "kehu_phone_v1"); rawData.put("field1", "field1_v"); rawData.put("m_phone", "m_phone_v"); Assert.assertEquals("case..when..中解析字段信息不正确", 2, parsedClause.getIdMapping().size()); Assert.assertEquals("case..when..解析结果错误", rawData.get("kehu_phone"), helper.calcCaseWhenData(rawData)); rawData.remove("kehu_phone"); Assert.assertEquals("case..when..解析结果错误", rawData.get("m_phone"), helper.calcCaseWhenData(rawData)); condition = " case \n" + " \twhen is_sx_emp='Y' then 'Y1' \n" + " \twhen is_sx_new_custom!='Y' then 'Y2' \n" + " \twhen is_sx_fort_promot_custom='Y' then 'Y3' \n" + " \twhen promotion_role_chn in ('10','11') and first_tenthousand_dt is not null then 'Y4' \n" + " \telse 'N' \n" + " end"; parsedClause = SimpleSyntaxParser.parse(condition); helper = new CaseWhenElDataCalcHelper(parsedClause.getAst().get(0)); rawData = new HashMap<>(); rawData.put("is_sx_emp", "N"); rawData.put("is_sx_new_custom", "Y"); rawData.put("is_sx_fortune_promot_custom", "N"); rawData.put("promotion_role_chn", "10"); rawData.put("first_tenthousand_dt", "10"); Assert.assertEquals("case..when..中解析字段信息不正确", 5, parsedClause.getIdMapping().size()); Assert.assertEquals("case..when..in解析结果错误", "Y4", helper.calcCaseWhenData(rawData)); rawData = new HashMap<>(); rawData.put("is_sx_emp", "N"); rawData.put("is_sx_new_custom", "Y"); rawData.put("is_sx_fortune_promot_custom", "N"); rawData.put("first_tenthousand_dt", "10"); rawData.put("promotion_role_chn", "9"); Assert.assertEquals("case..when..else解析结果错误", "N", helper.calcCaseWhenData(rawData)); rawData = new HashMap<>(); rawData.put("is_sx_new_custom", "Y"); rawData.put("is_sx_fortune_promot_custom", "N"); rawData.put("first_tenthousand_dt", "10"); rawData.put("promotion_role_chn", "9"); rawData.put("is_sx_emp", "Y"); Assert.assertEquals("case..when..=解析结果错误", "Y1", helper.calcCaseWhenData(rawData)); rawData = new HashMap<>(); rawData.put("is_sx_emp", "N"); rawData.put("is_sx_new_custom", "N"); rawData.put("is_sx_fortune_promot_custom", "N"); rawData.put("first_tenthousand_dt", "10"); rawData.put("promotion_role_chn", "9"); Assert.assertEquals("case..when..!=解析结果错误", "Y2", helper.calcCaseWhenData(rawData)); rawData = new HashMap<>(); rawData.put("is_sx_emp", "N"); rawData.put("is_sx_new_custom", "Y"); rawData.put("is_sx_fortune_promot_custom", "N"); // rawData.put("first_tenthousand_dt", "10"); rawData.put("promotion_role_chn", "11.1"); Assert.assertEquals("case..when..in+and+null解析结果错误", "N", helper.calcCaseWhenData(rawData)); condition = " case \n" + " \twhen is_sx_emp='Y' then 'Y1' \n" + " \twhen or_emp != null or or_emp2 > 3 then 'Y2_OR' \n" + " \twhen and_emp != null and and_emp2 > 3 or or_tmp3 <= 11 then 'Y3_OR' \n" + " \twhen promotion_role_chn not in ('10','11') and first_tenthousand_dt is not null then 'Y4' \n" + " \twhen promotion_role_chn not in ('10') then 'Y5_NOTIN' \n" + " \telse 'N_ELSE' \n" + " end"; parsedClause = SimpleSyntaxParser.parse(condition); helper = new CaseWhenElDataCalcHelper(parsedClause.getAst().get(0)); rawData = new HashMap<>(); rawData.put("is_sx_emp", "N"); rawData.put("or_emp", "Y"); Assert.assertEquals("case..when..中解析字段信息不正确", 8, parsedClause.getIdMapping().size()); Assert.assertEquals("case..when..in解析结果错误", "Y2_OR", helper.calcCaseWhenData(rawData)); rawData = new HashMap<>(); rawData.put("is_sx_emp", "N"); // rawData.put("or_emp", "Y"); rawData.put("or_emp2", "2"); rawData.put("or_tmp3", "12"); Assert.assertEquals("case..when..or>2解析结果错误", "Y5_NOTIN", helper.calcCaseWhenData(rawData)); rawData = new HashMap<>(); rawData.put("is_sx_emp", "N"); // rawData.put("or_emp", "Y"); rawData.put("or_emp2", "2"); rawData.put("or_tmp3", "13"); rawData.put("promotion_role_chn", "10"); Assert.assertEquals("case..when..notin解析结果错误", "N_ELSE", helper.calcCaseWhenData(rawData)); condition = " case \n" + " \twhen (is_sx_emp='Y' or a_field=3) then 'Y1' \n" + " \telse 'N_ELSE' \n" + " end"; parsedClause = SimpleSyntaxParser.parse(condition); helper = new CaseWhenElDataCalcHelper(parsedClause.getAst().get(0)); rawData = new HashMap<>(); rawData.put("is_sx_emp", "N"); rawData.put("or_emp", "Y"); Assert.assertEquals("case..when..中解析字段信息不正确", 2, parsedClause.getIdMapping().size()); Assert.assertEquals("case..when..()号解析结果错误", "N_ELSE", helper.calcCaseWhenData(rawData)); rawData = new HashMap<>(); rawData.put("is_sx_emp", "Y"); rawData.put("or_emp", "Y"); Assert.assertEquals("case..when..中解析字段信息不正确", 2, parsedClause.getIdMapping().size()); Assert.assertEquals("case..when..()号解析结果错误2", "Y1", helper.calcCaseWhenData(rawData)); condition = " case \n" + " \twhen (field1 != '0' and field2 != '0' or field3 != '0') then field3 \n" + " \telse 'N_ELSE' \n" + " end"; parsedClause = SimpleSyntaxParser.parse(condition); helper = new CaseWhenElDataCalcHelper(parsedClause.getAst().get(0)); rawData = new HashMap<>(); rawData.put("field1", "f1"); rawData.put("field2", "f2"); rawData.put("field3", "f3"); Assert.assertEquals("case..when..中解析字段信息不正确", 3, parsedClause.getIdMapping().size()); Assert.assertEquals("case..when..and+or优先级解析结果错误", rawData.get("field3"), helper.calcCaseWhenData(rawData)); rawData = new HashMap<>(); rawData.put("field1", "0"); rawData.put("field2", "f2"); rawData.put("field3", "f3"); Assert.assertEquals("case..when..and+or优先级解析结果错误", rawData.get("field3"), helper.calcCaseWhenData(rawData)); } }
如果有更多场景,我们只需添加测试,然后完善相应逻辑即可。这里所有的测试,都可以基于sql协议进行,如有空缺则应弥补相应功能,而非要求用户按自己的标准来,毕竟标准是个好东西。
4. 更多表达式计算
实际上,对表达式计算这东西,我们也许不一定非要自己去实现。毕竟太费力。有开源产品支持的,比如:aviator: https://www.oschina.net/p/aviator?hmsr=aladdin1e1 https://www.jianshu.com/p/02403dd1f4c4
如果该语法不支持,则可以先转换成支持的语法,再使用其引擎计算即可。
表达式计算,看起来像是在做了计算的工作,然而本质上,也都是在做翻译工作,只不过它是将表达式翻译成了java语言执行而已!