语义分析:C语言表达式的语法树生成——Python实现
令狐冲慢慢走近,那汉子全身发抖,双膝一屈,跪倒在雪地之中。令狐冲怒道:“你辱我师妹,须饶你不得。”长剑指在他咽喉之上,心念一动,走近一步,低声问道:“写在雪人上的,是些什么字?”
那汉子颤声道:“是……是……‘海枯……海枯……石烂,两……情……情不……不渝’。”自从世上有了“海枯石烂,两情不渝”这八个字以来,说得如此胆战心惊、丧魂落魄的,只怕这是破题儿第一遭了。
令狐冲一呆,道:“嗯,是海枯石烂,两情不渝。”心头酸楚,长剑送出,刺入他咽喉。
——《笑傲江湖》
语义分析较困难的根本原因在于语法的可递归性,深层次的递归使得问题的分解看起来变得相当地复杂。但是如果能将递归问题转化为迭代问题,便能很大程度地简化此问题模型。递归转化为迭代的关键在于——找到最深层递归结构的全部特征,迭代化之,问题便迎刃而解。
一般情况下,人们在面对复杂的递归问题时时,亦是依据其语法规则,找到其递归深层的结构,化解之,步步迭代,如此,问题便得到解决。人类的思维很是擅长将递归问题转化为迭代问题,而学习知识的过程,则可以看成是对各种各样语法规则的理解与掌握。
一元操作符、二元操作符的递归问题,可以很简单的转化为迭代,多元操作符的情况稍复杂些。
所有的操作符及其优先级如下图:
如typeof、取地址、指针指向等,在这里并未实现。实现的包括有算数运算式、逻辑运算式、函数调用与括号。对于理解语义分析的过程,已足够。
对于不包含括号与函数的简单表达式,我们语义分析演算过程如下:
我们的数据结构:
1 ''' 2 ____________________________ Syntax Tree 3 Parenthesis: 4 ["(",None] 5 [")",None] 6 Operators(grouped by precedence): 7 Unary : 8 1 + - ! ~ ["+",None] ["-",None] ["!",None] ["~",None] 9 Binary : 10 2 * / % ["*",None] ["/",None] ["%",None] 11 3 + - ["+",None] ["-",None] 12 4 << >> ["<<",None] [">>",None] 13 5 > >= < <= [">",None] [">=",None] ["<",None] ["<=",None] 14 6 == != ["==",None] ["!=",None] 15 7 & ["&",None] 16 8 ^ ["^",None] 17 9 | ["|",None] 18 10 && ["&&",None] 19 11 || ["||",None] 20 Ternary : 21 12 expr ? expr : expr ["?",None] [":",None] ["@expr","?:",listPtr0,listPtr1,listPtr2] 22 13 expr , expr , expr... 23 Var,Num,Expr,Function: 24 ["@var","varName"] 25 ["@num","num_string"] 26 ["@expr","Operator",listPtr,...] 27 ["@func","funcName",listPtr1,...] 28 ["@expr_list",["@var"|"@num"|"@expr"|"@func",...],...] 29 '''
这是我们最终的代码模块图:
其中形如 module_x_y 的函数,x表示此运算符的优先级,y表示横向序号,从零开始。代码注释已经写得很详细了,请看源代码:
1 ######################################## global list 2 OperatorList=['+','-','!','~',\ 3 '*','/','%',\ 4 '+','-',\ 5 '<<','>>',\ 6 '>','>=','<','<=',\ 7 '==','!=',\ 8 '&',\ 9 '^',\ 10 '|',\ 11 '&&',\ 12 '||',\ 13 '?',':'\ 14 ','] 15 ''' 31 + 8 * 9 ''' 16 listToParse=[ ['@num','31'] , ['+',None] , ['@num','8'] , ['*',None] , ['@num','9'] ] 17 18 ########### return value : 19 ############# 0 parsed some expresions 20 ############# 1 done nothing but no errors happened 21 ################# + =: ^+A... | ...Op+A... 22 def module_1_0(lis,i): 23 24 # left i right are both indexes :) 25 left=i-1 26 right=i+1 27 28 # process: ^+A... 29 if i==0 and len(lis)>=2: 30 if lis[right][0][0]=='@': 31 rightPtr=lis[right] 32 del lis[0:2] 33 lis.insert(0,["@expr","+",rightPtr]) 34 return 0 35 # process: ...Op+A... 36 if i>=1 and len(lis)>=3 and right<len(lis): 37 if lis[left][0] in OperatorList: 38 if lis[right][0][0]=='@': 39 rightPtr=lis[right] 40 del lis[i:i+2] 41 lis.insert(i,["@expr","+",rightPtr]) 42 return 0 43 44 return 1 45 46 ########### return value : 47 ############# 0 parsed some expresions 48 ############# 1 done nothing but no errors happened 49 ################# - =: ^-A... | ...Op-A... 50 def module_1_1(lis,i): 51 52 # left i right are both indexes :) 53 left=i-1 54 right=i+1 55 56 # process: ^-A... 57 if i==0 and len(lis)>=2: 58 if lis[right][0][0]=='@': 59 rightPtr=lis[right] 60 del lis[0:2] 61 lis.insert(0,["@expr","-",rightPtr]) 62 return 0 63 # process: ...Op-A... 64 if i>=1 and len(lis)>=3 and right<len(lis): 65 if lis[left][0] in OperatorList: 66 if lis[right][0][0]=='@': 67 rightPtr=lis[right] 68 del lis[i:i+2] 69 lis.insert(i,["@expr","-",rightPtr]) 70 return 0 71 72 return 1 73 74 ########### return value : 75 ############# 0 parsed some expresions 76 ############# 1 done nothing but no errors happened 77 ################# ! =: ...!A... 78 def module_1_2(lis,i): 79 80 # left i right are both indexes :) 81 left=i-1 82 right=i+1 83 84 # process: ...!A... 85 if len(lis)>=2 and right<len(lis): 86 if lis[right][0][0]=='@': 87 rightPtr=lis[right] 88 del lis[i:i+2] 89 lis.insert(i,["@expr","!",rightPtr]) 90 return 0 91 92 return 1 93 94 ########### return value : 95 ############# 0 parsed some expresions 96 ############# 1 done nothing but no errors happened 97 ################# ~ =: ...~A... 98 def module_1_3(lis,i): 99 100 # left i right are both indexes :) 101 left=i-1 102 right=i+1 103 104 # process: ...~A... 105 if len(lis)>=2 and right<len(lis): 106 if lis[right][0][0]=='@': 107 rightPtr=lis[right] 108 del lis[i:i+2] 109 lis.insert(i,["@expr","~",rightPtr]) 110 return 0 111 112 return 1 113 114 ########### return value : 115 ############# 0 parsed some expresions 116 ############# 1 done nothing but no errors happened 117 ################# * =: ...A*A... 118 def module_2_0(lis,i): 119 120 # left i right are both indexes :) 121 left=i-1 122 right=i+1 123 124 # process: ...A*A... 125 if i>=1 and len(lis)>=3 and right<len(lis): 126 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 127 leftPtr=lis[left] 128 rightPtr=lis[right] 129 del lis[left:left+3] 130 lis.insert(left,["@expr","*",leftPtr,rightPtr]) 131 return 0 132 133 return 1 134 135 ########### return value : 136 ############# 0 parsed some expresions 137 ############# 1 done nothing but no errors happened 138 ################# / =: ...A/A... 139 def module_2_1(lis,i): 140 141 # left i right are both indexes :) 142 left=i-1 143 right=i+1 144 145 # process: ...A/A... 146 if i>=1 and len(lis)>=3 and right<len(lis): 147 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 148 leftPtr=lis[left] 149 rightPtr=lis[right] 150 del lis[left:left+3] 151 lis.insert(left,["@expr","/",leftPtr,rightPtr]) 152 return 0 153 154 return 1 155 156 ########### return value : 157 ############# 0 parsed some expresions 158 ############# 1 done nothing but no errors happened 159 ################# % =: ...A%A... 160 def module_2_2(lis,i): 161 162 # left i right are both indexes :) 163 left=i-1 164 right=i+1 165 166 # process: ...A%A... 167 if i>=1 and len(lis)>=3 and right<len(lis): 168 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 169 leftPtr=lis[left] 170 rightPtr=lis[right] 171 del lis[left:left+3] 172 lis.insert(left,["@expr","%",leftPtr,rightPtr]) 173 return 0 174 175 return 1 176 177 ########### return value : 178 ############# 0 parsed some expresions 179 ############# 1 done nothing but no errors happened 180 ################# + =: ...A+A... 181 def module_3_0(lis,i): 182 183 # left i right are both indexes :) 184 left=i-1 185 right=i+1 186 187 # process: ...A+A... 188 if i>=1 and len(lis)>=3 and right<len(lis): 189 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 190 leftPtr=lis[left] 191 rightPtr=lis[right] 192 del lis[left:left+3] 193 lis.insert(left,["@expr","+",leftPtr,rightPtr]) 194 return 0 195 196 return 1 197 198 ########### return value : 199 ############# 0 parsed some expresions 200 ############# 1 done nothing but no errors happened 201 ################# - =: ...A-A... 202 def module_3_1(lis,i): 203 204 # left i right are both indexes :) 205 left=i-1 206 right=i+1 207 208 # process: ...A-A... 209 if i>=1 and len(lis)>=3 and right<len(lis): 210 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 211 leftPtr=lis[left] 212 rightPtr=lis[right] 213 del lis[left:left+3] 214 lis.insert(left,["@expr","-",leftPtr,rightPtr]) 215 return 0 216 217 return 1 218 219 ########### return value : 220 ############# 0 parsed some expresions 221 ############# 1 done nothing but no errors happened 222 ################# << =: ...A<<A... 223 def module_4_0(lis,i): 224 225 # left i right are both indexes :) 226 left=i-1 227 right=i+1 228 229 # process: ...A<<A... 230 if i>=1 and len(lis)>=3 and right<len(lis): 231 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 232 leftPtr=lis[left] 233 rightPtr=lis[right] 234 del lis[left:left+3] 235 lis.insert(left,["@expr","<<",leftPtr,rightPtr]) 236 return 0 237 238 return 1 239 240 ########### return value : 241 ############# 0 parsed some expresions 242 ############# 1 done nothing but no errors happened 243 ################# >> =: ...A>>A... 244 def module_4_1(lis,i): 245 246 # left i right are both indexes :) 247 left=i-1 248 right=i+1 249 250 # process: ...A>>A... 251 if i>=1 and len(lis)>=3 and right<len(lis): 252 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 253 leftPtr=lis[left] 254 rightPtr=lis[right] 255 del lis[left:left+3] 256 lis.insert(left,["@expr",">>",leftPtr,rightPtr]) 257 return 0 258 259 return 1 260 261 ########### return value : 262 ############# 0 parsed some expresions 263 ############# 1 done nothing but no errors happened 264 ################# > =: ...A>A... 265 def module_5_0(lis,i): 266 267 # left i right are both indexes :) 268 left=i-1 269 right=i+1 270 271 # process: ...A>A... 272 if i>=1 and len(lis)>=3 and right<len(lis): 273 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 274 leftPtr=lis[left] 275 rightPtr=lis[right] 276 del lis[left:left+3] 277 lis.insert(left,["@expr",">",leftPtr,rightPtr]) 278 return 0 279 280 return 1 281 282 ########### return value : 283 ############# 0 parsed some expresions 284 ############# 1 done nothing but no errors happened 285 ################# >= =: ...A>=A... 286 def module_5_1(lis,i): 287 288 # left i right are both indexes :) 289 left=i-1 290 right=i+1 291 292 # process: ...A>=A... 293 if i>=1 and len(lis)>=3 and right<len(lis): 294 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 295 leftPtr=lis[left] 296 rightPtr=lis[right] 297 del lis[left:left+3] 298 lis.insert(left,["@expr",">=",leftPtr,rightPtr]) 299 return 0 300 301 return 1 302 303 ########### return value : 304 ############# 0 parsed some expresions 305 ############# 1 done nothing but no errors happened 306 ################# < =: ...A<A... 307 def module_5_2(lis,i): 308 309 # left i right are both indexes :) 310 left=i-1 311 right=i+1 312 313 # process: ...A<A... 314 if i>=1 and len(lis)>=3 and right<len(lis): 315 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 316 leftPtr=lis[left] 317 rightPtr=lis[right] 318 del lis[left:left+3] 319 lis.insert(left,["@expr","<",leftPtr,rightPtr]) 320 return 0 321 322 return 1 323 324 ########### return value : 325 ############# 0 parsed some expresions 326 ############# 1 done nothing but no errors happened 327 ################# <= =: ...A<=A... 328 def module_5_3(lis,i): 329 330 # left i right are both indexes :) 331 left=i-1 332 right=i+1 333 334 # process: ...A<=A... 335 if i>=1 and len(lis)>=3 and right<len(lis): 336 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 337 leftPtr=lis[left] 338 rightPtr=lis[right] 339 del lis[left:left+3] 340 lis.insert(left,["@expr","<=",leftPtr,rightPtr]) 341 return 0 342 343 return 1 344 345 ########### return value : 346 ############# 0 parsed some expresions 347 ############# 1 done nothing but no errors happened 348 ################# == =: ...A==A... 349 def module_6_0(lis,i): 350 351 # left i right are both indexes :) 352 left=i-1 353 right=i+1 354 355 # process: ...A==A... 356 if i>=1 and len(lis)>=3 and right<len(lis): 357 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 358 leftPtr=lis[left] 359 rightPtr=lis[right] 360 del lis[left:left+3] 361 lis.insert(left,["@expr","==",leftPtr,rightPtr]) 362 return 0 363 364 return 1 365 366 ########### return value : 367 ############# 0 parsed some expresions 368 ############# 1 done nothing but no errors happened 369 ################# != =: ...A!=A... 370 def module_6_1(lis,i): 371 372 # left i right are both indexes :) 373 left=i-1 374 right=i+1 375 376 # process: ...A!=A... 377 if i>=1 and len(lis)>=3 and right<len(lis): 378 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 379 leftPtr=lis[left] 380 rightPtr=lis[right] 381 del lis[left:left+3] 382 lis.insert(left,["@expr","!=",leftPtr,rightPtr]) 383 return 0 384 385 return 1 386 387 ########### return value : 388 ############# 0 parsed some expresions 389 ############# 1 done nothing but no errors happened 390 ################# & =: ...A&A... 391 def module_7_0(lis,i): 392 393 # left i right are both indexes :) 394 left=i-1 395 right=i+1 396 397 # process: ...A&A... 398 if i>=1 and len(lis)>=3 and right<len(lis): 399 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 400 leftPtr=lis[left] 401 rightPtr=lis[right] 402 del lis[left:left+3] 403 lis.insert(left,["@expr","&",leftPtr,rightPtr]) 404 return 0 405 406 return 1 407 408 ########### return value : 409 ############# 0 parsed some expresions 410 ############# 1 done nothing but no errors happened 411 ################# ^ =: ...A^A... 412 def module_8_0(lis,i): 413 414 # left i right are both indexes :) 415 left=i-1 416 right=i+1 417 418 # process: ...A^A... 419 if i>=1 and len(lis)>=3 and right<len(lis): 420 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 421 leftPtr=lis[left] 422 rightPtr=lis[right] 423 del lis[left:left+3] 424 lis.insert(left,["@expr","^",leftPtr,rightPtr]) 425 return 0 426 427 return 1 428 429 ########### return value : 430 ############# 0 parsed some expresions 431 ############# 1 done nothing but no errors happened 432 ################# | =: ...A|A... 433 def module_9_0(lis,i): 434 435 # left i right are both indexes :) 436 left=i-1 437 right=i+1 438 439 # process: ...A|A... 440 if i>=1 and len(lis)>=3 and right<len(lis): 441 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 442 leftPtr=lis[left] 443 rightPtr=lis[right] 444 del lis[left:left+3] 445 lis.insert(left,["@expr","|",leftPtr,rightPtr]) 446 return 0 447 448 return 1 449 450 ########### return value : 451 ############# 0 parsed some expresions 452 ############# 1 done nothing but no errors happened 453 ################# && =: ...A&&A... 454 def module_10_0(lis,i): 455 456 # left i right are both indexes :) 457 left=i-1 458 right=i+1 459 460 # process: ...A&&A... 461 if i>=1 and len(lis)>=3 and right<len(lis): 462 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 463 leftPtr=lis[left] 464 rightPtr=lis[right] 465 del lis[left:left+3] 466 lis.insert(left,["@expr","&&",leftPtr,rightPtr]) 467 return 0 468 469 return 1 470 471 ########### return value : 472 ############# 0 parsed some expresions 473 ############# 1 done nothing but no errors happened 474 ################# || =: ...A||A... 475 def module_11_0(lis,i): 476 477 # left i right are both indexes :) 478 left=i-1 479 right=i+1 480 481 # process: ...A||A... 482 if i>=1 and len(lis)>=3 and right<len(lis): 483 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 484 leftPtr=lis[left] 485 rightPtr=lis[right] 486 del lis[left:left+3] 487 lis.insert(left,["@expr","||",leftPtr,rightPtr]) 488 return 0 489 490 return 1 491 492 ########### return value : 493 ############# 0 parsed some expresions 494 ############# 1 done nothing but no errors happened 495 ################# ?: =: ...A?A:A... 496 ################# ^ 497 def module_12_0(lis,i): 498 499 # left i right are both indexes :) 500 first=i-3 501 leftOp=i-2 502 left=i-1 503 right=i+1 504 505 # process: ...A?A:A... 506 # ^ 507 if i>=3 and len(lis)>=5 and right<len(lis): 508 if lis[right][0][0]=='@' and lis[left][0][0]=='@' and\ 509 lis[leftOp][0]=='?' and lis[first][0][0]=='@': 510 firstPtr=lis[first] 511 leftPtr=lis[left] 512 rightPtr=lis[right] 513 del lis[first:first+5] 514 lis.insert(first,["@expr","?:",firstPtr,leftPtr,rightPtr]) 515 return 0 516 517 return 1 518 519 ########### return value : 520 ############# 0 parsed some expresions 521 ############# 1 done nothing but no errors happened 522 ################# , =: A,A,...A,A 523 def module_13_0(lis,i): 524 525 # process: A,A,...A,A 526 if len(lis)==1 and lis[0][0][0]!='@': 527 return 1 528 if len(lis)==1 and lis[0][0][0]=='@': 529 return 0 530 if (len(lis)%2)==1 : 531 i=1 532 if lis[0][0][0]!='@': 533 return 1 534 while i<len(lis): 535 if lis[i+1][0][0]=='@' and lis[i][0]==',': 536 i=i+2 537 else: 538 return 1 539 ls=[['@expr_list']] 540 i=0 541 while i<len(lis): 542 ls[0].append(lis[i]) 543 i=i+2 544 del lis[:] 545 lis[:]=ls[:] 546 return 0 547 return 1
上面的代码虽然很大,却是最简单的一部分了,其实可以采取一些方法显著地压缩代码量,但是时间有限。
下面给出一元运算符、二元运算符、三元运算符及逗号分隔符的语义分析过程,这是本文的核心代码之一:
1 ######################################## global list 2 # construct a module dictionary 3 # module_dic_tuple[priority]['Operator'](lis,i) 4 module_dic_tuple=({}, { '+':module_1_0,'-':module_1_1,'!':module_1_2,'~':module_1_3 },\ 5 { '*':module_2_0,'/':module_2_1,'%':module_2_2 }, \ 6 { '+':module_3_0,'-':module_3_1 },\ 7 { '<<':module_4_0,'>>':module_4_1 },\ 8 { '>':module_5_0,'>=':module_5_1,'<':module_5_2,'<=':module_5_3 },\ 9 { '==':module_6_0,'!=':module_6_1 },\ 10 { '&':module_7_0 },\ 11 { '^':module_8_0 },\ 12 { '|':module_9_0 },\ 13 { '&&':module_10_0 },\ 14 { '||':module_11_0 },\ 15 { '?:':module_12_0 },\ 16 { ',':module_13_0 } ) 17 18 operator_priority_tuple=( () , ('+', '-', '!', '~') , ('*','/','%'),\ 19 ('+','-'),('<<','>>'),\ 20 ('>','>=','<','<='),('==','!='),\ 21 ('&'),('^'),('|'),('&&'),('||'),('?',':'),(',') ) 22 23 ############################# parse:unary,binary,ternary,comma expr 24 ########### return value : 25 ############# 0 parsed sucessfully 26 ############# 1 syntax error 27 def parse_simple_expr(lis): 28 if len(lis)==0: 29 return 1 30 #if lis[len(lis)-1][0][0]!='@': 31 # return 1 32 #if lis[0][0][0]!='@' and lis[0][0] not in ('+','-','!','~'): 33 # return 1 34 for pri in range(1,12): # pri 1,2,3,4,5,6,7,8,9,10,11 35 i=0 36 while 1: 37 if len(lis)==1 and lis[0][0][0]=='@': 38 return 0 39 if i>=len(lis): 40 break 41 if lis[i][0] in operator_priority_tuple[pri]: 42 if module_dic_tuple[pri][lis[i][0]](lis,i)==0: 43 i=0 44 continue 45 else: 46 i=i+1 47 continue 48 else: 49 i=i+1 50 for pri in range(12,13): # pri 12 # parse ...A?A:A... 51 i=0 52 while 1: 53 if len(lis)==1 and lis[0][0][0]=='@': 54 return 0 55 if i>=len(lis): 56 break 57 if lis[i][0]==':': 58 if module_dic_tuple[pri]['?:'](lis,i)==0: 59 i=0 60 continue 61 else: 62 i=i+1 63 continue 64 else: 65 i=i+1 66 return module_dic_tuple[13][','](lis,0) 67 return 1
上面代码中,使用了函数引用的词典链表来简化此部分的代码数量。
这一部分就不进行验证展示了,具体过程与前面的文章《一个简单的语义分析算法:单步算法——Python实现》中的描述类似。
实现了 parse_simple_expr 功能之后,剩下的函数与括号的语义分析变得简单些,演算过程如下:
代码实现:
1 ########### return value :[intStatusCode,indexOf'(',indexOf')'] 2 ############# intStatusCode 3 ############# 0 sucessfully 4 ############# 1 no parenthesis matched 5 ############# 2 list is null :( 6 def module_parenthesis_place(lis): 7 length=len(lis) 8 err=0 9 x=0 10 y=0 11 if length==0: 12 return [2,None,None] 13 try: 14 x=lis.index([")",None]) 15 except: 16 err=1 17 lis.reverse() 18 try: 19 y=lis.index(["(",None],length-x-1) 20 except: 21 err=1 22 lis.reverse() 23 y=length-y-1 24 if err==1: 25 return [1,None,None] 26 else: 27 return [0,y,x] 28 29 30 ############################# parse:unary binary ternary prenthesis function expr 31 ########### return value : 32 ############# 0 parsed sucessfully 33 ############# 1 syntax error 34 ############################# find first ')' 35 def parse_comp_expr(lis): 36 while 1: 37 if len(lis)==0: 38 return 1 39 if len(lis)==1: 40 if lis[0][0][0]=='@': 41 return 0 42 else: 43 return 1 44 place=module_parenthesis_place(lis) 45 if place[0]==0: 46 mirror=lis[(place[1]+1):place[2]] 47 if parse_simple_expr(mirror)==0: 48 if place[1]>=1 and lis[place[1]-1][0]=='@var': 49 '''func''' 50 funcName=lis[place[1]-1][1] 51 del lis[place[1]-1:(place[2]+1)] 52 lis.insert(place[1]-1,["@func",funcName,mirror[0]]) 53 else: 54 del lis[place[1]:(place[2]+1)] 55 lis.insert(place[1],mirror[0]) 56 else: 57 return 1 58 else: 59 return parse_simple_expr(lis) 60 return 1
如此,代码到此结束。
下面给出实验结果:
>>> ls=[['(',None],['@var','f'],['(',None],['@num','1'],[',',None],['@num','2'],[',',None],['@num','3'],[',',None],['!',None],['-',None],['@var','x'],['?',None],['@var','y'],[':',None],['~',None],['@var','z'],[')',None],['-',None],['@num','3'],[')',None],['/',None],['@num','4']] >>> ls [['(', None], ['@var', 'f'], ['(', None], ['@num', '1'], [',', None], ['@num', '2'], [',', None], ['@num', '3'], [',', None], ['!', None], ['-', None], ['@var', 'x'], ['?', None], ['@var', 'y'], [':', None], ['~', None], ['@var', 'z'], [')', None], ['-', None], ['@num', '3'], [')', None], ['/', None], ['@num', '4']] >>> len(ls) 23 >>> parse_comp_expr(ls);ls 0 [['@expr', '/', ['@expr', '-', ['@func', 'f', ['@expr_list', ['@num', '1'], ['@num', '2'], ['@num', '3'], ['@expr', '?:', ['@expr', '!', ['@expr', '-', ['@var', 'x']]], ['@var', 'y'], ['@expr', '~', ['@var', 'z']]]]], ['@num', '3']], ['@num', '4']]] >>> len(ls) 1 >>>
附录:
本文的全部源代码如下:
1 ''' 2 ____________________________Syntax & Syntax Tree 3 Parenthesis: 4 ["(",None] 5 [")",None] 6 Operators(grouped by precedence): 7 Unary : 8 1 + - ! ~ ["+",None] ["-",None] ["!",None] ["~",None] 9 Binary : 10 2 * / % ["*",None] ["/",None] ["%",None] 11 3 + - ["+",None] ["-",None] 12 4 << >> ["<<",None] [">>",None] 13 5 > >= < <= [">",None] [">=",None] ["<",None] ["<=",None] 14 6 == != ["==",None] ["!=",None] 15 7 & ["&",None] 16 8 ^ ["^",None] 17 9 | ["|",None] 18 10 && ["&&",None] 19 11 || ["||",None] 20 Ternary : 21 12 expr ? expr : expr ["?",None] [":",None] ["@expr","?:",listPtr0,listPtr1,listPtr2] 22 13 expr , expr , expr... 23 Var,Num,Expr,Function: 24 ["@var","varName"] 25 ["@num","num_string"] 26 ["@expr","Operator",listPtr,...] 27 ["@func","funcName",listPtr1,...] 28 ["@expr_list",["@var"|"@num"|"@expr"|"@func",...],...] 29 ''' 30 31 ######################################## global list 32 OperatorList=['+','-','!','~',\ 33 '*','/','%',\ 34 '+','-',\ 35 '<<','>>',\ 36 '>','>=','<','<=',\ 37 '==','!=',\ 38 '&',\ 39 '^',\ 40 '|',\ 41 '&&',\ 42 '||',\ 43 '?',':'\ 44 ','] 45 ''' 31 + 8 * 9 ''' 46 listToParse=[ ['@num','31'] , ['+',None] , ['@num','8'] , ['*',None] , ['@num','9'] ] 47 48 ########### return value : 49 ############# 0 parsed some expresions 50 ############# 1 done nothing but no errors happened 51 ################# + =: ^+A... | ...Op+A... 52 def module_1_0(lis,i): 53 54 # left i right are both indexes :) 55 left=i-1 56 right=i+1 57 58 # process: ^+A... 59 if i==0 and len(lis)>=2: 60 if lis[right][0][0]=='@': 61 rightPtr=lis[right] 62 del lis[0:2] 63 lis.insert(0,["@expr","+",rightPtr]) 64 return 0 65 # process: ...Op+A... 66 if i>=1 and len(lis)>=3 and right<len(lis): 67 if lis[left][0] in OperatorList: 68 if lis[right][0][0]=='@': 69 rightPtr=lis[right] 70 del lis[i:i+2] 71 lis.insert(i,["@expr","+",rightPtr]) 72 return 0 73 74 return 1 75 76 ########### return value : 77 ############# 0 parsed some expresions 78 ############# 1 done nothing but no errors happened 79 ################# - =: ^-A... | ...Op-A... 80 def module_1_1(lis,i): 81 82 # left i right are both indexes :) 83 left=i-1 84 right=i+1 85 86 # process: ^-A... 87 if i==0 and len(lis)>=2: 88 if lis[right][0][0]=='@': 89 rightPtr=lis[right] 90 del lis[0:2] 91 lis.insert(0,["@expr","-",rightPtr]) 92 return 0 93 # process: ...Op-A... 94 if i>=1 and len(lis)>=3 and right<len(lis): 95 if lis[left][0] in OperatorList: 96 if lis[right][0][0]=='@': 97 rightPtr=lis[right] 98 del lis[i:i+2] 99 lis.insert(i,["@expr","-",rightPtr]) 100 return 0 101 102 return 1 103 104 ########### return value : 105 ############# 0 parsed some expresions 106 ############# 1 done nothing but no errors happened 107 ################# ! =: ...!A... 108 def module_1_2(lis,i): 109 110 # left i right are both indexes :) 111 left=i-1 112 right=i+1 113 114 # process: ...!A... 115 if len(lis)>=2 and right<len(lis): 116 if lis[right][0][0]=='@': 117 rightPtr=lis[right] 118 del lis[i:i+2] 119 lis.insert(i,["@expr","!",rightPtr]) 120 return 0 121 122 return 1 123 124 ########### return value : 125 ############# 0 parsed some expresions 126 ############# 1 done nothing but no errors happened 127 ################# ~ =: ...~A... 128 def module_1_3(lis,i): 129 130 # left i right are both indexes :) 131 left=i-1 132 right=i+1 133 134 # process: ...~A... 135 if len(lis)>=2 and right<len(lis): 136 if lis[right][0][0]=='@': 137 rightPtr=lis[right] 138 del lis[i:i+2] 139 lis.insert(i,["@expr","~",rightPtr]) 140 return 0 141 142 return 1 143 144 ########### return value : 145 ############# 0 parsed some expresions 146 ############# 1 done nothing but no errors happened 147 ################# * =: ...A*A... 148 def module_2_0(lis,i): 149 150 # left i right are both indexes :) 151 left=i-1 152 right=i+1 153 154 # process: ...A*A... 155 if i>=1 and len(lis)>=3 and right<len(lis): 156 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 157 leftPtr=lis[left] 158 rightPtr=lis[right] 159 del lis[left:left+3] 160 lis.insert(left,["@expr","*",leftPtr,rightPtr]) 161 return 0 162 163 return 1 164 165 ########### return value : 166 ############# 0 parsed some expresions 167 ############# 1 done nothing but no errors happened 168 ################# / =: ...A/A... 169 def module_2_1(lis,i): 170 171 # left i right are both indexes :) 172 left=i-1 173 right=i+1 174 175 # process: ...A/A... 176 if i>=1 and len(lis)>=3 and right<len(lis): 177 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 178 leftPtr=lis[left] 179 rightPtr=lis[right] 180 del lis[left:left+3] 181 lis.insert(left,["@expr","/",leftPtr,rightPtr]) 182 return 0 183 184 return 1 185 186 ########### return value : 187 ############# 0 parsed some expresions 188 ############# 1 done nothing but no errors happened 189 ################# % =: ...A%A... 190 def module_2_2(lis,i): 191 192 # left i right are both indexes :) 193 left=i-1 194 right=i+1 195 196 # process: ...A%A... 197 if i>=1 and len(lis)>=3 and right<len(lis): 198 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 199 leftPtr=lis[left] 200 rightPtr=lis[right] 201 del lis[left:left+3] 202 lis.insert(left,["@expr","%",leftPtr,rightPtr]) 203 return 0 204 205 return 1 206 207 ########### return value : 208 ############# 0 parsed some expresions 209 ############# 1 done nothing but no errors happened 210 ################# + =: ...A+A... 211 def module_3_0(lis,i): 212 213 # left i right are both indexes :) 214 left=i-1 215 right=i+1 216 217 # process: ...A+A... 218 if i>=1 and len(lis)>=3 and right<len(lis): 219 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 220 leftPtr=lis[left] 221 rightPtr=lis[right] 222 del lis[left:left+3] 223 lis.insert(left,["@expr","+",leftPtr,rightPtr]) 224 return 0 225 226 return 1 227 228 ########### return value : 229 ############# 0 parsed some expresions 230 ############# 1 done nothing but no errors happened 231 ################# - =: ...A-A... 232 def module_3_1(lis,i): 233 234 # left i right are both indexes :) 235 left=i-1 236 right=i+1 237 238 # process: ...A-A... 239 if i>=1 and len(lis)>=3 and right<len(lis): 240 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 241 leftPtr=lis[left] 242 rightPtr=lis[right] 243 del lis[left:left+3] 244 lis.insert(left,["@expr","-",leftPtr,rightPtr]) 245 return 0 246 247 return 1 248 249 ########### return value : 250 ############# 0 parsed some expresions 251 ############# 1 done nothing but no errors happened 252 ################# << =: ...A<<A... 253 def module_4_0(lis,i): 254 255 # left i right are both indexes :) 256 left=i-1 257 right=i+1 258 259 # process: ...A<<A... 260 if i>=1 and len(lis)>=3 and right<len(lis): 261 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 262 leftPtr=lis[left] 263 rightPtr=lis[right] 264 del lis[left:left+3] 265 lis.insert(left,["@expr","<<",leftPtr,rightPtr]) 266 return 0 267 268 return 1 269 270 ########### return value : 271 ############# 0 parsed some expresions 272 ############# 1 done nothing but no errors happened 273 ################# >> =: ...A>>A... 274 def module_4_1(lis,i): 275 276 # left i right are both indexes :) 277 left=i-1 278 right=i+1 279 280 # process: ...A>>A... 281 if i>=1 and len(lis)>=3 and right<len(lis): 282 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 283 leftPtr=lis[left] 284 rightPtr=lis[right] 285 del lis[left:left+3] 286 lis.insert(left,["@expr",">>",leftPtr,rightPtr]) 287 return 0 288 289 return 1 290 291 ########### return value : 292 ############# 0 parsed some expresions 293 ############# 1 done nothing but no errors happened 294 ################# > =: ...A>A... 295 def module_5_0(lis,i): 296 297 # left i right are both indexes :) 298 left=i-1 299 right=i+1 300 301 # process: ...A>A... 302 if i>=1 and len(lis)>=3 and right<len(lis): 303 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 304 leftPtr=lis[left] 305 rightPtr=lis[right] 306 del lis[left:left+3] 307 lis.insert(left,["@expr",">",leftPtr,rightPtr]) 308 return 0 309 310 return 1 311 312 ########### return value : 313 ############# 0 parsed some expresions 314 ############# 1 done nothing but no errors happened 315 ################# >= =: ...A>=A... 316 def module_5_1(lis,i): 317 318 # left i right are both indexes :) 319 left=i-1 320 right=i+1 321 322 # process: ...A>=A... 323 if i>=1 and len(lis)>=3 and right<len(lis): 324 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 325 leftPtr=lis[left] 326 rightPtr=lis[right] 327 del lis[left:left+3] 328 lis.insert(left,["@expr",">=",leftPtr,rightPtr]) 329 return 0 330 331 return 1 332 333 ########### return value : 334 ############# 0 parsed some expresions 335 ############# 1 done nothing but no errors happened 336 ################# < =: ...A<A... 337 def module_5_2(lis,i): 338 339 # left i right are both indexes :) 340 left=i-1 341 right=i+1 342 343 # process: ...A<A... 344 if i>=1 and len(lis)>=3 and right<len(lis): 345 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 346 leftPtr=lis[left] 347 rightPtr=lis[right] 348 del lis[left:left+3] 349 lis.insert(left,["@expr","<",leftPtr,rightPtr]) 350 return 0 351 352 return 1 353 354 ########### return value : 355 ############# 0 parsed some expresions 356 ############# 1 done nothing but no errors happened 357 ################# <= =: ...A<=A... 358 def module_5_3(lis,i): 359 360 # left i right are both indexes :) 361 left=i-1 362 right=i+1 363 364 # process: ...A<=A... 365 if i>=1 and len(lis)>=3 and right<len(lis): 366 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 367 leftPtr=lis[left] 368 rightPtr=lis[right] 369 del lis[left:left+3] 370 lis.insert(left,["@expr","<=",leftPtr,rightPtr]) 371 return 0 372 373 return 1 374 375 ########### return value : 376 ############# 0 parsed some expresions 377 ############# 1 done nothing but no errors happened 378 ################# == =: ...A==A... 379 def module_6_0(lis,i): 380 381 # left i right are both indexes :) 382 left=i-1 383 right=i+1 384 385 # process: ...A==A... 386 if i>=1 and len(lis)>=3 and right<len(lis): 387 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 388 leftPtr=lis[left] 389 rightPtr=lis[right] 390 del lis[left:left+3] 391 lis.insert(left,["@expr","==",leftPtr,rightPtr]) 392 return 0 393 394 return 1 395 396 ########### return value : 397 ############# 0 parsed some expresions 398 ############# 1 done nothing but no errors happened 399 ################# != =: ...A!=A... 400 def module_6_1(lis,i): 401 402 # left i right are both indexes :) 403 left=i-1 404 right=i+1 405 406 # process: ...A!=A... 407 if i>=1 and len(lis)>=3 and right<len(lis): 408 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 409 leftPtr=lis[left] 410 rightPtr=lis[right] 411 del lis[left:left+3] 412 lis.insert(left,["@expr","!=",leftPtr,rightPtr]) 413 return 0 414 415 return 1 416 417 ########### return value : 418 ############# 0 parsed some expresions 419 ############# 1 done nothing but no errors happened 420 ################# & =: ...A&A... 421 def module_7_0(lis,i): 422 423 # left i right are both indexes :) 424 left=i-1 425 right=i+1 426 427 # process: ...A&A... 428 if i>=1 and len(lis)>=3 and right<len(lis): 429 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 430 leftPtr=lis[left] 431 rightPtr=lis[right] 432 del lis[left:left+3] 433 lis.insert(left,["@expr","&",leftPtr,rightPtr]) 434 return 0 435 436 return 1 437 438 ########### return value : 439 ############# 0 parsed some expresions 440 ############# 1 done nothing but no errors happened 441 ################# ^ =: ...A^A... 442 def module_8_0(lis,i): 443 444 # left i right are both indexes :) 445 left=i-1 446 right=i+1 447 448 # process: ...A^A... 449 if i>=1 and len(lis)>=3 and right<len(lis): 450 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 451 leftPtr=lis[left] 452 rightPtr=lis[right] 453 del lis[left:left+3] 454 lis.insert(left,["@expr","^",leftPtr,rightPtr]) 455 return 0 456 457 return 1 458 459 ########### return value : 460 ############# 0 parsed some expresions 461 ############# 1 done nothing but no errors happened 462 ################# | =: ...A|A... 463 def module_9_0(lis,i): 464 465 # left i right are both indexes :) 466 left=i-1 467 right=i+1 468 469 # process: ...A|A... 470 if i>=1 and len(lis)>=3 and right<len(lis): 471 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 472 leftPtr=lis[left] 473 rightPtr=lis[right] 474 del lis[left:left+3] 475 lis.insert(left,["@expr","|",leftPtr,rightPtr]) 476 return 0 477 478 return 1 479 480 ########### return value : 481 ############# 0 parsed some expresions 482 ############# 1 done nothing but no errors happened 483 ################# && =: ...A&&A... 484 def module_10_0(lis,i): 485 486 # left i right are both indexes :) 487 left=i-1 488 right=i+1 489 490 # process: ...A&&A... 491 if i>=1 and len(lis)>=3 and right<len(lis): 492 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 493 leftPtr=lis[left] 494 rightPtr=lis[right] 495 del lis[left:left+3] 496 lis.insert(left,["@expr","&&",leftPtr,rightPtr]) 497 return 0 498 499 return 1 500 501 ########### return value : 502 ############# 0 parsed some expresions 503 ############# 1 done nothing but no errors happened 504 ################# || =: ...A||A... 505 def module_11_0(lis,i): 506 507 # left i right are both indexes :) 508 left=i-1 509 right=i+1 510 511 # process: ...A||A... 512 if i>=1 and len(lis)>=3 and right<len(lis): 513 if lis[right][0][0]=='@' and lis[left][0][0]=='@' : 514 leftPtr=lis[left] 515 rightPtr=lis[right] 516 del lis[left:left+3] 517 lis.insert(left,["@expr","||",leftPtr,rightPtr]) 518 return 0 519 520 return 1 521 522 ########### return value : 523 ############# 0 parsed some expresions 524 ############# 1 done nothing but no errors happened 525 ################# ?: =: ...A?A:A... 526 ################# ^ 527 def module_12_0(lis,i): 528 529 # left i right are both indexes :) 530 first=i-3 531 leftOp=i-2 532 left=i-1 533 right=i+1 534 535 # process: ...A?A:A... 536 # ^ 537 if i>=3 and len(lis)>=5 and right<len(lis): 538 if lis[right][0][0]=='@' and lis[left][0][0]=='@' and\ 539 lis[leftOp][0]=='?' and lis[first][0][0]=='@': 540 firstPtr=lis[first] 541 leftPtr=lis[left] 542 rightPtr=lis[right] 543 del lis[first:first+5] 544 lis.insert(first,["@expr","?:",firstPtr,leftPtr,rightPtr]) 545 return 0 546 547 return 1 548 549 ########### return value : 550 ############# 0 parsed some expresions 551 ############# 1 done nothing but no errors happened 552 ################# , =: A,A,...A,A 553 def module_13_0(lis,i): 554 555 # process: A,A,...A,A 556 if len(lis)==1 and lis[0][0][0]!='@': 557 return 1 558 if len(lis)==1 and lis[0][0][0]=='@': 559 return 0 560 if (len(lis)%2)==1 : 561 i=1 562 if lis[0][0][0]!='@': 563 return 1 564 while i<len(lis): 565 if lis[i+1][0][0]=='@' and lis[i][0]==',': 566 i=i+2 567 else: 568 return 1 569 ls=[['@expr_list']] 570 i=0 571 while i<len(lis): 572 ls[0].append(lis[i]) 573 i=i+2 574 del lis[:] 575 lis[:]=ls[:] 576 return 0 577 return 1 578 579 ######################################## global list 580 # construct a module dictionary 581 # module_dic_tuple[priority]['Operator'](lis,i) 582 module_dic_tuple=({}, { '+':module_1_0,'-':module_1_1,'!':module_1_2,'~':module_1_3 },\ 583 { '*':module_2_0,'/':module_2_1,'%':module_2_2 }, \ 584 { '+':module_3_0,'-':module_3_1 },\ 585 { '<<':module_4_0,'>>':module_4_1 },\ 586 { '>':module_5_0,'>=':module_5_1,'<':module_5_2,'<=':module_5_3 },\ 587 { '==':module_6_0,'!=':module_6_1 },\ 588 { '&':module_7_0 },\ 589 { '^':module_8_0 },\ 590 { '|':module_9_0 },\ 591 { '&&':module_10_0 },\ 592 { '||':module_11_0 },\ 593 { '?:':module_12_0 },\ 594 { ',':module_13_0 } ) 595 596 operator_priority_tuple=( () , ('+', '-', '!', '~') , ('*','/','%'),\ 597 ('+','-'),('<<','>>'),\ 598 ('>','>=','<','<='),('==','!='),\ 599 ('&'),('^'),('|'),('&&'),('||'),('?',':'),(',') ) 600 601 ############################# parse:unary,binary,ternary,comma expr 602 ########### return value : 603 ############# 0 parsed sucessfully 604 ############# 1 syntax error 605 def parse_simple_expr(lis): 606 if len(lis)==0: 607 return 1 608 #if lis[len(lis)-1][0][0]!='@': 609 # return 1 610 #if lis[0][0][0]!='@' and lis[0][0] not in ('+','-','!','~'): 611 # return 1 612 for pri in range(1,12): # pri 1,2,3,4,5,6,7,8,9,10,11 613 i=0 614 while 1: 615 if len(lis)==1 and lis[0][0][0]=='@': 616 return 0 617 if i>=len(lis): 618 break 619 if lis[i][0] in operator_priority_tuple[pri]: 620 if module_dic_tuple[pri][lis[i][0]](lis,i)==0: 621 i=0 622 continue 623 else: 624 i=i+1 625 continue 626 else: 627 i=i+1 628 for pri in range(12,13): # pri 12 # parse ...A?A:A... 629 i=0 630 while 1: 631 if len(lis)==1 and lis[0][0][0]=='@': 632 return 0 633 if i>=len(lis): 634 break 635 if lis[i][0]==':': 636 if module_dic_tuple[pri]['?:'](lis,i)==0: 637 i=0 638 continue 639 else: 640 i=i+1 641 continue 642 else: 643 i=i+1 644 return module_dic_tuple[13][','](lis,0) 645 return 1 646 647 ########### return value :[intStatusCode,indexOf'(',indexOf')'] 648 ############# intStatusCode 649 ############# 0 sucessfully 650 ############# 1 no parenthesis matched 651 ############# 2 list is null :( 652 def module_parenthesis_place(lis): 653 length=len(lis) 654 err=0 655 x=0 656 y=0 657 if length==0: 658 return [2,None,None] 659 try: 660 x=lis.index([")",None]) 661 except: 662 err=1 663 lis.reverse() 664 try: 665 y=lis.index(["(",None],length-x-1) 666 except: 667 err=1 668 lis.reverse() 669 y=length-y-1 670 if err==1: 671 return [1,None,None] 672 else: 673 return [0,y,x] 674 675 676 ############################# parse:unary binary ternary prenthesis function expr 677 ########### return value : 678 ############# 0 parsed sucessfully 679 ############# 1 syntax error 680 ############################# find first ')' 681 def parse_comp_expr(lis): 682 while 1: 683 if len(lis)==0: 684 return 1 685 if len(lis)==1: 686 if lis[0][0][0]=='@': 687 return 0 688 else: 689 return 1 690 place=module_parenthesis_place(lis) 691 if place[0]==0: 692 mirror=lis[(place[1]+1):place[2]] 693 if parse_simple_expr(mirror)==0: 694 if place[1]>=1 and lis[place[1]-1][0]=='@var': 695 '''func''' 696 funcName=lis[place[1]-1][1] 697 del lis[place[1]-1:(place[2]+1)] 698 lis.insert(place[1]-1,["@func",funcName,mirror[0]]) 699 else: 700 del lis[place[1]:(place[2]+1)] 701 lis.insert(place[1],mirror[0]) 702 else: 703 return 1 704 else: 705 return parse_simple_expr(lis) 706 return 1
由于当树结构稍复杂时,分析其结构很是耗费时间,接下来,我们将开发一个将代码中的树结构图形化显示的简陋工具。
如有问题或者建议,欢迎留言讨论 :)