编译器简介

[binaryterms稻糠亩] A compiler transforms a program from its source language to a target language. The meaning of the program does not change. 'transform' is more fancy and mathy than 'translate'. For example, from
for (int a = 1, b = 2; b >= 0; b = b - 1) a = a + 1
to
( 0, 0, 0, 0, 0, 0, 0, 1 ), # store a 1
( 0, 0, 1, 0, 0, 0, 1, 0 ), # store b 2
( 0, 1, 0, 0, 0, 0, 0, 1 ), # add a 1
( 0, 1, 1, 1, 1, 1, 1, 1 ), # add b -1
( 1, 0, 0, 0, 0, 0, 1, 0 ), # jlz 2
( 1, 1, 0, 1, 1, 1, 0, 1 ), # jmp -3
( 1, 1, 0, 0, 0, 0, 0, 0 ), # jmp 0
Those 0s and 1s are binary bits. 0000000100100010010000010111111100000101101110111000000 is called machine code; machines are too stupid to speak a language.

A compiler is used to transform a program in a high-level language to a program in a low-level language. Java, C, C++, C#, Perl, and Python are high-level languages.

The compiler operates in phases. At each phase, the compiler decomposes the source program in a process to produce a target program.

 The lexical analysis is the initial phase in the analysis of the source program. The source program is made up of a stream of characters, and it is the input of the lexer, which outputs a sequence of tokens. Consider a small assignment statement that calculates the interest (simple interest) on the principal amount.

Interest = Principal * Rate * Time

This assignment statement above can be broken into the following tokens: Identifier, Assignment_Symbol, Identifier, Operator, Identifier, Operator, Identifier.

The syntax analysis phase is also referred to as hierarchical analysis or parsing. 我们分析句子成分,也研究字的构成。The tokens are grouped into grammatical phrases, and finally the parse tree.

 How to write a parser? GNU Bison is a parser generator. Bison reads a specification of a context-free language, and generates a parser (either in C, C++, or Java). You can find some examples by searching "bison calculator example". Bison is not a toy; the following list is of some projects that are known to use Bison:

* MySQL and PostgresSQL
* CMake
* Bash shell uses a yacc grammar for parsing the command input.
* GCC started out using Bison, but switched to a hand-written recursive-descent parser for C++ in 2004 (version 3.4), and for C and Objective-C in 2006 (version 4.1)
* The Go programming language (GC) used Bison, but switched to a hand-written scanner and parser in version 1.5.
* Perl 5 uses a Bison-generated parser starting in 5.10.
* The PHP programming language (Zend Parser).
* Bison's own grammar parser is generated by Bison.

posted @ 2021-11-27 16:30  Fun_with_Words  阅读(72)  评论(0编辑  收藏  举报









 张牌。