Said compiler from Lex & Yacc (3. Parameter Work)
Author: tangl_99
QQ: 8664220
MSN: TANGL_99@hotmail.com
Email: tangl_99@sohu.com
From this section, we have to enter the topic of the compiler. Have to say, the previous lexical scanner is only a small consisting of the entire compiler, and the language constructor described in these two is true Play an important role in our compilation work. These things believe that the course in the university's compilation has been studied, then this article is just roughly roughly, let everyone recall the knowledge of university, important YACC usage skills Wait, I will tell it later.
Example 3.1
Exp -> EXP OP EXP | (Exp) | Number
OP -> | - | *
Here is a well-defined simple integer arithmetic expression with addition, subtraction, multiplication. The bold indicates the end symbol, which is the symbol of the generated generated. And the exp, op is the non-final, they It is generated by a "->" symbol.
For example, 100 222 * 123123 - (888 11) is a specific expression that meets the above grammar.
Note that in the designation definition, it is possible to recatmends. So the EXP generated in the flash generated on the right side of the EXP generated.
Here | and regular expression, the meaning of the choice, that is, Exp may be exp or (exp) or Number.
Let us take a look at << Compilation Principles and Practices >> The book on BNF literacy.
For example, we have a mathematical expression (34-3) * 42, then let's take a look at how the above EXP grammar is derived to identify it.
(1) Exp => Exp [Exp -> EXP OP EXP]
(2) => exp p Number [exp -> number]
(3) => exp * number [op -> *]
(4) => (exp) * Number [exp -> (exp)]
(5) => (Exp) * Number [exp -> exp]
(6) => (Exp) * Number [exp -> number]
(7) => (exp - number) * Number [op -> -]
(8) => (NUMBER-NUMBER) * NUMBER [EXP -> Number]
In the end, all the non-final symbols in EXP turned all the end symbols. So the derivation is completed.
This derivation is very similar to the proposition reasoning that we speak in discrete mathematics. In fact, the mathematical foundation of the derived form of language is the proposition reasoning of our discrete mathematics.
During the derivation process, it is actually to launch the recursion in the original literary law. Then we are deriving the process, it is easy to achieve the generation of the tree. And analyzing the tree is a very important source of information in our compiler. We are The front and lexical analysis, and the goal of speaking syntax is to generate the analysis tree. With it, our compiler will become easier during the rear code generation process.
Please see:
Example 3.2
Also, examples of << Compilation Principles and Practices >>
Set E-> E A | A representation of the grammatism G, then consider the expression of it generated L (g)
If we are defined by the standard mathematical, we represent a grammap G with formula L (g) = {s | Exp => * s}.
S Represents any array of strings of the marker symbol, which is our end symbol. And Explicit end symbols, => * indicates a series of derived processes from non-ending to end symbols. Here * is a bit like we talk about the regular expression The * symbol in the formula, which represents 0 to unlimited repetition. So => * is a derived process of 0 times to unlimited times.
L (g) = {a, a a, a a a, a a a a, ...} E => E a => E A A => E A A A
At the same time, in our compilation class, another mathematical expression is often explained to explain the definition of cultural law.
G = (t, n, p, s)
Note that the T, N, P, and S here is a collection.
T Represents terminal symbol (TERMINAL), that is, {a, }
N represents a nonTerminal, that is, {E}, but n cannot be intersected with T.
P Represents a collection of production or grammar rules, here it has only one element: e-> E A
S represents the start symbol of the set N. About S, I also don't know its use, so I am sorry!
Example 3.3
This is what IF ELSE is often used in our C program language.
Statement -> IF-STMT | Other
IF-Stmt -> IF (Exp) Statement | IF (Exp) Statement Else Statement
EXP -> 0 | 1
Statement is the use of statements in our C language. It includes two possibilities. First, the IF-STMT statement, the second is other. Then we also define the generating style of the IF-STMT statement. There are two situations here, one There is no else, and there is an else. We use recursive. IF-stmt is originally included in Statement, but we use Statement in the IF-STMT. It is because of the intensive recursive Therefore, it has a broader representation of the regular expressions compared to our previously speaking, but at the same time, the derived identification of literacy is more complicated. According to the principle of compilation, it will focus on the literacy method. Division algorithm. A total of two, one is the LL algorithm, the top downward algorithm, the second is the LR algorithm, the algorithm from the bottom. The algorithm is relatively simple, there is a special case, that is, our next section The algorithm of recursive decline. Since the function in the C language can be recursive, the algorithm for achieving the recursive decline in this is very simple, and for our general programming language, although its algorithm is very weak. But it is already enough. And the algorithm for LR is a big problem. Its algorithm is the strongest, but it is very difficult to achieve, but also a scientist provides us with Yacc (or Bison) this Tools can come to automatically generate LR's grammar derivation algorithms. This is the YACC tool we have been mentioned.
Tall back, let's take a look at the procedure below
IF (0) Other Else Other
Analysis tree
Thinking: Why do you want to analyze the grammar into a tree structure?
Because the grammar itself is recursive, the best data structure represented is tree, so after we make a tree structure, we can use it on the problem of processing code generation, or it can be easily completed.
Example 3.4
Here I gave Microsoft's statement for the C language in the MSDN.
Note, here: symbol replaces our front yield ->
STATEMENT:
labeled-statementcompound-statementexpression-statementselection-statementiteration-statementjump-statementtry-except-statement / * Microsoft Specific * / try-finally-statement / * Microsoft Specific * /
JUMP-statement:
Goto Identifier; Continue; Break; Return Expression Opt;
Compound-Statement:
{Declaration-List Opt Statement-List Opt}
Declaration-list:
DeclarationDeclaration-List Declaration
Statement-List:
StatementStatement-List StatementExpression-Statement:
Expression Opt;
Iteration-Statement:
While (Expression) StatementDo Statement While (Expression); for (Expression Opt; Expression Opt; Expression Opt) Statement
Selection-Statement:
IF (Expression) Statementif (Expression) Statement Else StatementsWitch (Expression) Statement
Labeled-statement:
Identifier: StatementCase Constant-Expression: StatementDefault: Statement
Try-Except-statement: / * Microsoft Specific * /
__TRY Compound-Statement__except (Expression) Compound-Statement
Try-finally-statement: / * Microsoft Specific * /
__TRY Compound-Statement__finally compound-statement
2003-10-11
Chengdu, Sichuan University