PL0 language compiler analysis

zhaozj2021-02-08  226

The PL / 0 language is a subset of the PASCAL language. We analyze the compiler of PL / 0 herein includes analyzing the PL / 0 language source program, compiling the class PCODE code, and interprets running on the virtual machine The function of the class PCode code. The PL / 0 language compiler uses syntax analysis as the core, and the compilation method of scanning over again. The lexical analysis and code generation are used as a stand-alone subroutine for symbol analysis program. The function of errors and error recovery is provided while the gramatic analysis is provided. In the case where the source program has no error compilation, call class PCode interpreter explanation to perform the generated class PCode code.

Late Analysis Subprogrammatic Analysis: The Lord Analysis Subprogramme is getSym, the function is to read a word symbol (token) from the source program, put it in the global variable SYM, ID, and NUM, the syntax analyzer needs words. It is obtained directly from these three variables directly. (Note! Syntax Analyzer Removes the GetSym subroutine immediately to obtain a new word for a new word every time you use the value of the three variables. Instead of calling the getSym process when you need a new word.) The getSym process passes the getch by repeated call. The child process acquires characters from the source program and puts them into words. Buffer buffer technology is used during GetCH to improve program operation efficiency. The analysis process of the lexical analyzer: When calling getSym, it gets a character from the source program via the getch process. If this character is a letter, continue to get character or numbers, and finally you can form a word, check the stamp, if you find the reserved word, give the SYM variable into the corresponding word type value; if not found, Then this word should be a user-defined identifier (may be the name of the variable name, the constant name, or the process of the process), put the SYM as an Ident, and store this word into the ID variable. Check the two-point method to improve efficiency when checking the stamp. If the character obtained by getch is a number, continue to get the number with getch and put them into an integer, then set the SYM to Number and put the value of the numerical value into the NUM variable. If other legal symbols (such as: assignment number, larger than the number, less than or equal to the number, etc.), then the SYM is corresponding type. If you encounter a non-legal character, put the SYM into NUL.

Grammatical Analysis Subprogram Analysis: The grammatical analysis subroutine uses the auto-down recursive sub-program method, and the grammatical analysis is also based on the language of the program, and the mechanism for error processing is provided. The syntax analysis is mainly based on the sub-program analysis process, constdeclaration, variable definition analysis process, statement analysis process, expression processing (Expression), item processing (TERM), Factor Processing Process (Factor) and conditional processing process constitutes. These processes constitute a nesting hierarchy. In addition, there is an error reporting process (ERROR), code generation process (GEN), test word legitimacy and error recovery process (TEST), login name table process, query name table function (position) and column The secondary PCode code process (ListCode) is used to analyze the auxiliary process. It can be seen from the syntax graph of PL / 0: A complete PL / 0 program is composed of a sub-program and a sentence. Therefore, this compilation program is running, and the sub-program processing process block is called by the main program to analyze the divider section (the sub-program analysis process may also recursively call the block process), and then determine whether the last read symbol is period. If it is an error in the subsection and the sub-program analysis, it is a legal PL / 0 program that can run the generated code, otherwise the source PL / 0 program is illegal, output an error prompt. The operation mechanism of the PL / 0 compiler is analyzed according to each grammar unit.

Subprofer processing: After the syntax analysis begins, first call the Sub-Process Processing Process (Block) process division. Process entrance parameters are set to: 0 layers, symbol table position 0, error recovery word collection is the beginning of the sentence, the declaration, or statement start. After entering the Block process, first assign the partial data segment to 3, prepare 3 units to store the static chain SL, dynamic chain DL, and return address RA to allocate three units. Then use TX0 to record the current symbol table location and generate a JMP instruction, ready to jump to the start position of the main program, because the main program is currently started, so JMP's target is temporarily filled with 0, later Change again. At the same time, this JMP instruction is recorded in the code segment in the current location of the symbol table. After determining that the number of nested layers does not exceed the specified number of layers, the source program is started. First, it is judged whether or not a constant declaration is encountered, and if you encounter, the constant is stored in the symbol table. The same method is used to analyze the variable declaration, and the number of spatial data allocated under the DX variable is recorded in the variable definition. Then if you encounter Procedure reserved words, the process declaration and definition, the declaration method is to record the name and the hierarchy of the process into the symbol table, the process defined by the process is to call the Block process by recursive, because each process is a division program. Since this is the division in the division, it is necessary to pass the current level number LEV to the Block process when calling Block. After the division is completed, the process of entering the statement is completed. At this time, the code allocated pointer CX is the point of pointing the statement, which is the position that the previous JMP instruction requires jump to. Then, the jump position of this JMP instruction is changed to the current CX position via the address value recorded in front. And record the current code segment allocation address and the size of the local data segment to be assigned in the symbol table (value of DX). Generate an int instruction to assign DX space as the first instruction of this sub-block. The following is called the statement processing process statly analyzing statement. After the analysis is completed, the OPR command to generate an operand is 0, which is used to return from the sub-program (for the main program of the 0 layer, that is, the program is run, exit). Constant definition process: deposit, repeatedly obtain identifiers and corresponding values, and deposit into the symbol table. The name of the identifier and its corresponding value are recorded in the symbol table.

Variable definition process: Similar to constant definitions, by loop, repeatedly obtain identifiers, deposit into symbol tables. The name of the identifier is recorded in the symbol table, which is the offset address there in it in the layer.

Statement Processing Process: The statement processing is a nested subroutine that implements the analysis of the statement by calling expressions, item processing, factor processing, and the recursive calls. Statement Processing Processes can identify statements including assignment statements, READ statements, WRITE statements, call statements, if statements, and While statements. When you encounter a Begin / End statement, it is recursively call yourself. The corresponding class PCODE instruction is generated simultaneously.

Processing of assigning statements: First get the identifier left on the left, find it from the symbol table, and confirm that this identifier is indeed a variable name. Then by calling the expression processing process to calculate the value of the value of the value of the value, the corresponding instruction is generated to ensure that this value is placed at the top of the running period. Finally, the corresponding STO instruction is generated by the position information of the left variable before the previous, and the stack is stored in the space of the specified variable, and the assignment operation is implemented.

The processing of the READ statement: Determines the premise of the read statement syntax (otherwise error), generate the corresponding instruction: The first one is an OPR directive of the 16th operation, and implements a integer value from the standard input device, put it in the data stack top . The second is the STO instruction that stores the value of the top of the stack to the unit where the variable in the READ statement parentheses.

Write statement processing: Similar to the READ statement. Under the premise of the grammar, generate instructions: Analyze each expression in the Write statement parentheses by looping, generate the corresponding instruction to ensure that the value of the expression is calculated and put in the data stack and generates operation of the 14th operation. OPR instructions, the value of the output expression. Finally, the OPR instruction of the 15th operation is generated and outputs a wrap. Treatment of the Call statement: Find the identifier of the right of the Call statement from the symbol table to obtain the hierarchical and offset addresses. The corresponding CAL instruction is then generated. As for the protection site, the protection site required to call the sub-process is automatically completed by the class PCODE interpreter in explaining the execution of the CAL instruction.

Processing of the IF statement: Press the syntax of the IF statement, first call the logical expression processing process to process the condition of the IF statement, put the corresponding true and false value to the top of the data stack. Next, record the code segment allocation position (the position of the JPC instruction generated below), then generate the condition transfer JPC command (0 or fake transfer), and the transfer address is not known to be temporarily filled. Then call the statement processing procedure to process the statement or statement blocks behind the THEN statement. After the statement is processed, the location of the current code segment allocates the pointer should be the transfer location of the above JPC instruction. Change its jump position to the current code segment pointer position by recording the position of the JPC instruction in the previous record.

Process of begin / end statement: Analysis of process analysis by recursively calling statement by looping through each statement in the begin / End statement block, and generates the corresponding code.

Way of WHILE statement: Firstly, use CX1 variables to record the current code segment allocation position as the start position of the loop. The conditional expression in the WHILE statement is then processed to generate the corresponding code on the data stack top, and then record the current position with the CX2 variable, generate the condition transfer instruction, the transfer position is unknown, and 4. Analyze the statement or statement blocks after the DO statement by recursive call statement and generate the corresponding code. Finally, a unconditional jump command JMP is generated, and the jump to the CX1 finger, and change the jump position of the condition jump instruction referred to in CX2 to the current code segment allocation position.

Expression, Item, Factor Process: According to PL / 0 syntax, the expression should be started by the positive or unsigned, and is connected by several items to add teak. The item is connected by a number of factors to be connected by a ride number, and the factor may be an identifier or a number, or a sub-expression that is enclosed in parentheses. According to such a structure, the corresponding process is constructed, and the recursive call is completed, the expression is completed. Solve the item and factor independently solve the priority problem of the addition and decrease and the multiplier number. In the repeated calls of these processes, the value of the FSYS variable is always transmitted, ensuring that the error symbol can be skipped in the event of an error, so that the analytical process can be done.

Logical expression processing: First determine if it is a dollar logical expression: If so, calculate the value of the expression by calling the expression processing process, and then generates a judgment command. If not, it is definitely a binary logical operator. By calling the expression processing process, analyze the value of the arithmetic arithmetic, placed in two spaces of the stack, then generates corresponding logic accordingly, generates corresponding logic Judgment the instruction, put into the code segment.

Judging the word legality and error recovery process analysis: This process has three parameters, S1, S2 is two symbol set, and n is an error code. The function of this process is to test whether the current symbol (ie the value in the SYM variable) is in the S1 collection, if not, the error code N is output by calling the error reporting process, and abandon the current symbol, obtain a word through the lexical analysis process. Until this word appears in the S1 or S2 collection. This process is flexible in actual use, there are two main uses: When entering a syntax unit, call this procedure and check if the current symbol belongs to the start symbol collection of the symbol unit. If not, filter all symbols outside the start symbol and the successor symbol collection. At the end of the grammatical unit analysis, call this procedure, check if the current symbol belongs to the successor symbol set when calling the symbol unit. If not, filter all symbols outside of the successor symbol and start symbol collection. Through such a mechanism, it can be skipped in time when an error occurs in the source program, ensuring that the syntax analysis can continue. Other sub-processes called during syntax analysis are relatively simple, please refer to the comments of the source program.

Class PCode Code Interpretation Process Analysis This process simulates a stack computer that can run the PCode instruction. It has a stack data segment for storing running period data, which has a code segment for storing class PCode program code. At the same time, the data segment allocation pointer, command pointer, command register, partial segment pointer and other registers. When explaining the execution class PCode code, the data segment storage allocation is as follows: For each process of the source program, including the main program), when the call is called, three spaces are opened in the data segment, store the static chain SL, dynamic chain DL And return the address RA. The static chain records the base address of the latest data segment that defines the direct external process (or main program) of the process. The dynamic chain record calls the data segment base address of the process being run before the process. The return address records the breakpoint position of the program run when calling the process. For the main program, the values ​​of SL, DL, and RA are between 0. The function of the static chain is to reference its direct or indirect parent in one child (the parent of which is the nesting situation when defined, rather than the modem period). You can find a data segment base address that contains the variable to be referenced by a static chain, a data segment containing the variable to be referenced, and then access it by an offset address. When the process returns, the interpreter is recovered by the current segment base address by the current segment base address by returning the address to the address before the address recovery instruction pointer. Recovered the data segment by the current segment base address. Realize the return of the child process. For the main program, the interpreter will encounter the return address of 0, and it is considered that the program ends. Interpret the function of the base function during the program, is used to look forward to the partial data segment base address of the phase specified layer number along the static chain. This is often used in an instruction to access a local variable using STO, LOD, etc. The part of the class PCode code explained that the part performed by the loop and simple case judgment different instructions. When you encounter return instructions in the main program, the instruction pointer refers to 0 position, which uses such a condition as the condition of the end to the loop, and ensures that the program runs normally.

The following source program is based on the source code based on the "Compilation Principles" in Tsinghua University Press.

The program is compiled in Turbo Pascal 7.0.

*********************************************************** ****************************************

Program PL0 (FA, FA1, FA2); (* PL / 0 Compiler with Code Generation Interpretation Running Operations *)

(* PL / 0 compiler with code generation *)

Label 99; (* Declaring the wrong jump mark *) (* in Turbo Pascal 7.0 is not allowed to transfer, so the back GOTO statement is removed from me, so the label here is meaningless *)

Const (* constant definition *)

Norw = 13; (* of reserved words *) (* Number of reserved words *)

TXMAX = 100; (* length of identifier table *) (* identifier table length (capacity) *)

Nmax = 14; (* max number of digits in numbers *) (* Number allowed to maximum number *)

Al = 10; (* longth of identifiers *) (* identifier longest *)

AMAX = 2047; (* maximum address *) (* Addressing space *)

Levmax = 3; (* max depth of block Nesting *) (* maximum number of block nested layers *)

CXMAX = 200; (* size of code array *) (* class PCode target code array length (can accommodate code line number) *)

TYPE (* Type Definition *)

Symbol = (NUL, Ident, Number, Plus, Minus, Times, Slash, Oddsym,

EQL, NEQ, LSS, LEQ, GTR, GEQ, LPAREN, RPAREN, COMMA,

SEMICOLON, Period, Becomes, Beginsym, Endsym, IFSYM,

Thesym, WhileSym, WriteSym, Readsym, Dosym, Callsym,

Constsym, Varsym, Procsym; (* Symobl type identifies different types of vocabulary *)

Alfa = packed array [1..Al] of char; (* ALFA type for identifier *)

Object1 = (constant, variable, procedur); (* Object1 is the type of three identifiers *)

(* The original program uses Object as the type name, compiled in supporting object-oriented Turbo Pascal 7.0 *)

(* Wirth "the word" procedure "there, for won't work! *)

(* The above line is the annotation in the program list, saying that the original writer Wirth of this program uses Procedure this word as an identifier type, is not possible.

In fact, WIRTH was originally used here, it is possible. *)

Symset = set of symbol; (* Symset is a collection type of the Symbol type, which can be used to store a group of Symbol *)

FCT = (LIT, OPR, LOD, STO, CAL, INT, JMP, JPC); (* FCT type identifies the various instructions of class PCODE *)

INSTRUCTION = PACKED Record

f: fct; (* function code *)

L: 0..levmax; (* level *)

A: 0..Amax; (* Displacement addr *)

End; (* class PCode instruction type, contains three fields: command f, layer difference L, and another operand a *)

(*

LIT 0, A Load Constant A

OPR 0, A Execute Opr A

Lod L, A Load Variable L, Asto L, A Store Variable L, A

Cal L, A Call Procedure A at Level L

INT 0, A Increment T-Register By A

JMP 0, A jump to a

JPC 0, A Jump Conditional TO A

*)

VAR (* global variable definition *)

FA: text; (* text file FA is used to list the source program *)

FA1, FA2: TEXT; (* text file FA1 is used to list class PCODE code, FA2 is used to record the process of explaining the execution class PCODE code *)

Listswitch: boolean; (* True Set List Object Code *) (* If you set true, the program is compiled, and the class PCODE code will be listed.

Otherwise, not listed, Pcode code *)

CH: char; * Last Char READ *) (* Mainly used for the lexical analyzer, store characters that have been read from the file *)

SYM: SYMBOL; (* Last Symbol Read *) (* The use of the lexical analyzer output, store the Type of the TOKEN that recognizes the most recently *)

ID: ALFA; (* Last Identifier READ *) (* The use of the word method analyzer output, store the name of the identifier identifier "*)

Num: Integer; (* Last Number Read *) (* The use of the lexical analyzer output, store the value of the most recently identified numbers *)

Cc: integer; (* character count *) (* line buffer pointer *)

LL: Integer; (* line length *) (* line buffer length *)

KK: INTEGER; (* introduced this variable is for program performance consideration, see getSym process notes *)

CX: Integer; * Code Allocation Index *) (* Code Assignment Pointer, Code Generation Module always generates new code *)

Line: Array [1..81] of char; (* row buffer, used to read a line from the file, the use of words for the word "*)

A: ALFA; (* Words for temporary storage in analyzers *)

Code: array [0..cxmax] of instruction; (* The generated class PCode code table, store the compiled class PCode code *)

Word: array [1..norw] of Alfa; (* Reserved Word Form *)

Wsym: array [1..norw] of symbol; (* Symbol type corresponding to each reserved word in the word table *)

Ssym: array ['' .. '^'] of symbol; (* Some symbols corresponding to the Symbol type table *)

(* Wirth Uses "Array [char]" Here *)

MNEMonic: array [fct] of packed array [1..5] of char; (* class PCode instructions to help menu *)

DECLBEGSYS, STATBEGSYS, FACBEGSYS: SYMSET; (* Start, express start, start symbol collection *)

Table: Array [0..txmax] of record (* symbol table *)

Name: ALFA; (* Name of symbol *)

Case Kind: Object1 of (* Symbols *)

Constant: (* if it is a constant name *)

(VAL: integer); (* VAL in Val in the value of constant *)

Variable, procedur: (* If it is a variable name or process name *)

(LEVEL, ADR, SIZE: Integer (* Storage layer difference, offset address, and size *) (* "size" lacking in Orginal. I think it belons here *)

END;

FIN, Fout: Text; (* Fin text file is used to point to the input source file file, no * in the fout program *)

FNAME: STRING; (* File name of the PL / 0 source program file) *)

(* I modified code: The original program uses the ALFA type here, which cannot be passed in Turbo Pascal 7.0. The parameters of the READLN function cannot be ALFA type *)

Err: integer; (* Error total *)

(* Error handling process error *)

(* Parameters: N: Error code *)

Procedure Error (N: Integer);

Begin

Writeln ('****', '': CC-1, '!', n: 2); (* Displayed on the screen CC-1 position! With the error code prompt, due to CC

It is a buffer pointer, so it is the wrong position in the position *)

Writeln (FA1, '****', '': CC-1, '!', N: 2); (* Output in file CC-1 position! "with the error code prompt *)

Err: = Err 1 (* Error Total number plus one *)

End (* error *);

(* Lexical analysis process getsym *)

PROCEDURE GETSYM;

VAR

I, J, K: Integer;

(* Read Character Process in the original program getch *)

Procedure getch;

Begin

IF cc = ll tell (* If the row buffer pointer pointing to the last character of the row buffer is read from the file to the row buffer *)

Begin

If Eof (FIN) THEN (* If the end of the file *)

Begin

Write ('Program Incomplete'); (* error, exit program *)

Close (FA);

Close (FA1);

Close (FIN);

Halt (0);

{goto 99}

(* I modified code, because of the Turbo Pascal 7.0 does not allow cross-process GOTO, only the program can only be exited with the above method. *)

END;

LL: = 0; (* line buffer length 0 *)

Cc: = 0; (* line buffer pointer to the head *)

Write (CX: 4, ''); (* Output CX value, width is 4 *)

Write (FA1, CX: 4, ''); (* Output CX value, width is 4 to file *)

While Not Eoln (FIN) Do (* When not in the end of the line *)

Begin

LL: = LL 1; (* line buffer length plus one *)

Read (FIN, CH); (* reads a character from the file to ch *)

Write (CH); (* Output CH * on the screen)

Write (FA1, CH); (* Output CH output to file *)

LINE [LL]: = CH; (* Put the read character to the key position *)

END;

(* Visible, PL / 0 source program requires less than 81 characters per line *)

Writeln;

LL: = LL 1; (* line buffer length plus one, used to accommodate the ready-to-read carriage return Cr *)

READ (FIN, LINE [LL]); (* Putting # 13 (CR) reading row buffer tail *)

Read (FIN, CH); (* I added code. Since the text file in the PC is represented by # 13 # 10 (CR LF),

So it is necessary to read excess LF from the file, where the CH variable is due to the value of the CH variable is so will be changed, and this extra value is placed in the CH. *)

Writeln (FA1);

END;

CC: = CC 1; (* row buffer pointer plus one, point to the upcoming character *)

CH: = line [cc] (* reads the character, put into the global variable ch *)

End (* getch *);

Begin (* getsym *)

While (ch = '') or (ch = # 13) Do (* I modified code: This sentence is used to read an effective character

(Skip the extra spaces in the read characters), but actually want to jump

Excessive carriage return *)

GetCh;

If Chin ['a' .. 'z'] (* If the read character is a letter, it is a reserved word or identifier *)

Begin

K: = 0; (* Identifier buffer finger pin 0 *)

REPEAT (* This loop is used to read the character identifier *) in the source file.

IF K

Begin

K: = k 1;

a [k]: = CH;

END;

Getch (* read a character *)

Until NOT (CH I '.' 9 ']); (* until reading is not letters or numbers, so that the identifier constituent rules of PL / 0 are:

Start with letters, followed by several letters or numbers *)

IF K> = kk Then (* If the current identifier length is greater than or equal to KK *)

KK: = k (* Let KK is the current identifier length *)

Else

Repeat (* This loop is used to fill the identifier buffer space for space for space for the corresponding letter or space.

a [kk]: = '';

KK: = KK - 1

Until KK = K;

(* When the process is running, the value of KK is Al, that is, the maximum identifier length, if the identifier is less than KK,

The space with the rear of the A array does not have a space for spaces.

At this time, the value of KK becomes a number of A array front non-space character. When you run GetSym in the future, if the identifier of the read is greater than or equal to KK,

The value of KK becomes the length of the current identifier.

At this time, you don't have to fill the space behind, because it is definitely a space behind it. Vioence, if the recently read the identifier length is less than KK, then you need to go forward from the KK position.

Fill the space that exceeds the current identification length is filled with spaces.

Such a logic above is entirely considered for program performance. In fact, it can be fully simply simply puts the space in the A [K] element in the A array regardless of the three seven twenty-one full fs.

*)

(* Below starts, two-point findings see what the read identifier is not one of the reserved words *)

ID: = a; (* Last read identifier is equal to A *)

I: = 1; (* i pointing to the first reserved word *)

J: = norw; (* J point to the last reserved word *)

Repeat

K: = (i j) DIV 2; (* K pointing to a reserved word *)

If ID <= word [k] "(* If the current identifier is less than the reserved word referred to in K)

J: = K - 1; (* Mobile J Point *)

If ID> = word [k] "If the current identifier is greater than the reserved word referred to K *)

i: = k 1 (* Mobile I pointer *)

Until i> j; (* loop until the reserved word is found *) IF i - 1> j Then (* If I - 1> J indicates that the corresponding item is found in the reserved language, the id is reserved * )

SYM: = Wsym [K] (* Find the reserved word, set the SYM to the corresponding reserved word value *)

Else

SYM: = Ident (* does not find the reserved word, set the SYM as an identity, indicating that it is the identifier *)

END (* to this reading character is the end of the reserved word or identifier for letters *)

Else (* If the character is not letters *)

IF chin ['0' .. '9'] THEN (* If the character is digital *)

Begin (* Number *) (* Starts the number to process *)

K: = 0; (* number number *)

Num: = 0; (* number is 0 *)

SYM: = Number; (* Set SYM is Number, indicating that this time is digital *)

REPEAT (* This loop reads the character from the source file, constitutes a number *)

Num: = 10 * Num (ORD (CH) - ORD ('0')); (* Num * 10 plus the recently read character ASCII minus '0' ASCII get the corresponding value *)

K: = k 1; (* Digital digits plus one *)

Getch

Until NOT (CH IN ['0' .. '9']); (* until the characters read out are not numbers *)

IF K> Nmax Then (* If the components of the components are greater than the maximum number of digital bits *)

Error (30) (* issued 30 "wrong *)

End (* ends to the digital recognition processing *)

Else

IF ch = ':' Then (* If you read unsteres, not numbers but colon *)

Begin

GetCh; (* read a character *)

if ch = '=' THEN (* If you read is the equal sign, you can confer the value number with the colon) *)

Begin

Sym: = becomes; (* SYM type is set to assignment number Becomes *)

Getch (* read the next word *)

end

Else

SYM: = NUL; (* If you are not reading the equal sign, the separate colon is not what is *)

End (* Processing to the assignment number above *)

Else (* If you are not letting the letter, it is not a number, it is not a colon.

If ch = '<' Then (* If you read less than the number *)

Begin

GetCh; (* read a character *)

if ch = '=' THEN (* If you read the equal number *)

Begin

SYM: = Leq; (* purchased a less than or equal number *)

Getch (* read a character *)

end

Else (* If you are not followed by the number *)

SYM: = LSS (* That is a separate smaller *)

end

Else (* If it is not letters, it is not a number, not a colon, not less than the number *)

If ch = '>' THEN (* If you read more than the number, the processing process is similar to the processing less than the number *)

Begin

GetCh; (* read a character *)

if ch = '=' THEN (* If you read the equal number *)

Begin

SYM: = geq; (* purchased a larger than equal to the number *)

Getch (* read a character *)

end

Else (* If it is not the same after greater than the number *)

Sym: = gtr (* That is a separate greater than the number *)

end

Else (* If you are not letting the letter, it is not a number, no colon, no less than the number, no big than the number *)

Begin (* That explains it is not an identifier / reserved word, nor is the complex double-byte operator, should be a normal symbol *)

SYM: = SSYM [CH]; (* Isolated on the symbol table to find it, assigning Sym *)

Getch (* read a character *)

end

(* Judgment of the entire IF statement *)

End (* getsym *);

(* The word method analysis process getSym summary: read a number of effective characters from the source file, form a token string, identify the type of it

To reserve word / identifier / numbers or other symbols. If it is a reserved word, put the SYM into the corresponding reserved word type, if

The identifier is indicated by the SYM is indicated by the identifier. At the same time, it is stored in the ID variable to reserve the word string or the identification.

The name. If it is a number, the SYM is set to Number, and the value of the number is stored in the NUM variable. If it is other operator,

Then put the SYM directly into the corresponding type. After this process, the CH variable is stored in the next charm *)

(* Target code generation process gen *)

(* Parameters: x: Help of a line of code to generate *)

(* Y, z: Two operands of the code *)

(* This process is used to write the generated target code to the target code array, and the subsequent interpreter explanation is executed *)

PROCEDURE GEN (X: fct; y, z: integer);

Begin

IF CX> CXMAX THEN (* If CX> CXMAX indicates that the currently generated code line number is greater than the maximum number of lines allowed *)

Begin

Write ('Program Too Long'); (* Output "Up Too Long", exit *)

Close (FA);

Close (FA1);

Close (FIN);

Halt (0)

{goto 99}

(* I modified code, because of the Turbo Pascal 7.0 does not allow cross-process GOTO, only the program can only be exited with the above method. *)

END;

WITH code [cx] do (* write code to the current CX of the target code array "*)

Begin

f: = x;

L: = Y;

A: = z;

END;

CX: = CX 1 (* Mobile CX Pointer pointing down one vacancy *)

End (* gen *);

(* Test whether the current word is legal process TEST *)

(* Parameters: S1: When the syntax analysis enters or exits a synthetic unit, the current word meets the collection *)

(* S2: In an error, you can restore syntax to analyze the normally complement of the formation *)

(* N: Error information number, when the current symbol does not belong to the wrong information of the legal S1 collection *)

Procedure Test (S1, S2: Symset; N: Integer);

Begin

IF not (Sym in S1) THEN (* If the current symbol is not in S1 *)

Begin

Error (N); * Send N number error *)

S1: = S1 S2; (* add S2 collection to S1 collection *)

While Not (Sym in S1) Do (* Find the next legal symbol by loop to recover grammar analysis work *)

Getsym

end

End (* test *);

(* Syntax Analysis Process Block *)

(* Parameters: Lev: The level of this syntax analysis *)

(* TX: Symbol table pointer *)

(* fsys: Word Collection for Error Recovery *)

Procedure Block (Lev, TX: Integer; fsys: symset);

VAR

DX: Integer; * Data segment memory allocation pointer, pointing to the offset position of the next assigned space in the data segment *) Tx0: Integer; (* Initial Table Index *) (* recordbook Layer start symbol table location *)

CX0: Integer; (* Initial Code Index *) (* Record this layer starts with code segment allocation location *)

(* Landing symbol table process Enter *)

(* Parameters: k: To log in to the symbol type symbol type *)

Procedure Enter (k: Object1);

Begin (* Enter Object Into Table *)

TX: = TX 1; (* Symbol table pointer points to a new vacancy *)

WITH TABLE [TX] Do (* Start Login *)

Begin

Name: = ID; (* Name is the name of the symbol, for the identifier, here is the name of the identifier *)

Kind: = k; (* symbol type, may be constant, variable or process name *)

Case K is (* Different operations according to different types *)

Constant: (* if it is a constant name *)

Begin

If Num> Amax THEN (* in the case where the value of constants is greater than the maximum allowed value *)

Begin

Error (31); (* throws 31 errors *)

Num: = 0; (* The number actually logged in is replaced by 0)

END;

Val: = NUM ​​(* If the legal value, log in to the symbol table *)

END;

Variable: (* if it is a variable name *)

Begin

Level: = lev; (* note the level number it belongs *)

ADR: = Dx; (* note it in the offset in the current layer *)

DX: = DX 1; (* The offset is from one, ready for the next time *)

END;

Procedur: (* If you want to log in, the process name *)

Level: = Lev (* Location of this process *)

end

end

End (* enter *);

(* The login symbol process does not take into account the problem of repeated definitions. If you repeat the definition, it is based on the last definition. *)

(* Find the function position * of the location of the specified symbol in the symbol table

(* Parameters: ID: Symbols you want to find *)

(* Return value: The symbol you want to find in the symbol table, if you can't find it, return 0 *)

Function Position (ID: ALFA): Integer;

VAR

i: integer;

Begin (* Find Identifier In Table *)

Table [0] .name: = ID; (* put the ID into the symbol table 0 position *)

I: = TX; (* From the current location of the symbol table, the last symbol is starting to find *)

While Table [i] .name <> ID DO (* If the current symbol is not consistent with the inconsistency to find *)

i: = i - 1; (* find a front *)

Position: = i (* Return the location number found, if you don't find it, you must be 0 *)

End (* position *);

(* Constant declaration processing process constdeclaration *)

PROCEDURE CONSTDECLATION;

Begin

If SYM = Ident Then (* The first symbol encountered by the constant declaration process must be identifier *)

Begin

Getsym; (* Get the next token *)

IF SYM IN [EQL, Becomes] THEN (* if it is equal or assignment number *)

Begin

if Sym = Becomes Then (* If it is assignment number (constant should be the alignment) *) Error (1); (* throws 1 error *)

(* Here, there is actually the error correction, making the compilation continue, and the assignment number is used as an equal sign *)

Getsym; (* Get the next token, equivalent or assignment number, should be connected to the number *)

if Sym = Number Then (* If it is true *)

Begin

ENTER (constant); (* Log in to the symbol table *)

Getsym (* Get the next token, prepare for the back *)

end

Else

Error (2) (* If the equal number is not a number, throw the No. 2 error *)

end

Else

Error (3) (* If the constant identifier is not equal or assignment number, throw 3 error *)

end

Else

Error (4) (* If the first symbol encountered in the constant declaration is not an identifier, throw an error *)

End (* constdeclaration *);

(* Variable declaration process VArDeclaration *)

PROCEDURE VARDECLATION;

Begin

If SYM = Ident Then (* The first symbol encountered by the variable declaration process must be identifier *)

Begin

ENTER (Variable); (* Log in to the symbol table *)

Getsym (* Get the next token, prepare for the back *)

end

Else

Error (4) (* If the first symbol encountered in the variable declaration process is not an identifier, throw an error *)

END (* VARDECLATION *);

(* List the current layer of PCODE target code process listcode *)

Procedure Listcode;

VAR

i: integer;

Begin (* List code generated for this block)

If ListSwitch Then (* If the user selects is to list the code *)

Begin

For i: = cx0 to CX - 1 DO (* starts from the current layer code to the current code location - 1, ie the local sub-block *)

WITH code [i] do

Begin

Writeln (i: 4, mnemonic [f]: 5, l: 3, a: 5); (* shows the macrper and the L and A operands of the code code *)

(* I modified code: When the original program is output I, no designated 4 characters width, not beautiful, and the following sentence is not supported. *)

Writeln (Fa, I: 4, MNemonic [F]: 5, L: 3, A: 5) (* Print the screen to the file *)

END;

end

End (* listcode *);

(* Statement Process Statement *)

(* Parameter Description: fsys: If the error can be used to recover symbolic symbols for syntax analysis *)

Procedure Statement (fsys: symset);

VAR

I, CX1, CX2: Integer;

(* Expression Procedure Expression *)

(* Parameter Description: fsys: If the error can be used to recover symbolic symbols for syntax analysis *)

Procedure Expression (fsys: symset);

VAR

Addop: Symbol;

(* Item Process Term *)

(* Parameter Description: fsys: If the error can be used to recover symbolic symbols for syntax analysis *)

Procedure Term (fsys: symset);

VAR

Mulop: Symbol;

(* Factor processing process Factor *)

(* Parameter Description: fsys: If the error can be used to recover symbolic symbols for syntax analysis *)

Procedure Factor (fsys: symset); var

i: integer;

Begin

Test (Facborgsys, FSYS, 24); (* Before starting factor processing, check if the current token is in the Facborgsys collection. *)

(* If it is not a legal token, throw the No. 24 error, and recover the syntax processing through FSYS sets to continue *)

While Sym in Facbegsys Do (* Cycle Processing Factor *)

Begin

if Sym = Ident Then (* If you encounter the identifier *)

Begin

i: = position (ID); (* check the symbol table, find the current identifier in the symbol table *)

IF i = 0 THEN (* If the symbol table returns 0, the identifier *) is not found.

Error (11) (* throws 11 errors *)

Else

WITH TABLE [i] do (* If the location of the current identifier is found in the symbol table, start to generate the corresponding code *)

Case Kind of

Constant: Gen (LIT, 0, VAL); (* If the identifier corresponds to constant, the value is VAL, generates the LIT instruction, put the VAL to the top *)

Variable: Gen (LOD, LEV - Level, ADR); (* If the identifier is a variable name, generate the LOD instruction, *)

(* Placed the variable of the offset address of the layer from the current layer Level to the top of the stack *)

Procedur: Error (21) (* If the identifier encountered in the factor processing is the process name, error, throw 21 words *)

END;

Getsym (* Get the next token, continue loop processing *)

end

Else

IF SYM = Number the (* If you encounter numbers when the factor is processed)

Begin

if Num> Amax Then (* If the size exceeds allowing maximum AMAX *)

Begin

Error (31); (* throws 31 "error *)

Num: = 0 (* Point 0 value by 0) *)

END;

GEN (LIT, 0, NUM); (* Generate the LIT Direction, put this value field constant on the top *)

Getsym (* Get the next token *)

end

Else

if Sym = lparen kiln (* If you encounter the left bracket *)

Begin

Getsym; (* Get a token *)

Expression ([RPAREN] FSYS); (* Recursive call Expression subroutine analysis a child expression *)

IF SYM = rparen kiln (* After child expression analysis, you should encounter the right bracket *)

Getsym (* If you do to meet the right brackets, read the next token *)

Else

Error (22) (* Otherwise I will throw the 22nd error *)

END;

TEST (FSYS, FACBEGSYS, 23) (* A factor is handled, the token encountered should be in the fsys collection *)

(* If not, throw 23, and find the beginning of the next factor, grammar analysis can continue to run *)

end

End (* factory *);

Begin (* term *)

FACTOR ([Times, Slash] fsys; (* Every item should start by factor, so calling Factor subroutine analysis factor *)

While Sym in [Times, Slash] Do (* After a factor, you should encounter the number or division *)

Begin

Mulop: = SYM; (* Save the current operator *)

Getsym; (* Get the next token *)

Factor (fsys [Times, Slash]); (* should be a factor after the operator, so that Factor subroutine analysis factor *) if mulop = Times Then (* If you have encountered a number *)

Gen (OPR, 0, 4) (* Generate Multiplication Directive *)

Else

GEN (OPR, 0, 5) (* is not a multidistance must be divided, generating division instructions *)

end

End (* term *);

Begin (* Expression *)

IF SYM IN [Plus, Minus] THEN (* an expression may start from the plus sign or minus sign, indicating the positive and negative number *)

Begin

Addop: = SYM; (* Save the current number or negative, so as to generate the corresponding code *)

Getsym; (* Get a token *)

Term (fsys [plus, minus]); (* Negative negative is behind, adjust TERM subroutine analysis *)

if addop = minus kil (* If saved, the symbol is loaded *)

GEN (OPR, 0, 1) (* Generate a 1st Operation Directive: Refueling Operation *)

(* If it is not a negative, it is the positive, no need to generate the corresponding instruction *)

end

Else (* If it is not starting by the log, it should be an item *)

Term (fsys [plus, minus]); (* Call Term subroutine analysis *)

While Sym In [Plus, Minus] Do (* should be added after the addition or decline *)

Begin

Addop: = SYM; (* Save the operator *)

Getsym; (* Get the next token, then follow the addition to the operation, it should be one item *)

Term (fsys [plus, minus]); (* Tune Term subroutine analysis *)

if addop = plus the (* If the operator between items and items is plus sign *)

GEN (OPR, 0, 2) (* Generate 2 Operation Directive: Add *)

Else (* Otherwise it is subtraction *)

GEN (OPR, 0, 3) (* Generates 3 Operation Directive: subtraction *)

end

End (* expression *);

(* Condition processing process *)

(* Parameter Description: fsys: If the error can be used to recover symbolic symbols for syntax analysis *)

Procedure Condition (fsys: symset);

VAR

Relop: Symbol; (* is used to temporarily record token (here is a binary logical operator) content *)

Begin

if Sym = ODDSYM THEN (* if it is an ODD operator (1 yuan) *)

Begin

Getsym; (* Get the next token *)

Expression (fsys); (* Processing calculation of ODD expression *)

Gen (opr, 0, 6); (* Generate 6 Operation Directive: Parity Judgment *)

end

Else (* If it is not an ODD operator (then it is a binary logical operator) *)

Begin

Expression ([EQL, NEQ, LSS, LEQ, GTR, GEQ] FSYS); (* Proceed in the left of the expression *)

IF not (SYM IN [EQL, NEQ, LSS, LEQ, GTR, GEQ]) THEN (* If TOKEN is not a logical operator)

Error (20) (* throws 20 errors *)

Else

Begin

RELOP: = SYM; (* Record the current logical operator *)

Getsym; (* Get the next token *)

Expression (fsys); (* Processing the right part of the expression *)

Case Relop of (* If the operator just now is one of the following *) EQL: Gen (opr, 0, 8); (* Iso: Producing an instruction of 8 bodies *)

NEQ: Gen (opr, 0, 9); (* 不 号: Generates No. 9 judgment or other instructions *)

LSS: GEN (OPR, 0, 10); (* less than number: generates 10 judgment small instructions *)

Geq: Gen (opr, 0, 11); (* is greater than the equal number: generated 11 "not less than the instruction *)

GTR: GEN (OPR, 0, 12); (* bigger than number: generated 12 "greater than the instruction *)

Leq: Gen (opr, 0, 13); (* less than or equal to: generating 13 cases no more than the instruction *)

end

end

end

End (* condition *);

Begin (* Statement *)

if Sym = Ident Then (* So-called "Statement" may be assigning statement, starting with identifier *)

Begin

I: = Position (ID); (* check the identifier in the symbol table *)

IF i = 0 THEN (* If not found *)

Error (11) (* throws 11 errors *)

Else

If Table [i] .kind <> variable the (* If this identifier is found in the symbol table, the identifier is not a variable name *)

Begin

Error (12); (* throws 12 wrong *)

I: = 0 (* i set 0 as an error flag *)

END;

Getsym; (* get the next token, normal should be assigned *)

if Sym = Becomes Then (* If it is true to assign the value *)

Getsym (* Get the next token, normal should be an expression *)

Else

Error (13); (* If the left identifier symbol of the assignment statement is not assigned, throws 13 incorrect *)

Expression (fsys); (* Processing Expression *)

IF i <> 0 THEN (* If you have not erroneous, i will not be 0, i refers to the location of the current name left identifier in the symbol table *)

With table [i] do

Gen (STO, Lev - Level, ADR) (* Generate a line of expression value written to the designated memory STO target code *)

end

Else

if Sym = ReadSym Then (* If it is not a value of the statement, but encountered a read statement *)

Begin

Getsym; (* Get the next token, normal condition should be the left bracket *)

IF sym <> lparen kiln (* If the read statement is not the left bracket *)

Error (34) (* throws 34 errors *)

Else

Repeat (* loop gets the parameter table in the READ statement parentheses, which generates the corresponding "read" target code from the keyboard "*)

Getsym; (* Get a token, normal should be a variable name *)

if Sym = Ident Then (* if it is an identifier *)

(* There is a problem here, it should also be judged that this identifier is not a variable name, if it is a constant name or process name, error *)

i: = position (ID) (* check the symbol table, find it to give i, find if I will be 0 *)

Else

I: = 0; (* is not an identifier, there is a problem, i set 0 as an error mark *)

IF i = 0 THEN (* if there is error *)

Error (35) (* throws 35 error *)

ELSE (* Otherwise generates the corresponding target code *)

With table [i] do

Begin

GEN (OPR, 0, 16); (* Generate 16 Operation Directive: Read Number *) Gen (STO, LEV - Level, ADR) (* Generate a STO Directive, save the read value into the specified variable Space*)

END;

Getsym (* Get the next token, if it is a comma, read language is not finished, otherwise it should be the right bracket *)

Until Sym <> COMMA; (* constantly generating code until the variable in the parameter table of the READ statement is traversed, here is not a comma, should be the right bracket *)

IF sym <> rparen kiln (* If it is not our expected right bracket *)

Begin

Error (33); (* throws 33 error *)

While Not (Sym in fsys) Do (* Relying on the fsys set, find the next legal token, restore syntax analysis *)

Getsym

end

Else

Getsym (* If the read statement ends normally, get the next token, generally semicolon or end *)

end

Else

if Sym = WriteSym Then (* If you encounter WRITE statement *)

Begin

Getsym; (* Get the next token, should be the left bracket *)

IF sym = lparen the (* If you are right, left brackets *)

Begin

REPEAT (* gets each value in parentheses, perform output *)

Getsym; (* get a token, here should be an identifier *)

Expression ([RPAREN, COMMA] FSYS); (* Call the Expression process analysis expression, using the right bracket and comma *) for error recovery

GEN (OPR, 0, 14) (* Generates No. 14 Directive: Output to Screen *)

Until Sym <> Comma; (* is no longer comma that is cycled until it is, it should be the right bracket *)

IF sym <> rparen kiln (* if not the right bracket *)

Error (33) (* throws 33 error *)

Else

Getsym (* to get the next token under normal circumstances, ready for the back *)

END;

GEN (OPR, 0, 15) (* Generate a target code for an operation, function is to output a wrap *)

(* This can be seen that the WRITE statement in PL / 0 is similar to the WriteLn statement in Pascal, which is *) with output wrap.

end

Else

if Sym = Callsym Then (* If it is a Call statement *)

Begin

Getsym; (* Get token, should be a process name identifier *)

IF sym <> identity (* if the calil is not an identifier *)

Error (14) (* throws 14 wrong *)

Else

Begin

i: = position (ID); (* Find the corresponding identifier * from the symbol table *)

IF i = 0 THEN (* If not found *)

Error (11) (* throws 11 errors *)

Else

WITH TABLE [I] Do (* If the identifier is located in the first location of the symbol table *)

if Kind = Procedur Then (* If this identifier is a process name *)

Gen (CAL, Lev-Level, ADR) (* Generates CAL target code, call this process *)

Else

Error (15); (* If the Call is not the process name, throw the 15th error *)

Getsym (* Get the next token, prepare for the back *)

end

end

Else

IF SYM = ifsym the (* if it is an IF statement *) Begin

Getsym; (* Get a token should be a logical expression *)

Condition ([THENSYM, DOSYM] FSYS); (* Analytical calculation of logical expressions, error recovery set to the THEN and DO statement *)

If SYM = Tensym Then (* should encounter THEN statement after expression *)

Getsym (* get the token after the Toon, should be a statement *)

Else

Error (16); (* If there is no THEN after IF, throw a 16th error *)

CX1: = CX; (* note the current code allocation pointer position *)

GEN (JPC, 0, 0); (* Generate condition jump instructions, jump position temporarily fill in 0, then fill in *)

Statement (fsys); (* Analysis of the statement after the THEN *)

Code [cx1] .a: = CX (* The jump position of the last line instruction (CX1 referred to) should be the current CX indicated by the current CX *)

end

Else

If SYM = Beginsym Then (* If you encounter Begin *)

Begin

Getsym; (* Get the next token *)

Statement ([semicolon, endsym] fsys); (* Analysis processing on the statement between BeGin and END *)

While Sym in [SEMICOLON] Statbegsys do (* If the analysis is completed, the semicolon or statement starts loop analysis Next statement *)

Begin

if Sym = SEMICOLON THEN (* If the statement is a semicolon (may be empty statement) *)

Getsym (* Get Next Token Continue Analysis *)

Else

Error (10); (* If there is no semicolon between the statement and the statement, No. 10 wrong *)

Statement ([semicolon, endsym] fsys) (* Analyze a statement *)

END;

if Sym = endsym Then (* If the statement is completely analyzed, you should encounter END *)

Getsym (* is indeed end, read a token *)

Else

Error (17) (* If it is not an end, throw a 17th error *)

end

Else

if Sym = WhileSym Then (* If you encounter while statement *)

Begin

CX1: = CX; (* Move down the current code allocation location, this is the start position of the While loop *)

Getsym; (* Get the next token, should be a logical expression *)

Condition ([DOSYM] FSYS); (* Analyze this logical expression *)

CX2: = CX; (* Remember the current code allocation location, this is the start position of the statement in the While Do *)

GEN (JPC, 0, 0); (* Generates the condition jump instruction, the jump position is temporarily filled with 0 *)

IF SYM = dosym THEN (* should be a DO statement *)

Getsym (* Get the next token *)

Else

Error (18); (4 after IF, throwing 18 errors *)

Statement (fsys); (* Analyze the statement block after DO *)

GEN (JMP, 0, CX1); (* loop jump to the CX1 position, that is, logic judgment again *)

Code [cx2] .a: = CX (* change the jump position just fills 0 to the current position, complete the processing of the While statement *)

END;

Test (fsys, [], 19) (* to this statement processing, you will be able to meet the symbols in the fsys set, if you don't encounter, you will throw 19 wrong *) End (* statement *);

Begin (* block *)

DX: = 3; (* address indicator gives the relative position where each layer is partially allocated.

The reason why the initial value is 3 is: each layer has three spaces with three spaces for static chain SL, dynamic chain DL, and return address RA *)

Tx0: = TX; (* Initial symbol table pointer points to the start position of the current layer in the symbol table *)

Table [TX] .adr: = cx; (* Symbol table Current location The start position of the current layer code *)

GEN (JMP, 0, 0); (* produces a row of jump instructions, the jump position is temporarily unknown to fill 0 *)

if Lev> Levmax Then (* If the number of nesting layers in the current process is greater than the maximum allowable sleeve *)

Error (32); (* issued 32 error *)

Repeat (* Start all the declarations in the source program *)

if Sym = Constsym Then (* If the current token is a const reserved word, start constant declaration *)

Begin

Getsym; (* Get the next token, normal should be an identifier used as a constant name *)

Repeat (* repeated constant declaration *)

ConstdeClaration; (* declares that the current token is the constant of the identifier *)

While Sym = Comma Do (* If you have encountered a comma, repeatedly declare the next constant *)

Begin

Getsym; (* Get the next token, here should be identifier *)

ConstdeClaration (* declares with the current token as the constant of the identifier *)

END;

if Sym = SEMICOLON THEN (* If the constant declaration is end, you should encounter a semicolon *)

Getsym (* Get the next token, ready for the next round of loop *)

Else

Error (5) (* If the constant statement is not encountered after the semicolon is encountered, No. 5 error *)

Until Sym <> Ident (* If you encounter a non-identifier, the constant declaration ends *)

END;

(* The syntax of the constant declaration here is different from the EBNF paradigm on the class:

It can accept the following declaration method, and the following grammar is not available according to the EBNF paradigm on the class:

Const a = 3, b = 3; c = 6; D = 7, E = 8;

That is, it can accept constant declarations separated by a semicolon or comma, and only accept the comma-separated statement *) according to the EBNF paradigm *)

If SYM = VARSYM THEN (* If the current token is a VAR reserved word, starting a variable declaration, similar to the constant statement *)

Begin

Getsym; (* Get the next token, which is normal to be used as an identifier for the variable name *)

REPEAT (* repeated variable declaration *)

Vardeclaration; (* declared a variable as the identifier for the current token *)

While Sym = Comma Do (* If you encounter a comma, repeatedly declare the next variable *)

Begin

Getsym; (* Get the next token, here should be identifier *)

VArDeclaration; (* declares that the current token is the identifier of the identifier *)

END;

If SYM = SEMICOLON THEN (* If the variable declaration ends, the semicolon *)

Getsym (* Get the next token, ready for the next round of loop *)

Else

Error (5) (* If the variable declaration statement has not encountered a semicolil, No. 5 error *)

Until Sym <> Ident; (* If the non-identifier is encountered, the variable declaration ends *) (* Here is the same as the above constant declaration: conflict with the grammar specifications of PL / 0. *)

END;

While Sym = procsym do (* Cycle declarations each sub-process *)

Begin

Getsym; (* Get the next token, normal should be the identifier of the process name *)

if Sym = Ident Then (* If the token is confirmed as identifier *)

Begin

Enter; (* Log in to the name table *)

Getsym (* Get the next token, normal condition should be a semicolon *)

end

Else

Error (4); (* otherwise throwing 4 wrong *)

if Sym = SEMICOLON THEN (* If the current token is a semicolon *)

Getsym (* Get the next token, prepare recursive calls for grammatical analysis *)

Else

Error (5); (* otherwise throwing 5 wrong *)

Block (Lev 1, TX, [SEMICOLON] FSYS); (* recursive calling grammar analysis process, current level plus one, simultaneously passing the table index, legal word *)

If SYM = SEMICOLON THEN (* Remissive Return, the current token should be a semicolon after the last end of the recursive call.

Begin

Getsym; (* Get the next token *)

TEST (Statbegsys [Ident, Procsym], FSYS, 6); (* Check if the current token is legal, unlaveiled, use fsys recovery syntax analysis while throwing 6 wrong *)

end

Else

Error (5) (* If the symbol after the process declares is not a semicolon, throw 5 wrong *)

END;

TEST (Statbegsys [Ident], DECLBEGSYS, 7) (* Check if the current state is legal, unlaminated, use the declaration start symbol to make a fault recovery, throw 7 wrong *)

Until NOT (SYM in Declbegsys); (* until the declarative source program is completed, continue to execute, analyze the main program *)

Code [TABLE [TX0] .adr] .a: = CX; (* change the jump position of the previously generated jump statement to the current position *)

WITH TABLE [TX0] DO (* Record in the symbol table *)

Begin

ADR: = CX; (* address is current code allocation address *)

Size: = dx; (* length is the current data generation allocation location *)

END;

Cx0: = CX; (* note the current code allocation location *)

Gen (int, 0, dx); (* Generate allocation space instructions, assign DX space *)

Statement ([semicolon, endsym] fsys); (* Processes the currently encountered statement or statement *)

Gen (opr, 0, 0); (* Generates returns from subroutines *)

Test (fsys, [], 8); (* Check if the current state is legal, unlaminated, throw 8 wrong *)

Listcode (* lists the class PCODE code *)

End (* block *);

(* PL / 0 compiler generated class PCode target code interpretation running process interpret *)

PROCEDURE INTERPRET;

Const

Stacksize = 500; (* constant definition, imaginary stack computer has 500 stack unit *)

VAR

P, B, T: Integer; (* Program base topstack registers *)

(* P is the program command pointer, pointing to the code to run next *)

(* B is the base pointer, pointing to the partial variable data segment base address assigned to it during each process being called, * T is the top register, and the class PCode is in a fashionable stack calculation. Run, this variable records the current stack of this computer *)

i: instruction; (* I am in the I variables to save the current instruction *)

S: array [1..stacksize] of integer; (* Datastore *) (* s is a stack computer data memory area *)

(* Substrate Base *) by static link

(* Parameter Description: L: The layer of the required data area and the current layer *)

(* Return value: required data area base address *)

Function Base (L: Integer): Integer;

VAR

B1: Integer;

Begin

B1: = B; (* Find Base 1 Level Down *) (* begins with the current layer *)

While L> 0 do (* If l is greater than 0, loop through the data area base address required to find forward in the static chain *)

Begin

B1: = S [B1]; (* as the new current layer based on the content (exactly the static chain SL data, the base address of the previous layer) * )

l: = L - 1 (* upward, L minus one *)

END;

Base: = B1 (* Return the data area base address found by *)

End (* base *);

Begin

Writeln ('Start PL0'); (* PL / 0 program starts running *)

T: = 0; (* Program starts running time Stack Top Register 0 *)

B: = 1; (* Data segment base address is 1 *)

P: = 0; (* start executing from 0) *)

s [1]: = 0;

s [2]: = 0;

s [3]: = 0; (* is SL, DL, RA three units in data memory is 0, identified as main program *)

REPEAT (* Start running program target code *)

I: = code [p]; (* Get a row of target code *)

P: = P 1; (* instruction pointer plus one, pointing the next code *)

With i do

Case f of (* If I's F, that is, the instruction assistance is the following case, perform different functions *)

Lit: (* If it is a LIT Directive *)

Begin

T: = t 1; (* Stack top pointer is shifted, allocated in the stack *)

s [t]: = a (* The content of the unit stores the A operand of the I instruction, that is, put the constant value in the roof of the running stack *)

END;

OPR: (* if it is OPR instruction *)

Case A of (*) (* Different operations depending on the operand of the A operand *)

0: (* 0 is operational to return operation from the sub-process *)

Begin (* return *)

T: = B - 1; (* Release the data memory space occupied by this segment process *)

P: = S [T 3]; (* Take the instruction pointer to the value of the RA, point to the return address *)

B: = S [T 2] (* takes the data segment base address to the value of DL, pointing to the data segment base address of the priorction sub-process *)

END;

1: (* No. 1 operation is a stack top data to take anti-operation *)

s [t]: = -s [t]; (* Take the stack top data)

2: (* 2 operation is the top two data plus operation *)

Begin

T: = T - 1; (* Stack top pointer down *)

s [t]: = s [t] s [t 1] (* plus two unit data into the top *)

END;

3: (* No. 3 operation is the top two data subtraction operations *) Begin

T: = T - 1; (* Stack top pointer down *)

s [t]: = s [t] - s [t 1] (* Substall the two unit data into the top *)

END;

4: (* No. 4 operation is the top two data multiplication operations *)

Begin

T: = T - 1; (* Stack top pointer down *)

s [t]: = s [t] * s [t 1] (* multiply the two unit data in the top *)

END;

5: (* 5 operation is the top two data division operations *)

Begin

T: = T - 1; (* Stack top pointer down *)

s [t]: = s [t] DIV S [T 1] (* Adjust the two unit data in the top *)

END;

6: (* 6 operation is a judgment operation *)

s [t]: = ORD (ODD (S [t])); (* The value of the data stack is odd, set the stack of tops 1, otherwise 0 *)

8: (* No. 8 operation is the top two data judgment of the stack *)

Begin

T: = T - 1; (* Stack top pointer down *)

s [t]: = ORD (s [t] = s [t 1]) (* Judgment, equal stack is set, no waiting 0 *)

END;

9: (* No. 9 operation is not the operation of the top two data in the stack *)

Begin

T: = T - 1; (* Stack top pointer down *)

s [t]: = ORD (S [t] <> s [t 1]) (* If you don't wait, you don't wait for the top 1, equally 0 *)

END;

10: (* No. 10 operation is smaller than the operation *)

Begin

T: = T - 1; (* Stack top pointer down *)

s [t]: = ORD (S [T]

END;

11: (* No. 11 operation is the top two data of the stack is greater than or equal to operation *)

Begin

T: = T - 1; (* Stack top pointer down *)

s [t]: = ORD (S [T]> = S [T 1]) (* Judiece is greater than or equal, if the following value is greater than or equal to the above value, the stack is set, otherwise 0 *)

END;

12: (* 12) is more than the operation of the top two data for the stack *)

Begin

T: = T - 1; (* Stack top pointer down *)

s [t]: = ORD (S [T]> S [T 1]) (* Judging greater than, if the following value is greater than the above value, the stack is set 1, otherwise 0 *)

END;

13: (* No. 13 operation for the top two data of the stack is less than or equal to operation *)

Begin

T: = T - 1; (* Stack top pointer down *)

S [T]: = ORD (S [T] <= S [T 1]) (* is less than or equal, if the lower value below is equal to the value equal to the above, the stack is set, otherwise 0 *)

END;

14: (* 14 operation is output stack of top value *)

Begin

Write (s [t]); (* Output Stack Total *)

Write (FA2, S [T]); (* prints to the file *)

T: = T - 1 (* Stack Top Down *)

END;

15: (* No. 15 operation is the output wrap operation *)

Begin

Writeln; (* output wrap *)

Writeln (* * simultaneously outputs to file *)

END;

16: (* 16 operation is to accept keyboard value input to the top *)

Begin

T: = T 1; (* Stack top shift, allocation space *)

Write ('?'); (* screen display question mark *) WRITE (FA2, '?'); (* At the same time output to file *)

Readln (s [t]); (* Get input *)

Writeln (FA2, S [T]); (* Print the user input value to the file *)

END;

END; (* OPR instruction analysis run *)

LOD: (* If it is the LOD instruction: put the variable on the top *)

Begin

T: = T 1; (* Stack top shift, open space *)

s [t]: = s [Base (L) a] (* Find data by data area layer L and offset address a, store the new space opened above (ie on top) *)

END;

STO: (* if it is a STO directive *)

Begin

S [Base (L) A]: = S [T]; (* Put the value of the value of the stack to the position of the data area layer difference L offset address A. *)

T: = T - 1 (Shown down, release space *)

END;

CAL: (* if it is CAL instruction *)

Begin (* generat new block mark *)

s [t 1]: = base (L); (* is pressing the static chain SL * in the top of the stack

S [T 2]: = B; (* then press the current data area base address as dynamic chain DL *)

s [t 3]: = P; (* finally pressed into the current breakpoint, as return address RA *)

(* The above work is the protection site before the process call *)

B: = T 1; (* Point the current data area base address to the location of SL *)

P: = a; (* Continue the execution instruction from the position indicated by A, that is, the jump * of the program is implemented *)

END;

INT: (* if it is int instruction *)

T: = T A; (* Top A space is shifted on top, that is, open up A new memory unit *)

JMP: (* if it is JMP instruction *)

p: = a; (* as the value of the JMP instruction operand as the next time the instruction address to be executed, realize unconditional jump *)

JPC: (* if it is JPC instruction *)

Begin

IF S [T] = 0 THEN (* Judgment Stack Total *)

P: = a; (* If you are 0, you jump, otherwise you don't jump *)

T: = T - 1 (* Release Stack Top Space *)

END;

End (* with, casage *)

Until P = 0; (* If P is equal to 0, it means that the command returns from the subroutine when the main program is run, that is, the end of the entire program running *)

Close (FA2) (* Close the FA2 file used to record the input and output of the screen *)

(* The interpretation of the PCode code ends *)

End (* interpret *);

Begin (* main *)

For ch: = '' to '!' do (* This loop puts all the SSYM arrays all NUL *)

SSYM [CH]: = NUL;

(* Changed Because Of Different Character Set

Note The Typos Below in The Original Where

The Alfas Were Not Given The Correct Space *)

(* The reserved word table below is initialized, the word length is less than 10 characters, and the unnecessary position is filled with spaces. It is easy to find the reserved word when the lexical analysis is available.

Word [1]: = 'begin';

Word [2]: = 'Call';

Word [3]: = 'Const';

Word [4]: ​​= 'DO'; Word [5]: = 'end';

Word [6]: = 'IF';

Word [7]: = 'odd';

Word [8]: = 'procedure';

Word [9]: = 'read';

Word [10]: = 'TEN';

Word [11]: = 'Var';

Word [12]: = 'while';

Word [13]: = 'WRITE';

(* Reserved word symbol list, find the reserved word in the corresponding position in this table after the reserved word is found in the above reserved language.

Wsym [1]: = beginsym;

Wsym [2]: = Callsym;

Wsym [3]: = Constsym;

Wsym [4]: ​​= dosym;

Wsym [5]: = endsym;

Wsym [6]: = IFSYM;

Wsym [7]: = ODDSYM;

Wsym [8]: = procsym;

Wsym [9]: = readsym;

Wsym [10]: = THENSYM;

Wsym [11]: = VARSYM;

Wsym [12]: = WhileSym;

Wsym [13]: = WriteSYM;

(* Initialize the symbol table, assign the corresponding type of symbol, and the remaining symbols are NUL * due to the types assigned to the beginning of the cycle.

SSYM [' ']: = Plus;

SSYM ['-']: = minus;

SSYM ['*']: = TIMES;

SSYM ['/']: = Slash;

SSYM [']: = lparen;

SSYM [')']: = rparen;

SSYM ['=']: = EQL;

SSYM [',']: = COMMA;

SSYM ['.']: = period;

SSYM ['#']: = neq;

SSYM [';']: = SEMICOLON;

(* Initialization class PCODE Help Comet, this table mainly supplies the output of PCODE code *)

MNEMONIC [LIT]: = 'lit';

MNEMONIC [OPR]: = 'opr';

MNEMONIC [LOD]: = 'LOD';

MNEMONIC [STO]: = 'STO';

MNEMONIC [CAL]: = 'CAL';

MNEMONIC [INT]: = 'int';

MNEMONIC [JMP]: = 'JMP';

MNEMONIC [JPC]: = 'JPC';

(* I have modified code: The book is the form of 'xxx', that is, two spaces after the assistance, through the Internet query the original program is confirmed as each space before and after the speaker. *)

(* The purpose of this is to make the result of the latter output results *)

Declbegsys: = [Constsym, Varsym, Procsym];

Statbegsys: = [Beginsym, Callsym, IFSYM, WhileSym]; Facbegsys: = [Ident, Number, LParen];

(* Page (Output) *)

(* Since the text file processing method of Turbo Pascal 7.0 is very different from the method used in the source program, the following related file processing has made a lot of change. *)

Assign (FA1, 'FA1.TXT'); (* Associate the text file FA1 with the fa1.txt file, used to output the generated class PCode code *)

ReWrite (FA1); (* established and open Fa1.txt file *)

Write ('INPUT file?'); (* prompts to enter the PL / 0 source name *)

Write (FA1, 'INPUT file?'); (* the same prompt output to fa1.txt file *)

Readln (FNAME); (* Get file name entered by the keyboard *)

Writeln (FA1, FNAME); (* Print the keyboard into the fa1.txt file *)

{OpenF (FIN, FNAME, 'R');

Assign (FIN, FNAME); (* Pl / 0 source program file with fin *)

RESET (FIN); (* Open the PL / 0 source file file associated with the FIN)

Write ('List Object Code?'); (* Tip Do you want to list class PCode code *)

Readln (FNAME); (* Get User Enter *)

Write (FA1, 'LIST Object Code?'); (* The same prompt is written in the fa1.txt file *)

Listswitch: = (fname [1] = 'y'); (* If you enter the string of 'Y' starting, set the listSwitch flag, otherwise false *)

Err: = 0; (* Error number 0 *)

Cc: = 0; (* lexical analysis row buffer pointer 0 *)

CX: = 0; (* class PCode code table finger pin 0 *)

LL: = 0; (* lexical analysis row buffer length 0 *)

CH: = ''; (* lexical analysis Current character is space *)

KK: = Al; (* Set the value of KK is the longest length of the allowable identifier, specifically used the view GetSyM process notes *)

Assign (Fa, 'Fa.txt'); (* Associate FA.TXT with FA. FA is used for output source *)

ReWrite (FA); (* established and open Fa.txt *)

Getsym; (* First call the lexical analysis subroutine, get the first word of the source program (token) *)

Block (0, 0, [PERIOD] DECLBEGSYS STATBEGSYS); (* Syntax analysis of the main program (that is, the first division) *)

(* The layer where the main program is located, the symbol table is temporarily empty, the symbolic table pointer finger 0 position *)

Close (FA); * Close file *)

Close (FA1); (* Close file *)

If Sym <> period the (* Main program analysis ends, you should encounter the end of the program ending "*)

Error (9); (* If not the number, No. 9 error *)

(* It is understood that a legal PL / 0 source program should be composed of division procedures and sentences. *)

if Err = 0 THEN (* If the number of error is 0, you can start explaining the code of the execution compilation generation *)

Begin

Assign (FA2, 'FA2.TXT'); (* Associate the text file FA2 with the fa2.txt file, used to output the class PCode code running result *) REWRITE (FA2); (* established and open FA2 file *)

Interpret (* Start Interpretation of Pcode Code *)

end

Else

Write ('ErrorS in PL / 0 Program "; (* If there is an error, the prompt has an error *)

99: (* This label is originally used to exit the program, because Turbo Pascal does not support cross-process jumps, this is not used here. *)

{Closef (FIN);

Close (FIN); (* Close source file *)

Writeln

End.

*********************************************************** *********************

转载请注明原文地址:https://www.9cbs.com/read-1889.html

New Post(0)