My compiler C3 compiler

xiaoxiao2021-03-05  33

C3 compiler

2.1

structure

In fact, since the language has been determined, the rest is good. The next thing we have to do is manually implementing the C3 compiler, and then we will use our automated analyzers to handle other languages ​​so that we can see their difference.

Since it is intended to be an internal representation, it means that it is designed to design a class set to indicate these things. The first thing is Rule, which is no doubt.

Class Rule

Public Name

AS

String

Public RuleText

AS

String

END CLASS

I don't think anything will use such a class. The result of this compiled root is not compiled - all the rules of the text format. We need a representation of a grammar rule! After a long time, I chose a for a long time.

The method of the road map (I found this in the later further study] This is the same as the same way in 1976).

The left picture is a basic element. But just this, it's not good, the reason is the same as the Rule class above - too simple, we need some practical things and useful information for the back compiler - Just this is a service program . We want to change and make it close to C3 and try to approach the computer language. After a large number of trials, it has continuously improved its constant improvement to obtain the following representation scheme.

Recommendation method of the basic element of the road map

:

Normal chamber:

Optional map:

* Closed package chart:

Closed Package Chart:

Note: Illegal exports will not be expressed.

The element in the lower right corner is labeled in the lower right corner indicates that this is an optional element to match or match.

The matching exit of the closure is pointing to yourself. This is because as long as the match is matched, it will continue to match, and only if it is no longer match, it will be transferred from the mismatch outlet to the next element. The correct packet is actually a copy before the closed bag, and requires the same to match, which is the same as its mathematical meaning.

Below is a few route mapped rules:

Integer => DIGIT

String => "" "" ("" ""! "" "" "" "" "" "

Float => Digit * "." DIGIT

Then, RuleMap and RuleMapItem respective correspondence, respectively, respectively, respectively. Thus RuleMap is a collection class, according to the requirements of .NET, it needs to implement the IEnumeRATABLE interface. RuleMapItem needs optional, matching outlets, do not match the outlets and names.

Class rulemap

Implements IENUMERATABLE

Public

Function genumerator ()

AS IENUMERATOR

Public Name

As string

Public country

As integer

Public Item (Index

As integer

As rulemapitem

Public

Function Add (Name

As string

As rulemapitem

END CLASS

Class RulmapItem

Public Name

As string

Public Optional

As boolean

Public Match

As rulemapitem

Public dismatch

As rulemapitem

END CLASS

The definition of grammar rules is basically completed, and the next is the rule of the container: section. Since section is also a rule in semantics, the code of its rule is that all members of its internal, that is, it can match any of its internal rules, so the section is inherited from Rule. But as a container, it should also implement the IEnumerable interface. Class Section

Inherits Rule

Implements IENUMERABLE

Public function genumerator ()

AS IENUMERATOR

Public country

As integer

Public

SUB Add (r

As rule)

Public

SUB Remove

As integer

Public Item (Index

As integer

As rule

END CLASS

Since the unique difference between Class and Section is different, the location is different, so ISClass property is added in the section.

2.2

Road map combination principles

What is the internal most important point of the roadmap is what is the legal and illegal export of the roadmap during the compilation process? For example: (A | (B [C])) D

A after the analysis is over, why does not match the exit? Because they are or the relationship, only the matching of the B path is considered when the A does not match. Thereof

That C? Since it is associated with B, it should be legally exported in B, which is also matched. That is why the matching export does not point to C? Because A and C are not in an expression! Not right --a and b are not in an expression. That's right, but the expression of A, B is in the same expression. And actually optional C is also an expression only only one.

It is the most complex and trouble of places: D, which is associated with the previous expression, then it should be placed in legal exports. So is it legal export? The mismatch outlet of the optional primitive in accordance with C3 is also a legal exit. Both point D? Analyze the optional meaning,

Matching mismatch must match the next one. That is to say, the matching exit and the mismatch outlet must point to the subsequent chart, that is, D.

Is it finished? If the input is a string starting with a A, what? According to this rule AD is acceptable. However, if you scan it according to the road map above, only A is accepted, D is not accepted. This is obviously wrong. So the matching exit of A should also point to D.

However, this problem does not point to the above simple - how to find all these three legal exports? Obviously we need to find

All legitimate outlets are assigned to all legitimate outlets. There are two problems, one, because the road map is a mesh structure, how to traverse, this is actually described in the algorithm book, and it is no longer detailed; the second is how to judge the export is legal, of course match Exports As long as they do not point to any primitives, it is mainly legally, mainly, the problem that does not match the exit - the above C is a typical example - the solution is that the optional elevation does not match the exit.

Basically the frame design is complete, consider implementing problems, software engineering issues, and logical issues. The first is to implement the problem. Because of the class, interface and key points, the algorithm of the difficulty has been designed, and the remaining only fill in the code, it is relatively simple. Then there is a software engineering problem, whether there is robust, maintainability, and scalability, most worthless of scalability, because this test-based project is not a mature technology, so scalability is very important, because We don't know when we may forget something, no added. And logically can be very clear.

Its implementation stage considers that there is not much care in addition to scalability except scalability. In a specific implementation, I took the code integrates the code of the compilation section in these classes. But actually this is not a good idea, because this requires a special class to handle compilation problems, and it is a bit more troublesome due to the scattering of compiled code. Relative use of a fully split C3 compiler to handle although it may be more complicated (relatively), it is much more maintenanceability. The class mentioned above specially handled the compilation problem is the Grammar class that later born. It is entirely responsible for the compilation of C3 and compiled sequential / reverse sequencing work. More detailed description

Matching rules instructions documentation.

In the process of making the analyzer, we need an error message so that the function is added to the above architecture and the corresponding class to implement the automatic generation of the error information. However, since it can be automatically generated here, it is certainly automatic generation when analyzing its runtime, so these code is actually not large, but in order to prevent future needs, it has always been reserved from version 1.0, but in version 1.2 It is no longer used in the future.

After the completion of the above work, I wrote a simplified version of C-, a C, and made a corresponding change in .NET, see

Program Test Report 1, of course, this report is only after the scanner is completed, before it is only testing whether its function can be realized, sequential / reverse sequence is correct. Through this point, you can understand how important the test case is, there is no case, we can't even trust our code - there is no test, how do you know that it is right?

转载请注明原文地址:https://www.9cbs.com/read-34110.html

New Post(0)