Expression calculation: analysis and design

xiaoxiao2021-03-06 98

The problem is from a data acquisition system I have done to network devices and hosts, some captured data need to be saved after a certain calculation, rather than only saving its original value. In order to provide the user's greatest flexibility, I envisaged a user interface that allows the user to input to calculate expressions (or calculated formulas). Thus, in addition to the need to comply with a small amount, the user can get the maximum flexibility.

What is the characteristic of such an expression? It is generally not purely calculated expression (simple as: 1 2 * 3-4). It contains elements I call variables. Variables typically have special internal syntax, such as the total number of physical memory, "@TotalMemory" means device or host (below as device), then expressions "(@ TotalMemory- @ FreeMEMORY) / @ TotalMemory * 100" means The percentage of the current memory usage rate. If you connect to the alarm system, monitor this value that exceeds 80 systems, WARNING, then this is a meaningful thing. Different kinds of acquisition data may require a different computation of different complications. However, it is clear that when the final evaluation, those specific variables must be replaced with specific values (ie, the specific values collected), otherwise the expression is not calculated. This process occurs at runtime.

The generality of the problem I think expression computing is a general topic and is not a new topic. We may have encountered it. I wrote an expression to the conversion and calculation program when I was studying. At the time, I was a pro-work. I have seen some reporting systems, whether it is alone or is included in the MIS system, financial software, many of the calculation formulas. I think the calculation formulas in these systems and the problems I have encountered are substantially the same. For me, I encountered this problem in the data collection project, and next time I will encounter it in other projects. Since it is already more than once, I hope to find a general solution.

Some existing design and implementation cannot meet the requirements after designing and implementing the first version, I feel unattractive. Subsequently, I spent some time to search for (Keywords: Expression is infixed by the suffix or expression infix postfix). Slightly disappointed is that some of the conversions found, the calculated program cannot meet my requirements. Many such procedures only support pure immediately calculated expression, and variables are not supported. Moreover, expression parsing and calculation is coupled together, it is difficult to expand. Add new operators (or new variable syntax) almost certainly to modify the source code. In my opinion, this is the biggest defect (in fact, the expression conversion and calculation of the expression in that year, although it is proud, it seems to have the same defect). However, the conversion of the expression and the calculation itself have mature, stack-based classic algorithms, many computer books or textbooks. The expression written in natural way is a prefixed form, first convert the infix expression into a suffix expression, and then calculate the value of the suffix expression. I intend to use this classic process and algorithm.

My design idea and the target of the target are mature, I am eager to extract them out, remove (related to parsing) coupling! Imagine if things have relatively complete connotation and independence, it produces this need, and can also be demonstrated by the form of separation and extraction, which is inseparable from abstraction. I soon realized that I actually designed a small-scale framework for expression calculations.

The expression is to support the addition and subtraction, this is basic, immediately thinking. Maybe it should also support the square, the opening (SQRT), the triangle operator such as SiN, COS, etc. So what happens if there is anything else, including custom operators? Can you determine how to consider? As a custom operator, it is completely, reasonable needs. In the data acquisition system, I think about the introduction of a DIFF operator, indicating the same accumulated collection, the difference in the most recent twice (ie collected cycle). The above thinking prompted me to decide, the design of the operator must be open. Users (herein referring to user programmers, the same) can expand, add new operators. The variables are allowed in the expression. For the support of variables run through the entire process of expression parsing, conversion, and calculation. In the parsing phase, users should allow users to use their own variable syntax, I should not implement variable identification based on certain syntax.

Due to support scalable operators, unknown variable syntax, even basic values (icons 123, 12.3456, 1.217), there are also various types and precision (Integer / long / float / double / biginteger / bigdecimal), this Decided that a cured expression parsing method was not available. Expression parsing also needs to be expandable. The best result is to provide an easy-to-use and extended parsing framework. For new operators, new variable syntax, users extends on this framework to provide enhanced parsing capabilities.

From an abstract point of view, I plan to support the expression of only four elements: parentheses (including left and right brackets), operators, values, and variables. A end user gives a character string, after parsing, it may generate an internal representation, which is easy to follow up, and each element of the expression can only be one of the above four.

The value I wrote an expression of the value of the value, called Numeral. I am an integer, floating point or double precision. From a relatively vague sense, I hope it can express the value of any type and precision. But I also hope that it can explicitly express which type and accuracy of the representative, if needed. Even I think that Numeral can also express Biginteger and BigDecimal (idea in some case, we need to resolve and calculate a such expression, it allows the accuracy and range of the value, so that Long or Double is not covered ), Otherwise we will encounter trouble under special occasions. In scalability, numerical classes are not like operator classes, which should be mature, so almost no extension needs to be extended.

After repeated trial chaos (Numeral later has been modified and even rewriting), I found a wise approach. Use java.lang.Number directly as a value class (actually this is an interface). I am glad to see that in Java, INTEGER, Long, Float, Double, and even the numerical classes such as Biginteger, BigDecimal enabled Java.Lang.Number interface, and the user uses what type and accuracy of Number. Look and use, the right to master in his / her hands, I should not determine the type and accuracy of the value in advance. Select the Number class to express the value, which seems to be the best, the minimum choice is selected, and has maintained considerable flexibility. As a result of a band, the Numeral class is discarded. In brackets in the expression, the role of brackets is not overlooked. It can change the natural priority of the operation, calculate in the order of the user. I use the Bracket class to indicate parentheses, which can be seen as Final because it does not need to be extended. Braces are divided into parentheses and right brackets, I use two static instance variables (and are only two instance variables of the Bracket class) as the Bracket class.

Public Class Bracket

{

PRIVATE STRING NAME;

Private bracket (String name) {

THIS.NAME = Name;

}

Public Static Final Bracket

LEFT_BRACKET = New Bracket ("),

Right_bracket = new bracket (")");

Public string toString () {

Return Name;

}

The design requirements of operator operators are open, which is almost immediately mean that it must be abstract. I once hesitated as an interface or an abstract class definition, and finally I chose an abstract class.

Public Abstract Class Operator

{

PRIVATE STRING NAME;

Protected Operator (String Name) {

THIS.NAME = Name;

}

Public Abstract Int getDimension ();

Public Abstract Number Eval (Number [] Oprands, INT Offset; // THROWS ARITHMETICEXCEPTION?

Public Number Eval (Number [] Oprands) {

Return Eval (Oprands, 0);

}

Public string toString () {

Return Name;

}

The design of this operator contains two primary interface methods. This is conveyed by getDimion () interface: Is the operator a few yuan? That is, several operands are needed. Obviously, the most common is a dollar and binary operators. This interface method seems to also mean more than binary operators, but I have no more in-depth investment for more than binary operators. I can't determine whether the transformation of the stack-based expression and whether the algorithm fully supports the operator of more than binary. Despite this concern, I still retain the current interface method.

The main interface method of the operator is EVAL (), which is the calculation interface of the operator, reflecting the essence of the operator. In this interface method, you must pass all the required operands to it, the operator is a few yuan, requiring several operands, which should be consistent. Then, perform calculations that meet the meanings, return the result. If new operators are added, the user needs the above interface method of the operator. Variables say in a sense, the variable is "to be determined". Should I design a Variable class (or interface)? I did this. When is the variable, what specific value is replaced, these procedures I don't know, it should be left to the user. I have almost zero for variables, so the meaning of the Variable class is not big. If you continue to keep this class / interface, you also bring a limit to the user, he / she must inherit or implement the Varibale class / interface, so I have discarded the variable. I just declare and stick to this: In an expression, if an element is not parentheses, it is not a value, nor an operator, then I will treat it as a variable.

Expression parsing interface expression analysis The basic problem to be resolved is: For the expression string given by the user, the value, operator, parentheses, and variables are identified, and then converted into internal, easy to process expressions. form. I provided a general expression parsing interface as follows.

Public Interface Parser

{

Object [] PARSE (STRING EXPR) Throws IllegalexpressionException;

}

In this parsing interface I only define a method PARSE (). Expression is used as input parameters, and the method returns an array Object [] as a resolution result. If parsing passes, you can affirm the elements in the array or Number, or Operator, or Bracket. If it is not one of the above, it is considered to a variable.

Perhaps this is too general for expression analysis. Because it avoids the key issue of "how to resolve", it seems that it is not great to help users. How to resolve the expression, in my opinion, a complex, even difficult problem.

The main difficulties are in that it is unable to provide an out-of-analysis implementation of various expressions. Considering that users may add new operators to introduce new variable syntax, and even support different types and accuracy. As mentioned earlier, if you can design an expression parsing framework, let the user can easily expand on this framework. But I have not been able to do this. Next, a default parser (SimpleParser) that has been implemented will be mentioned. This default implementation tries to establish such a framework, I think there may be certain limitations.

Conversion of infix expressions to suffixes This is done by a converter class Converter. I can independent of the conversion algorithm (and the calculation algorithm below), so it does not rely on the extension of operators or variables, which is benefited from the previously doing basic work - for expression elements (numerical, parentheses, operators and Analysis and abstraction of variables). The basic process of the algorithm is such (readers can check from online or related books, I am slightly changed, because of the support variables): Create a workplace and an output queue. Read the expression from left to right, when you read a value or variable, send it directly to the output queue; when you read the operator T, pop up all the priorities in the stack above or equal to T, send it to the output In the queue, then t in stack; when reading the left parentheses, it always puts it into the stack; when reading the right bracket, it will pop up the operators above the first left brackets of the stack, send it to the output queue. After, discard the left parentheses. In the Converter class, the core method control () executes the above algorithm, the input is a prefix expression, the output is a suffix expression, and the process of completing the conversion. Public Abstract Class Converter

{

Public Abstract Int PrecedenceCompare (Operator OP1, Operator OP2)

Throws unknownoperatoruence;

Public Object [] Convert (Object [] InfixExpr)

Throws IllegaleXpressionException, UnknownOperatorexception

{

Return Convert (InfixExpr, 0, InfixExpr.length);

}

Public Object [] Convert (object [] infixExpr, int offset, int LEN

Throws IllegaleXpressionException, UnknownOperatorexception

{

IF (InfixExpr.length - Offset

Throw new illegalargumentException ();

// Create an output expression to store results

ArrayList Output = new arraylist ();

// Create a work stack

Stack stack = new stack ();

INT currinputposition = offset; // Current location (in the input queue)

System.out.println ("----------- Begin conversion procedure ------------"); // Temp!

While (CurrinputPosition

{

Object currinputelement = infixExpr [currinputposition ];

If (currinputelement instanceof number) // numerical element direct output

{

Output.add (currinputelement);

System.out.println ("Number:" currinputelement; // Temp!

} Else if (currinputelement instanceof bracket) / / encounter parentheses, put in stack or match

{

Bracket currinputbracket = (bracket) currinputelement; if (currinputbracket.equals) {// 括进进进

Stack.push (currInputelement);

} else {// Right brackets, seek matching (left bracket)

// Pop up all stack elements (operator) until you encounter (left) brackets

Object stackelt;

{

IF (! stack.empty ())

STACKELEMENT = stack.pop ();

Else

Throw New IllegalexpressionException ("Bracket (s) mismatch";

STACKELEMENT InstanceOf Bracket

Break;

Output.add (stackelement);

System.out.println ("Operator Popup:" stackelement); // Temp!

WHILE (TRUE);

}

} else if (currinputelement instanceof operator)

{

Operator currinputoperty = (operator) Currinputelement;

// Pop up all priorities above or equal to all operators

// (until all pop up all the conditions or the left bracket)

While (! stack.empty ()) {

Object stackelement = stack.peek ();

IF (Stackelement InstanceOf Bracket) {

Break; // This must be a left bracket, no pop-up

} else {

Operator stackoperator = (operator) stackelement;

IF (precedenceCompare (StackOperator, currinputoperty> = 0) {

// The priority is higher than the current, pop-up (to output queue)

Stack.pop ();

Output.add (stackelement);

System.out.println ("Operator:" stackelement); // Temp!

} Else {// The priority is lower than the current, no pop-up

Break;

}

// Current operator into the stack

Stack.push (currInputelement);

} else // if (currinputelement instanceof variable)

// Other people are considered variables, and variables are also directly output.

{

Output.add (currinputelement);

System.out.println ("Variable:" currinputelement; // Temp!

}

/ / Pop up the remaining elements in the stack to the output queue

While (! stack.empty ())

{

Object stackelement = stack.pop ();

Output.add (stackelement);

System.out.println ("Left Stack Operator:" stackelement); // Temp!

}

System.out.println ("----------------------"); // Temp!

Return Output.toArray ();

}

The reader may soon notice that the Converter class is not a specific class. Since the algorithm is mature and stable, we also have independently, why is the CONVERTER class not a stable concrete class? Because I found a problem that I have to face an operator priority during the conversion process, this is a problem that cannot be ignored. According to convention, if there is no bracket explicitly determine the order in which the calculated calculation is used, the calculated order is determined by the priority of the comparison operator. Because I can't determine how much the user is in specific use, how much is his / her operator, and what is the priority order between two operators. This knowledge can only be told me by the user. To be wrong, tell the Converter class, so the Converter class provides an abstract operator comparison interface precedenceCompare () is implemented by the user.

During a period of time, I was confused for how to verify the validity of the expression. I realized that the conversion did not necessarily mean that the expression is certain, effective. Even the next successful calculation of the value of the suffix expression is also proved to be valid. Of course, in the case of certain transition failures or calculations, such as the number of operators do not match the number of operands, left and right parentheses do not match, etc., it is sure that the original expression is invalid. However, proves that an expression is effective, and the conditions should be demanding. Unfortunately, I have not found the theoretical basis for testing the validity of expressions.

Calculate the suffix expression This is done by a calculator class Calculator. The core method of the Calculator class is evAl (), and the parameters transmitted to it must be a suffix expression. Before calling this method, if the variable is contained in the expression, it should be replaced by the corresponding value, otherwise the expression is uncapable, which will result in improving IncalculableExpressionException. The basic process of the algorithm is this: Create a work stack, read the expression from left to right, press the value to press into the stack; read the operator to pop up N numbers from the stack, calculate the result, then press In the stack, n is the element of the operator; repeating the above process, the last output stack top is used as the calculation result, end.

Public Class Calculator

{

Public Number Eval (Object [] PostfixExpr) Throws IncalculableExpressionException

{

Return Eval (postfixexpr, 0, postfixexpr.length);

}

Public Number Eval (Object [] PostfixExpr, int offset, int LEN

Throws IncalculableExpressionException

{

PostFixExpr.length - Offset

Throw new illegalargumentException ();

Stack stack = new stack ();

INT currposition = offset

While (CurrPosition

{

Object element = postfixExpr [currposition ];

Element InstanceOf Number {

Stack.push (element);

} else if (Element InstanceOf Operator)

{

Operator op = (operator) Element;

INT Dimensions = op.getdimension (); if (Dimensions <1 || stack.size ()

Throw New IncalculableExpressionException ("Lack Operand (s) for operator '" op "");

Number [] OPERANDS = New Number [DIMENSIONS];

For (int J = Dimensions - 1; J> = 0; J -)

{

OPERANDS [J] = (Number) stack.pop ();

}

Stack.push (OP.EVAL (Operands));

Else

Throw New IncalculableExpressionException ("Unknown Element: Element);

}

IF (stack.size ()! = 1)

Throw New IncalculableExpressionException ("Redundant Operand (s));

Return (Number) stack.pop ();

}

The default implementation I mainly discussed the design of expression calculations. A good design and implementation often include some default implementations. In this case, I provide the implementation of the basic four operators and a default parser implementation (SimpleParser).

The operator achieves four basic operators of the addition and subtraction. It should be noted that for each basic operator, the current default implementation only supports Number, long, float, and double. Moreover, it is necessary to pay attention to how different types and accuracy values are calculated, how to determine the type and accuracy of the result value. The default implementation has a certain handling.

The parser This default parser is implemented, I think it is simple, and it is named SimpleParser. Its basic idea is to consist of expressions as composed of parentheses, values, operators, and variables, each of which can be parsed independently, providing an expression element parser (ElementParser). SIMPLEPARSER calls four element parsers, completes all parsing work.

ElementParser provides an expression element level parsing interface. This interface implements this interface.

Public Interface ElementParser

{

Object [] parse (char [] expr, int OFF);

}

Analysis Method PARSE () The input parameter indicates the string to be parsed, as well as the start offset. The returned result is stored in the specific element (Number, Operator, Bracket or Object) obtained, and the cutoff shift of the parsed. This time, the offset is likely to be the start offset of the next resolution, and if you don't consider the blank character.

So when the parser is called by SimpleParser during the entire parser process? My solution is: it calls each element parser session. It can be said that this is an attempt. The order of tries is to pay attention to, in turn, the variable parser -> operator parser -> numerical parser -> Brand parser.

Why do you perform such a order? From the deep level, I reflected my concern. This is: The analysis of the expression may be quite complicated. There may be such an expression that is parsing for "divided into rules", because there is a place that needs "overall parsing". For example, consider a substring such as "DIFF (@TotalBytesReceived". Users may use it possible to express such a meaning: take the difference between the two acquisitions collected before and after the acquisition amount TotalBytesReceive. DIFF does not even understand the operator in the traditional sense. The final reasonable choice is probably, "Diff (@TotalBytesReceived" is deemed to parse and handle the entire "DIFF (@TotalBytesReceiveD". In such a case, the resolution is disassembled to "DIFF", "(", "@ Bytereceived", ")". This is why the variable parser is called first, so that the user is allowed to intercept, redefine the regular parsing method to meet the actual needs. In fact, I arranged some of the possible parts (such as variables), such as variables), their parsers were first called, minimized portions (such as parentheses) their parsers were last called. On each step, if the resolution is successful, then subsequent parsers will not be called. If all the element parsers cannot be resolved in a location of the expression string, the expression will not be resolved and will throw an IllegaleXpressionException.

Extended implementation due to the relationship between the space, not here, the exemplary implementation is discussed. This does not mean no extension implementation. In the data acquisition project mentioned earlier, since the basic original intention is to support the variable of special grammar, I implements an extension implementation that supports variables, and supports some other operators, except for four operators. I believe that the reader can see that the work I do is reflected and satisfied with scalability. The scalability is mainly reflected in operators and variables.

Summary For expression calculations, I have a challenging requirement, but it is not too high. However, in order to close or reach this goal, I have a good job in design, and the number is easy to manal. As mentioned earlier, I dropped the Numeral class, the Variable class. In fact, it is not there. I have also designed the Element class, and the expression is internally embodied as an array ELEMENT []. In the ELEMENT class, it indicates what type of element it contains through an enumeration variable (value, parentheses, operator or variable). But I smell this practice is not clear enough, natural (if you are chasing it, you can say it is not an objective), and finally discarding this class. Accordingly, ELEMENT [] is replaced by more direct object [].

My continuous improvement is to strive to keep design simultaneously in the pursuit of other goals. Note that this is not equal to the pursuit of too simple! I hope that my efforts make me basically reach this goal. I have removed the main coupling, allowing relatively unchanged parts - expression conversion and calculating part independently, and opens the changed part - operators and variables. Although I have regretted in expression analysis, the general parsing interface of the expression is too broad, and it is not possible to provide substantive help for the user's extension. Fortunately, the default parser is made up.

Finally, I hope this design and implementation of expressions can be used and expanded for others. I am willing to see it can withstand the test.

转载请注明原文地址:https://www.9cbs.com/read-91640.html

9cbs

New Post(0)