| iMatix home page
| Libero home page | Libero documentation
| << | < | > | >>
Libero Libero
Version 2.32

Example of Parsing An Arithmetic Expression

The next example is for an arithmetic expression evaluator that was originally proposed by Leif Svalgaard. The program works out the value of a string like '2-SQR(2)*1.414'. The program translates easily into most of the languages I've worked with so far, from assembler to Basic to C and COBOL.

Here I describe the program lrcalc.c, which is a subroutine that I use in Libero. You can find the source for this program along with the other sources for Libero in the lrsrc.zip archive. There are also versions of this program in various languages in the examples archive.

Lrcalc chops the expression into tokens, each representing an operand (numbers) or operator. The operators are classified as:

There are various ways of parsing an expression like this; lrcalc combines two basic techniques: push-down stacks for the operands and operators, and states to indicate how tokens are handled. Each state accepts specific tokens and rejects others. For instance, at the start of the expression, an operator like '*' is not valid. When a state accepts an operand or operator, it adds it to the appropriate stack. When a state stacks an operator, it evaluates any previous operators that have the same, or higher priority. '*' and '/' have a higher priority than '+' and '-'.

Operators like '(' and ')' are placeholders that group parts of the expression together with a higher priority. Operators like '*', '/', '+', and '-' are binary operators that take two values off the operand stack, do their work, and place the result back on the stack.

The program basically takes tokens one by one, stacks and evaluates them according to the priority rules, until it reaches the end of the expression. To make this clean, the program places a special end-mark token at the end of the expression when it starts. When it reaches the end-mark, it evaluates any remaining operators, which leaves the result of the expression sitting on the stack.

After-Init:
    (--) Ok                         -> Expecting-Initial
          + Get-Next-Token
    (--) Error                      ->
          + Terminate-The-Program

The two states Expecting-Initial and Expecting-Operand are similar, except that the first allows End-Mark while the latter does not. I.e. we accept an empty expression (End-Mark in Expecting-Initial), but don't accept an expression that ends when we expect an operand:

Expecting-Initial:
    (--) Term-Op                    ->
          + Allow-Signed-Number
          + Signal-Invalid-Token
          + Terminate-The-Program
    (--) Number                     -> Expecting-Operator
          + Stack-The-Number
          + Get-Next-Token
    (--) Left-Par                   -> Expecting-Operand
          + Stack-The-Operator
          + Get-Next-Token
    (--) End-Mark                   ->
          + Terminate-The-Program

These two states handle signed numbers (e.g.. -12, +100) by accepting Term-Op tokens (ie. '+' or '-') so long as these are stuck to a following number. The module Allow-Signed-Number gets the next token, and if this is a number, it kicks the dialog into accepting a number in the same state. It does this using an exception event called Number:

Expecting-Operand:
    (--) Term-Op                    ->
          + Allow-Signed-Number
          + Signal-Invalid-Token
          + Terminate-The-Program
    (--) Number                     -> Expecting-Operator
          + Stack-The-Number
          + Get-Next-Token
    (--) Left-Par                   -> Expecting-Operand
          + Stack-The-Operator
          + Get-Next-Token

After stacking an operand, the dialog expects an operator:

Expecting-Operator:
    (--) Term-Op                    -> Expecting-Operand
          + Unstack-Ge-Operators
          + Stack-The-Operator
          + Get-Next-Token
    (--) Factor-Op                  -> Expecting-Operand
          + Unstack-Ge-Operators
          + Stack-The-Operator
          + Get-Next-Token
    (--) End-Mark                   ->
          + Unstack-All-Operators
          + Unstack-If-End-Mark
          + Terminate-The-Program
   (--) Right-Par                  -> Expecting-Operator
          + Unstack-All-Operators
          + Unstack-If-Left-Par
          + Get-Next-Token

The Defaults state lists all tokens. If a state does not explicitly accept some token, the Defaults state handles it: it issues an error message and terminates the program:

Defaults:
    (--) Number                     ->
          + Signal-Invalid-Token
          + Terminate-The-Program
    (--) Term-Op                    ->
          + Signal-Invalid-Token
          + Terminate-The-Program
    (--) Factor-Op                  ->
          + Signal-Invalid-Token
          + Terminate-The-Program
    (--) End-Mark                   ->
          + Signal-Token-Missing
          + Terminate-The-Program
    (--) Left-Par                   ->
          + Signal-Invalid-Token
          + Terminate-The-Program
    (--) Right-Par                  ->
          + Signal-Invalid-Token
          + Terminate-The-Program
    (--) Exception                  ->
          + Terminate-The-Program

The expression parsing technique shown here is easy to elaborate to support variables, functions, other operators, strings, etc. It is true that some languages have easier ways of evaluating expressions, but this technique is easily applied to languages like assembler and COBOL that do not have support from tools like lex and yacc.


| << | < | > | >>
| Introduction to Libero | The Coke Machine Example | Example of Using a Telephone | Example of Controlling a Telephone | Source Code For Phone.c | Example of a C/C++ Comment Stripper | Example of Parsing An Arithmetic Expression | Dialogs For Dummies | Frequently Asked Questions
iMatix
Copyright © 1996-97 iMatix