Expression Parsing in Data Structure

Free Data Structures and Algorithms courses with real-time projects Start Now!!

A+B-(C*D)/E+X*Y-

I am sure you must have seen this kind of expression at some point in your life. When you approach solving it, you assign the values to the variable and perform the operations based on the BODMAS rule. But that is not how expression solving works in computer systems. When you enter an expression like this in the system, it reads it as a string.

In this article, we will be learning types of expression notations, precedence order, and associativity. Later we will be focusing on how a computer system approaches solving an arithmetic or logical expression

Expression Parsing

Expression parsing in data structure refers to the evaluation of arithmetic and logical expressions. The arithmetic expression can be written in three forms:

1. Infix notation

2. Postfix notation

3. Prefix notation

The final answers to the arithmetic or logical expression remain the same no matter in which form they are represented.

Associativity in Expression Parsing

It is used when two operators of the same precedence are present in an expression. Associativity can be:

1. left to right

2. right to left.

For example: 

Expression 40/20*20 is evaluated as :

 = (40/20)*20

 = 2*20

 = 40

Since the order of precedence for * and / is the same, expression is solved according to associativity, which for ✱ and / is left to right.

Order of precedence in Expression parsing

Order of precedence is the order that the operators in an expression will be following during execution. If two or more symbols of the same precedence level are preset in an expression, then the expression is read according to associativity.

OperatorDescriptionAssociativity
()parenthesisleft-to-right
[]brackets
.Object 
->pointer
++  – –Postfix increment, decrement
++  – –Prefix increment/decrementright-to-left
+ – Unary plus/minus
! ~Logical negation/bitwise complement
(type)Cast
*Dereference
&Address (of operand)
sizeofreturn size in bytes 
*  /  %Multiplication/division/modulusleft-to-right
+  –Addition/subtractionleft-to-right
<<  >>Bitwise shift left, Bitwise shift rightleft-to-right
<  <=Relational less than/less than or equal toleft-to-right
>  >=Relational greater than/greater than or equal to
==  !=Relational is equal to/is not equal toleft-to-right
&Bitwise ANDleft-to-right
^Bitwise exclusive ORleft-to-right
|Bitwise inclusive ORleft-to-right
&&Logical ANDleft-to-right
| |Logical ORleft-to-right
? :Ternary conditionalright-to-left
=Assignmentright-to-left
+=  -=Addition/subtraction assignment
*=  /=Multiplication/division assignment
%=  &=Modulus/bitwise AND assignment
^=  |=Bitwise exclusive/inclusive OR assignment
<<=  >>=Bitwise shift left/right assignment
,Comma (separate expressions)left-to-right

Types of notations in Expression Parsing

1. Infix notation in Expression Parsing

The normal form of expression that we generally use is the infix notation. Operators in infix notation are used in between the operands. Infix notation is easy for humans but for computer systems, it might require more memory and time to solve.

For example:  (a+b) * (c-d)

2. Prefix notation in Expression parsing

In prefix notation, also known as polish notation, the operator is written before the operands.

For example: *+ab-cd

3. Postfix notation in Expression Parsing

In prefix notation, also known as reverse polish notation, the operator is written after the operands.

For example:  ab+cd-*

Expression Parsing

Writing algorithms for parsing expressions written in infix notation is very tough and inefficient. For this reason, infix expressions are first converted to postfix or prefix notation and then evaluated.

Converting of expression is done on basis of associativity and order of precedence.

For example:

Convert the following infix to postfix:

X+(R-Q)*S-Y/Z 

Step 1: start from the expression inside the brackets

=X+(RQ-)*S-Y/Z

Step 2: scanning from left to right, convert the expression for highest precedence and associativity.

=X+RQ-S*-YZ/

=XRQ-S*+YZ/-

Input symbolstackpostfix
XemptyX
++X
(+(X
R+(XR
+(-XR
Q+(-XRQ
)+XRQ-
*+*XRQ-
S+*XRQ-S
XRQ-S*+
YXRQ-S*+Y
/-/XRQ-S*+Y
Z-/XRQ-S*+YZ
nullEMPTYXRQ-S*+YZ/-

Convert the following infix to prefix:

X+(R-Q)*S-Y/Z 

STEP 1: reverse the expression

Z/Y-S*)Q-R(+X

STEP 2:

Input Symbolstackprefix
ZemptyZ
//Z
Y/ZY
ZY/
SZY/S
*-*ZY/S
)-*)ZY/S
Q-*)ZY/SQ
-*)-ZY/SQ
R-*)-ZY/SQR
(-*ZY/SQR-
+-+ZY/SQR-*
X-+ZYSQR-*X
nullemptyZYSQR-*X+-

Reverse the above result: -+X*-RQS/YZ (final prefix expression)

Postfix evaluation

Evaluate:  5 2 + 6 3 / * using stack

InputCurrent stackCurrent symbolOperation on stack
5 2 + 6 3 / *empty5Push 5
2 + 6 3 / *52Push 2
+ 6 3 / *2 5+Pop 2 and 5, push 5+2
6 3 / *76Push 6
3 / *6 73Push 3
/ *3 6 7/Pop 3 and 6, push 6/3
*2 7*Pop 2 and 7, push 2*7

The Final result of the above expression is 14.

Algorithm of Expression Parsing

Below are the steps to do Expression Parsing:

1: start scanning one symbol at a time of the expression from left to right.
2: if the symbol is an operand, push it to stack
3: if the symbol is an operator, pop operand from the stack and perform the operation
4: push the result of step 3, into the stack
5: Repeat steps 1 to 4 until all operands are consumed
6: pop the stack and perform the operation

Postfix evaluation in C++

#include <bits/stdc++.h>
using namespace std;
 
int precedence_order(char op){
    if(op == '+'||op == '-')
    return 1;
    if(op == '*'||op == '/')
    return 2;
    return 0;
}
 
int operation(int a, int b, char op){
    switch(op){
        case '+': return a + b;
        case '-': return a - b;
        case '*': return a * b;
        case '/': return a / b;
    }
}
 
int evaluate(string expression){
    int i;
     
    stack <int> values;
     
    stack <char> ops;
     
    for(i = 0; i < expression.length(); i++){
        
        if(expression[i] == ' ')
            continue;
         
        else if(expression[i] == '('){
            ops.push(expression[i]);
        }
         
        else if(isdigit(expression[i])){
            int val = 0;
             

            while(i < expression.length() &&
                        isdigit(expression[i]))
            {
                val = (val*10) + (expression[i]-'0');
                i++;
            }
             
            values.push(val);
             
              i--;
        }
         
       
        else if(expression[i] == ')')
        {
            while(!ops.empty() && ops.top() != '(')
            {
                int val2 = values.top();
                values.pop();
                 
                int val1 = values.top();
                values.pop();
                 
                char op = ops.top();
                ops.pop();
                 
                values.push(operation(val1, val2, op));
            }
             
            
            if(!ops.empty())
               ops.pop();
        }
         
        
        else
        {
            
            while(!ops.empty() && precedence_order(ops.top())
                                >= precedence_order(expression[i])){
                int val2 = values.top();
                values.pop();
                 
                int val1 = values.top();
                values.pop();
                 
                char op = ops.top();
                ops.pop();
                 
                values.push(operation(val1, val2, op));
            }
             
            
            ops.push(expression[i]);
        }
    }
     
    
    while(!ops.empty()){
        int val2 = values.top();
        values.pop();
                 
        int val1 = values.top();
        values.pop();
                 
        char op = ops.top();
        ops.pop();
                 
        values.push(operation(val1, val2, op));
    }
     
    
    return values.top();
}
 
int main() {
    cout << evaluate("34 + 13 * 93") << "\n";
    cout << evaluate("85 / 23 - 16") << "\n";
    cout << evaluate("100 * ( 2 + 12 )") << "\n";
    cout << evaluate("121 * ( 5 - 1 ) / 11");
    return 0;
}

Expression Parsing Postfix evaluation in Python

def precedence_order(op):
    
    if op == '+' or op == '-':
        return 1
    if op == '*' or op == '/':
        return 2
    return 0
 
def operation(a, b, op):
     
    if op == '+': return a + b
    if op == '-': return a - b
    if op == '*': return a * b
    if op == '/': return a // b
 

def evaluate(expression):
     
    values = []
    ops = []
    i = 0
     
    while i < len(expression):
         
        if expression[i] == ' ':
            i += 1
            continue
         
        elif expression[i] == '(':
            ops.append(expression[i])
         
        elif expression[i].isdigit():
            val = 0
             
            while (i < len(expression) and
                expression[i].isdigit()):
             
                val = (val * 10) + int(expression[i])
                i += 1
             
            values.append(val)
             
            i-=1
        
        elif expression[i] == ')':
         
            while len(ops) != 0 and ops[-1] != '(':
             
                val2 = values.pop()
                val1 = values.pop()
                op = ops.pop()
                 
                values.append(operation(val1, val2, op))
            ops.pop()
         
        else:
        
            while (len(ops) != 0 and
                precedence_order(ops[-1]) >=
                   precedence_order(expression[i])):
                         
                val2 = values.pop()
                val1 = values.pop()
                op = ops.pop()
                 
                values.append(operation(val1, val2, op))
             
            
            ops.append(expression[i])
         
        i += 1
     
    while len(ops) != 0:
         
        val2 = values.pop()
        val1 = values.pop()
        op = ops.pop()
                 
        values.append(operation(val1, val2, op))
     
    return values[-1]
 
if __name__ == "__main__":
     
    print(evaluate("34 + 13 * 93"))
    print(evaluate("85 / 23 - 16"))
    print(evaluate("100 * ( 2 + 12 )"))
    print(evaluate("121 * ( 5 - 1 ) / 11"))



Conclusion

Expression parsing is the method that a computer uses to solve an arithmetic expression, Unlike humans, a computer system has completely different approaches as we discussed in this article since infix expression is not understandable to the system.

We also discussed different components of expression parsing, notations, associativity, and precedence. The precedence order is similar to Mathematical concert of BODMAS, just with a lot more symbols.

Did you like our efforts? If Yes, please give DataFlair 5 Stars on Google

follow dataflair on YouTube

Leave a Reply

Your email address will not be published. Required fields are marked *