Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
233 views
in Technique[技术] by (71.8m points)

regex - Regular Expression for Mathematical Expression Python

I would like to come out with a regax which is able to match the mathematical expression which is being stored in exp.

The pattern of the expression starts and end with bracket and cannot contain any letters. The expression may contain integers or floats, and the operators can be +-*/**. Length of the expression is not limited.

This is my regex:

import re
re.match(r'^[(]([(-]?([0-9]+)[)]??)([(]?([-+/*]([0-9]))?([.]?[0-9]+)?[)])[)]*$', exp)

However my regex doesn't match with some of the strings. For example:

  • exp = '(( 200 + (4 * 3.14)) / ( 2 ** 3 ))'
  • exp = '(23.23+23)'
  • exp = '((23**2)/23)'
  • exp = '(23.34-(3*2))'

I am new to regular expression and I am not sure which part of the expression is wrong, please forgive me for the trouble and hope that someone can help me with it. Thank you so much!

question from:https://stackoverflow.com/questions/65870655/regular-expression-for-mathematical-expression-python

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You could approach it as "splitting" the string with operators being your separators. This would save you from trying to represent numbers in your regular expression.

So you only need an expression that will pick up the 5 operators and parentheses. This can be expressed using a pipe between operators with the longest operator (**) being first.

import re
symbols   = ["**","+","-","*","/","(",")"]    # longest first
tokens    = re.compile("("+"|".join(map(re.escape,symbols))+")")
            # placing symbols in a group makes re.split keep the separators

def tokenize(exp):
  parts = map(str.strip,tokens.split(exp)) # split and strip spaces
  return list(filter(None,parts))          # remove empty parts

exp = '(( 200 + (4 * 3.14)) / ( 2 ** 3 ))'
print(tokenize(exp))
['(', '(', '200', '+', '(', '4', '*', '3.14', ')', ')', '/', '(', '2', '**', '3', ')', ')']


exp = '(23.23+23)'
print(tokenize(exp))
['(', '23.23', '+', '23', ')']

exp = '((23**2)/23)'
print(tokenize(exp))
['(', '(', '23', '**', '2', ')', '/', '23', ')']

exp = '(23.34-(3*2))'
print(tokenize(exp))
['(', '23.34', '-', '(', '3', '*', '2', ')', ')']        

Then you can perform a second pass and validate that the components are either an operator or a valid number as well as check that the expression is well formed with matching parentheses and alternating operator/operand. At that point you will know exactly what part of the expression is incorrect.

for example:

def validate(exp):
    parts    = tokenize(exp)
    error    = ""
    pLevel   = 0
    previous = "$"
    for errorPos,part in enumerate(parts):
        pLevel += (part=="(")-(part==")")
        if pLevel<0: error="too many closing parentheses";break
        if part in "**+/)" and previous in "$**+-/(" :
            error = "missing operand";break 
        if part not in "**+-/)" and previous not in "$**+-/(":
            error = "missing operator";break
        previous = part
        if part in ["**","*","+","-","/","(",")"]: continue
        if all(p.isdigit() for p in part.split(".",1)): continue
        error = "invalid operand: " + part
        break
    if not error and pLevel!=0:
        errorPos,error = len(parts),"unbalanced parentheses"
    if not error and previous in "**+-/":
        errorPos,error = len(parts),"missing operand"
    if error:
        print("".join(parts))
        indent = " " * sum(map(len,parts[:errorPos]))
        print(indent+"^")
        print(indent+"|__ Error!",error)

...

validate('(( 200 + (4 * 3,14)) / ( 2 ** 3 ))')
                       
((200+(4*3,14))/(2**3))
         ^
         |__ Error! invalid operand: 3,14


validate('(( 200 + (4 * 3.14)) / ( 2 ** 3 )') 

((200+(4*3.14))/(2**3)
                      ^
                      |__ Error! unbalanced parentheses


validate('(( 200 + (4 * 3.14)))) / ( 2 ** 3 )') 

((200+(4*3.14))))/(2**3)
                ^
                |__ Error! too many closing parentheses


validate('(( 200 + *(4 * 3,14)) / ( 2 ** 3 ))')

((200+*(4*3,14))/(2**3))
      ^
      |__ Error! missing operand


validate('(( 200 + ()(4 * 3,14)) / ( 2 ** 3 ))')

((200+()(4*3,14))/(2**3))
       ^
       |__ Error! missing operand


validate('(( (200 + )(4 * 3,14)) / ( 2 ** 3 ))')

(((200+)(4*3,14))/(2**3))
       ^
       |__ Error! missing operand


validate('(( (200 + 2)(4 * 3,14)) / ( 2 ** 3 ))')

(((200+2)(4*3,14))/(2**3))
         ^
         |__ Error! missing operator

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...