Bash regular expression execution hangs on long expressions

159 views Asked by At

I need to validate a 38 field comma seperated string. Fields can be numeric, decimal or empty allowed strings.

Problem is when I construct a regular expression for 38 fields and try to execute, it hangs forever and it hangs.

I use following per field reg exps:

INT="[0-9]+"
TIM="[0-9]+"
NUM="[0-9]+(\.[0-9]+)?"
STR=".*"  # --> (also tried "[^,]*" but no change)

I constructed my regexps with above expressions.

1) This is working fine: (Output: "matches")

[[ "str1,1.1,5,6,7,8,9,str2,str3,str4,str1,1.1,5,6,7,8,9,str2,str3,str4,str1,1.1,5,6,7,8,9,str2,str3,str4,str1,1.1,5,6,7,8,9,str2,str3,str4" =~ ^.*\,[0-9]+(\.[0-9]+)?\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,.*\,.*\,.*\,.*\,[0-9]+(\.[0-9]+)?\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,.*\,.*\,.*\,.*\,[0-9]+(\.[0-9]+)?\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,.*\,.*\,.*\,.*\,[0-9]+(\.[0-9]+)?\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,.*\,.*\,.*$ ]] && echo matches

2) This hangs and execution wont complete !!!:

[[ "str1,1.1,5,6,7,8,9,str2,str3,str4,str5,str6,str7,str8,str9,str10,str11,2.0,str12,0.0,5.0,str13,12312545645,45456456478,78979754545,12312545645,45456456478,78979754545,78979754545,4.74,0.1245,4.174,0.4245,6,80,str14,str15" =~ ^.*\,[0-9]+(\.[0-9]+)?\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,.*\,.*\,.*\,.*\,.*\,.*\,.*\,.*\,.*\,.*\,[0-9]+(\.[0-9]+)?\,.*\,[0-9]+(\.[0-9]+)?\,[0-9]+(\.[0-9]+)?\,.*\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+(\.[0-9]+)?\,[0-9]+(\.[0-9]+)?\,[0-9]+(\.[0-9]+)?\,[0-9]+(\.[0-9]+)?\,[0-9]+\,[0-9]+\,[0-9]+(\.[0-9]+)?\,.*\,.*$ ]] && echo matches

I thought .* is too generic then tried [^,]* but nothing changed.

Please advice how can I solve this without splitting by "," once then compare one by one.


!!! Correction !!!

Above I stated:

STR="." # --> (also tried "[^,]" but no change)

This is wrong. Noticed that, I failed to replace all of them. When I replace all .* to [^,] problem is resolved. See below:

3) This is fixed version and working as expected:

[[ "str1,1.1,5,6,7,8,9,str2,str3,str4,str5,str6,str7,str8,str9,str10,str11,2.0,str12,0.0,5.0,str13,12312545645,45456456478,78979754545,12312545645,45456456478,78979754545,78979754545,4.74,0.1245,4.174,0.4245,6,80,1,str15,str16" =~ ^[^,]*\,[0-9]+(\.[0-9]+)?\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[^,]*\,[^,]*\,[^,]*\,[^,]*\,[^,]*\,[^,]*\,[^,]*\,[^,]*\,[^,]*\,[^,]*\,[0-9]+(\.[0-9]+)?\,[^,]*\,[0-9]+(\.[0-9]+)?\,[0-9]+(\.[0-9]+)?\,[^,]*\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+\,[0-9]+(\.[0-9]+)?\,[0-9]+(\.[0-9]+)?\,[0-9]+(\.[0-9]+)?\,[0-9]+(\.[0-9]+)?\,[0-9]+\,[0-9]+\,[0-9]+(\.[0-9]+)?\,[^,]*\,[^,]*$ ]] && echo matches

Watch out for Catastrophic Backtracking that I learned from this issue.

1

There are 1 answers

5
Liu On

Sorry I am not able to comment since my reputation is lower than 50. :(

Will the following regex work for you?

^([A-Za-z0-9\s\.]+\,){37}[A-Za-z0-9\s\.]+$