I'm trying the following: I have a string, that can look like this: 'a, b, (c, d, (e, f), g), (h, i)' and I want to split it at the commas that resemble the first layer:
a b (c, d, (e, f), g) (h, i)
I just can't figure out how to do this. The logical solution I got was, I have to find the commas, which have the same amount of opening and closing brackets behind them. How can I implement this with regular expressions?
Best Regards
Here are a couple of options:
Option 1: If your data has a consistent pattern of commas and parentheses across rows, you can actually parse it quite easy with a regex. The downside is that if your pattern changes, you have to change the regex. But it's also quite fast (even for very large cell arrays):
And the result for this example:
Note that I used the pattern
[-\d\.]+
to match an arbitrary number which may have a negative sign or decimal point.Option 2: You can use
regexprep
to repeatedly remove pairs of parentheses that don't contain other parentheses, replacing them with whitespace to maintain the same size string. Then, find the positions of the commas in the final processed string and break up the original string using these positions. You won't have to change the regex for each new pattern of commas and parentheses, but this will be a little slower than the above (but still only taking a second or two for arrays of up to 15,000 cells):This gives the same result as Option 1.