Regex extraction from Right to Left

Question

Regex extraction from Right to Left

122 views Asked by Jotne At 08 October 2023 at 14:42

I have some data where I like to extract data from right to left. Sample data

1,4,34
5,15
22

Expected output:

One=34  Two=4  Three=1
One=15  Two=5
One=22

This is as far as I have got with my regex experience.

(?:(?<three>\d+),)?(?:(?<two>\d+),)?(?<one>\d+)$

But this gives :

One=34  Two=4  Three=1
One=15  Three=5
One=22

So it fails when there are only two extraction. Any good idea? PS I do not have any revers tools

Original Q&A

There are 5 answers

Vivick On 08 October 2023 at 14:56

^((?:(?<three>\d+),)(?:(?<two>\d+),)|(?:(?<two2>\d+),)?)(?<one>\d+)$ is the only potential solution I can think of, but since capture groups must all have different names, you end up with 2 "two" with different names.

sln On 08 October 2023 at 20:41

Naming groups in the reverse order is ok.
If you're looking for matching in the reverse order, this is a direct way.

This is a template regex that can be expanded as needed and will match left to
right (LTR) in a string from the last to the first in ascending group order.

This removes post processing steps.

Example, these strings produce these matching arrays:

1,4,34 => [34,4,1]
5,15 => [15,5]
22 => [22]

https://regex101.com/r/uo04VM/1

^(?=(?&D_n){0,2}(\d+)$)(?=(?:(?&D_n){0,1}(\d+)(?&n_D)$)?)(?=(?:(\d+)(?&n_D){2}$)?).+$(?(DEFINE)(?<D_n>\d+[^\d\r\n]+)(?<n_D>[^\d\r\n]+\d+))

Expanded

^
(?=
   (?&D_n){0,2}
   ( \d+ )                       # (1)
   $
)
(?=
   (?:
      (?&D_n){0,1}
      ( \d+ )                       # (2)
      (?&n_D) $
   )?
)
(?=
   (?:
      ( \d+ )                       # (3)
      (?&n_D){2} $
   )?
)
.+ $
(?(DEFINE)
   (?<D_n> \d+ [^\d\r\n]+ )      # (4)
   (?<n_D> [^\d\r\n]+ \d+ )      # (5)
)

warren On 17 October 2023 at 13:57

You want a variable-list of field names extracted from delimited data in reverse order?

How many entries could you possibly have? Three? Five? Two hundred seventy four?

Are you trying to do this at search time (ie in SPL you are writing/running), or in props.conf?

If you are trying to do this at search time, I would not try to use a regular expression at all - use split() (or makemv) and mvindex() (with negative indexing) to find the items you want:

...
| eval mvlist=split(delimited_field,",")
...
| eval three=mvindex(mvlist,-3)
...

Jotne On 21 October 2023 at 09:18

To avoid using regex from right to left, I found a way to revers the string.

Sed by it self seem to have a limit to 9 numbered back references.

echo "AbCdEfG" | sed  -r 's/(.)(.)?(.)?(.)?(.)?(.)?(.)?/\7\6\5\4\3\2\1/'
GfEdCbA

But sed splunk does not have this limit (nor that I need so many) so

| makeresults 
| eval test="abcdefghijkl"
| rex mode=sed field=test "s/(.)(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?/\12\11\10\9\8\7\6\5\4\3\2\1/"

gives: test=lkjihgfedcba

Then using regex from left to right works fine.

**The fourth bird** · Accepted Answer · 2023-10-08T15:28:17+00:00

You can make the first 2 groups optional as a whole:

^(?:(?:(?<three>\d+),)?(?<two>\d+),)?(?<one>\d+)$

The pattern matches:

^ Start of string
(?: Non capture group
- (?:(?<three>\d+),)? Optionally capture 1+ digits in group "three" and match a comma
- (?<two>\d+), Capture 1+ digits in group "two" and match a comma
)? Close the non capture group
(?<one>\d+) Capture 1+ digits in group "one"
$ End of string

Regex demo

TechQA.

Regex extraction from Right to Left

There are 5 answers

Related Questions in REGEX

Related Questions in PCRE

Related Questions in SPLUNK

Popular Questions

Trending Questions