Regex Match text within a Capture Group

11.2k views Asked by At

Sample Text:

\- !ruby/object:DynamicAttribute 
  attributes: 
    resource_id: "1"
    resource_type: Applicant
    string_value: "Michael"
    int_value: 
    id: "35972390"
    date_value: 
    name: first_name
  attributes_cache: {}

\- !ruby/object:DynamicAttribute 
  attributes: 
    resource_id: "1"
    resource_type: Applicant
    string_value: "Johnson"
    int_value: 
    id: "35533149"
    date_value: 
    name: last_name
  attributes_cache: {}

Target:

I'm trying to extract the value after "string_value" where the "name" equals some string. Let's say it equals last_name. The attributes are not in any particular order. I've explored using capture groups but I did not get very far.

Any help on this would be appreciated. Thanks!

1

There are 1 answers

1
Mustofa Rizwan On

You can try this regex:

string_value:(?=(?:(?!attributes_cache).)*name: last_name)\s+\"(\w+)\".*?attributes_cache

Explanation

  1. string_value: matches the characters string_value:
  2. Positive Lookahead (?=(?:(?!attributes_cache).)*name: last_name) it looks ahead to see if it contains name: last_name but will not go beyond attributes_cache , otherwise it may overlap with the next result set which may have name: last_name
  3. \s+ matches any whitespace character (equal to [\r\n\t\f\v ])
  4. Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
  5. \" matches the character " literally (case sensitive)
  6. 1st Capturing Group (\w+) : \w+ matches any word character (equal to [a-zA-Z0-9_]) => this is the text that you want capture.

The capture group 1 contains the text that you are looking for.

Although you haven't described the programming language but the following sample is done on ruby (run it) :

re = /string_value:(?=(?:(?!attributes_cache).)*name: last_name)\s+\"(\w+)\".*?attributes_cache/m
str = '\\- !ruby/object:DynamicAttribute 
  attributes: 
    resource_id: "1"
    resource_type: Applicant
    string_value: "Johnson1"
    int_value: 
    id: "35533149"
    date_value: 
    name: last_name
  attributes_cache: {}

\\- !ruby/object:DynamicAttribute 
  attributes: 
    resource_id: "1"
    resource_type: Applicant
    string_value: "Michael"
    int_value: 
    id: "35972390"
    date_value: 
    name: first_name
  attributes_cache: {}

\\- !ruby/object:DynamicAttribute 
  attributes: 
    resource_id: "1"
    resource_type: Applicant
    string_value: "Johnson2"
    int_value: 
    id: "35533149"
    date_value: 
    name: last_name
  attributes_cache: {}'

# Print the match result
str.scan(re) do |match|
    puts match.to_s
end