Using regex in perl to extract a substring, or line from a blob of text

1.1k views Asked by At

I have a variable with some blob of text in it

$foo = "
    Garbage directory
    /test/this/is/a/directory
    /this/is/another/foo\nThisd is is\nDrop stuff testing\nRandom stuff emacs is great";

How do I use regex to get the line that is /test/this/is/a/directory

I tried this:

my $foo = "
    Garbage directory
    /test/this/is/a/directory
    /this/is/another/foo\nThisd is is\nDrop stuff testing\nRandom stuff emacs is great";
$foo =~ /^\/test.*$/;
print "\n$foo\n";

But this just keeps on printing the whole blob of text.

3

There are 3 answers

4
Paul Carlton On BEST ANSWER

Your regular expression should be:

/\/test.*\n/

The reason is because you are matching the entire text and there's no limit to the end of the line. You'll need to express that you want the match to the next new line. This regular expression includes the newline in the match though.

With regular expressions there are different ways to do it so it depends on the context of what you're trying to accomplish. You could add the m modifier at the end. What this will do is treat the string as multiple lines so you can then use ^$ with each line instead of the entire text. Also using the m multiline modifier will not result in a match that includes the newline.

/\/test.*/m would suffice.

For more info: https://perldoc.perl.org/perlre.html

Furthermore print "$foo"; WILL NOT print the match because =~ operator returns a true or false value and does not reassign the variable to the match. You'll need to change the regex for pattern matching and print the first match:

$foo =~ /(\/test.*)/m;
print $1;
2
Jan On

Change your expression to

$foo =~ m~^\s*/test.*$~m;

See a demo on regex101.com.


This uses other delimiters (~) so that you don't need to escape the /, additionally whitespaces (\s*) and turns on the multiline mode (m).

1
merlin2011 On

The OP seems to want the specified line to be printed, rather than the whole blob of text. For this, we need to modify Jan's answer to capture and extract the actual match.

my $foo = "
    Garbage directory
    /test/this/is/a/directory
    /this/is/another/foo\nThisd is is\nDrop stuff testing\nRandom stuff emacs is great";
$foo =~ m~^(\s*/test.*)$~m;
$foo = $1;
print "\n$foo\n"

Output:

/test/this/is/a/directory