quantify a captured regex

127 views Asked by At

I am looking for file paths within scripts. So I am going to write out a script that cats out file and then looks for a "/".

I would rather use perl regex and would just like to grep out file paths.

foo@foohost:~ $ cat /sbcimp/dyn/data/FOO/GSD/scripts/FOOonoff.pl | grep "/"

#!/usr/bin/perl
my $output_file = "/sbcimp/dyn/data/stmFOO3/dailymetrics/PartRates/file6.csv";
my $input_file_name_ESTATE = "/sbcimp/dyn/sym/data/stmFOO3/part_rates/FOO_estate.$year$month1$day1.1630.csv";
my $input_file_name_ESTATE = "/sbcimp/data/stmFOO3/part_rates/FOO_estate.20140829.1630.csv";
my $input_file_name_ESTATE2 = "/sbcimp/part_rates/FOO_estate.$year$month1$day2.1630.csv";
my $input_file_name_ESTATE3 = "/sbcimp/FOO_estate.$year$month2$day3.1630.csv";
my $input_file_name_NEW = "/sbcimp/dyn/data/stmFOO3/dailymetrics/RiskTiers/new_terms.csv";
    $argVal =~ s/\s+$//;
    $argVal =~ s/^\s+//;
    $argVal =~ s/\"$//;
    $argVal =~ s/^\"//;
    $argVal =~ s/\'$//;
    $argVal =~ s/^\'//;

If I cat out the file and put it through a perl one liner, I get just the root directory.

foo@foohost:~ $ cat /sbcimp/dyn/data/FOO/GSD/scripts/FOOonoff.pl | perl -nle 'print /(\/\w+\/)/' | sort -u

/sbcimp/

I understand quantifiers in regex but if I use 'print /(/\w+/){1,9}/' that is not going to give me "/w+/..either 1 or 9 times. I am going to be looking for paths that are 1 or many times from the root path. How do I quantify the whole captured regex, not just the last character?

1

There are 1 answers

0
Miller On

I recommend not using a regular expression to parse Perl code, but instead to use PPI.

The following parses the perl lines that you provided for strings, reduces them to just their base content, and then pulls out the path information:

use strict;
use warnings;

use PPI;
use File::Basename;

my $src = do {local $/; <DATA>};

# Load a document
my $doc = PPI::Document->new( \$src );

# Find all the strings within the doc
my $strings = $doc->find( 'PPI::Token::Quote' );
for (@$strings) {
    my $str = eval 'no strict; no warnings; '. $_->content;
    next if $@ || $str !~ /\//;

    my ($name, $path) = fileparse($str);

    print "$path\n";
}

__DATA__
#!/usr/bin/perl
my $output_file = "/sbcimp/dyn/data/stmFOO3/dailymetrics/PartRates/file6.csv";
my $input_file_name_ESTATE = "/sbcimp/dyn/sym/data/stmFOO3/part_rates/FOO_estate.$year$month1$day1.1630.csv";
my $input_file_name_ESTATE = "/sbcimp/data/stmFOO3/part_rates/FOO_estate.20140829.1630.csv";
my $input_file_name_ESTATE2 = "/sbcimp/part_rates/FOO_estate.$year$month1$day2.1630.csv";
my $input_file_name_ESTATE3 = "/sbcimp/FOO_estate.$year$month2$day3.1630.csv";
my $input_file_name_NEW = "/sbcimp/dyn/data/stmFOO3/dailymetrics/RiskTiers/new_terms.csv";
    $argVal =~ s/\s+$//;
    $argVal =~ s/^\s+//;
    $argVal =~ s/\"$//;
    $argVal =~ s/^\"//;
    $argVal =~ s/\'$//;
    $argVal =~ s/^\'//;

Outputs:

/sbcimp/dyn/data/stmFOO3/dailymetrics/PartRates/
/sbcimp/dyn/sym/data/stmFOO3/part_rates/
/sbcimp/data/stmFOO3/part_rates/
/sbcimp/part_rates/
/sbcimp/
/sbcimp/dyn/data/stmFOO3/dailymetrics/RiskTiers/