How to find strings between two specified texts

64 views Asked by At

I have a html code like this

<html>
<body>
<a href="one">frist</a>
<a href="two">second</a>
<a href="three">third</a>
<a href="four">fourth</a>
</body>
</html>

I want to make a perl script that get this code and print the string between

<a href="

and

">

For this code it will be this

one
two
three
four

How can i do that? Sorry for my bad english

2

There are 2 answers

0
serenesat On

This is not the right way to parse an HTML file but for your given data this code will give you desire output:

use warnings;
use strict;

my $file = $ARGV[0];
open my $fh, "<", $file or die $!;

while ( my $line = <$fh> ) {
    chomp $line;
    if ( $line =~ m/a href="(.+)"(.*)/g ) {
        print "$1\n";
    }
}

data

<html>
<body>
<a href="one">frist</a>
<a href="two">second</a>
<a href="three">third</a>
<a href="four">fourth</a>
</body>
</html>

Output

one
two
three
four
0
Borodin On

Use HTML::LinkExtor like this

use strict;
use warnings;

use HTML::LinkExtor;

my $extor = HTML::LinkExtor->new;
$extor->parse_file(\*DATA);

for ( $extor->links ) {
  my ($tag, $att, $val) = @$_;
  print $val, "\n" if $tag eq 'a' and $att eq 'href';
}

__DATA__
<html>
<body>
<a href="one">frist</a>
<a href="two">secnod</a>
<a href="three">thrid</a>
<a href="four">furoth</a>
</body>
</html>

output

one
two
three
four