Removing code duplication in detecting EOR vs EOF

247 views Asked by At

I am looping through multi-line records and loading them into an array. I happen to be using Perl , but language is irrelevant as I am looking for an optimization to the algorithm. Specifically, I am bothered by the fact that I am writing the array push twice. Once in the loop when I find an end of record (eor) and again when I run out of file (eof, not eor). I know this doesn't affect the speed, I just don't like have to repeat the code in two places. It means that if it changes, I have to modify in two places.

The approach I am taking is this:

my $data = []; #data object array
my $record = {};
my $line;
while (my $line = <$file>){
    if($line =~ /marker-a:(.*)/){
        # Update data object
        $$record{'a'} = $1;
    }
    if($line =~ /marker-b:(.*)/){
        # Update data object
        $$record{'b'} = $1;
    }
    if($line =~ /eor/){
        # End of record; add to data array
        push(@$data,$record);
        $record = {};
    }
}
#Update leftover data to data array
push(@$data,$record);

Is there a better way to do this? I know I could just create a function, but I am looking for something more elegant. I haven't tested this code, but it should give enough of an idea of what I am doing. Let me know if any questions.

2

There are 2 answers

3
Borodin On BEST ANSWER

All that is necessary is to change

if ( $line =~ /eor/ )

to

if ( $line =~ /eor/ or eof )

and remove the push outside the loop


Update

Here's a more complete solution that uses best practices and avoids pushing empty records onto the array

my ($data, $record);

while ( <$file> ) {

    if ( /marker-([ab]):(.*)/ ) {
        $record->{$1} = $2;
    }

    if ( ( /eor/ or eof ) and $record ) {
        push @$data, $record;
        $record = undef;
    }
}
1
ikegami On

So you want EOR and EOF to be handled at the same place. That means that checking for EOF can't be done at the top of the loop anymore. The trick to moving the condition into the loop is to switch to an infinite loop.

my $data = [];
my $record = {};
while (1) {
    my $line = <$file>;
    if (!defined($line) || $line =~ /eor/) {
        push(@$data, $record) if keys(%$record);
        last if !defined($line);
        $record = {};
    }
    elsif ($line =~ /marker-a:(.*)/) {
        $record->{a} = $1;
    }
    elsif ($line =~ /marker-b:(.*)/) {
        $record->{b} = $1;
    }
}