Perl: Iterating through large hash, runs out of memory

Question

Perl: Iterating through large hash, runs out of memory

544 views Asked by e2121 At 11 June 2015 at 07:18

I have been trying to find values that match between two columns (columns a and column b) of a large file and print the common values, plus the corresponding column d. I have been doing this by interating through hashes, however, because the file is so large, there is not enough memory to produce the output file. Is there any other way to do the same thing using less memory resources.

Any help is much appreciated.

The script I have written thus far is below:

#!usr/bin/perl
use warnings;
use strict;

open (FILE1, "<input.txt") || die "$!\n Couldn't open input.txt\n";
open (Output, ">output.txt")||die "Can't Open output.txt ";
my $hash1={};
my $hash2={};

while (<FILE1>) {
    chomp (my $line=$_);
    my ($a, $b, $c, $d) = split (/\t/, $line);

    if ($a) {
        $hash1->{$a}{info1} = "$d"; #original_ID-> YOB
    }
    if ($b) {
        $hash2->{$b}{info2} = "$a"; #original_ID-> sire
    }

    foreach my $key (keys %$hash2) {
        if (exists $hash1{$a}) {
            $info1 = $hash1->{$a}->{info1};
            print "$a\t$info1\n";
        }
    }
}

close FILE1;
close Output;
print "Done\n";

To clarify, the input file is a large pedigree file. An example is:

1    2   3   1977
2    4   5   1944
3    4   5   1950
4    5   6   1930
5    7   6   1928

An example of the output file is:

2   1944
4   1950
5   1928

Original Q&A

There are 1 answers

**Georgi Rangelov** · Answer 1 · 2015-06-11T19:24:27+00:00

Does the below work for you ?

#!/usr/local/bin/perl

use strict;
use warnings;
use DBM::Deep;
use List::MoreUtils qw(uniq);

my @seen;

my $db = DBM::Deep->new(
    file => "foo.db",
    autoflush => 1
);

while (<>) {
    chomp;
    my @fields = split /\s+/;
    $$db{$fields[0]} = $fields[3];
    push @seen, $fields[1];
}

for (uniq @seen) {
    print $_ . " " . $$db{$_} . "\n" if exists $$db{$_};
}

TechQA.

Perl: Iterating through large hash, runs out of memory

There are 1 answers

Related Questions in PERL

Related Questions in ITERATION

Related Questions in PERL-HASH

Popular Questions

Popular Tags

Trending Questions