Ordered hash of hashes - setting and accessing key/value pairs

273 views Asked by At

I want to implement an ordered hash where the value of each key value pair will be a another nested hash map. I am unable to do so. I am not getting any errors but nothing is being printed.

use Hash::Ordered;
use constant { lead_ID => 44671 , lag_ID => 11536 , start_time => time };

my $dict_lead=Hash::Ordered->new;
my $dict_lag=Hash::Ordered->new;

open(my $f1,"<","tcs_07may_nse_fo") or die "cant open input file";
open(my $f2,">","bid_ask_".&lead_ID) or die "cant open output file";
open(my $f3,">","ema_data/bid_ask_".&lag_ID) or die "cant open output file";

while(my $line =<$f1>){
    my @data=split(/,/,$line);
    chomp(@data);
    my ($tstamp, $instr) = (int($data[0]), $data[1]); 
    if($instr==&lead_ID){
      $dict_lead->set($tstamp=>{"bid"=>$data[5],"ask"=>$data[6]});
    }
    if($instr==&lag_ID){
      $dict_lag->set($tstamp=>{"bid"=>$data[5],"ask"=>$data[6]});
    }
}
close $f1;
foreach my $key ($dict_lead->keys){
    my $spread=$dict_lead{$key}{"ask"}-$dict_lead{$key}{"bid"};
    %hash=$dict_lead->get($key);
    print $key.",".$hash{"ask"}."\n";
    print $f2 $key.",".$dict_lead{$key}{"bid"}.","
          .$dict_lead{$key}{"ask"}.",".$spread."\n";
}
foreach my $key ($dict_lag->keys){
    my $spread=$dict_lag{$key}{"ask"}-$dict_lag{$key}{"bid"};
    print $f3 $key.",".$dict_lag{$key}{"bid"}.","
          .$dict_lag{$key}{"ask"}.",".$spread."\n";
}
close $f2;
close $f3;
print "Ring destroyed in " , time() - &start_time   , " seconds\n";

The output printed on my terminal is :

1430992791,  
1430992792,  
1430992793,  
1430992794,  
1430992795,  
1430992796,  
1430992797,  
1430992798,  
1430992799,  
Ring destroyed in 24 seconds

I realize from the first column of output that I am able to insert the key in ordered hash. But I don't understand how to insert another hash as value for those keys. Also how would I access those values while iterating through the keys of the hash?

The output in the file corresponding to file handle $f2 is:

1430970394,,,0  
1430970395,,,0  
1430970396,,,0  
1430970397,,,0  
1430970398,,,0  
1430970399,,,0  
1430970400,,,0  
2

There are 2 answers

0
G. Cito On

With ordered hashes constructed using Hash::Ordered, the hash is an object. Those objects have properties (e.g. an index; if you examine a Hash::Ordered object it will have more than just hash elements inside of it) and they provide methods for you manipulate and access their data. So you need to use the supplied methods - like set to access the hash such as you do in this line:

$dict_lead->set($tstamp=>{"bid"=>$data[5],"ask"=>$data[6]});

where you create a key using the the scalar $tstamp and then associate it with an anonymous hash as it value.

But while you are using Hash::Ordered objects, your script also makes use of a plain data-structure (%hash) that you populate using $dict_lead->get($key) in your first foreach loop. All the normal techniques, idioms and rules for adding keys to a hash still apply in this case. You don't want to repeatedly copy the nested hash out of $dict_lead Hash::Ordered object into %hash here, you want to add the nested hash to %hash and associate it with a unique key.

Without sample data to test or a description of the expected output to compare against it is difficult to know for sure, but you probably just need to change:

 %hash=$dict_lead->get($key);

to something like:

  $hash{$key} = $dict_lead->get($key); 

to populate your temporary %hash correctly. Or, since each key's value is an anonymous hash that is nested, you might instead want to try changing print $key.",".$hash{"ask"}."\n"; to:

  print $key.",".$hash{$key}{"ask"}."\n"

There are other ways to "deeply" copy part of one nested data structure to another (see the Stackoverflow reference below) and you maybe be able to avoid using the temporary variable all together, but these small changes might be all that is necessary in your case.


In general, in order to "insert another hash as a value for ... keys" you need to use a reference or an anonymous hash constructor ({ k => "v" , ... }). So e.g. to add one key:

my %sample_hash ;
$sample_hash{"key_0"} = { bid => "1000000" , timestamp => 1435242285 }; 
dd %sample_hash ;

Output:

("key_0", { bid => 1000000, timestamp => 1435242285 })

To add multiple keys from one hash to another:

my %to_add = ( key_1 => { bid => "1500000" , timestamp => 1435242395 }, 
               key_2 => { bid => "2000000" , timestamp => 1435244898 } );

for my $key ( keys %to_add ) {  
   $sample_hash{$key} = $to_add{$key}  
}

dd %sample_hash ;

Output:

(
  "key_1",
  { bid => 1000000, timestamp => 1435242285 },
  "key_0",
  { bid => 1400000, timestamp => 1435242395 },
  "key_2",
  { bid => 2000000, timestamp => 1435244898 },
)

References

1
Borodin On

First of all, I don't see why you want to use a module that keeps your hash in order. I presume you want your output ordered by the timestamp fields, and the data that you are reading from the input file is already ordered like that, but it would be simple to sort the keys of an ordinary hash and print the contents in order without relying on the incoming data being presorted

You have read an explanation of why your code isn't behaving as it should. This is how I would write a solution that hopefully behaves properly (although I haven't been able to test it beyond checking that it compiles)

Instead of a hash, I have chosen to use a two-element array to contain the ask and bid prices for each timestamp. That should make the code run fractionally faster as well as making it simpler and easier to read

It's also noteworthy that I have added use autodie, which makes perl check the status of IO operations such as open and chdir automatically and removes the clutter caused by coding those checks manually. I have also defined a constant for the path to the root directory of the files, and used chdir to set the working directory there. That removes the need to repeat that part of the path and reduces the length of the remaining file path strings

#!/usr/bin/perl

use strict;
use warnings;
use 5.010;
use autodie;

use Hash::Ordered;

use constant DIR     => '../tcs_nse_fo_merged';
use constant LEAD_ID => 44671;
use constant LAG_ID  => 11536;

chdir DIR;

my $dict_lead = Hash::Ordered->new;
my $dict_lag  = Hash::Ordered->new;

{
    open my $fh, '<', 'tcs_07may_nse_fo';

    while ( <$fh> ) {

        chomp;
        my @data = split /,/;

        my $tstamp = int $data[0];
        my $instr  = $data[1];

        if ( $instr == LEAD_ID ) {
            $dict_lead->set( $tstamp => [ @data[5,6] ] );
        }
        elsif ( $instr == LAG_ID ) {
            $dict_lag->set( $tstamp => [ @data[5,6] ] );
        }
    }
}

{
  my $file = 'ema_data/bid_ask_' . LEAD_ID;
  open my $out_fh, '>', $file;

  for my $key ( $dict_lead->keys ) {
      my $val = $dict_lead->get($key);
      my ($ask, $bid) = @$val;
      my $spread = $ask - $bid;
      print join(',', $key, $ask), "\n";
      print $out_fh join(',', $key, $bid, $ask, $spread), "\n";
  }
}

{
  my $file = 'ema_data/bid_ask_' . LAG_ID;
  open my $out_fh, '>', $file;

  for my $key ( $dict_lag->keys ) {
      my $val = $dict_lead->get($key);
      my ($ask, $bid) = @$val;
      my $spread = $ask - $bid;
      print $out_fh join(',', $key, $bid, $ask, $spread), "\n";
  }
}

printf "Ring destroyed in %d seconds\n", time - $^T;