The outputs were different in the terminal windows and txt file?

Question

The outputs were different in the terminal windows and txt file?

115 views Asked by Victor.H At 16 November 2018 at 09:05

I am trying to merge two files by Perl.

Codes so far:

 my $hash_ref;  
 open (my $I_fh, "<", "File1.txt") or die $!;

 my $line = <$I_fh>;
 while ($line = <$I_fh>) {
 chomp $line;
 my @cols = split ("\t", $line);
 my $key = $cols[1];
 $hash_ref -> {$key} = \@cols;
 }
 close $I_fh;

 open (my $O_fh, "<", "File2.txt") or die $!;
 while ($line = <$O_fh>) {
 chomp $line;
 my @cols = split ("\t", $line);
 my $key = shift (@cols);
 push (@{$hash_ref -> {$key}}, @cols);

 }
 close $O_fh;


 open (my $out, ">", "merged.txt") or die $!;

 foreach my $key (sort keys %$hash_ref) {

 my $row = join ("\t", @{$hash_ref -> {$key}});

print $out "$key\t$row\n";
 }
close $out;

I am using print or Dumper function to check every steps. In the terminal windows, everything is fine. However, in my output file (merged txt), the format was changed. I would like to merge two files by adding more columns, not adding more rows. How can I fix codes?

  File 1.txt:  
  Index    Name    Column1   Column2  
   1        A1                  AB      
   2        A2                  CD   
   3        B1                  EF    
   4        B2                  GH   


    File 2.txt:   
    Name  Type  
     A1     1  
     A2     1   
     B1     2   
     B2     1    

   Merged file:  

   A1   1   AB    
        1     
   A2   2   CD  
        1      
   B1   3   EF  
        2      
   B2   4   GH   
        1      

Wanted file:  
Name  Type  Column2  

  A1   1   AB    
  A2   1   CD   
  B1   2   EF   
  B2   1   GH

Original Q&A

There are 1 answers

**Shawn** · Accepted Answer · 2018-11-16T12:01:33+00:00

Assuming the files are sorted based on the name column, this is really easy to do thanks to the join(1) program:

$ join --header -t $'\t' -o 2.1,2.2,1.4 -1 2 -2 1 file1.tsv file2.tsv
Name    Type    Column2
A1  1   AB
A2  1   CD
B1  2   EF
B2  1   GH

The --header option is a GNU extension that excludes the first lines of the two files from being joined and treats them as column titles instead. -t sets the column separator, -o controls what columns are included in the output (A list of FILE.COLUMN specifiers), and -1 and -2 choose the columns that are used to join the two files.

If they're not sorted, or if you're set on perl, your code looks very very close; besides all the typos and such, you're printing out every column, not just the ones your desired output suggest you care about. Consider:

#!/usr/bin/perl
use warnings;
use strict;
use feature qw/say/;
use autodie;

my %names;

sub read_file {
  my ($file, $idx) = @_;
  open my $in, "<", $file;
  my $header = <$in>;
  while (<$in>) {
    chomp;
    my @F = split /\t/;
    push @{$names{$F[$idx]}}, \@F;
  }
}

read_file "file1.tsv", 1;
read_file "file2.tsv", 0;

say "Name\tType\tColumn2";
for my $n (sort keys %names) {
  my $row = $names{$n};
  say "$n\t$row->[1][1]\t$row->[0][3]";
}

I also suspect your strange output might be explained by running your program on data files that use Windows-style line endings when your OS uses Unix-style line endings.

TechQA.

The outputs were different in the terminal windows and txt file?

There are 1 answers

Related Questions in PERL

Related Questions in HASHREF

Related Questions in PERL-HASH

Popular Questions

Popular Tags

Trending Questions