I am trying to merge two files by Perl.
Codes so far:
my $hash_ref;
open (my $I_fh, "<", "File1.txt") or die $!;
my $line = <$I_fh>;
while ($line = <$I_fh>) {
chomp $line;
my @cols = split ("\t", $line);
my $key = $cols[1];
$hash_ref -> {$key} = \@cols;
}
close $I_fh;
open (my $O_fh, "<", "File2.txt") or die $!;
while ($line = <$O_fh>) {
chomp $line;
my @cols = split ("\t", $line);
my $key = shift (@cols);
push (@{$hash_ref -> {$key}}, @cols);
}
close $O_fh;
open (my $out, ">", "merged.txt") or die $!;
foreach my $key (sort keys %$hash_ref) {
my $row = join ("\t", @{$hash_ref -> {$key}});
print $out "$key\t$row\n";
}
close $out;
I am using print or Dumper function to check every steps. In the terminal windows, everything is fine. However, in my output file (merged txt), the format was changed. I would like to merge two files by adding more columns, not adding more rows. How can I fix codes?
File 1.txt:
Index Name Column1 Column2
1 A1 AB
2 A2 CD
3 B1 EF
4 B2 GH
File 2.txt:
Name Type
A1 1
A2 1
B1 2
B2 1
Merged file:
A1 1 AB
1
A2 2 CD
1
B1 3 EF
2
B2 4 GH
1
Wanted file:
Name Type Column2
A1 1 AB
A2 1 CD
B1 2 EF
B2 1 GH
Assuming the files are sorted based on the name column, this is really easy to do thanks to the join(1) program:
The
--header
option is a GNU extension that excludes the first lines of the two files from being joined and treats them as column titles instead.-t
sets the column separator,-o
controls what columns are included in the output (A list of FILE.COLUMN specifiers), and-1
and-2
choose the columns that are used to join the two files.If they're not sorted, or if you're set on perl, your code looks very very close; besides all the typos and such, you're printing out every column, not just the ones your desired output suggest you care about. Consider:
I also suspect your strange output might be explained by running your program on data files that use Windows-style line endings when your OS uses Unix-style line endings.