How to parse multiple csv files with Perl and print only the unique results

80 views Asked by At

I have a bunch of csv file in a simple format say 'Name,Country,Currency'. I need to read all of them and print only the unique union. If they show up in few files they are identical. Tried to use Hash::Merge but seems to be working only for two. I assume I have to reinitialize it in the loop while opening these files for reading but not sure how. In the end I want a file of the same nature but containing all of them without repetition. Many thanks.

Input looks like:

EDL,Finland,Euro

Output want the same format .I made a loop reading the files ,and at any stage I have two hashes %A and %B with $name as keys (after splitting).

$A{$name}=$coun and $B{$name}=$curr 

I also have two %merged hashes defined as

$merged1 = Hash::Merge->new('LEFT_PRECEDENT'); 
my %merged1 = %{ $merged1->merge( \%merged1, \%A ) }; 

The error I get is complaining about unknown function "merge". Must be a simple thing but cannot see it.

2

There are 2 answers

0
TLP On BEST ANSWER

Assuming the lines considered duplicates are identical in all fields, and the data is uniform you can get away with something simple like

perl -ne'print unless $seen{$_}++' universe* > out.csv 

Which is a simple dedupe routine (deduping by hash key), then redirect output with the shell.

0
mivk On

It seems you don't really need Perl for what you describe. This should do on any Mac or Linux :

sort -u universe*

The -u option removes duplicates