How to parse multiple csv files with Perl and print only the unique results

Question

How to parse multiple csv files with Perl and print only the unique results

142 views Asked by rolleikid At 22 November 2020 at 10:29

I have a bunch of csv file in a simple format say 'Name,Country,Currency'. I need to read all of them and print only the unique union. If they show up in few files they are identical. Tried to use Hash::Merge but seems to be working only for two. I assume I have to reinitialize it in the loop while opening these files for reading but not sure how. In the end I want a file of the same nature but containing all of them without repetition. Many thanks.

Input looks like:

EDL,Finland,Euro

Output want the same format .I made a loop reading the files ,and at any stage I have two hashes %A and %B with $name as keys (after splitting).

$A{$name}=$coun and $B{$name}=$curr

I also have two %merged hashes defined as

$merged1 = Hash::Merge->new('LEFT_PRECEDENT'); 
my %merged1 = %{ $merged1->merge( \%merged1, \%A ) };

The error I get is complaining about unknown function "merge". Must be a simple thing but cannot see it.

Original Q&A

There are 2 answers

mivk On 22 November 2020 at 11:53

It seems you don't really need Perl for what you describe. This should do on any Mac or Linux :

sort -u universe*

The -u option removes duplicates

**TLP** · Accepted Answer · 2020-11-22T11:38:41+00:00

TLP On 22 November 2020 at 11:38 BEST ANSWER

Assuming the lines considered duplicates are identical in all fields, and the data is uniform you can get away with something simple like

perl -ne'print unless $seen{$_}++' universe* > out.csv

Which is a simple dedupe routine (deduping by hash key), then redirect output with the shell.

TechQA.

How to parse multiple csv files with Perl and print only the unique results

There are 2 answers

Related Questions in PERL

Related Questions in MERGE

Related Questions in PERL-HASH

Popular Questions

Popular Tags

Trending Questions