Using Text::CSV_XS perl module for parsing CSV with newlines

270 views Asked by At

We are using Text::CSV_XS module for parsing CSV with the following options:

my $csv = Text::CSV_XS->new ({ 
                              binary=> 1, eol => "\n", 
                              allow_loose_quotes => 1, 
                              allow_loose_escapes=> 1,
                              escape_char => "\\" 
                            });

The CSV is created using the mysql LOAD DATA infile. If there are NULL values then MYSQL adds \N in the fields

Before Parsing data :

1973127,99,\N

If we use the escape_char => "\\" then the \N is replaced with N

Data after parsing:

1973127,99,N

Also, the data with double '\' are removed

eg: D\\'Mello changed to D'Mello

If we remove escape_char => "\\" then the data is changed as shown below

Before Parsing :

1539190,125,"\" SHIDDH - SHILA \"\
\"126"

After parsing:

1539190,125,\\" SHIDDH - SHILA \\"\\n\\"126

What is the best way to resolve this issue?

1

There are 1 answers

0
goneri On

You can clean up the CSV file first.

#!/usr/bin/perl

use strict;
use warnings;

my $file = shift;
open CSV, "<$file" or die;
foreach my $line (<CSV>) {
    $line =~ s/\\N/NULL/g;
    print $line;
}
close CSV;

perl cleanup.pl my.csv > new.csv