Data extraction from a CSV file is missing some data

81 views Asked by At

I've got a script that extracts data from a CSV file and reprints it into another file, removing extra fields from the last record when there is a match to a search string. See below

echo off
setlocal EnableDelayedExpansion
pause
set cur=0
FOR /F "delims=" %%A in (INPUT.csv) DO (
set line=%%A
set line=!line:,,=, ,!

FOR /F "tokens=1-11 delims=," %%G in (^"!line!^") DO (
    if "%%G"=="" (echo.)
    if "%%G"==""FILENAME_YYYYMMDD.CSV"" (
        echo %%G,%%H,%%I,%%J >> output.csv
        goto EOF
    ) else (
        echo %%G,%%H,%%I,%%J,%%K,%%L,%%M,%%N,%%O,%%P,%%Q >> output.csv
    )
    set /a cur=cur+1
  )
)
:EOF
echo %cur%
pause

My problem is two folds.

  1. The FILENAME_YYYYMMDD changes depending on the date the input file was created. How do I get it to partial match the FILENAME? ie. %%G is a match when %%G == FILENAME_20150610 or FILENAME_20150611 or FILENAME_XYZ
  2. The script mostly works but a number of the records are missing the last field. In total 7/190 records are missing %%Q. These incomplete records are randomly spread throughout my output file.

Example below:

BEFORE

"Parent","CODE1","Child ONE",CODEA,"COMPANY","","Address1",,"SUBURB","STATE","2000"
"FILENAME_20150529.csv","20150529","15:09:30",187,"","","","","","",""

AFTER

"Parent","CODE1","Child ONE",CODEA,"COMPANY","","Address1", ,"SUBURB1","STATE2" 
"FILENAME_20150529.csv","20150529","15:09:30",187
1

There are 1 answers

0
PA. On
  1. read help set and use the set = :~ substring extraction

    set fn=%%G
    set fn=!fn:~1,9!
    if /i !fn!==FILENAME_ (
    
  2. BAT parsing (either with for or with set) is not suitable for a complex csv parser, as there are complex rules regarding commas and quotes. In your case you might have some unbalanced quotes or some commas inside fields that cause your parsing to fail.