I've got a script that extracts data from a CSV file and reprints it into another file, removing extra fields from the last record when there is a match to a search string. See below
echo off
setlocal EnableDelayedExpansion
pause
set cur=0
FOR /F "delims=" %%A in (INPUT.csv) DO (
set line=%%A
set line=!line:,,=, ,!
FOR /F "tokens=1-11 delims=," %%G in (^"!line!^") DO (
if "%%G"=="" (echo.)
if "%%G"==""FILENAME_YYYYMMDD.CSV"" (
echo %%G,%%H,%%I,%%J >> output.csv
goto EOF
) else (
echo %%G,%%H,%%I,%%J,%%K,%%L,%%M,%%N,%%O,%%P,%%Q >> output.csv
)
set /a cur=cur+1
)
)
:EOF
echo %cur%
pause
My problem is two folds.
- The FILENAME_YYYYMMDD changes depending on the date the input file was created. How do I get it to partial match the FILENAME? ie. %%G is a match when %%G == FILENAME_20150610 or FILENAME_20150611 or FILENAME_XYZ
- The script mostly works but a number of the records are missing the last field. In total 7/190 records are missing %%Q. These incomplete records are randomly spread throughout my output file.
Example below:
BEFORE
"Parent","CODE1","Child ONE",CODEA,"COMPANY","","Address1",,"SUBURB","STATE","2000"
"FILENAME_20150529.csv","20150529","15:09:30",187,"","","","","","",""
AFTER
"Parent","CODE1","Child ONE",CODEA,"COMPANY","","Address1", ,"SUBURB1","STATE2"
"FILENAME_20150529.csv","20150529","15:09:30",187
read
help set
and use theset = :~
substring extractionBAT parsing (either with
for
or withset
) is not suitable for a complex csv parser, as there are complex rules regarding commas and quotes. In your case you might have some unbalanced quotes or some commas inside fields that cause your parsing to fail.