I'm trying to convert the eBay File Exchange download into a tab-delimited format my shipping software can read.
If each and every column were quoted, this would be easy--but they're not. Only some columns (name, item listing title, etc) are quoted, and some quoted columns contain commas. The rest are bare of quotes.
I need a way to parse and convert this in a .bat file, but using comma as a delimiter splits the quoted fields if they contain a comma too, giving me unusable data. I'm certain there's a simple fix for this, I just can't figure it out.
Eric J is correct - solving this kind of problem with batch is not simple. But it is possible :-)
The main problem is how to differentiate between quoted and unquoted commas - jeb solved a similar problem with quoted vs. unquoted semicolons at 'Pretty print' windows %PATH% variable - how to split on ';' in CMD shell. The code below looks very different, but the fundamental concept is the same.
The code below should work for pretty much any CSV as long as all lines are less than ~8000 bytes long. Batch variable values are limited to 8191 bytes, and some characters are temporarily expanded to two bytes.
The code assumes there are not any existing TABs within the CSV file.
It does not modify any existing quotes.
As I say, the code should work, but it will be painfully SLOW if you have a large file. You would be much better off with a .NET solution as Eric J suggested.