I'm just trying out csvkit for converting Excel to csv. However, it's not taking into account formatting on dates and times, and producing different results from Excel's own save-as-csv. For example, this is a row of a spreadsheet:
And this what Excel's save-as produces:
22/04/1959,Bar,F,01:32.00,01:23.00,00:59.00,00:47.23
The date has no special formatting, and the time is formatted as [mm].ss.00
. However, this is in2csv
's version of the csv:
1959-04-22,Bar,F,0.00106481481481,0.000960648148148,0.00068287037037,0.000546643518519
which is of course of no use at all. Any ideas? There don't seem to be any command-line options for this - no-inference
doesn't help. Thanks.
EDIT
Both csvkit
ans xlrd
do seem to take into account formatting, but they're not smart about it. A date of 21/02/1066 is passed though as the text string '21/02/1066' in both cases, but a date '22/04/1959' is turned into '21662.0' by xlrd
, and 1959-04-22
by csvkit
. Both of them just give up on small elapsed times and pass through the float representation. This is Ok if you know that the cell should contain an elapsed time, because you can just multiply by 24*60*60
to get the right answer.
I don't think xlrd
would be much help here since its date tuple functions only handle seconds, and not centiseconds.
EDIT 2
Found out something interesting. I started with a base spreadsheet containing times. In one of them I formatted the times as [m:]ss.00
, and in the other I formatted them as [mm:]ss.00
. I then saved each as a .xls
and a .xlsx
, giving a total of 4 spreadsheets. Excel could convert all 4 to csv, and all the time text in the csv's appeared as originally written (ie. 0:21.0
, for example, for 0m 21.0s).
in2csv
can't handle the two .xls
versions at all; this time appears as 00:00:21
. It also can't handle the [m:]ss.00
version of the .xlsx
- conversion gives the catch-all 'index out of range' error. The only one of the 4 spreadsheets that in2csv
can handle is the .xlsx
one, with [mm:]ss.00
formatting.
The optional
-I
argument should be working to avoid this issue. When testing your sample data, I get what Excel's save-as produces.Command:
in2csv sample.csv -I > sample-output-i.csv
Output:
22/04/1959,Bar,F,01:32.00,01:23.00,00:59.00,00:47.23
-I, --no-inference
Disable type inference when parsing CSV input.https://csvkit.readthedocs.io/en/latest/scripts/in2csv.html