I'm a R
user with great interest for Julia
. I don't have a computer science background. I just tried to read a 'csv' file in Juno
with the following command:
using CSV
using DataFrames
df = CSV.read(joinpath(Pkg.dir("DataFrames"),
"path/to/database.csv"));
and got the following error message
CSV.CSVError('error parsing a 'Int64' value on column 26, row 289; encountered '.'"
in read at CSV/src/Source.jl:294
in #read#29 at CSV/src/Source.jl:299
in stream! at DataStreams/src/DataStreams.jl:145
in stream!#5 at DataStreams/src/DataStreams.jl:151
in stream! at DataStreams/src/DataStreams.jl:187
in streamto! at DataStreams/src/DataStreams.jl:173
in streamfrom at CSV/src/Source.jl:195
in paresefield at CSV/src/paresefield.jl:107
in paresefield at CSV/src/paresefield.jl:127
in checknullend at CSV/src/paresefield.jl:56
I look at the entry indicated in the data frame: the row 287, 288 are like this 30
, 33
respectively (seem to be of type Integer
) and the the row 289 is 30.445
(which is of type float
).
Is the problem that DataFrames
filling the column with Int
and stopped when it saw an Float
?
Many thanks in advance
The problem is that float happens too late in the data set. By default CSV.jl uses
rows_for_type_detect
value equal to100
. Which means that only first 100 rows are used to determine the type of a column in the output. Setrows_for_type_detect
keyword parameter inCSV.read
to e.g.300
and all should work correctly.Alternatively you can pass
types
keyword argument to manually set column type (in this caseFloat64
for this column would be appropriate).