I have a huge sensor data
set and working on Python
. The problem is their date formats
. Basically,this is how the date columns look like;
07/ 7/15 06:51
07/ 7/15 06:53
07/ 7/15 06:55
07/ 7/15 06:57
07/ 7/15 06:59
2015-07-07 07:00:46.047
07/ 7/15 07:03
07/ 7/15 07:05
07/ 7/15 07:07
07/ 7/15 07:09
07/ 7/15 07:11
07/ 7/15 07:13
2015-07-07 07:15:53.007
2015-11-14 23:33:43.000
2015-11-14 23:35:44.000
2015-11-14 23:37:43.000
2015-11-14 23:39:43.000
2015-11-14 23:41:43.000
11/14/15 23:42
2015-11-14 23:45:43.000
11/14/15 23:46
2015-11-14 23:49:43.000
2015-11-14 23:51:44.000
I am going to parse dates to use weekdays, weekends and as an extra maybe I will turn them a Julian date format
(which uses numbers 1 to 365 instead of regular dates).
I had tried to :
Parsing dates while I am reading csv
Date until parser; e.g. dateutil.parser.parse(x)
Datetime.strptime
but none of them worked. I still cannot parse dates. These data in 10 part excel files.
When I read them with pd.read_csv(......, parse_dates('date'))
, it reads date columns as 'object'
in some files and as 'datetime64'
format in other files. But even if the files with format 'datetime64'
date's cannot parse and it gives an error :
"Unknown String Format".
Any idea would help!
Your probably going to have to munge this with several approaches I haven't done a significant amount of testing but I was able to convert 2 of your different dates(
07/ 7/15 06:51
,2015-11-14 23:45:43.000
) to datetime objects using:The
date
parameter inside theparser.parse
method would be the varied string format you have for dates.There might be a better way to do this but try using this approach as a lambda method on the date column to see the result.