is it possible to have dateutil.parser.parse only parse iso8601 strings and throws error for other formats

1k views Asked by At

I am using dateutil.parser. I only want to parse a date string which has date, time and timezone info.

e.g. I only want to accept valid date as "2014-11-11T18:28:50.588Z". If user have passed "2013-12-11"(which is again valid date for dateutil), then I want to throw error.

P.S - I know I can use regex but I was hoping to see if I can use dateutil library

1

There are 1 answers

0
Paul On

No, this is not possible with dateutil.parser. dateutil's parser is primarily used for processing anything that looks like a date and currently has minimal customization options.

One thing to note is that both 2013-12-11 and 2014-11-11T18:28:50.588Z are valid ISO-8601 dates, so even if you have a rule saying, "parse only ISO-8601 dates", it would still catch both of these.

My recommendation in general is that if you know the exact format of the date string, you should use strptime, example:

from datetime import datetime
from dateutil import tz

def parse_datetime(dt_str):
    return datetime.strptime(dt_str, '%Y-%m-%dT%H:%M:%S.%fZ').replace(tzinfo=tz.tzutc())


if __name__ == "__main__":
    print(parse_datetime("2014-11-11T18:28:50.588Z"))

    try:
        parse_datetime("2013-12-11")
    except ValueError:
        print("Failed to parse!")

# Returns:
#
# 2014-11-11 18:28:50.588000+00:00
# Failed to parse!

If you want to be a little more flexible and allow dates without the Z extension, or without the floats, I have found the fastest way is to do an if/elif block that checks the length of the string. Here is how I would do that:

from datetime import datetime
from dateutil import tz

def parse_datetime(dt_str):
    tzinfo = None
    if dt_str.endswith('Z'):
        tzinfo = tz.tzutc()
        dt_str = dt_str[:-1]

    if len(dt_str) == 23:
        fmt =  '%Y-%m-%dT%H:%M:%S.%f'
    elif len(dt_str) == 19:
        fmt = '%Y-%m-%dT%H:%M:%S'
    else:
        raise ValueError("Unknown format for date: {}".format(dt_str))

    return datetime.strptime(dt_str, fmt).replace(tzinfo=tzinfo)


if __name__ == "__main__":
    print(parse_datetime("2014-11-11T18:28:50.588"))
    print(parse_datetime("2014-11-11T18:28:50.588Z"))
    print(parse_datetime("2014-11-11T18:28:50"))
    print(parse_datetime("2014-11-11T18:28:50Z"))    

    try:
        parse_datetime("2013-12-11")
    except ValueError as e:
        print(e)

# Returns:
#
# 2014-11-11 18:28:50.588000
# 2014-11-11 18:28:50.588000+00:00
# 2014-11-11 18:28:50
# 2014-11-11 18:28:50+00:00
# Unknown format for date: 2013-12-11