extract elasticsearch date from a <start-date>/<duration> XBRL-JSON format

224 views Asked by At

I am storing XBRL JSON using elasticsearch.

This xBRL-JSON OIM spec describes the oim:period property:

Otherwise, an ISO 8601 time interval representing the {interval} property, expressed in one of the following forms:

<start>/<end>

<start>/<duration>

<duration>/<end>

Where <start> and <end> are valid according to the xsd:dateTime datatype, and <duration> is valid according to xsd:duration.

Examples from arelle's plugin look like this:

  • 2016-01-01T00:00:00/PT0S
  • 2015-01-01T00:00:00/P1Y

I notice that arelle's plugin exclusively produces this format:

  • <start>/<duration>

My question

Is there a way to save at least the <start> part as a date type in elasticsearch?

Ideas I had:

elastichsearch only (my preference)

  • Use a custom date format which anticipates the /<duration> part, but ignores it
    • I haven't checked Joda yet; will it ignore characters in the date format if they aren't part of the special character? Like the "/" delimiter or the "P" which precedes any duration value (like PT0S and P1Y above)?
    • EDIT So the single-quote character escapes literals; this works yyyy'/P' will accept a value '2015/P'. However, the rest of the duration could be more dynamic
    • Re: dynamic; will Joda accept regex or wildcard character like "\d" or "+" qualifier so I can ignore all the possible variations following the P?
  • Use a character filter to strip out the /<duration> part before saving only <start>as datetime. But I don't know if character filters happen before saving as type: date. If they don't, the '/`part isn't stripped, and I wouldn't be passing valid date strings.
  • Don't use date type: Use a pattern tokenizer to split on /, and at least the two parts will be saved as separate tokens. Can't use date math, though.
  • Use a transformation; although it seems like this is deprecated. I read about using copy_to instead, but that seems to combine terms, and I want to break this term apart
  • Some sort of plugin? Maybe a plugin which will fully support this "interval" datatype described by the OIM spec... maybe a plugin which will store its separate parts...?

change my application (I prefer to use elasticsearch-only techniques if possible)

  • I could edit this plugin or produce my own plugin which uses exclusively <start> and <end> parts, and saves both into separate fields;
    • But this breaks the OIM spec, which says they should be combined in a single field
    • Moreover it can be awkward to express an "instant" fact (with no duration; the PT0S examples above); I guess I just use the same value for end property as start property... Not more awkward than a 0-length duration (PT0S) I guess.
1

There are 1 answers

2
pdw On

Not a direct answer, but it's worth noting that the latest internal drafts of the xBRL-JSON specification have moved away from the the single-field representation. Although the "/" separated notation is an ISO standard, tool support for it appears to be extremely poor, and so the working group has chosen to switch to separate fields for start and end dates. I would expect Arelle support to follow suit in due course.