I am storing XBRL JSON using elasticsearch.
This xBRL-JSON OIM spec describes the oim:period
property:
Otherwise, an ISO 8601 time interval representing the {interval} property, expressed in one of the following forms:
<start>/<end>
<start>/<duration>
<duration>/<end>
Where <start> and <end> are valid according to the xsd:dateTime datatype, and <duration> is valid according to xsd:duration.
Examples from arelle's plugin look like this:
- 2016-01-01T00:00:00/PT0S
- 2015-01-01T00:00:00/P1Y
I notice that arelle's plugin exclusively produces this format:
- <start>/<duration>
My question
Is there a way to save at least the <start>
part as a date type in elasticsearch?
Ideas I had:
elastichsearch only (my preference)
- Use a custom date format which anticipates the
/<duration>
part, but ignores it- I haven't checked Joda yet; will it ignore characters in the date format if they aren't part of the special character? Like the "/" delimiter or the "P" which precedes any duration value (like
PT0S
andP1Y
above)? - EDIT So the single-quote character escapes literals; this works
yyyy'/P'
will accept a value '2015/P'. However, the rest of the duration could be more dynamic - Re: dynamic; will Joda accept regex or wildcard character like "\d" or "+" qualifier so I can ignore all the possible variations following the
P
?
- I haven't checked Joda yet; will it ignore characters in the date format if they aren't part of the special character? Like the "/" delimiter or the "P" which precedes any duration value (like
- Use a character filter to strip out the
/<duration>
part before saving only<start>
as datetime. But I don't know if character filters happen before saving as type: date. If they don't, the '/`part isn't stripped, and I wouldn't be passing valid date strings. - Don't use date type: Use a pattern tokenizer to split on
/
, and at least the two parts will be saved as separate tokens. Can't use date math, though. - Use a transformation; although it seems like this is deprecated. I read about using
copy_to
instead, but that seems to combine terms, and I want to break this term apart - Some sort of plugin? Maybe a plugin which will fully support this "interval" datatype described by the OIM spec... maybe a plugin which will store its separate parts...?
change my application (I prefer to use elasticsearch-only techniques if possible)
- I could edit this plugin or produce my own plugin which uses exclusively
<start>
and<end>
parts, and saves both into separate fields;- But this breaks the OIM spec, which says they should be combined in a single field
- Moreover it can be awkward to express an "instant" fact (with no duration; the
PT0S
examples above); I guess I just use the same value forend
property asstart
property... Not more awkward than a 0-length duration (PT0S
) I guess.
Not a direct answer, but it's worth noting that the latest internal drafts of the xBRL-JSON specification have moved away from the the single-field representation. Although the "/" separated notation is an ISO standard, tool support for it appears to be extremely poor, and so the working group has chosen to switch to separate fields for start and end dates. I would expect Arelle support to follow suit in due course.