XML Schema for scientific instrumentation time-series logging

Question

XML Schema for scientific instrumentation time-series logging

207 views Asked by user2097818 At 10 December 2014 at 06:50

Q:

I know there is not one perfect answer to all of this nonsense; I am hoping for some experienced insight to narrow down the possible flavors, some general strategy to avoid conversion nightmares, and any ideas on reducing my data-storage footprint on the CPU/disk (large string operations are expensive and tedious). I am on restricted hardware, and somewhat new to XML standards. I can read and write it just fine (usually for website), never really as a dataset encapsulation.

I have given this weeks of thought, and I am 92.3% sure that XML files are my ideal storage destination. I am logging various instrumentation readings/analysis, and holding it for months at a time. Although I do have concerns about my data-collection nodes having limited hardware resources (Excessive string operations can get slow, 512kB RAM, 3.2GB flash storage).

I am trying to find a well formed ML with a minimal footprint that can handle RAW numerical datatypes. I do not need fully compliant files, BUT I am looking for a best-fit solution, so lets not deviate too far from proper form

Primary Data Model Factors

(and why I think XML is a better fit that Packed binary, FLAT TEXT, or even CSV)

Up to 8 different datapoints (different measurements, brands, and sensor types)
various raw datatypes (REAL32, DINT, DWORD, BYTE, STRING(arbitrarily long)
datasets need to be able to keep absolute timestamps within each file (I have a directory full of 100's of XML's that will eventually merge)
datapoint configuration/quantity could change, so I need to be able to note alterations to the schema with minimal verbosity/confusion.

Performance Constraints/Considerations

I should normally only write out the XML from the embedded platform, so readability is not paramount, although if I do need to handle any kind of inquiry, tossing and parsing 3.0GB of text is not going to be fun even at its very cleanest.
- I believe that intermittent DATE-TIME nodes will help me index such an inquiry
Compressing data excessively can actually become a problem at export time, because those become yet more calculations to unzip my laziness.
Excessively verbose XML only gives me 111 days of storage. I would like to get that up to 180 days or longer. So I do need to condense text better.
There are 3 potential targets once the data is offloaded. I don't want to run into conversion bottlenecks/mistakes by over-complicating.
- Microsoft Excel (he doesn't have to understand it perfectly, but we don't want to spend hours manually importing non-compliant schema types/maps into a 2D grid.
- RRD Backend Server (I will be able to run any conversions needed, but hopefully I am already close to what RRD wants
- Some cute Javascript/Android tools. Although I expect these to perform custom datatype handling, well-formed XML will make retrieval and parsing simpler during development.

Original Q&A

There are 1 answers

**Bert Schultheiss** · Answer 1 · 2015-07-08T13:25:05+00:00

Bert Schultheiss On 08 July 2015 at 13:25

Did you consider storing your XML files in an XML database such as eXist?

TechQA.

XML Schema for scientific instrumentation time-series logging

Q:

Primary Data Model Factors

Performance Constraints/Considerations

There are 1 answers

Related Questions in XML

Related Questions in DATABASE

Related Questions in REAL-TIME

Related Questions in DATALOG

Popular Questions

Popular Tags

Trending Questions