Store big arrays (Lists) of data in XML format into SQL server 2016

2.7k views Asked by At

I need to save the result from several measurements generated in C#.
The results are big Lists of data of different types:

List<double, double> [20000 records]
List<string, double, double> [20000 records]
List<int, double, double> [20000 records]

My first thought was to store these lists in XML format into a table (xml field). So, for each List I generate a XML file and insert it into the table.

<MeasurementList>
  <item>
    <left_measurement>0.1264</left_measurement>
    <right_measurement>6.500</right_measurement>
  </item>
  <item>
    <left_measurement>0.2314</left_measurement>
    <right_measurement>6.968</right_measurement>
  </item>
  <item>
    <left_measurement>0.2365</left_measurement>
    <right_measurement>7.598</right_measurement>
  </item>
  ...
</MeasurementList> 

This will occur 3000 times a day, so I need 9000 records each day.

I know that the disk space is an issue, but my main concern is:
Is this is the best aproach or there are any other solutions

1

There are 1 answers

1
Shnugo On BEST ANSWER

As your own answer was deleted as a only-link-answer I'll place my comments as answer (and delete them above).

Stuffing millions of values into millions of rows will still need less place than to store the same amount of data nested in about four times the characters you'd need for the pure data.

At least I'd use <lm> instead of <left_measurement>. The shortest in (character) space was

<L v="0.2314"/><R v="6.968"/>
or 
<L>0.2314</L><R>6.968</R>
or 
<v l="0.2314" r="6.968">

The last option would reflect your List<double, double> [20000 records] best...

The main question is: What are you going to do with this after some days? You can store millions of values somehow, but will you ever have to read them again? If not: Why do you store it at all? If yes (and that's what I assume): Don't put this in a string based container! You will get into sever performance troubles...

About performance and disc space:

  • a FLOAT will take 4 bytes, while 0.2314 consumes 6 characters (12 bytes in NVARCHAR!) and you do not need extra space for the meta data!
  • SQL Server will store an XML as a hidden hierachy table with NVARCHAR strings (2 bytes per character!)
  • JSON is still stored as the string you see (If plain latin you can use VARCHAR with 1 byte per character).
  • XML is much faster in queries than JSON (no structure parsing needed)...
  • Both (XML and JSON) store their values string based. Any attempt to read these values will need a type conversion ("0.1234" is not a number, but a string, that looks like a number!) This is not very expensive with a few values, but this will tear down your system with billions of data...

You might read When can I save JSON or XML data in an SQL Table