C# ServiceStack.Text analyze stream of json

606 views Asked by Kristian Barrett At 05 January 2025 at 16:05

I am creating a json deserializer. I am deserializing a pretty big json file (25mb), which contains a lot of information. It is an array for words, with a lot of duplicates. With NewtonSoft.Json, I can deserialize the input as stream:

using (var fs = new FileStream(@"myfile.json", FileMode.Open, FileAccess.Read))
using (var sr = new StreamReader(fs))
using (var reader = new JsonTextReader(sr))
{
    while (reader.Read())
    {
        //Read untill I find the narrow subset I need and start parsing and analyzing them directly
        var obj = JObject.Load(reader); //Analyze this object
    }
}

This allows me to keep reading small parts of the json and analyze it and check for duplicates etc.

If I want to do the same with ServiceStack.Text. I am doing something like:

using (var fs = new FileStream(@"myfile.json", FileMode.Open, FileAccess.Read))
using (var sr = new StreamReader(fs))
{
    var result = ServiceStack.Text.JsonSerializer.DeserializeFromReader<MyObject>(sr);
}

MyObject only contains the subset of the json I am interested in, but this creates a massive overhead, as I will get a big array that contains a lot of duplicates.

In the first method I can filter these away immediately and thus not keeping them in memory.

The memory footprint between the two are (this includes the console program overhead):

NewtonSoft: 30mb
ServiceStack.Text: 215mb

And the time is:

NewtonSoft: 2.5s
ServiceStack.Text: 1.5s

The memory footprint is quite important, as I will be processing a lot of these.

I do understand that the ServiceStack method will give me the security of TypeSafety, but the memory footprint is more important for me.

As I can see that ServiceStack.Text is a lot faster, so I would like to know if I am able to recreate NewtonSoft example, but with ServiceStack.Text?

Edit (Added the object I try to parse):

public class MyObject
{
    public List<List<Word>> Words { get; set; }
}

public class Word
{
    public string B { get; set; }
    public string W { get; set; }
    public string E { get; set; }
    public string P { get; set; }
}

In my test file (which is representative of use case) it has 29000 words, but only around 8500 unique words. I am only analyzing this data, so I cannot change the structure of it. It is a file containing arrays of arrays of words.

Original Q&A

TechQA.

C# ServiceStack.Text analyze stream of json

There are 0 answers

Related Questions in C#

Related Questions in JSON

Related Questions in SERVICESTACK

Related Questions in SERVICESTACK-TEXT

Popular Questions

Popular Tags

Trending Questions