I have a basic background class in an otherwise empty ASP.NET Core 8 Minimal API project.
App startup is just:
builder.Services.AddHttpClient();
builder.Services.AddHostedService<SteamAppListDumpService>();
The background class is for saving snapshots of a Steam API endpoint, all basic stuff:
public class SteamAppListDumpService : BackgroundService
{
static TimeSpan RepeatDelay = TimeSpan.FromMinutes(30);
private readonly IHttpClientFactory _httpClientFactory;
private string GetSteamKey() => "...";
private string GetAppListUrl(int? lastAppId = null)
{
return $"https://api.steampowered.com/IStoreService/GetAppList/v1/?key={GetSteamKey()}" +
(lastAppId.HasValue ? $"&last_appid={lastAppId}" : "");
}
public SteamAppListDumpService(IHttpClientFactory httpClientFactory)
{
_httpClientFactory = httpClientFactory;
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
while (!stoppingToken.IsCancellationRequested)
{
await DumpAppList();
await Task.Delay(RepeatDelay, stoppingToken);
}
}
public record SteamApiGetAppListApp(int appid, string name, int last_modified, int price_change_number);
public record SteamApiGetAppListResponse(List<SteamApiGetAppListApp> apps, bool have_more_results, int last_appid);
public record SteamApiGetAppListOuterResponse(SteamApiGetAppListResponse response);
protected async Task DumpAppList()
{
try
{
var httpClient = _httpClientFactory.CreateClient();
var appList = new List<SteamApiGetAppListApp>();
int? lastAppId = null;
do
{
using var response = await httpClient.GetAsync(GetAppListUrl(lastAppId));
if (!response.IsSuccessStatusCode) throw new Exception($"API Returned Invalid Status Code: {response.StatusCode}");
var responseString = await response.Content.ReadAsStringAsync();
var responseObject = JsonSerializer.Deserialize<SteamApiGetAppListOuterResponse>(responseString)!.response;
appList.AddRange(responseObject.apps);
lastAppId = responseObject.have_more_results ? responseObject.last_appid : null;
} while (lastAppId != null);
var contentBytes = JsonSerializer.SerializeToUtf8Bytes(appList);
using var output = File.OpenWrite(Path.Combine(Config.DumpDataPath, DateTime.UtcNow.ToString("yyyy-MM-dd__HH-mm-ss") + ".json.gz"));
using var gz = new GZipStream(output, CompressionMode.Compress);
gz.Write(contentBytes, 0, contentBytes.Length);
}
catch (Exception ex)
{
Trace.TraceError("skipped...");
}
}
}
The API returns approx 16 MB of data in total, then it compresses/saves it to a 4 MB file, every 30 minutes, nothing else. In between runs, when the garbage collector runs I would expect the memory consumption to drop to almost nothing, but it increases over time, as an example it's been running for 2 hours on my PC and is consuming 700MB memory. On my server it's been running for 24 hours and is now consuming 2.5 GB memory.
As far as I can tell all the streams are disposed, HttpClient is created using the recommended IHttpClientFactory, does anyone know why this basic functionality is consuming so much memory even after garbage collection? I've tried looking at it in the VS manage memory dump but can't find much useful. Does this point to a memory leak in one of the classes (i.e. HttpClient / SerializeToUtf8Bytes) or am I missing something?
The responseString and contentBytes are usually around 2MB.
Any time you allocate a contiguous block of memory >= 85,000 bytes in size, it goes into the large object heap. Unlike the regular heap it isn't compactified unless you do so manually[1] so it can grow due to fragmentation giving the appearance of a memory leak. See Why Large Object Heap and why do we care?.
As your
responseStringandcontentBytesare usually around 2 MB I would recommend rewriting your code to eliminate them. Instead, asynchronously stream directly from your server and to your JSON file using the relevant built-in APIs like so:Notes:
GZipStreamdoes not buffer its input so there is a chance that streaming to it incrementally can result in worse compression ratios. However, as discussed by Bradley Grainger in Always wrap GZipStream with BufferedStream, buffering the incremental writes using a buffer that is 8K or larger effectively eliminates the problem.According to the docs, the
useAsyncargument to theFileStreamconstructorThus you may need to test to see whether, in practice, you get better performance with
UseAsyncFileStreamsequal totrueorfalse. You may also need to play around with the buffer sizes to get the best performance and compression ratio -- always being sure to keep the buffer smaller than 85,000 bytes.If you think large object heap fragmentation may be a problem, see the MSFT article The large object heap on Windows systems: A debugger for suggestions on how to investigate further.
Since your
DumpAppList()method only runs every half hour anyway, you might try compacting the large object heap manually after each run to see if that helps:You may want to pass the
CancellationToken stoppingTokenintoDumpAppList().[1] Do note that, in Memory management and garbage collection (GC) in ASP.NET Core: Large object heap, MSFT writes:
So my statement about when LOH compaction occurs may be out of date on certain platforms.