I am working on a sample to show how to read a large file and process it in chunks. It seems to be misbehaving however, returning an invalid number of bytes read.
For example, imagine we have a test file with a length of 8320 bytes and we use the following code to read it in chunks of 4096 bytes:
public static async IAsyncEnumerable<byte[]> ConvertInChunks(string filePath, int chunkSize,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
byte[]? buffer = null;
try
{
buffer = ArrayPool<byte>.Shared.Rent(chunkSize);
var memory = buffer.AsMemory(0, chunkSize);
var fileLength = new FileInfo(filePath).Length;
using var mm = MemoryMappedFile.CreateFromFile(filePath);
await using var accessor = mm.CreateViewStream();
var bytesRead = 0L;
while (bytesRead < fileLength)
{
var read = await accessor.ReadAsync(memory, cancellationToken);
bytesRead += read;
yield return memory[..read].ToArray();
}
}
finally
{
if (buffer is not null)
ArrayPool<byte>.Shared.Return(buffer);
}
}
What I am finding is that the final iteration is still returning a read value of 4096. This seems like a bug to me, but maybe I am doing something incorrectly in my setup. I can work around this by explicitly splitting my memory buffer so that it is the minimum between bytes remaining and 4096, but I really expected the call to ReadAsync to read either 4096 or less when there is less data remaining in the stream. Note, I also tried the other ReadAsync overload with a buffer and explicit offset and count and it had the same behavior.
All credit to @MarkGravell for this answer:
The solution is simple, set the view stream size explicitly in the create call: