Mapping files bigger than 2GB with Java

4.7k views Asked by At

It could be generally stated: how do you implement a method byte[] get(offset, length) for a memory-mapped file that is bigger than 2GB in Java.

With context:

I'm trying to read efficiently files that are bigger than 2GB with random i/o. Of course the idea is to use Java nio and memory-mapped API.

The problem comes with the limit of 2GB for memory mapping. One of the solutions would be to map multiple pages of 2GB and index through the offset.

There's a similar solution here:

Binary search in a sorted (memory-mapped ?) file in Java

The problem with this solution is that it's designed to read byte while my API is supposed to read byte[] (so my API would be something like read(offset, length)).

Would it just work to change that ultimate get() to a get(offset, length)? What happens then when the byte[] i'm reading lays between two pages?

1

There are 1 answers

2
Stu Thompson On

No, my answer to Binary search in a sorted (memory-mapped ?) would not work to change get() to get(offset, length) because of the memory mapped file array boundary, like you suspect. I can see two possible solutions:

  1. Overlap the memory mapped files. When you do a read, pick the memory mapped file with the start byte immediately before the read's start byte. This approach won't work for reads larger than 50% of the maximum memory map size.
  2. Create a byte array creation method that reads from two different two different memory mapped files. I'm not keen on this approach as I think some of the performance gains will be lost because the resulting array will not be memory mapped.