Use HashMap to store file positions and access these randomly using RandomAccessFile

Question

Use HashMap to store file positions and access these randomly using RandomAccessFile

406 views Asked by dotwin At 23 June 2017 at 15:51

Initial problem:

I have the following issue: I am joining 2 CSVs using Java. While I can "stream" one of the CSVs (read in, process, write out line-by-line), the smaller one resides in memory (a HashMap to be precise), as I need to look up the keys of each row of the big CSV while going through it. The problem: if the "small CSV" is too large to keep in mem, I am running into OutOfMem errors.

While I know that I could avoid these issues by just reading both CSVs into a DB and perform the join there, it is infeasible in my application to do so. Is there a Java wrapper (or some other sort of object) which would allow me to keep only the HashMap's keys in memory, and put all of its values into a temp file on disk (in a self-managed fashion)?

Update:

After the comments of ThomasKläger and JacobG, I solved the problem in the following way:

Use a HashMap to store a row’s keys and that row’s start and end position using RandomAccessFile’s .getFilePointer().

While going through the large CSV, I am now using the HashMap to look up the matching rows’ positions, .seek(pos), and read them.

This is a working solution, thanks a lot.

Original Q&A

There are 1 answers

**fxrbfg** · Answer 1 · 2017-06-23T17:50:47+00:00

According to what you describe you need something like off heap collections, in example MapDb lib, http://www.mapdb.org/ From description:

MapDB provides Java Maps, Sets, Lists, Queues and other collections backed by off-heap or on-disk storage. It is a hybrid between java collection framework and embedded database engine.

TechQA.

Use HashMap to store file positions and access these randomly using RandomAccessFile

There are 1 answers

Related Questions in JAVA

Related Questions in CSV

Related Questions in HASHMAP

Related Questions in ON-DISK

Popular Questions

Popular Tags

Trending Questions