perl6 How to give more memory to MoarVM?

225 views Asked by At

I have to run data analysis on about 2 million lines of data and each line about 250 bytes long. So total about 500 megabytes of data. I am running latest Rakudo on Virtualbox Linux with 4G memory.

After about 8 hours, I got MoarVM panic due to running out of memory. How do I give more memory to MoarVM? Unfortunately I cannot break up the 2 millions into chunks and write to a files first because part of the data analysis requires the whole 2-m lines.

2

There are 2 answers

5
raiph On BEST ANSWER

I suggest you tackle your problem in several steps:

  • Prepare two small sample files if you haven't already. Keep them very small. I suggest a 2,000 lines long file and a 20,000 line long one. If you already have some sample files of around that length then those will do. Run your program for each file, noting how long each took and how much memory was used.

  • Update your question with your notes about duration and RAM use; plus links to your source code if that's possible and the sample files if that's possible.

  • Run the two sample files again but using the profiler as explained here. See what there is to see and update your question.

If you don't know how to do any of these things, ask in the comments.

If all the above is fairly easy, repeat for a 100,000 line file.

Then we should have enough data to give you better guidance.

2
Jonathan Worthington On

MoarVM doesn't have its own upper limit on memory (unlike, for example, the JVM). Rather, it gives an "out of memory" or "memory allocation failed" error only when memory is requested from the operating system and that request is refused. That may be because of configured memory limits, or it may really be that there just isn't that much available RAM/swap space to satisfy the request that was made (likely if you haven't configured limits).

It's hard to provide specific advice on what to try next given there's few details of the program in the question, but some things that might help are:

  • If you are processing the data in the file into some other data structure, and it's possible to do so, read the file lazily (for example, for $fh.lines { ... } will only need to keep the Str for the line currently being processed in memory, while my @lines = $fh.lines; for @lines { } will keep all of the Str objects around).
  • Is that data in the file ASCII or Latin-1? If so, pass an :enc<ascii> or similar when opening the file. This may lead to a smaller memory representation.
  • If keeping large arrays of integers, numbers, or strings, consider using natively typed arrays. For example, if you have my int8 @a and store a million elements then it takes 1 MB of memory; do that with my @a and they will all be boxed objects inside of a Scalar container, which on a 64-bit machine that could eat over 70MB. Similar applies if you have an object that you make many instances of, and might be able to make some of the attributes native.