I was working with real-world data with a couple hundred million rows. Since Python took ages to go through them, I decided to experiment using Codon for faster execution. It worked well with smaller amounts of data showing 20x speedup but as the data increased, it started slowing down. Moreover, it also comsumed inexorbitant amounts of data and finally crashes.
Python is able to run on about 50GB memory consumption as per the Mac's activity monitor. On the other hand, Codon uses up more than 80 GB, slows down incredibly and breaks at the 25% mark. Similar memory consumption was observed even when I implemented further batching indicating that the memory consumed in prior batches is not being released even after the batch is done processing. I also tried to use the 'del' keyword as well as tried to gc.collect() ASAP but there is no improvement. It seems Codon is not suitable for applications where scalability is needed.
To anybody who has experience with Codon compiler, do you know how does Codon manage memory? What should I be doing to explicitly release memory from variables which are no longer needed?
Please let me know if you need any further information from me. Thanks in advance.