Usually, when I need to invoke a complicated formula, I break it down into two or more lines to make the code more comprehensible. However, when profiling some code that calculates RMSE, I discovered that doing this appears to increase my code's memory use. Here's a simplified example:
import numpy as np
import random
from memory_profiler import profile
@profile
def fun1():
#very large datasets (~750 mb each)
predicted = np.random.rand(100000000)
observed = np.random.rand(100000000)
#calculate residuals as intermediate step
residuals = observed - predicted
#calculate RMSE
RMSE = np.mean(residuals **2) ** 0.5
#delete residuals
del residuals
@profile
def fun2():
#same sized data
predicted = np.random.rand(100000000)
observed = np.random.rand(100000000)
#same calculation, but with residuals and RMSE calculated on same line
RMSE = np.mean((observed - predicted) ** 2) ** 0.5
if __name__ == "__main__":
fun1()
fun2()
Output:
Filename: memtest.py
Line # Mem usage Increment Line Contents
================================================
5 19.9 MiB 0.0 MiB @profile
6 def fun1():
7 782.8 MiB 763.0 MiB predicted = np.random.rand(100000000)
8 1545.8 MiB 762.9 MiB observed = np.random.rand(100000000)
9 2308.8 MiB 763.0 MiB residuals = observed - predicted
10 2308.8 MiB 0.1 MiB RMSE = np.mean(residuals ** 2) ** 0.5
11 1545.9 MiB -762.9 MiB del residuals
Filename: memtest.py
Line # Mem usage Increment Line Contents
================================================
13 20.0 MiB 0.0 MiB @profile
14 def fun2():
15 783.0 MiB 762.9 MiB predicted = np.random.rand(100000000)
16 1545.9 MiB 762.9 MiB observed = np.random.rand(100000000)
17 1545.9 MiB 0.0 MiB RMSE = np.mean((observed - predicted) **
2) ** 0.5
As you can see, the the first function (where the calculation is split) appears to require an additional ~750 mb at peak- presumably the cost of the residuals
array. However, both functions require the array to be created- the only difference is that the first function assigns it a name. This is contrary to my understanding of the way memory management in python is supposed to work.
So, what's going on here? One thought is that this could be some artifact of the memory_profiler module. Watching the Windows task manager during a run indicates a similar pattern (though I know that's not a terribly trustworthy validation). If this is a "real" effect, what am I misunderstanding about the way memory is handled? Or, is this somehow numpy-specific?
memory_profiler
's "Mem usage" columns tells you the memory usage after each line completes, not the peak memory usage during that line. In the version where you don't saveresiduals
, that array is discarded before the line completes, so it never shows up in the profiler output.