The below piece of code gives me a memory error when I run it. It is part of a simple neural net I'm building. Keep in mind that all the math is written out for learning purposes:
learning_rate = 0.2
costs = []
for i in range(50000):
ri = np.random.randint(len(data))
point =data[ri]
## print (point)
z = point[0] * w1 + point[1] * w2 + b
pred = sigmoid(z)
target = point[2]
cost = np.square(pred-target)
dcost_pred = 2* (pred-target)
dpred_dz = sigmoid_p(z)
dz_dw1 = point[0]
dz_dw2 = point[1]
dz_db = 1
dcost_dz = dcost_pred*dpred_dz
dcost_dw1 = dcost_pred*dpred_dz*dz_dw1
dcost_dw2 = dcost_pred*dpred_dz
dcost_db = dcost_dz * dz_db
w1 = w1 - learning_rate*dcost_dw1
w2 = w2 - learning_rate*dcost_dw2
b = b - learning_rate*dcost_db
if i % 100 == 0:
for j in range(len(data)):
cost_sum = 0
point = data[ri]
z = point[0]*w1+point[1]*w2+b
pred = sigmoid(z)
target = point[2]
cost_sum += np.square(pred-target)
When the program gets to this part it results in the following error:
Traceback (most recent call last):
File "D:\First Neual", line 89, in <module>
File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\", line 3261, in plot
ret = ax.plot(*args, **kwargs)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\", line 1717, in inner
return func(ax, *args, **kwargs)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\axes\", line 1373, in plot
File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\axes\", line 1779, in add_line
File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\axes\", line 1801, in _update_line_limits
path = line.get_path()
File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\", line 957, in get_path
File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\", line 667, in recache
self._xy = np.column_stack(np.broadcast_arrays(x, y)).astype(float)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\numpy\lib\", line 353, in column_stack
return _nx.concatenate(arrays, 1)
Are there any ways to make this code more efficient? Maybe using generators?
You probably meant to move the
cost_sum = 0
out of the loop!Memory Error
You're attempting to plot 50000 points, matplotlib certainly doesn't take kindly to that, so you might want to reduce the number of points you plot. It appears you tried to do this in the loop at the bottom of the code?
Let's address the efficiency part of your question. Now, I can't comment on how to make the algorithm itself to be more efficient, but I thought I'd offer my wisdom on making Python code run faster.
I'll start off by saying this: the Python compiler (that sits behind the interpreter) does next to no optimisation, so the common optimisations compilers usually apply themselves apply to us here.
Common Subexpression Elimination
In your code you make use of stuff like this:
This is really inefficient! We are recomputing
3 times! It would be much better to just make use of the variable we've already assigned to:To put this into perspective, this is 4 instructions less each iteration of the loop (14 vs 10).
In a similar vein, you recompute
twice per iteration, why not use(x, y) = data[ri]
instead?Loop Hoisting
You also do a bit of recomputation during your iterations of the loop. For instance, every iteration you find the size of the data up to 3 different times and this doesn't change. So compute it before the loop:
Also, that
is costing you 3 instructions to even access, without even thinking about the call. If you were being really performance sensitive, you'd move it out of the loop too:Strength Reduction
Sometimes, some operators are more fast performing than others. For instance;
Uses a
instruction, butuses an
instruction. It's likely the in-place instruction is a bit faster, so something to consider (but you'd need to test out that theory).Plot Every Iteration
As stated in the comments (nicely spotted @HFBrowning), you are plotting all the points every iteration, which is going to absolutely cripple performance!
Since you know the size of the list you are inserting into (50500 or something?) you can allocate the list to be that size with something like:
costs = [0] * 50500
, this saves a lot of time reallocating the size of the list when it gets full. You'd stop using append and start assigning to an index. Bear in mind though, this will cause weirdness in your plot unless you only plot once after the loop has finished!