Python memory error when trying to build simple neural net

368 views Asked by At

The below piece of code gives me a memory error when I run it. It is part of a simple neural net I'm building. Keep in mind that all the math is written out for learning purposes:

learning_rate = 0.2
costs = []
for i in range(50000):
    ri = np.random.randint(len(data))
    point =data[ri]
##    print (point)
    z = point[0] * w1 + point[1] * w2 + b
    pred = sigmoid(z)
    target = point[2]
    cost = np.square(pred-target)
    costs.append(cost)
    dcost_pred = 2* (pred-target)
    dpred_dz = sigmoid_p(z)

    dz_dw1 = point[0]
    dz_dw2 = point[1]
    dz_db = 1

    dcost_dz = dcost_pred*dpred_dz    
    dcost_dw1 = dcost_pred*dpred_dz*dz_dw1
    dcost_dw2 = dcost_pred*dpred_dz
    dcost_db = dcost_dz * dz_db

    w1 = w1 - learning_rate*dcost_dw1
    w2 = w2 - learning_rate*dcost_dw2
    b = b - learning_rate*dcost_db
    plt.plot(costs)


    if i % 100 == 0:

        for j in range(len(data)):
            cost_sum = 0
            point = data[ri]
            z = point[0]*w1+point[1]*w2+b
            pred = sigmoid(z)
            target = point[2]
            cost_sum += np.square(pred-target)
        costs.append(cost_sum/len(data))

When the program gets to this part it results in the following error:

Traceback (most recent call last):
  File "D:\First Neual Net.py", line 89, in <module>
    plt.plot(costs)
  File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\pyplot.py", line 3261, in plot
    ret = ax.plot(*args, **kwargs)
  File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\__init__.py", line 1717, in inner
    return func(ax, *args, **kwargs)
  File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\axes\_axes.py", line 1373, in plot
    self.add_line(line)
  File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\axes\_base.py", line 1779, in add_line
    self._update_line_limits(line)
  File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\axes\_base.py", line 1801, in _update_line_limits
    path = line.get_path()
  File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\lines.py", line 957, in get_path
    self.recache()
  File "C:\Program Files (x86)\Python36-32\lib\site-packages\matplotlib\lines.py", line 667, in recache
    self._xy = np.column_stack(np.broadcast_arrays(x, y)).astype(float)
  File "C:\Program Files (x86)\Python36-32\lib\site-packages\numpy\lib\shape_base.py", line 353, in column_stack
    return _nx.concatenate(arrays, 1)
MemoryError

Are there any ways to make this code more efficient? Maybe using generators?

1

There are 1 answers

2
J_mie6 On BEST ANSWER

Bugs

You probably meant to move the cost_sum = 0 out of the loop!

Memory Error

You're attempting to plot 50000 points, matplotlib certainly doesn't take kindly to that, so you might want to reduce the number of points you plot. It appears you tried to do this in the loop at the bottom of the code?

Efficiency

Let's address the efficiency part of your question. Now, I can't comment on how to make the algorithm itself to be more efficient, but I thought I'd offer my wisdom on making Python code run faster.

I'll start off by saying this: the Python compiler (that sits behind the interpreter) does next to no optimisation, so the common optimisations compilers usually apply themselves apply to us here.

Common Subexpression Elimination

In your code you make use of stuff like this:

dcost_dz = dcost_pred*dpred_dz    
dcost_dw1 = dcost_pred*dpred_dz*dz_dw1
dcost_dw2 = dcost_pred*dpred_dz

This is really inefficient! We are recomputing dcost_pred*dpred_dx 3 times! It would be much better to just make use of the variable we've already assigned to:

dcost_dz = dcost_pred*dpred_dz    
dcost_dw1 = dcost_dz*dz_dw1
dcost_dw2 = dcost_dz

To put this into perspective, this is 4 instructions less each iteration of the loop (14 vs 10).

In a similar vein, you recompute point[0] and point[1] twice per iteration, why not use (x, y) = data[ri] instead?

Loop Hoisting

You also do a bit of recomputation during your iterations of the loop. For instance, every iteration you find the size of the data up to 3 different times and this doesn't change. So compute it before the loop:

for i in range(50000):
    ri = np.random.randint(len(data))

becomes

datasz = len(data)
for i in range(50000):
    ri = np.random.randint(datasz)

Also, that np.random.randint is costing you 3 instructions to even access, without even thinking about the call. If you were being really performance sensitive, you'd move it out of the loop too:

datasz = len(data)
randint = np.random.randint
for i in range(50000):
    ri = randint(datasz)

Strength Reduction

Sometimes, some operators are more fast performing than others. For instance;

b = b - learning_rate*dcost_db

Uses a BINARY_SUBTRACT instruction, but

b -= learning_rate*dcost_db

uses an INPLACE_SUBTRACT instruction. It's likely the in-place instruction is a bit faster, so something to consider (but you'd need to test out that theory).

Plot Every Iteration

As stated in the comments (nicely spotted @HFBrowning), you are plotting all the points every iteration, which is going to absolutely cripple performance!

List append

Since you know the size of the list you are inserting into (50500 or something?) you can allocate the list to be that size with something like: costs = [0] * 50500, this saves a lot of time reallocating the size of the list when it gets full. You'd stop using append and start assigning to an index. Bear in mind though, this will cause weirdness in your plot unless you only plot once after the loop has finished!