What can I do to speed up the execution of the following code significantly (in particular for even higher values of N). I think the main problem is the sum in the get_field function.
from numpy import *
from numpy.linalg import *
from scipy import constants as co
import pylab as plt
eps0 = co.epsilon_0
charges = []
class charge:
def __init__(self, q, pos):
self.q=q
self.pos=pos
a = 10
N = 1001
d = 1e-3
Q = 1e-8
q = Q/N**2
for x in linspace(0,a,N):
for y in linspace(0,a,N):
## First plate
position = array([x,y,0])
NewCharge = charge(q,position)
charges.append(NewCharge)
## Second Plate
position2 = array([x,y,d])
NewCharge2 = charge(-q,position2)
charges.append(NewCharge2)
def get_field(position):
E = array([0.,0.,0.])
for C in charges:
q = C.q
rVector = position - C.pos
r = norm(rVector)
r0Vector = rVector/r
E += 1/(4*pi*eps0)*q/r**2 * r0Vector
#print(E)
return E[2]
p = 0.1
x0 = 0.5*a
y0 = 0.5*a
#print(get_field([x0,y0,1]))
Z = linspace(p*d,(1-p)*d,100)
E = [get_field([x0,y0,z]) for z in Z]
print(Z)
print(E)
plt.plot(Z,E,'x')
plt.ylim(bottom=0)
plt.show()
The background is a physics problem where I want to find the electric field of a parallel plate capacitor by explicitly summing up the coulomb fields of evenly distributed point charges on two parallel plates.
Update @Blckknght's answer improves the performance of my code significantly, for example for N = 100, my original code takes approx 90 seconds on my system and your code 0,6 seconds. For N = 500 my code takes approx 2300 seconds, Blckknght's code approx 3 seconds.
The problem with this improved code is that the memory consumption is so high that in my case the program just gets killed if my 16 GB memory are exceeded.
So is there any way to get those optimizations without running into the memory problem at all?
Programming fast numerical code with
numpyis quite different than writing normal Python code. You often need to discard some best practices from normal code to get good numerical performance.In this context, I think the thing to discard is using object oriented programming, with your
Chargeclass. Using a singlenumpyarray to contain all the coordinates of all the charges is going to be much more efficient than having a separate array for each point, as the library can use fancy parallel CPU instructions to do all the computation in one go, rather than charge by charge in a slow pure-Python loop.Here's my crude solution, which builds up a few different arrays, one with the coordinates of the charges on the two plates, one for the charge values (
qand-q), and then one for the points that vary over the Z axis where you want to sample the field. I use variousnumpybroadcasting techniques to combine them.The funny indexing of
np.mgridwith imaginary step values is based on this answer which I found by googling. You can also usenp.meshgrid, it just takes some extra steps. The rest is pretty standardnumpylogic, you just need to reshape or slice to add dimensions a few times to get the arrays to broadcast together properly in the final step.The biggest downside of using big
numpyarrays like this is memory usage. Since we're taking the Cartesian product of the 100 test coordinates with the 2 million-ish charge coordinates, we wind up with several gigabytes of memory being used all at once (about 5 GB for the largest array on my system). That could get slow if you run out of real RAM and your computer starts needing to swap things out of virtual memory. It might be possible to save a little memory here or there by deleting references to arrays we don't need any more (or not keeping persistent references to temporary intermediate values at all). I avoided making a separate variable for the unit vectors by just combining the division by the magnitude with the existing magnitude squared that the charge was being divided by.