I'm trying to evaluate dask by converting a method from thunder (using Spark), to the equivalent numpy version, but I'm not sure how to write this using dask/distributed.
In thunder, I can take a stack of images, convert it to a series, and correlate against some signal:
imgs = thunder.images.fromrandom((10, 900, 900))
series = imgs.toseries()
signal = series[5, 5, :]
correlated = series.correlate(signal)
The numpy version looks like this:
series = numpy.random.rand(900, 900, 10)
signal = series[5, 5, :]
reshaped = series.reshape(900 * 900, 10)
correlated = numpy.asarray(
map(lambda x: numpy.corrcoef(x, signal)[0, 1], reshaped))
)
final = correlated.reshape(900, 900)
I'm looking for some tips on how to convert this into something for distributed in particular.
Perhaps something like the following?
If you wanted to correlate your images against each other
Or against some other signal
However, I'm not very familiar with your application, so the response above may be flawed.