As a precomputation to a integral function, I need make some computation on a large matrice.
for (size_t x = 1; x < size().x(); ++x)
for (size_t y = 0; y < size().y(); ++y)
for (size_t z = 0; z < size().z(); ++z)
field::at(x, y, z) += field::at(x - 1, y, z);
for (size_t x = 0; x < size().x(); ++x)
for (size_t y = 1; y < size().y(); ++y)
for (size_t z = 0; z < size().z(); ++z)
field::at(x, y, z) += field::at(x, y - 1, z);
for (size_t x = 0; x < size().x(); ++x)
for (size_t y = 0; y < size().y(); ++y)
for (size_t z = 1; z < size().z(); ++z)
field::at(x, y, z) += field::at(x, y, z - 1);
my field inherit a std::vector<size_t>
where the at
as been overided
T& at(size_t x, size_t y, size_t z)
{
return container::at(x + y * size().x() + z * size().x() * size().y();
}
Here are some execution times on my machine
- (128x128x128) ~ 250 ms
- (256x256x256) ~ 3 sec
- (512x512x512) ~ 53 sec
That looks very slow to me.
Question
- Is allocating a
std::vector
of size 512x512x512 (1G) a bad idea ? Should I divide it in multiple (512) sub vector of size 512x512 (2M/each) - Is there any other way to do the same simple computation which would be more cache efficient ? (I'm guessing cache faults are a reason why this is so slow)