I have been trying to get better awareness of cache locality. I produced the 2 code snippets to gain better understanding of the cache locality characteristics of both.
vector<int> v1(1000, some random numbers);
vector<int> v2(1000, some random numbers);
for(int i = 0; i < v1.size(); ++i)
{
auto min = INT_MAX;
for(int j = 0; j < v2.size(); ++j)
{
if(v2[j] < min)
{
v1[i] = j;
}
}
}
vs
vector<int> v1(1000, some random numbers);
vector<int> v2(1000, some random numbers);
for(int i = 0; i < v1.size(); ++i)
{
auto min = INT_MAX;
auto loc = -1;
for(int j = 0; j < v2.size(); ++j)
{
if(v2[j] < min)
{
loc = j;
}
}
v1[i] = loc;
}
In the first code v1
is being updated directly inside the if statement. Does this cause cache locality issues because during the update it'll replace the current cache line with some contiguous segment of data from v1[i] to v1[cache_line_size/double_size]
? If this analysis is correct, then this would seem to slow down the loop over j
, since for each iteration of the j
loop, it'll likely have cache misses. It seems the second implementation alleviates this issue by using a local variable and not updating v1[i]
until after the j
loop is complete?
In practice, I think the compiler might optimize the cache locality issue away? For discussion, how about we assume no compiler optimizations.