I'm implementing the Floyd-Warshal algorithm, as can be found here.
I'm not just interested in the shortest distance between nodes, but also in the path, corresponding with that distance.
In order to do this, I have modified the algorithm as follows:
double[,] dist = new double[V, V]; // existing line
string[,] connections = new string[V, V]; // new line, needed for remembering the path
...
for (i = 0; i < V; i++){
for (j = 0; j < V; j++){
dist[i, j] = graph[i, j];
connections[i, j] = $"({i},{j})";}} // added: initialisation of "connections"
...
if (dist[i, k] + dist[k, j] < dist[i, j])
{
dist[i, j] = dist[i, k] + dist[k, j];
connections[i, j] = connections[i, k] + "-" + connections[k, j]; // Added for remembering shortest path
}
I'm running this algorithm with a snake-like list of locations of one million, all of them simply being added one after the other.
As a result, my connections array looks as follows:
[0, 0] "(0,0)"
[0, 1] "(0,1)"
[0, 2] "(0,1)-(1,2)"
[0, 3] "(0,1)-(1,2)-(2,3)"
[0, 4] "(0,1)-(1,2)-(2,3)-(3,4)"
[0, 5] "(0,1)-(1,2)-(2,3)-(3,4)-(4,5)"
...
[0, 787] "(0,1)-(1,2)-...(786,787)" // length of +7000 characters
...
... at the moment of my OutOfMemoryException (what a surprise) :-)
I would like to avoid that OutOfMemoryException, and start thinking of different techniques:
- Forcing garbage collection once a variable is not needed anymore (in case this is not yet done)
- "Swapping" very large objects between memory and hard disk, in order to get more memory access.
I believe the second option being the most realistic (don't kill me if I'm wrong) :-) Is there a technique in C# which makes that possible?
Oh, if you react like "You're an idiot! There's a far better way to keep the shortest paths in Floyd-Warshal!", don't refrain from telling me how :-)
Edit: taking into account the multiple comments, for which I'm very grateful:
In the meantime, I've replaced my strings with Lists of Lists of Points, and this seems to be working fine:
Instead of:
string[,] l_connections;
I have:
List<List<List<Point>>> l_connections;
The speed has doubled, and when working with huge collections of dictionaries (+1000 entries of ...), I get a System.OutOfMemoryException only at ±800 entries instead of ±650.
That's already a huge improvement, but does anybody know to get even better?
Edit: information about garbage collector and its settings:
There is following GC and GCSettings information:
System.GC.MaxGeneration:[2]
IsServerGC:[False]
LargeObjectHeapCompactionMode:[Default]
LatencyMode:[Interactive]
I have altered the LargeObjectHeapCompactionMode to CompactOnce but this brought the performance back to similar results as when I was working with large strings instead of Lists.
Edit: how to work with List collection:
Hereby the code, when working with List collections:
public void floydWarshall(Dictionary<(int x, int y), double> dictionary, out double[,] dist, out List<List<List<Point>>> connections)
{
int dictionary_size = (int)Math.Ceiling(Math.Sqrt(dictionary.Count));
dist = new double[dictionary_size, dictionary_size];
...
for (k = 0; k < dictionary_size; k++)
{
// Pick all vertices as source one by one
for (i = 0; i < dictionary_size; i++)
{
// Pick all vertices as destination for the above picked source
for (j = 0; j < dictionary_size; j++)
{
// If vertex k is on the shortest path from i to j, then update dist[i][j]
if (dist[i, k] + dist[k, j] < dist[i, j])
{
dist[i, j] = dist[i, k] + dist[k, j];
connections[i][j] = new List<Point>();
connections[i][j].AddRange(connections[i][k]);
connections[i][j].AddRange(connections[k][j]);
}
}
}
}
Thanks in advance
There is indeed.
Every shortest path from
mtonis either a direct stepm-nor a path throughk,m-...-k-...-n, wherem-...-kis the shortest path frommtok, already stored at array index[m,k]andk-...-nis the shortest path fromkton, already stored at array index[k,n].So all you need to store at
[m,n]is the value ofk, which is any single interior vertex along the shortest path. (Thekvariable in the question code is a valid choice)Storage requirement:
O(V^2), down fromO(V^3).A suitable C# data structure would be
int[,], wherepath[m,n] == mmeans a direct linkpath[m,n] == nmeans "I don't know yet" (storingnullinint?[,]is also a reasonable approach, but wastes some memory)path[m,n] == k && k != m && k != nmeans the shortest path passes throughkand the path can be reconstructed by recursing both[m,k]and[k,n]