How prefix sum logic works in graph minimum distance calculation?

160 views Asked by At

Given an integer n that represents the number of vertices on a graph, and each vertex connect to next and previous vertex exactly with single edge, except for the start vertex and end vertex, we can calculate the minimum distance pair of vertices with the following code. Assuming this is unweighted graph, or can be assumed also weighted graph will all edge weight is 1.

function countMinDistance (n) {
  const result = new Array(n).fill(0)

  for (let i = 0; i < n - 1; i++) {
    result[0] += 2
    result[n - i - 1] -= 2
  }

  for (let i = 0; i + 1 < n; i++) {
    // calculate distance to other nodes with prefix sum of the previous nodes
    result[i + 1] += result[i]
  }

  return result
};

Example for following graph:

(1) --- (2) --- (3) --- (4)

The result of min distance would be an array [6, 4, 2, 0]. Where 6 is total of vertices pairs with min distance of 1, 4 is total vertices pairs with min distance of 2, 2 is total vertices pairs with min distance of 3, 0 is total vertices pairs with min distance of 4.

For each minimum distance k, the pairs are:

  • For k == 1, the pairs are (1, 2), (2, 1), (2, 3), (3, 2), (3, 4), and (4, 3).
  • For k == 2, the pairs are (1, 3), (3, 1), (2, 4), and (4, 2).
  • For k == 3, the pairs are (1, 4), and (4, 1).
  • For k == 4, there are no pairs.

I understand that on traversing vertices we need to increase by 2, which most likely the distance between current node (vertex) with the previous one multiplied by 2. I also have some understanding that on traversing we need to update distance between current node to all other nodes on the graph. Somehow I didn't get the efficient way yet to do this in a linear for loop (O(n)) way, until I see the above code. I can do it easily using two nested for loops (or even three nested for loops, with Floyd-Warshall algorithm) though. But because of the execution time constraint, I can't use it.

The problem is I don't understand these code parts result[n - i - 1] -= 2 and result[i + 1] += result[i]. Can someone explain how the above code can work for the min distance calculation, especially those parts that I am not sure about? How come when traversing vertices we calculate with decreasing by 2 (what is logical explanation to that, that can be easily reason about?), then doing the prefix sum magic on the second loop, and finally get correct result?

5

There are 5 answers

0
Romeo Razakandriarimanana On

Initialization: Correctly initialize the counts for distances. For a chain of n vertices, the number of pairs at each distance k (from 1 to n-1) follows a specific pattern that needs to be calculated directly, considering the graph's linear structure.

Accumulation: Use the correct logic to accumulate these distances. For a linear graph, the pattern is simpler and does not typically require decrementing counts as originally coded.

Given the misunderstanding in the original code and explanation, a corrected approach focusing on directly calculating distances based on the graph's linear structure would be more straightforward and accurate. Let's implement a corrected version based on understanding the problem correctly.

The corrected approach calculates the minimum distances between pairs of vertices accurately for a linear graph structure. For the example graph with n = 4 vertices:

The resulting array is [6, 4, 2, 0], which matches the expected outcome. This means:

There are 6 pairs of vertices at a minimum distance of 1 (each adjacent pair, counted in both directions). There are 4 pairs of vertices at a minimum distance of 2. There are 2 pairs of vertices at a minimum distance of 3. There are 0 pairs of vertices at a minimum distance of 4, since the maximum distance in a graph with 4 vertices is 3.

0
emilianoc On

that code simply is not correct so you cannot understand that.

let me explain:

function countMinDistance (n) {
  //here result is initialized as an array of n elements all setted at 0
  //====================================================================
  const result = new Array(n).fill(0);
  //====================================================================

  for (let i = 0; i < n - 1; i++) {
    result[0] += 2
    
    /*
    ===================================================================
    ===================================================================
    ===================================================================
    here on first iteration when i=0 you have result[n+1]-=2
    this is so wrong because n+1 is out of range 
    now the worst part:
      javascript (does not rise an error because it is one of many ways for extend an array ) create 2 other element at the end of your array (2 because even result[n] is out of range)
    and the worst worst part:
      the 2 elements created are not yet initialized so  the "-=2" will  set those element to NaN (Not a Number)
    ===================================================================
    ===================================================================
    ===================================================================
    */
    result[n - i + 1] -= 2;
  }

  for (let i = 0; i + 1 < n; i++) {
    // calculate distance to other nodes with prefix sum of the previous nodes
    result[i + 1] += result[i]
  }

  return result
};
0
Mr Patience On

Firstly, there's a small mistake in your code:
result[n - i + 1] -= 2 should be result[n - i - 1] -= 2, otherwise you have index out of bounds at i = 0 => n - 0 + 1 = n + 1.

I assume that in each step we increment / decrement by 2 because the graph is undirected - we count the path in each direction (from a to b, then from b to a).

Now, let's take your example:

(1) --- (2) --- (3) --- (4)

After the first loop, the array will be [6, -2, -2, -2].
We set the 0 index to (n-1) * 2 and set the others to -2.
The 6 at index 0 is the back-and-forth distance from 1 to 4.

If we have this distance, then if we traverse by a single node, on each step we will skip a single edge, a single back-and-forth, so a path of length 2. This is what -2 is trying to account for.

When you traverse the graph starting from 0 you skip a path of length 0 so you should decrement your path length by 2.

The code is a bit overcomplicated though, it could be simplified to:

function countMinDistance(n) {
  const result = new Array(n).fill(0); 

  for (let i = 1; i < n; i++) {
    // back-and-forth distance from `i - 1` to `n`
    // `n-1` accounts for the edges number = vertices - 1
    const dist = (n - i) * 2;
    result[i - 1] = dist;
  }

  return result;
}
0
Ahmed Amin Shahin On

For the provided example graph, the resulting result array is [6, 4, 2, 0], which corresponds to the total number of minimum distance pairs for distances 1, 2, 3, and 4.

For distance 1, there are 6 pairs: (1, 2), (2, 1), (2, 3), (3, 2), (3, 4), and (4, 3). For distance 2, there are 4 pairs: (1, 3), (3, 1), (2, 4), and (4, 2). For distance 3, there are 2 pairs: (1, 4), and (4, 1). For distance 4, there are 0 pairs.

In short, the code efficiently calculates the minimum distance pairs of vertices in a graph by cleverly updating the result array and then using prefix sum to obtain the actual minimum distances. This approach avoids nested loops and achieves the desired result in linear time complexity (O(n)).

0
rockerbacon On

This looks like a simple case of dynamic programming.

To start off, let's first define a method for calculating the distances between two vertices. Because of our specific graph structure, we can ignore traditional graph algorithms and calculate the distance in O(1) time:

function distance(v1, v2) {
    return Math.abs(v1-v2)
}

Now let's take a look at a naive O(n^2) implementation:

function countMinDistance(n) {
    // result counts the number of pairs with a specific distance
    // eg: result[k] is the count of vertice pairs with distance k
    const result = new Array(n).fill(0)

    for (let v1 = 0; v1 < n-1; v1++) {
        for (let v2 = v1+1; v2 < n; v2++) {
            let k = distance(v1, v2)
            // count pairs (v1, v2) and (v2, v1)
            // k-1 because javascript indexing starts at 0
            // result[0] is the result for k=1
            result[k-1] += 2
        }
    }

    return result
}

This algorithm works well and, with a few adjustments to the allocation of the result array, can even work for more general graphs. It is, however, inefficient for our particular graph structure.

Why is that? Because there are intersections in the paths between vertices. If we want to traverse (1, 4) we know the shortest route is traversing (1, 3) and then traversing (3, 4). Because of that, we can assume following statement:

If the number of vertex pairs with k minimum distance is fn(k), then the number of vertex pairs with k+1 minimum distance is fn(k)-2. fn(k) is 0 for any k greater than or equal the number of vertices n in the graph

fn(n) = 0
fn(k) - 2 = fn(k+1) => fn(k) = fn(k+1) + 2

One can probably use induction to prove the statement above, but I will refrain from doing that as this would probably belong somewhere in Math Exchange.

Let's then start exploiting this property. First I'll write a recursive function:

function countPairs(k, n) {
    if (k == n) {
        return 0
    } else {
        return countPairs(k+1, n) + 2
    }
}

function countMinDistance(n) {
    let result = new Array(n).fill(0)

    for (let k = 1; k < n; k++) {
        // k-1 because javascript indexing starts at 0
        // result[0] is the result for k=1
        result[k-1] = countPairs(k, n)
    }

    return result
}

This function is still no better than our initial naive algorithm, but we can employ dynamic programming to improve it. When running countPairs in the main loop, we end up re-calculating the same thing multiple times, eg:

  1. countPairs(1, 4) calculates: countPairs(2, 4) + 2; countPairs(3, 4) + 2; countPairs(4, 4) + 2;
  2. countPairs(2, 4) calculates: countPairs(3, 4) + 2; countPairs(4, 4) + 2;

If we were to store previous results of countPairs somewhere in memory, then we wouldn't need to repeat calculations:

function countPairs(k, n, cache) {
    if (cache[k] !== undefined) {
        return cache[k]
    } else if (k == n) {
        cache[k] = 0
        return cache[k]
    } else {
        cache[k] = countPairs(k+1, n, cache) + 2
        return cache[k]
    }
}

function countMinDistance(n) {
    let result = new Array(n).fill(0)

    let cache = []
    for (let k = 1; k < n; k++) {
        // k-1 because javascript indexing starts at 0
        // but result[0] is the result for k=1
        result[k-1] = countPairs(k, n, cache)
    }

    return result
}

This is the algorithm from your post, written in a more self-explanatory manner. The algorithm you posted just does some clever things to use a little less code and be a little more machine efficient:

  1. The result array is repurposed as a cache
  2. The algorithm is iterative instead of recursive
  3. A cache initialization is done before the main loop, to serve as a sort of replacement for the recursion base (pre-calculates countPairs(1, n)) and to eliminate the need for a "if is already in cache" statement