Algorithm - Balancing a disconnected bipartite graph

378 views Asked by At

I have a bipartite graph G. I would like to find a fast algorithm (asymptotically) to find an assignment of all the vertices of G into two sets X and Y such that the complete bipartite graph formed by the sets X and Y has as many edges as possible.

Slightly longer explanation:

G is bipartite and consists of a set connected components (each of which are bipartite, obviously). We want to decide on a positioning (for lack of a better word) of each component into X and Y. After deciding upon all the positionings, we complete the bipartite graph (i.e. we connect every vertex of X to every vertex of Y). We then count out how many edges are there totally (including original edges) and we want to maximize this count. Simple math shows that the number of edges would be |X|*|Y|.

My thought process for a solution:

As the number of components increase, the number of choices for G increases exponentially. However, if we take number of connected components of G to be equal to number of nodes in G, then the solution is simple - split so that number of nodes in X and Y are equal (or almost equal in case of odd number of nodes in G). This makes me want to generalize that the problem is the same thing as trying to minimize the difference in cardinalities of X and Y (which can be solved as in this SO question). However, I have been unsuccessful in proving so.

1

There are 1 answers

0
amit On BEST ANSWER

Let's decompose the problem.

  1. Your graph is actually a set of connected components, each connected component is (U_i,V_i,E_i).
  2. To maximize the number of edges, you need to maximize the value of |X|*|Y|
  3. To get the maximal value of |X|*|Y|, you obviously need to use all vertices (otherwise, by adding another vertex, you get a bigger value).
  4. Your freedom of choice is actually to choose for each component i - if you should add U_i to X, and V_i to Y - or vise versa.

So, what you are actually trying to do is:

maximize:
sum { x_i * |V_i| : for each component i} * sum { y_i * |U_i| : for each component i}
subject to constraints:
x_i, y_i in {0,1} for all i
x_i + y_i = 1     for all i

The value we want to maximize behaves similar to the function f(x) = x(k-x), because if we increase |X|, it comes at the expanse of decreasing |Y|, and by the same amount. This function has a single maximum:

f(x) = xk - x^2
f'(x) = k - 2x = 0  ---> x = k/2

Meaning, we should distribute the nodes such that the cardinality (size) of X and Y are closest as possible to each other (and use all the vertices).

This can be reduced to Partition Problem:

Given U_1,V_1,U_2,V_2,...,U_k,V_k
Create an instance of partition problem:
abs(|U_1| - |V_1|), abs(|U_2| - |V_2|), ... , abs(|U_k| - |V_k|)

Now, the optimal solution found to partition problem can be translated directly to which of U_i,V_i to include in which set, and will make sure the difference in sizes is kept to minimum.