I have a bipartite graph G
. I would like to find a fast algorithm (asymptotically) to find an assignment of all the vertices of G
into two sets X
and Y
such that the complete bipartite graph formed by the sets X
and Y
has as many edges as possible.
Slightly longer explanation:
G
is bipartite and consists of a set connected components (each of which are bipartite, obviously). We want to decide on a positioning (for lack of a better word) of each component into X
and Y
. After deciding upon all the positionings, we complete the bipartite graph (i.e. we connect every vertex of X
to every vertex of Y
). We then count out how many edges are there totally (including original edges) and we want to maximize this count. Simple math shows that the number of edges would be |X|*|Y|
.
My thought process for a solution:
As the number of components increase, the number of choices for G
increases exponentially. However, if we take number of connected components of G
to be equal to number of nodes in G
, then the solution is simple - split so that number of nodes in X
and Y
are equal (or almost equal in case of odd number of nodes in G
). This makes me want to generalize that the problem is the same thing as trying to minimize the difference in cardinalities of X
and Y
(which can be solved as in this SO question). However, I have been unsuccessful in proving so.
Let's decompose the problem.
(U_i,V_i,E_i)
.|X|*|Y|
|X|*|Y|
, you obviously need to use all vertices (otherwise, by adding another vertex, you get a bigger value).i
- if you should addU_i
toX
, andV_i
toY
- or vise versa.So, what you are actually trying to do is:
The value we want to maximize behaves similar to the function
f(x) = x(k-x)
, because if we increase|X|
, it comes at the expanse of decreasing|Y|
, and by the same amount. This function has a single maximum:Meaning, we should distribute the nodes such that the cardinality (size) of
X
andY
are closest as possible to each other (and use all the vertices).This can be reduced to Partition Problem:
Now, the optimal solution found to partition problem can be translated directly to which of U_i,V_i to include in which set, and will make sure the difference in sizes is kept to minimum.