I have n points and have to find the maximum united area between k points (k <= n). So, its the sum of those points area minus the common area between them.
]1
Suppose we have n=4, k=2. As illustrated in the image above, the areas are calculated from each point to the origin and, the final area is the sum of the B area with the D are (only counting the area of their intersection once). No point is dominated
I have implemented a bottom-up dynamic programming algorithm, but it has an error somewhere. Here is the code, that prints out the best result:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct point {
double x, y;
} point;
struct point *point_ptr;
int n, k;
point points_array[1201];
point result_points[1201];
void qsort(void *base, size_t nitems, size_t size,
int (*compar)(const void *, const void *));
int cmpfunc(const void *a, const void *b) {
point *order_a = (point *)a;
point *order_b = (point *)b;
if (order_a->x > order_b->x) {
return 1;
}
return -1;
}
double max(double a, double b) {
if (a > b) {
return a;
}
return b;
}
double getSingleArea(point p) {
return p.x * p.y;
}
double getCommonAreaX(point biggest_x, point new_point) {
double new_x;
new_x = new_point.x - biggest_x.x;
return new_x * new_point.y;
}
double algo() {
double T[k][n], value;
int i, j, d;
for (i = 0; i < n; i++) {
T[0][i] = getSingleArea(points_array[i]);
}
for (j = 0; j < k; j++) {
T[j][0] = getSingleArea(points_array[0]);
}
for (i = 1; i < k; i++) {
for (j = 1; j < n; j++) {
for (d = 0; d < j; d++) {
value = getCommonAreaX(points_array[j - 1], points_array[j]);
T[i][j] = max(T[i - 1][j], value + T[i - 1][d]);
}
}
}
return T[k - 1][n - 1];
}
void read_input() {
int i;
fscanf(stdin, "%d %d\n", &n, &k);
for (i = 0; i < n; i++) {
fscanf(stdin, "%lf %lf\n", &points_array[i].x, &points_array[i].y);
}
}
int main() {
read_input();
qsort(points_array, n, sizeof(point), cmpfunc);
printf("%.12lf\n", algo());
return 0;
}
with the input:
5 3
0.376508963445 0.437693410334
0.948798695015 0.352125307881
0.176318878234 0.493630156084
0.029394902328 0.951299438575
0.235041868262 0.438197791997
where the first number equals n, the second k and the following lines the x and y coordinates of every point respectively, the result should be: 0.381410589193,
whereas mine is 0.366431740966. So I am missing a point?
This is a neat little problem, thanks for posting! In the remainder, I'm going to assume no point is dominated, that is, there are no points
csuch that there exists a pointdwithc.x < d.xandc.y < d.y. If there are, then it is never optimal to usec(why?), so we can safely ignore any dominated points. None of your example points are dominated.Your problem exhibits optimal substructure: once we have decided which item is to be included in the first iteration, we have the same problem again with
k - 1, andn - 1(we remove the selected item from the set of allowed points). Of course the pay-off depends on the set we choose - we do not want to count areas twice.I propose we pre-sort all point by their x-value, in increasing order. This ensures the value of a selection of points can be computed as piece-wise areas. I'll illustrate with an example: suppose we have three points,
(x1, y1), ..., (x3, y3)with values(2, 3), (3, 1), (4, .5). Then the total area covered by these points is(4 - 3) * .5 + (3 - 2) * 1 + (2 - 0) * 3. I hope it makes sense in a graph:By our assumption that there are no dominated points, we will always have such a weakly decreasing figure. Thus, pre-sorting solves the entire problem of "counting areas twice"!
Let us turn this into a dynamic programming algorithm. Consider a set of
npoints, labelled{p_1, p_2, ..., p_n}. Letd[k][m]be the maximum area of a subset of sizek + 1where the(k + 1)-th point in the subset is pointp_m. Clearly,mcannot be chosen as the(k + 1)-th point ifm < k + 1, since then we would have a subset of size less thank + 1, which is never optimal. We have the following recursion,The initial cases where
k = 1are the rectangular areas of each point. The initial cases together with the updating equation suffice to solve the problem. I estimate the following code asO(n^2 * k). The term squared inncan probably be lowered as well, as we have an ordered collection and might be able to apply a binary search to find the best subset inlog ntime, reducingn^2ton log n. I leave this to you.In the code, I have re-used my notation above where possible. It is a bit terse, but hopefully clear with the explanation given.
Using the example data you've provided, I find an optimal result of
0.381411, as desired.