A very complex combinations task that has been bugging me for 7 months

87 views Asked by At

Some seven months ago I went to a job interview at a very big company. They gave me this task to solve, and for the past 7 months I can't seem to be able to find a solution.


Here is the task: Some database has A entries. How many combinations (without repetition) with B amount (B < A) elements made out of A are there, that for any given B (contained in those A) different elements always contain at least X% of C entries (C < B) out of B given (C/B)? Include pattern for obtaining all of them. In short, we need:

  1. Formula for calculation of how many combinations satisfy the above conditions
  2. Formula for listing all of them. (any programming language or just detailed descriptive and any format)

Note: Both are mandatory, as those need to be set in a separate table in db.


After 2 hours of being totally clueless I was given a simplified version: Some database has 50 entries. How many combinations (without repetition) with 9 elements made out of those 50 are there, that for any given 9 different elements (contained in those 50) always contain at least 15% of 6 entries out of given 9 (6/9)? Include pattern for obtaining all of them. In short, we need:

  1. Formula for calculation of how many combinations satisfy the above conditions
  2. Formula for listing all of them. (any programming language or just detailed descriptive and any format)

Note: Both are mandatory, as those need to be set in a separate table in db.


Edit: To explain further. Let us say the result of (1.) is D possible subsets (combinations without repetition) with 9 elements from A. And some user of the database (or software using it) enters random 9 elements (from |A| = 50 set). This always needs to result in, that at least 15% of those D subsets has 6 out of 9 that user entered. It doesn't matter how many of those D has 1/9, 2/9, 3/9, 4/9, 5/9, 7/9, 8/9 and 9/9, the only thing that matters is that 15% and above have 6/9, for any 9/50 entered. Oh and D needs to be the minimal possible.

Edit2: Even further. Example: Set of A=50 entries is given. We need minimal amount of possible combinations/subsets without repetition with B=9 elements from those 50, that satisfy the following: When a user enters random 9 entries, 15%+ of resulting subsets must have 6 out of 9 that user entered. And the resulting subset must be uniform for any 9 that user can enter.


And I still failed. And I am still clueless of how to solve something like this.

1

There are 1 answers

1
ash On

I explain the simplified version: Let's name your database A, with |A|=50 elements in it. Now 6 elements of these 50 are special somehow and we want to keep track of them. We call the set of these 6 elements C.

Now to our job: We should count all subsets X of A with exactly 9 elements and at least 15% of their elements should come from C. Since 15% of 9 is 1.35 we need at least 2 elements of C in our sets X.

We know that there are binomial(50,9)=2505433700 subsets of A with 9 elements. Now lets count how many of them violate your criteria: there are 44 elements in A which are not in C, so there are binomial(44,9) subsets of A that contain no elements from C. Next we count how many 9-element-subsets of A contain exactly one element of C: We take a random 8-element-subset from A without C and put exactly one element from C to it, so we get 6*binomial(44,8) possibilities.

Now we can write our result, by taking all 9-element-subsets from A and subtracting those, that violate your criteria: binomial(50,9) - binomial(44,9) - 6*binomial(44,8) = 733107430.

Ok... now we know how much there are. But how do we list them? Let's use some pseude-code to do it:

AminC := setminus(A,C)

for j in 2..6 do
    for X1 in subsets(C, j) do
        for X2 in subsets(AminC, 9-j) do
            print(setadd(X1,X2))

This algorithm yields an alternative way of counting your sets:

binomial(6,2)*binomial(44,7) +...+ binomial(6,6)*binomial(44,3)=733107430.

Hope this helps..