Looking for suggestions on understanding a particular combinatorial optimization problem

479 views Asked by At

Given a set of items (sized anywhere from 1 to 100) and a number of bins (1 to 15.) Each item having a subset of bins the item can be assigned to and a preference ordering of which bin is best, second best, etc., just for it. Items also have a natural order, represented below by naming, e.g., item1 before item2. Each bin has a capacity between 1 and 5 (every item has identical weight, i.e., 1.)

An example input could be three bins and six items (- indicates a bin is not in the item's usable set, i.e., can't be packed with it):

      | bin1  bin2  bin3                        |  bin1  bin2  bin3   
------------------------               ----------------------------
item1 |    1     2     -               capacity |     4     4     5
item2 |    -     1     2               
item3 |    2     1     3
item4 |    1     2     3
item5 |    1     -     2
item6 |    1     2     3

The goals are (in order with each goal completely overriding any lower goal when there's a conflict, e.g., packing five items is always better than four no matter what number of bins are used or preferences ignored):

  1. maximize number of items packed
  2. pack items in their natural order, e.g., if total bin capacity is one and there are two items, item1 will be packed and item2 not
  3. minimize number of bins used
  4. pack each item according to its bin preferences and natural order, i.e, item1 in its first preference and item2 in its second is better than item1 in its second and item2 in its first
  5. in cases where two solutions are indistinguishable by these goals, either solution is acceptable to rank higher, e.g, as a side-effect of implementation or just arbitrary tie-breaking.

So the input above would be packed as:

      | bin1  bin2  bin3
------------------------
item1 |    x            
item2 |          x     
item3 |          x     
item4 |    x          
item5 |    x      
item6 |    x   

The question then is for what to read/review to help me come up with algorithm ideas for solving this problem with the input sizes from the first paragraph and a time constraint of a few seconds, i.e., not brute force (or at least any brute force I've conceived of so far.) I'm using Ruby and C but language isn't overly relevant at this stage of woods stumbling.

I'll be grateful of any reading suggestions, ideas on combinations of algorithms, or just thoughts on clarifying the problem statement...

Update 1

To be less unclear, while there are many algorithms that cover various parts of this my difficulty is in finding (or perhaps recognizing) information handling all the criteria together, especially minimizing the number of bins used when there is excess capacity and conflicting item-to-bin sets and item preferences, which is hopefully more clearly shown in the following example:


      | bin1  bin2  bin3                        |  bin1  bin2  bin3   
------------------------               ----------------------------
item1 |    1     2     3               capacity |     3     2     3
item2 |    1     2     3          
item3 |    -     1     2

While bin1 is the most preferred, item3 can't be placed in it at all, and while bin2 is the next most preferred for all items, it can hold only two of the three items. So the correct set of assignments (x) is actually the least preferred bin:

      | bin1  bin2  bin3
------------------------
item1 |               x  
item2 |               x  
item3 |               x  

Update 2 I reworked the description with information on how the goals relate and removed the variable of bin priority as it only makes finding an answer less likely and can be worked around elsewhere in the system I'm working on.

3

There are 3 answers

2
j_random_hacker On BEST ANSWER

Suppose there are n items and b bins, and each bin has size s. The ordering of constraints you have added actually simplifies the problem a great deal.

They mean specifically that we should always pick items 1, 2, ..., m for the largest m <= n that will fit in the allotted number of bins (since picking a smaller number would necessarily produce a worse solution by rule 1). Items will be packed in bins in this order, possibly with some bins left incompletely filled (since rearranging items within a bin or across bins would produce a worse solution by rule 2). There are 2 cases:

  1. m < n, meaning that we can't fit all the items. In that case, all b bins will be tightly packed with the 1st m items in that order, and we are done.
  2. m = n, in which case we can fit all the items. We now consider subcases of this case.

In this case, it may be possible that packing bins tightly will leave a final block of 0 < e <= b of the bins completely empty. In that case, discard those final e empty bins and proceed (since using more bins would produce a worse solution by rule 3). In any case, call the final number of bins remaining r. (r = b - e.)

We now know exactly which items and which bins we will be using. We also know the order in which the items must be packed. Because of the ordering constraint, we can regard the decisions about which bins are to be left incompletely filled as the problem of how to inject "start-next-bin" instructions into the ordered list 1, 2, ... n of items. We can inject up to r-1 of these instructions.

This problem can be solved in O(nrs) time using dynamic programming. Essentially we compute the function:

f(i, j, k) = the score of the best solution in which the first i items occupy the first j boxes, with exactly k items in the jth box.

The recurrence is:

f(i, j, 0)     = max(f(i, j-1, k)) over all 0 <= k <= s
f(i, j, k > 0) = f(i-1, j, k-1) + q(i, j)

Where q(i, j) is the quality score of assigning item i to box j. (As I mentioned in the comments on your post, you need to decide on some way to assign scores for a placement of any item i into any box j, presumably based on how well i's preferences are met. If it's easier to work with "badness" values than quality values, just change the max() to a min() and the -infinity boundary values below to infinity.)

The first equation says that the best score of a solution for the first i items whose rightmost bin is empty is equal to the best score that can be found by considering every solution for the first i items without that bin. These candidate solutions consist of all the ways that the previous bin can be packed, including leaving it empty too.

The second equation says that the best score for the first i items whose rightmost bin is not empty is found simply by adding the quality score for placing the last item to the best score for placing the first i-1 items in the same number of bins.

The boundary conditions are:

f(0, 0, 0) = 0
f(i, 0, k) = -infinity for all other i and k

After calculating the values of f(i, j, k) for each 0 <= i <= n, 0 <= j <= r and 0 <= k <= s and storing them in a table, f(n, r, s) will give the optimal score of the final solution. Although this only gives the score of the maximum, the actual optimal solution(s) can be found by tracing back through the f(i, j, k) matrix from the end, at each k = 0 step looking for the predecessor state (i.e. the alternative under the max()) that must have led to the current state. (It may happen that several alternatives under the max() give equal scores, in which case multiple optimal solutions exist, and any of these paths can be followed to find just one of them.)

2
AShelly On

This reminds me of the "Match" algorithm used to place medical school graduates in residency programs. What if you treat the items like students, their bin preferences like the rank lists, and the bins like hospitals?

Basically, you go through the list of items, and for each item, find the bin it prefers most. Check with the bin: do you have room for this item, and if not, do you prefer it more than any items you currently have? If no, cross this bin off the item's list, and move to the item's next choice.
If yes, place this item in the bin, and put the displaced item (if any) back in the unmatched pool.

The difference between your problem and the residency match is that you wouldn't fix the bin's preferences up front. Instead you would use a rule that prefers items that bring the bin closest to 100% full.

My only concern is that this modification might make the algorithm unstable. But it's such a simple algorithm, it's probably worth trying.

2
Jules On

This is a bipartite matching problem and can be solved in polynomial time.

http://en.wikipedia.org/wiki/Matching_(graph_theory)#Maximum_matchings_in_bipartite_graphs