Is there a standard algorithm to balance overlapping objects into buckets?

605 views Asked by At

I have a bunch of users, with a given start and end time, e.g.:

{ Name = "Peter", StartTime = "10:30", EndTime = "11:00" },
{ Name = "Dana", StartTime = "11:00", EndTime = "12:30" },
{ Name = "Raymond", StartTime = "10:30", EndTime = "14:00" },
{ Name = "Egon", StartTime = "12:00", EndTime = "13:00" },
{ Name = "Winston", StartTime = "10:00", EndTime = "12:00" }

I want to put them into buckets, based on times that they overlap (based on a configurable threshold, e.g., they need to overlap at least half an hour). I want buckets to be ideally 4 items big, but any range from 2-5 is acceptable.

In the example above, no 4 people match, so I'd have a bucket of 3 (Peter, Raymond, Winston) and one of 2 (Dana, Egon).

I've prototyped an algorithm that seems to rely on chance rather than science:

  1. Order the List by StartTime
  2. Create an empty bucket
  3. Pick the first user from the List
  4. Check that user against all users in the bucket
  5. If that user overlaps with everyone in the bucket, put that person in it and remove it from the list
  6. If the bucket has the ideal size (4) or if I'm looping and checking the same user more than three times, close the bucket and create a new, empty one

This works well for the first few buckets, but leads to buckets with only 2 people that could be combined better.

I could change the algorithm to remove all ideal buckets from the list and reshuffle and try some more, but I feel that this should be a common problem - it's like shift assignments for workers, or maybe the knapsack problem.

Does anyone know a standard algorithm for this type of problem?

(Tagged combinatorics because I think this is the area of math it applies, correct me if wrong)

3

There are 3 answers

0
David Eisenstat On BEST ANSWER

tl;dr: dynamic programming for the win (O(sort(n)) time).

First, note that bucketing contiguously in start-time order is fine.

Proposition (defragging): Let a, b, c, d be distinct users such that StartTime(a) ≤ StartTime(b) ≤ StartTime(c) ≤ StartTime(d). If X and Y are valid buckets such that a, c ∈ X and b, d ∈ Y, then X - {c} ∪ {b} and Y - {a} ∪ {d} are valid buckets as well.

I only know how to prove this by tedious case analysis (omitted).

The upshot is, you can pretend as though you're breaking a paragraph into lines, where the “paragraph“ is the list of users in start-time order, and each “line” is a bucket. There is an algorithm due to Knuth and Plass for line breaking optimally, where the penalty for a given line is a more or less arbitrary function. You could make buckets of 4 cost 0, buckets of 3 cost 1, buckets of 2 cost 2, and buckets of 1 cost 100, for example.

1
Orren Ravid On

Based on your problem I would probably do something like first making a class called "Person" pr something like that. Give this class attributes of "Name", "Start Time", and "End Time".

class Person
{
     public string name;
     public double start_time;
     public double end_time;
}

Then put them in some ordered list of type Person. (Also I'm currently storing the times as doubles. You can convert them back to times simply by multiplying any decimal part of the time that I have by 60/100.

Afterwords, you make a list of Buckets to which you can add new Buckets if need be. You then sort the list based on a threshold which you define and if two objects being compared overlap based on that threshold, then those both go into that Bucket. If they don't overlap, then move to the next Bucket, if there is an overlap there then add it to that Bucket, etc until you reach the last Bucket. If you have gone through all the Buckets and there is still no overlap, then create a new Bucket for that object.

class MainFunc
{    
    static void Main(string[] args)
    {    

         //makes a function to check if 2 values overlap over a threshold
         //st stands for start time and et stands for end time

         public bool IsWithinThreshold(double st1, double st2, double et1, double et2)
         {
             double threshold = .5;
             if(st1 >= et2 || st2 >= et1)
             {
                 return false
             }
             else
             {
                 if(st1+threshold <= et2 && st1+threshold <= et1 || st2+threshold <= et1 && st2+threshold <=et2)
                 {
                     return true;
                 }
                 else
                 {
                     return false;
                 }
             }
         }           
        // makes objects of type Person with attributes of name, start time, and end time

        Person Peter = new Person();
        Peter.name = "Peter"
        Peter.start_time = 10.5
        Peter.end_time = 11.0

        Person Dana = new Person();
        Dana.name = "Dana"
        Peter.start_time = 11.0
        Peter.end_time = 12.5

        Person Raymond = new Person();
        Raymond.name = "Raymond"
        Raymond.start_time = 10.5
        Raymond.end_time = 14.0

        Person Egon = new Person();
        Egon.name = "Egon"
        Egon.start_time = 12.0
        Egon.end_time = 13.0

        Person Winston = new Person();
        Winston.name = "Winston"
        Winston.start_time = 10.0
        Winston.end_time = 12.0

        //puts objects of type Person into an unordered list

        List<Person> people = new List<Person>();
        people.Add(Peter);
        people.Add(Dana);
        people.Add(Raymond);
        people.Add(Egon);
        people.Add(Winston);

        //sets up a list of lists of People (Buckets in our case)

        List<List<Person>> Buckets = new List<List<Person>>;

        //sets up an intial Bucket and adds the first person on the list to it

        List<Person> Bucketinitial = new List<Person>;
        Bucketinitial.add(people[0]);


        for(var i = 1; i < people.Count; i++)
        {
            for(var j = 0; j< Buckets.count; j++)
            {
                //sets a checker to make sure that all objects in a given Bucket overlap with the person we are checking

                bool overlap = true;

                for(var k = 0; k< Buckets[k].count; k++)
                {
                overlap = overlap & IsWithinThreshold(people[i].start_time,Buckets[j][k].start_time,people[i].end_time,Buckets[j][k].end_time)
                }

                if (overlap == true)
                {
                    Buckets[j].add(people[i])
                }

                //if all the objects in a bucket don't overlap with the person...
                //... make a new Bucket with that person
                else
                {
                    List<Person> NewBucket = new List<Person>;
                    NewBucket.add(people[i]);
                    Buckets.add(NewBucket);
                }
            }
        }
    }
}

Then just add a print command to print out the name attributes of each object within each Bucket of the buckets list. Please leave a comment if you have any questions/concerns, cheers.

0
dpmcmlxxvi On

You can modify your algorithm to incorporate an interval tree to speed up the search

  1. Order the List by StartTime
  2. Add items to an interval tree
  3. Create an empty bucket
  4. Pick the first item from the List
  5. Find earliest items within threshold time of first item that fill up a bucket using an interval search of the interval tree
  6. Remove bucketed items from list
  7. If list is empty stop otherwise go to step 4

Basically you're moving from left to right in interval steps (given by your configurable threshold) using the interval tree to quickly query for closest items as you move.