I have a scheduling optimisation problem where I need to optimise Activities to be performed in a given timeframe based on priority and dependencies.
The issue is that when creating dependencies, I have to create (and iterate) an insane amount of constraints with the current implementation, which results in issues when solving for high values of time_horizon.
Both dependent_activity and prerequisite activity are a list of binary variables Activity_A_0 which dictates whether an Activity_A is occurring at time t=0.
The current implementation makes sure that a 'prerequisite_activity' has occurred at least as many times as its 'duration' before the 'dependent_activity' can start by checking that the 'dependent_activity' at any point in time is smaller than or equal to the sum of it's prerequisite activities divided by it's duration (ie, it will only be 1 when the whole activity has completed).
import pulp
# Define activities with durations, resource requirements, priorities, and parallel constraints
# Resources are per timeframe (ie, if every hour you need 1 machine, the value is 1 not 1*hours)
# priority is total
activities = {
"Activity A": {
"duration": 2,
"resources": {"resource1": 1, "resource2": 1},
"priority": 3,
},
"Activity B": {
"duration": 3,
"resources": {"resource1": 2, "resource2": 1},
"priority": 2,
},
"Activity C": {
"duration": 1,
"resources": {"resource1": 1, "resource2": 2},
"priority": 1,
},
"Activity D": {
"duration": 2,
"resources": {"resource1": 1, "resource2": 1},
"priority": 4,
},
}
# Define activity dependencies
activity_dependencies = [
("Activity B", "Activity A"),
("Activity C", "Activity B"),
]
for name, value in activities.items():
activities[name]["priority"] = (
activities[name]["priority"] / activities[name]["duration"]
)
# Define the time horizon
time_horizon = 10
# Create a LP problem
problem = pulp.LpProblem("ScheduleOptimization", pulp.LpMaximize)
# Create binary variables for each activity and time slot
activity_vars = {}
for activity in activities:
for t in range(time_horizon):
activity_vars[(activity, t)] = pulp.LpVariable(
f"{activity}_{t}", 0, 1, pulp.LpInteger
)
# Create a variable to represent the total priority
total_priority = pulp.LpVariable("TotalPriority", cat=pulp.LpContinuous)
# Objective: Maximize the total priority
problem += total_priority
# Constraints
## Activity Dependencies
for (dependent_activity, prerequisite_activity), t in itertools.product(
activity_dependencies, range(time_horizon)
):
problem += (
activity_vars[(dependent_activity, t)]
<= pulp.lpSum(activity_vars[(prerequisite_activity, tt)] for tt in range(0, t))
/ activities[prerequisite_activity]["duration"]
), f"Dependency ({dependent_activity},{prerequisite_activity}) t={t}"
# Update total_priority variable to reflect the actual total priority
problem += total_priority == pulp.lpSum(
activity["priority"] * activity_vars[(activity_name, t)]
for activity_name, activity in activities.items()
for t in range(time_horizon)
)
# Solve the problem
problem.solve(pulp.PULP_CBC_CMD(msg=1))
# Print the schedule
schedule = {}
for activity_name in activities:
for t in range(time_horizon):
print(
activity_vars[(activity_name, t)],
"___",
pulp.value(activity_vars[(activity_name, t)]),
)
if (
pulp.value(activity_vars[(activity_name, t)]) == 1
and schedule.get(activity_name) is None
):
schedule[activity_name] = t
print("Optimal Schedule:")
for activity, start_time in schedule.items():
print(f"{activity} starts at time {start_time}")
print(f"Total Priority: {pulp.value(total_priority)}")
Here I expect Activity A to complete before B which in turn needs to complete before C. So A>B>C
The number of these constraints is len(activity_dependencies) * time_horizon and the last of these constraints have as many as time_horizon variables which causes the optimiser to crash.
I need a solution which, ideally, reduces the number of constraints and the size of these constraints.
You might consider defining the problem variables in terms of start and end times of each activity, instead of creating binary variables that represent each activity status at a given time.
Beware that this solution has a caveat that is detailed in the "Notes" section.
Here's the modified optimization problem code:
Execution Time Comparison
The results shown in the table below represent the execution time for the new implementation versus the original problem formulation, using a
time_horizon
equal to1000
.As you can see, the new implementation is much more efficient, since its optimization problem defines fewer variables and constraints, and the constraints being added are much smaller. Additionally, it scales more efficiently as larger
time_horizon
values are set.Notes
Here's the catch of this modified problem formulation: it assumes that for every existing dependent activity, the prerequisite activity start time plus duration never exceeds the defined
time_horizon
value.For example, say you have the following
activities
dictionary andactivity_dependencies
list:For the above example, the modified optimization problem returns the following activities start and end times:
However, since
"Activity E"
duration plus its start time exceeds thetime_horizon
value, the"Activity E"
start and end times should in fact be both equal to 0 instead of 2 and 10.