Python Gekko Max Equation Length Error for Integer Programming

44 views Asked by At

I've used Gekko to solve an MILP problem, where I am expected to select the optimal price for each product, in order to maximize my objective function (the total profit). However, I have had to slice the dataframe, as if I include the whole data set, I get a Max Equation Length Error. Is there any way to resolve this to input a larger data frame in to the solver? I was originally using the PuLP package, but the execution time of the code was taking unreasonably long for larger datasets. I've attached my code below.

# Preprocess data
price_matrix = []
profit_matrix = []
revenue_matrix = []
grevenue_matrix = []
num_prices_per_product = []

df = df_org[:200]

# Iterate over unique products
for product in df['Product'].unique():
    # Filter data for the current product
    product_data = df[df['Product'] == product]
    
    # Extract unique price points and sort in descending order
    price_points = np.sort(product_data['Sales Price'].unique())[::-1]
    num_prices = len(price_points)
    
    # Preallocate arrays
    product_profit_matrix = np.zeros(num_prices)
    product_revenue_matrix = np.zeros(num_prices)
    product_grevenue_matrix = np.zeros(num_prices)

    # Iterate over price points
    for i, price in enumerate(price_points):
        # Filter data for the current price point
        price_data = product_data[product_data['Sales Price'] == price]
        if not price_data.empty:
            # Assign values to matrices
            product_profit_matrix[i] = price_data['Profit Margin'].iloc[0]
            product_revenue_matrix[i] = price_data['Revenue'].iloc[0]
            product_grevenue_matrix[i] = price_data['Gross revenue'].iloc[0]

    # Append matrices and other information
    price_matrix.append(price_points)
    profit_matrix.append(product_profit_matrix)
    revenue_matrix.append(product_revenue_matrix)
    grevenue_matrix.append(product_grevenue_matrix)
    num_prices_per_product.append(num_prices)

start = time.time()

# Initialize gekko model
m = GEKKO(remote=False)

# Decision variables
x = {}
for i, product_name in enumerate(df['Product'].unique()):
    for j in range(max(num_prices_per_product)):
        x[(product_name, j)] = m.Var(value=0, lb=0, ub=1, integer=True)

# Objective function (maximize total profit)
total_profit = 0
for i, product_name in enumerate(df['Product'].unique()):
    for j in range(num_prices_per_product[i]):
        total_profit += profit_matrix[i][j] * x[(product_name, j)]
m.Maximize(total_profit)

# Constraint: Each product must have exactly one selected price
for i, product_name in enumerate(df['Product'].unique()):
    m.Equation(sum(x[(product_name, j)] for j in range(num_prices_per_product[i])) == 1)

# Discount Constraint
revenue_difference = 0
for i, product_name in enumerate(df['Product'].unique()):
    for j in range(num_prices_per_product[i]):
        revenue_difference += (grevenue_matrix[i][j] - revenue_matrix[i][j]) * x[(product_name, j)]
discount_constraint = 0.13 # Set your desired maximum discount
discount_tolerance = 0.01
total_gross_revenue = 0
for i, product_name in enumerate(df['Product'].unique()):
    for j in range(num_prices_per_product[i]):
        total_gross_revenue += grevenue_matrix[i][j] * x[(product_name, j)]
m.Equation(revenue_difference <= (discount_constraint + discount_tolerance) * total_gross_revenue)
m.Equation(revenue_difference >= (discount_constraint - discount_tolerance) * total_gross_revenue)

# Profit Constraint
profit_constraint = 6000
profit_tolerance = 0.05
m.Equation(total_profit <= profit_constraint + (profit_tolerance * profit_constraint))
m.Equation(total_profit >= profit_constraint - (profit_tolerance * profit_constraint))

# Solve the optimization problem
m.solve(disp=True)

end = time.time()
print("Optimization Time:", end - start)

# Print the results
print("Status:", m.options.SOLVESTATUS)
print("Total Profit ($):", total_profit)

# Print the selected prices
results_df = pd.DataFrame(selected_prices, columns=["Product", "Selected Price Point", "Value (AED)"])
print(tabulate(results_df, headers='keys', tablefmt='fancy_grid', showindex=False))

I tried using Gekko and PuLP to solve a MILP problemn, but ran in to issues with using my algorithm on large datasets.

1

There are 1 answers

0
John Hedengren On

A simple problem demonstrates the issue and a potential solution. There are n randomly generated products, each with a random price and revenue. The objective is to maximize profit by selecting the best 10 products. The solution selects the 10 products with the highest margins.

from gekko import GEKKO
import numpy as np

# Random data
n=500
price = np.random.randn(n)
revenue = np.random.randn(n)

# Create Gekko model
m = GEKKO(remote=False)

# Decision variables
x = [None]*n
for i in range(n):
   x[i] = m.Var(value=0,lb=0,ub=1,integer=True)

# Select 10 products out of n
m.Equation(sum(x)==10)

# Maximize profit
profit = 0
for i in range(n):
   profit+=x[i]*(price[i]-revenue[i])
m.Maximize(profit)   

m.options.SOLVER=1
m.solve()

When n>=413, there is the error that a symbolic expression is over 15,000 characters.

 APM model error: string >       15000  characters
 Consider breaking up the line into multiple equations
 
 The may also be due to only using newline character CR
   instead of CR LF (for Windows) or LF (for MacOS/Linux) 
 To fix this problem, save APM file with appropriate newline characters
 
 STOPPING...

There are two changes that address the long expression problem.

  1. Use m.sum() instead of sum(). This uses the gekko summation that takes longer to compile, but doesn't generate a single large expression that may violate the 15,000 character limit.
# Select 10 products out of n
m.Equation(m.sum(x)==10)
  1. For the objective function, try using multiple objective functions without the summation.
# Maximize profit
for i in range(n):
   m.Maximize(x[i]*(price[i]-revenue[i]))

Now the model can solve with n=500 or many more products.

from gekko import GEKKO
import numpy as np

# Random data
n=5000
price = np.random.randn(n)
revenue = np.random.randn(n)

# Create Gekko model
m = GEKKO(remote=False)

# Decision variables
x = [None]*n
for i in range(n):
   x[i] = m.Var(value=0,lb=0,ub=1,integer=True)

# Select 10 products out of n
m.Equation(m.sum(x)==10)

# Maximize profit
for i in range(n):
   m.Maximize(x[i]*(price[i]-revenue[i]))

m.options.SOLVER=1
m.solve()

Here is the solution with n=5000 products.

 --------- APM Model Size ------------
 Each time step contains
   Objects      :            1
   Constants    :            0
   Variables    :         5001
   Intermediates:            0
   Connections  :         5001
   Equations    :         5001
   Residuals    :         5001
 
 Number of state variables:           5001
 Number of total equations: -            2
 Number of slack variables: -            0
 ---------------------------------------
 Degrees of freedom       :           4999
 
 ----------------------------------------------
 Steady State Optimization with APOPT Solver
 ----------------------------------------------
Iter: 1 I: 0 Tm: 0.02 NLPi: 3 Dpth: 0 Lvs: 0 Obj: -4.47E+01 Gap: 0.00E+00
 Successful solution
 
 ---------------------------------------------------
 Solver         :  APOPT (v1.0)
 Solution time  :   3.630000000703149E-002 sec
 Objective      :   -44.6851835859332     
 Successful solution
 ---------------------------------------------------

You can inspect the symbolic expressions in the temporary run folder m._path or open the folder with m.open_folder(). Open the text file gk0_model.apm to see the variables, equations, and objective function as an archive of what was compiled to byte-code.

Large-scale problems with many binary decision variables may solve slowly. There are APOPT solver options that may help to speed up the solution with the MINLP solver. If the problem is MILP, there are good solvers such as CPLEX and Gurobi that are likely more efficient than using an MINLP solver.