Hi @Stephane,
Thank you for providing a complete example to reproduce the issue. In this model you have parameters and variables indexed over two sets (locations and items). Since you have 3,000 items and 30,000 locations, this means 90,000,000 variables if all items are available in all locations. This may still be possible to solve with Gurobi or CPLEX on a machine with a substantial amount of memory, but it is way bigger than anything that an open-source solver will be able to tackle nowadays.
To get an idea of the model you do the following to see the size of the data being passed to AMPL:
print(cartesian_locations.unstack().shape)
The following code will display (21191625,)
(i.e., 21,191,625 rows):
import polars as pl
orders = pl.read_csv("./data/orders.csv", separator=";")
locations = pl.read_csv("./data/locations.csv", separator=";")
# orders = orders.filter(
# (pl.col("ITEM_ID") == 34682)
# | (pl.col("ITEM_ID") == 34657)
# | (pl.col("ITEM_ID") == 29840)
# )
cartesian_locations = (
orders.select("ITEM_ID").unique().join(locations, how="cross", suffix="_LOCATIONS")
)
cartesian_locations = cartesian_locations.select(
pl.col("ITEM_ID"),
pl.col("LOCATION_ID"),
(
pl.col("CART_STD1")
| pl.col("CART_STD2")
| pl.col("CART_DEMI-HAUT")
| pl.col("CART_VOLUMINEUX")
),
)
cartesian_locations = cartesian_locations.with_columns(
pl.col("CART_STD1").apply(lambda col: int(col))
)
cartesian_locations = cartesian_locations.pivot(
values="CART_STD1",
index="ITEM_ID",
columns="LOCATION_ID",
aggregate_function="first",
)
cartesian_locations = cartesian_locations.to_pandas()
cartesian_locations = cartesian_locations.set_index("ITEM_ID").transpose()
df = cartesian_locations.unstack()
print(df.shape)
You can filter this data to only include items available in each location. That should help improving the performance and reduce the memory usage substantially.
Note that instead of having variables indexed over {Items, Locations}
, you should index them over valid allocations to reduce the number of variables. This also avoids the constraint Valid_Allocation_Constraint
. Nevertheless, if the amount of valid allocations remains in the order of millions, you will only be able to solve it with commercial solvers.