> “In practice they are very efficient. They solve industrial-scale problems all the time,” sometimes with tens of thousands of variables.
Although the number of variables or constraints is not a good measure of the hardness of an ILP, "tens of thousands" is not that unusual. During my PhD, I often solved programs with millions of variables and constraints. These programs were so large that even "building" them in the solver library took several seconds. Impressively, gurobi was often able to optimize them in under 24 hours. I have two results here where gurobi solved 1M * 8M problems to optimality in under 2 hours. That was several years ago, so I expect newer versions to be even faster. Even for larger problems (around 5M * 50M) it often found provably near-optimal solutions (within 5% of optimality) very fast. gurobi is really an impressive piece of software, and free for academic use.
I was on CPLEX team for a few years. Core developers are all PhDs from Stanford/MIT and etc. So it's very hardcore stuff, no less than AI research.
Completely agree that size is not a good proxy for estimating MIP difficulties. Internally we colllected a bunch of very hard problems to sovle from different domains. Some are actually pretty small, say a few thousand variables/constraints. IMHO what made hard problems difficult to solve is actually the 'intneral structure' of the problems. And modern industry solvers all have a lot of built-in heurstics to take advantages of the structures, i.e., what kind of cuts, presolve/diving/branching strategy to apply, how to get the bounds ASAP.
Interestinglly at some point some folks even tried using machine learning to predict strategies. Didn't work quite well back then (10+ years ago). There was some work of using seq2seq for MIP (pointer network, I think) a few years ago; worked OK. So I'm really looking forward to some breakthroughs by LLM.
It's a shame that after IBM aquired ILOG (which owns CPLEX), most of the ppl left for Gurobi.
> IMHO what made hard problems difficult to solve is actually the 'intneral structure' of the problems.
20 years ago I wrote a master’s thesis in computer vision. The stereo matching algorithm I developed could be expressed as a big integer linear program. But after pondering it for some time I realized it could also be expressed as a dynamic programming problem, with tiny integer linear programs as subproblems. Reduced the runtime by like a factor of 1000x, or more.
I feel most of those big search problems could be solved much easier and quicker with some form of annealing/tree search/dynamic or greedy algorithms with results very close to the theoretical linear optimum
> with results very close to the theoretical linear optimum
In this case I could prove it was the globally optimal solution.
But this was only possible of course due to the internal structure of the problem: it was in effect a simpler problem hiding within a linear integer program. Standard solvers couldn’t find this structure, but it was possible to do by hand.
What's the situation with CPLEX these days? Is IBM still investing any serious resources into it? Gurobi keeps getting massively better (either in performance or expressiveness) every release.
CPlex has shown no progression in the benchmarks in the past years, so it is safe to assume that the number of developers they have employed is either 0 or maybe 1.
Not the same person but maybe "Mixed-Integer Nonlinear Programming: A Survey of Algorithms and Applications" could be a good start? I haven't read it yet.
I am an academic in the field. A good starting point would be Tobias Achterberg's PhD thesis:
"Constraint Integer Programming"
It details the implementation of SCIP, which is one of the leading open source solvers, and explains some of the most important tricks.
That said, there is a lot of literature out there; if you're interested in a particular aspect like e.g. presolving or cutting planes, then feel free to ask me further questions. It is worth noting that the implementation specifics are hidden by the top commercial solvers, so these are difficult to find anywhere
I love how researchers state that a piece of code doing actual work is "slow", once it takes a few seconds to run and yet windows 11 often takes multiple seconds to show me the context menu, when i right click something.
The Windows 11 context menu lag (and overall Windows bloat) is a product of Conway's law. Microsoft creates bloated software because it's a monopoly. So, "optimization" in this case would essentially be antitrust action.
I think the other piece going into it is that hardware keeps getting faster. I would imagine especially so the hardware of the average Windows developer. If something doesn't generally appear to be slow to those building it, they're unlikely to spend a lot of time optimizing.
I'm a fan of including benchmarks in CI pipelines. The Windows developers should have a historical record [1] of the performance degradation and be forced to look at it daily.
To be fair, that machine is pretty high-spec for NT 3.51, given that the minimum requirements are 12 MB RAM and a 386 (which tops out at 40 MHz). It'd certainly be nice to see that kind of performance in newer Windows versions, though.
> I think the other piece going into it is that hardware keeps getting faster.
It's also that you're expected to update windows whether or not they've actually improved performance. Why would microsoft prioritize that when the bulk of their sales are driven by other concerns, like enterprise sales and what some might call basic functionality (e.g. Windows Defender, when it was first released).
Hell, now they're moving to a service model and there's even less reason to court consumers directly.
it depends on what you're using it for. if you use an optimization algorithm after every keystroke to relayout the paragraph you're typing, it had better not take a few seconds to run. (and indeed there are optimization algorithms for that problem in particular that are much faster than a few seconds.)
* surprisingly, en.wikipedia has more information than ru.wikipedia, but neither even has a hint as to headcounts over the years, let alone more technical details.
I have the book “planning in the user” somewhere packed away. I remember it speaking about the number of variables and size of the problem in it. Sorry I don’t remember the numbers off the top of my head.
It is called duality. For each linear program that is maximizing, you can find a dual linear program wich is minimization. They both have the same optimal objective value. Thus one serves as a lower bound and the other as an upper bound on the optimal objective value. The gap between them is used to measure the closeness to the global optima.
Great thing is that you can find the dual really easily and each feasible solution of the primal problem can be used to get a feasible solution of the dual.
LPs can prove optimality of a result but sometimes this is to expensive and you stop as soon as you reach near optimality based on known bounds.
For several problems you can prove lower and/or upper bounds for the optimal results. For example you can prove that a route between two points may not be shorter than their distance.
If you use LP in production you often don't care about optimality and can work with suboptimal results very well. Worst case you stop after a timeout and just pick the best result.
Edit: I forgot to mention that during solving of a LP you usually get dynamically computed bounds. The branch and bounds algorithm works by computing bounds on relaxed problems that give upper bounds for the original problem.
I helped implement production and labor planning software on top of FICO xpress some years ago. The paradigm of LP/ILP was all new to me though I was very much into math. Our software was solving hierarchical/nested optimization problems involving millions of variables and constraints every 15 minutes. It felt like magic that this was possible. I would really encourage anyone that has never worked with these tools before to explore them as it can open a palette of tools and ideas you may not have thought of before. A great free way to get into this is to read the examples and use PuLP for Python.
The optimization solver landscape changes slow enough that a 5 year old overview it is still very relevant. However, in recent years the open source solver HiGHS (https://highs.dev/) has emerged and even surpassed the older open source solvers.
Linearity has been a sticking point since the very beginning, with the famous exchange between Hotelling and Von Neumann. MILP solvers are amazing, and what they can solve feels like it should be impossible, but even so you get stuck sometimes banging nonlinear pegs into linear holes.
Note this is an improvement to a theoretical bound and doesn’t present any immediate improvements for most practically solvable Integer Programming problems (which are generally solved with more efficient heuristics).
Yes, and in practice you also often want to solve 'mixed' integer programming problems. Ie those that have both continuous and integer variables.
That works really well in practice (because most solvers relax your (mixed) integer programming problem to a fully continuous problem anyway, and then try to 'repair' any fractional solutions as necessary), but is apparently really hard to analyse in theory.
Once we know it can be done, there is much higher chance somebody will look really hard into it and with a bit of luct, they will come up with a practical algorithm. Or at least good one for some edge cases.
What struck me when writing integer programs was how quickly the rule and variable count grows with the problem. Expressing simple ideas often required O(n²) rules as every pair of objects needed constrained relations. The default solver in PuLP solved all the problems with ease.
> “The proof basically says that you can solve an integer linear program, in theory, almost as well as you can solve one where the integer variable is only zero or one,”
I'm not sure I understand this. It's usually the opposite way
A linear problem is super fast. With integers it goes to branch and bound (a kinda fancy name for manual search). But lots of 1/0 variables just make your problem slow
If the parameters are allowed to be real numbers, it's very fast. just move "up" to an edge (constraint), then follow the edges upward to your maximum.
If the parameters need to be integers, then it becomes way more difficult. The optimum for the continuous case is probably not valid. So you need to enumerate a lot of points.
If the parameters can just be 0 or 1, it's a special case of the integer case: you still need to enumerate, but way less points.
> If the parameters are allowed to be real numbers, it's very fast. just move "up" to an edge (constraint), then follow the edges upward to your maximum.
Sounds like the simplex method, which is complicated and/or has potentially quite bad worst-case complexity. (You can implement it naively in very little code, and I’ve done this on some programming competitions, but you shouldn’t do this for serious use.) You can achieve better asymptomatic behavior with “interior point” methods, where you arrange for all the constraints to repel your optimum, solve a (nice, convex, polynomial time) problem to find the best point that touches none of the constraints, and then relax the repulsion from the constraints a bit and iterate. All that being said, a good simplex implementation may well run circles around interior point methods for well-behaved inputs.
(I’m a fan of interior point methods because they can also naturally handle fairly large classes of nonlinear constraints while still running in more or less polynomial time. Check out CVXOPT for a nice example of what you can do with great ergonomics.)
The point I was trying to make (it seems I failed) is that in the real case, you have a sense of direction (gradient) while this does not really exist in the discrete case.
Although the number of variables or constraints is not a good measure of the hardness of an ILP, "tens of thousands" is not that unusual. During my PhD, I often solved programs with millions of variables and constraints. These programs were so large that even "building" them in the solver library took several seconds. Impressively, gurobi was often able to optimize them in under 24 hours. I have two results here where gurobi solved 1M * 8M problems to optimality in under 2 hours. That was several years ago, so I expect newer versions to be even faster. Even for larger problems (around 5M * 50M) it often found provably near-optimal solutions (within 5% of optimality) very fast. gurobi is really an impressive piece of software, and free for academic use.