David MacIver: Properties for testing optimisation

I’ve been thinking a bunch recently about how to use Hypothesis to test constrained optimisation problems. I put together this talk for pydata London, and yesterday I sketched out these tests for how you might test a solution to the knapsack problem (both of these are specifically combinatorial optimisation problems, but the principles I’m outlining here work in general).

Essentially there are only two properties to constrained optimisation problems:

For all input problems, the optimised solution satisfies the constraints
For all input problems and solutions satisfying the constraints of those problems, the solution does not score better than the optimised solution.

The first property is usable as is – you write a generator for the problems you’re trying to solve, write a test that runs the optimiser on the problem and spits out a solution, check it satisfies the constraints. This is a super useful test for the correctness of your optimiser and you should definitely write it.

But really the one we’re interested in is the second property. We want to assert that the solution our optimiser provides us with is actually optimal.

The problem is that fuzz testers are not optimisers, and if your optimal solution is worse than simply picking a point at random, basically any sort of test will demonstrate this. You can’t reliably expect Hypothesis or similar to find an improvement just by asking for arbitrary data.

If we have a reference solution (e.g. by restriction to a sufficiently small corner of the space that we can solve the problem through brute force) then that’s one way of doing it: Just run the problem through both your optimiser and the reference solution and assert that they produce solutions with the same score (note: Not necessarily the same solution. There may be multiple equally good optima). This also works if you have a reference but not very good solution – if you implement say a cheap local optimiser and assert that running the local optimiser on a randomly picked point doesn’t improve on your optimal solution, this is still a pretty good test.

But I’m interested in testing the case where we don’t have a reference solution. How would we do that?

Without further ado, here are some properties I’ve figured out that seem to work well:

If we take the optimal solution and perturb it slightly, this should never improve on the optimal solution. Bonus: If you have a local optimiser, perturb the solution slightly and then run the local optimiser on that perturbed solution.
If we make the problem strictly easier (e.g. in the knapsack problem by raising the value of one of the items we’ve already chosen) then this should strictly improve the score if we rerun the optimiser.
If we make the problem strictly harder (e.g. by removing one of the chosen items in the knapsack) and rerun the optimiser, this should never improve the score.

The last seems the most promising to me, because it helps create situations where you break out of local optima. You potentially kill off the local optima you’ve found yourself in and see if you can find a better one. Notably, both of the tests that demonstrated a problem in the knapsack algorithm (increasing the weight of a chosen item, removing a chosen item) are of this form.

The first is also useful, particularly for testing a local optimiser because it helps you find new things you can add in to the optimisation process! e.g. if you find that reversing the solution and rerunning the optimiser on the reversed solution sometimes improves matters, why not iterate that to a fixed point in your optimisation process?

The middle solution is I suspect not that useful because in general things that make the problem easier by promoting stuff that was already chosen, so the optimiser will tend to make the same choice. I might be missing cases where this is actually a useful thing to be doing.

I’m going to keep an eye on this, as it seems to be an under-explored area (not completely unexplored – there’s some research on doing more classic fuzz testing on SMT solvers that I need to read up on) and I think it might inform useful heuristics for other places where you can do property based testing (machine learning for example is something that I’m interested in figuring out how to do).

David MacIver: Properties for testing optimisation

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112