Solving Sudoku in Python Packaging

simonw · 2024-10-23T02:20:27 1729650027

I love this so much. I dug around a bit and figured out how it works - I have an explanation (with an illustrative diagram) here: https://simonwillison.net/2024/Oct/21/sudoku-in-python-packa...

Figuring out how it works is a great way to learn a bit more about how Python packaging works under the hood. I learned that .whl files contain a METADATA file listing dependency constraints as "Requires-Dist" rules.

I ran a speed comparison too. Using the uv pip resolver it took 0.24s - with the older pip-compile tool it took 17s.

TeMPOraL · 2024-10-23T10:38:40 1729679920

Tangent, but I wondered what libuv had to do with speeding up Python packaging, and it turns out nothing. I wonder why someone choose to name a pip replacement in a way that effectively collides with several tools and libraries across many languages...

giancarlostoro · 2024-10-23T14:16:52 1729693012

I agree.. While I think it looks amazing, it's a poor naming choice.

seanw444 · 2024-10-23T04:56:13 1729659373

Wow, uv really is fast.

jebebeebehhe · 2024-10-23T06:30:10 1729665010

As is simonw writing that post in under 60m assuming he first saw the concept here on HN.

simonw · 2024-10-23T12:03:26 1729685006

Nah I wrote this one a couple of days ago when I first saw the Sudoku solving project on Mastodon.

jeanlucas · 2024-10-23T13:04:39 1729688679

Thank you, I was going to point out the post on your blog has date of publication lol

Medox · 2024-10-23T08:58:38 1729673918

Using the simonw resolver is took under 3600s *

zahlman · 2024-10-23T06:43:01 1729665781

People keep trying to sell the speed of such solutions as a killer feature for uv, but I think I must not be anywhere near the target audience. The constraint-solving required for the sorts of projects I would typically work on is not even remotely as complex, while I'm bottlenecked by a slow, unreliable Internet connection (and the lack of a good way to tell Pip not to check PyPI for new versions and only consider what's currently in the wheel cache).

the_mitsuhiko · 2024-10-23T11:19:23 1729682363

> while I'm bottlenecked by a slow, unreliable Internet connection (and the lack of a good way to tell Pip not to check PyPI for new versions and only consider what's currently in the wheel cache).

Which is one of the reasons why uv is so fast. It reduces the total times it needs to go to PyPI! Not only does it cache really well, it also hits PyPI more efficiently and highly parallel. Once you resolved once, future resolutions will likely bypass PyPI for the most part entirely.

zahlman · 2024-10-23T11:57:54 1729684674

Oh, that's good to hear. The performance discourse around uv seems to revolve around "written in Rust! Statically compiled!" all the time, but an algorithmic change like that is something that could conceivably make its way back into Pip. (Or perhaps into future competing tools still written in Python. I happen to have a design of my own.)

kvdveer · 2024-10-23T07:58:38 1729670318

Our CI took 2 minutes to install the requirements. Adding UV dropped that to seconds. Now most time is spent on running tests, instead of installing requirements.

Of course we could've cached the venv, but cache invalidation is hard, and this is a very cheap way to avoid it.

zahlman · 2024-10-23T09:54:58 1729677298

Or you could cache the solve, surely? Simply explicitly including your transitive dependencies and pinning everything should do the trick - but there is work being done actively on a lockfile standard, it's just the sort of topic that attracts ungodly amounts of bikeshedding.

(One of these days I'll have to figure out this "CI" thing. But my main focus is really just solving interesting programming problems, and making simple elegant tools for mostly-small tasks.)

simonw · 2024-10-23T18:41:28 1729708888

More significant than the speed improvement in my opinion is the space saving.

The reason uv is fast is that it creates hard links from each of your virtual environments to a single shared cached copy of the dependencies (using copy-on-write in case you want to edit them).

This means that if you have 100 projects on your machine that all use PyTorch you still only have one copy of PyTorch!

zahlman · 2024-10-23T20:59:22 1729717162

This is definitely a feature I wanted to have in my own attempt (hopefully I can start work on it before the end of the year; I think I'll bump up my mental priority for blogging about the design). I also have considered symlink (not sure if this really works) and .pth file-based approaches.

(TIL `os.link` has been supported on Windows since 3.2.)

itsbjoern · 2024-10-23T07:35:38 1729668938

Personally I’m just a fan of people improving dev tooling, regardless of it ultimately making a huge difference to my workflow. I haven’t used uv yet, but I’m still tangentially following it because despite pip and poetry being great tools I have had my fair share of grievances with them.

yunohn · 2024-10-23T14:38:50 1729694330

Using uv (IME) should be a drop-in replacement for almost all of your python packaging needs. I highly recommend giving it a shot - it’s saved me measurable time in just a few months.

ilyagr · 2024-10-23T18:01:13 1729706473

How does it encode the idea of having all the numbers on each line/square?

visarga · 2024-10-23T04:11:25 1729656685

That's why it feels like installing a ML repo is like sudoku. You install everything and at the last step you realize your neural net uses FlashAttention2 which only works on NVIDIA compute version that is not deployed in your cloud VM and you need to start over from scratch.

hskalin · 2024-10-23T07:07:42 1729667262

Sometimes I just change the version of the package in requirements to fit with others and pray that it works out (a few times it does)

pjc50 · 2024-10-23T08:58:31 1729673911

See the discussion on why sqlite insists on vendoring its build dependencies as far as possible and not using, say, CMake.

austinjp · 2024-10-23T08:55:06 1729673706

This describes the day I wasted on Monday before I gave up and wrote some damn deterministic code instead of using some damn AI.

nicman23 · 2024-10-23T07:16:50 1729667810

honestly if the ml does not have a docker image - not compose no build an image- i do not even bother any more

anthk · 2024-10-23T15:58:26 1729699106

Guix fixes that in the spot.

chatmasta · 2024-10-23T04:13:34 1729656814

Here’s the same thing in Poetry (2022): https://www.splitgraph.com/blog/poetry-dependency-resolver-s...

teschmitt · 2024-10-23T06:17:11 1729664231

Was just about to say: I've seen this before but building it with a universally usable requirements.txt is even cooler.

echoangle · 2024-10-23T04:52:40 1729659160

> Solving the versions of python package from your requirements is NP-complete, in the worst case it runs exponentially slow. Sudokus are also NP-complete, which means we can solve sudokus with python packaging.

Is that actually sufficient? Can every system that’s solving something that’s NP-complete solve every other NP-complete problem?

tzs · 2024-10-23T14:38:22 1729694302

> Can every system that’s solving something that’s NP-complete solve every other NP-complete problem?

Others have given the answer (yes) and provided some links. But it is nice to have an explanation in thread so I'll have a go at it.

The key idea is the idea of transforming one problem to another. Suppose you have some problem X that you do not know how to solve, and you've got some other problem Y that you do know how to solve.

If you can find some transform that you can apply to instances of X that turns them into instances of Y and that can transform solutions of those instances of Y back to solutions of X, then you've got an X solver. It will be slower than your Y solver because of the work to transform the problem and the solution.

Now let's limit ourselves to problems in NP. This includes problems in P which is a subset of NP. (Whether or not it is a proper subset is the famous P=NP open problem).

If X and Y are in NP and you can find a polynomial time transformation that turns X into Y then in a sense we can say that X cannot be harder than Y, because if you know how to solve Y then with that transformation you also know how to solve X albeit slower because of the polynomial time transformations.

In 1971 Stephen Cook proved that a particular NP problem, boolean satisfiability, could serve as problem Y for every other problem X in NP. In a sense then no other NP problem can be harder than boolean satisfiability.

Later other problems were also found that were universal Y problems, and the set of them was called NP-complete.

So if Python packaging is NP-complete then every other NP problem can be turned into an equivalent Python packaging problem. Note that the other problem does not have to also be NP-complete. It just has to be in NP.

Sudoku and Python Packaging both being NP-complete means it goes both ways. You can use a Python package solver to solve your sudoku problems and you can use a sudoku solver to solve your Python packaging problems.

zahlman · 2024-10-23T06:40:12 1729665612

>Can every system that’s solving something that’s NP-complete solve every other NP-complete problem?

Yes, by definition (https://en.wikipedia.org/wiki/NP-completeness , point 4).

rolisz · 2024-10-23T05:00:48 1729659648

Yes, NP complete means that every other NP problem is reducible to it.

arjvik · 2024-10-23T07:32:20 1729668740

I think for Sudoku to be NP-Complete, it needs to be generalized to arbitrary board sizes (at the very least)

tetha · 2024-10-23T09:37:57 1729676277

Correct, if the complexity guys talk about NP-complete sudoku, it's always about solving Sudokus of an arbitrary, but fixed and finite size.

The problem class of "Solve an arbitrary Sudoku of Size 9" might even be constant runtime, since it's a finite set to search through.

SJC_Hacker · 2024-10-23T17:46:53 1729705613

Does the Sudoku size have to be perfect square, or can other sizes exist?

I suppose you could leave some blank and make the squares the next largest perfect square

tetha · 2024-10-23T19:19:39 1729711179

Hm. The biggest problem there are the definition of "diagonal" and "box".

I mean non-square sudokus are still in NP. If a solution can be validated in polynomial time, a problem is in NP.

And to validate a (x, y) sized sudoku, you need to check (x * y) [ number of total squares] * (x [row] + y [column] + max(x, y)ish [diagonal-ish] + ((x/3) + (y/3)) [box]). The boxes might be weird, but boxes are smaller than the overall sudoku, so we are looking at some max(x, y)^4 or so. Same for the diagonals. The input is x*y numbers, so max(x, y)^2. Very much polynomial in the size of the input[1]

And it should also be easy to show that if an (n, n) sized sudoku has a solution, an (n+k, n+k) sized sudoku has a solution. You kinda shove in the new numbers in knights-kinda moves and that's it.

1: this can be a bit weird, because you need to be careful "what you're polynomial in". If your input is a number or two, you might be polynomial in the magnitude of the number, which however is exponential with the input length.

In this case however, we wouldn't have encoding shenanigans, since we're just placing abstract symbols from the turing machine's alphabet onto an imagined grid.

blharr · 2024-10-24T14:18:36 1729779516

I looked it up, and there are 6.6e21 9x9 sudokus, which I wouldn't consider constant time practically

tetha · 2024-10-24T20:28:31 1729801711

That is another very interesting part of complexity theory, yeah.

Like, "Constant time" means, "Runtime independent from input". And, well, solving any sudoku of size 9 or less is a constant times 6.6e21. Maybe a bit more in the exponent, but meh.

Like in graph theory, there are some algorithms for I think maxflow, which solve the thing in O(node_count^4). Theory can push this down to like O(node_count^3) or O(node_count^2.7) or less. That's amazing - you can lose almost 2 orders of magnitude.

Well, implementation of these algorithms and more detailed analysis point out _huge_ precomputations necessary to achieve the speedups. In practice, you'd only see speedups if you had graphs with multiple billions of nodes. In practice, if you deal with a boring subset like "realistically relevant", asymptotically worse algorithms may be the objectively better choice.

Like in this case. Here, some O(n^5) - O(n^9) depending on what the solver does can be better than O(1) for many practical purposes.

In such areas, intuition is little more than a lie.

empath75 · 2024-10-23T13:02:43 1729688563

https://www.youtube.com/watch?v=6OPsH8PK7xM

This video I think makes it obvious why that's true in a pretty intuitive way. I posted it a few days ago as a link and it never got traction.

SAT is the equivalent of being able to find the inverse of _any_ function, because you can describe any function with logic gates (for obvious reasons), and any collection of logic gates that describes a function is equivalent to a SAT problem. All you need to do is codify the function in logic gates, including the output you want, and the ask a SAT solver to find the inputs that produce that output.

yochem · 2024-10-21T14:12:11 1729519931

No way pip actually is a really inefficient SAT solver!

stabbles · 2024-10-23T08:42:23 1729672943

For a long time it was not because there was no backtracking.

Now it is just an exhaustive, recursive search: for the current package try using versions from newest to oldest, enqueue its dependencies, if satisfied return, if conflict continue.

taeric · 2024-10-23T18:22:19 1729707739

If there was no backtracking, that implies it couldn't solve every sudoku? That is rather amusing with the implication that it couldn't solve every dependency, as well?

fernandotakai · 2024-10-23T12:04:43 1729685083

uv actually talks about this in their resolver docs https://docs.astral.sh/uv/reference/resolver-internals/

alentred · 2024-10-23T12:57:59 1729688279

This is BRILLIANT ! I knew of a trend to implement lots of different things at compile-time (in Scala and Haskell communities at least) - definitely fun and quirky, but it never seemed that "special". This one, it has an air of old-school computer magic around it, probably because it is so elegant and simple.

mi_lk · 2024-10-23T08:20:42 1729671642

See also this 2008 post using Debian package system to solve Sudoku:

https://web.archive.org/web/20160326062818/http://algebraict...

ziofill · 2024-10-23T01:27:01 1729646821

but how does it know the constraints?

thangngoc89 · 2024-10-23T02:08:05 1729649285

This is the content of sudoku_0_0-1-py3-none-any.whl. So when the (0,0) cell is 1, none of the cells in the same row, column and subgrid should be 1.

    Requires-Dist: sudoku_0_1 != 1
    Requires-Dist: sudoku_0_2 != 1
    Requires-Dist: sudoku_0_3 != 1
    Requires-Dist: sudoku_0_4 != 1
    Requires-Dist: sudoku_0_5 != 1
    Requires-Dist: sudoku_0_6 != 1
    Requires-Dist: sudoku_0_7 != 1
    Requires-Dist: sudoku_0_8 != 1
    Requires-Dist: sudoku_1_0 != 1
    Requires-Dist: sudoku_2_0 != 1
    Requires-Dist: sudoku_3_0 != 1
    Requires-Dist: sudoku_4_0 != 1
    Requires-Dist: sudoku_5_0 != 1
    Requires-Dist: sudoku_6_0 != 1
    Requires-Dist: sudoku_7_0 != 1
    Requires-Dist: sudoku_8_0 != 1
    Requires-Dist: sudoku_0_1 != 1
    Requires-Dist: sudoku_0_2 != 1
    Requires-Dist: sudoku_1_0 != 1
    Requires-Dist: sudoku_1_1 != 1
    Requires-Dist: sudoku_1_2 != 1
    Requires-Dist: sudoku_2_0 != 1
    Requires-Dist: sudoku_2_1 != 1
    Requires-Dist: sudoku_2_2 != 1

jsnell · 2024-10-23T01:36:36 1729647396

The constraints are going to be static and independent of the puzzle. So I expect they're encoded in the package dependencies. So for example version 1 of the package sudoku_0_0 will conflict with all of: version 1 of sudoku_[0-8]_0; version 1 of sudoku_0_[0-8]; version 1 of [012]_ [012].

roywiggins · 2024-10-23T04:20:34 1729657234

generate_packages makes it moderately clear:

https://github.com/konstin/sudoku-in-python-packaging/blob/m...

IshKebab · 2024-10-23T04:03:10 1729656190

Yeah they missed out the actual interesting bit from the readme...

worewood · 2024-10-23T11:33:12 1729683192

This is type of cool hacking I like to see. Kudos! (Or better, Sukodus :) )

niyonx · 2024-10-21T02:38:36 1729478316

How did you even think of that? Nice!

revskill · 2024-10-23T09:43:51 1729676631

This is a hack.

jessekv · 2024-10-23T14:21:27 1729693287

And why I come here for... er, news.

anthk · 2024-10-23T13:09:45 1729688985

Now, in MicroLisp, Common Lisp and maybe Emacs' Elisp too:

http://www.ulisp.com/show?33J9