I haven't tested that, so I'm not sure if it would work. The import only inserts rows, it doesn't delete, so I don't think that is the cause of fragmentation. I suspect this line in the vacuum docs:
> The VACUUM command may change the ROWIDs of entries in any tables that do not have an explicit INTEGER PRIMARY KEY.
means SQLite does something to organize by rowid and that this is doing most of the work.
Reddit post/comment IDs are 1:1 with integers, though expressed in a different base that is more friendly to URLs. I map decoded post/comment IDs to INTEGER PRIMARY KEYs on their respective tables. I suspect the vacuum operation sorts the tables by their reddit post ID and something about this sorting improves tables scans, which in turn helps building indices quickly after standing up the DB.
In Python's case, as the article describes quite clearly, the issue is that the design of "working software" (particularly setup.py) was bad to the point of insane (in much the same way as the NPM characteristics that enabled the recent Shai Hulud supply chain attacks, but even worse). At some point, compatibility with insanity has got to go.
Helpfully, though, uv retains compatibility with newer (but still well-established) standards in the Python community that don't share this insanity!
I mean, you’re on the right track in that they did cut out other insanity. But unclear how much of the speed up is necessarily tied to breaking backward compat (are there a lot of “.egg” files in the wild?)
Not as far as I can tell, except perhaps in extended-support legacy environments (for example, ActiveState is still maintaining a Python 2.x distribution).
For web development I would say avoid high-level frameworks as much as you can. Most of them are built for hand-holding developers which is counterproductive to learning the fundamentals and are usually inflexible to demands outside the "happy path".
Rust's notorious compile times sticks out like a sore thumb partly because other system languages can run laps before your Rust build is done. And also because everyone and their grandma swears Rust is blazing fast.
Until you have to compile the program without prior build cache or start a build in a CI pipeline.
reply