Making a change to SQLite source code

under-Peter · on Oct 26, 2022

What a nice blogpost! I really enjoyed how Bruno leads us through the whole process including his thoughts and strategies to find the code that needs to be modified and how to do it. Luckily the problem - how to get the bytes of a sqlite record - is small enough in scope to solve in a readable blogpost. That’s a „genre“ i really enjoy.

Interesting that he could also do purely by reading the code, or at least presents it that way. I might have started by hooking a debugger to the application and seeing where an insert takes me. Guess many roads lead to rome.

throwaway81523 · on Oct 26, 2022

Not mentioned is that the full test sqlite test suite is proprietary and you need a super expensive sqlite foundation membership to get access to it. That means (unless you get that membership), your patched/forked version will be less extensively tested than the official version. So sqlite is in reality very difficult to fork. Sqlite is very solid, but bugs do sometimes show up in it despite all that testing, and more relevantly, some bugs in development are presumably caught by the testing, which outsiders don't have access to.

I'm somewhat leery of hacking on sqlite for this reason. It seems to me like a good candidate for RiiR.

garaetjjte · on Oct 26, 2022

Reportedly they didn't earn anything from test suite, so it is surprising they keep it proprietary.

>The 100% MCD tests, that’s called TH3. That’s proprietary. I had the idea that we would sell those tests to avionics manufacturers and make money that way. We’ve sold exactly zero copies of that so that didn’t really work out.

https://corecursive.com/066-sqlite-with-richard-hipp/#billio...

TrustPatches · on Oct 26, 2022

Really liked this post. One of the hardest things for me as a junior engineer is navigating complicated codebases and understanding them enough to make the change I want. This gives useful insight on how one might approach a similar problem.

Waterluvian · on Oct 26, 2022

Is it just me or is one of the most difficult challenges of doing so figuring out if you’re making the change at the right abstraction layer?

jhurliman · on Oct 26, 2022

It’s not just you, this is a fundamental challenge in programming. I think this paper by Peter Naur lays out the reason why it’s difficult, because software is a lossy representation of a theory held in one or more individuals heads. The original author had a model for how a problem could be solved by written code, and how that code might be extended or refactored in the future to solve related problems. No amount of API design or naming conventions or documentation can perfectly capture those ideas.

https://pages.cs.wisc.edu/~remzi/Naur.pdf

gwd · on Oct 26, 2022

The next blog post can be about how to submit the change to libSQL [1], so other people can take advantage of it.

[1] https://github.com/libsql/libsql

pokstad · on Oct 26, 2022

This reminds me: I’m really looking forward to SQLite blob I/O being added to the Golang library. There is an open PR to add it. It will enable a lot of interesting use cases revolving around streaming of blobs rather than storing them in memory.

idealmedtech · on Oct 26, 2022

Would be great to see how this change affects SQLite’s absolutely gargantuan test suite!

headPoet · on Oct 26, 2022

[flagged]

dang · on Oct 26, 2022

That aspect of SQLite has had many past discussions on HN and is therefore off topic in this thread.

SQLite Code of Ethics (2020) - https://news.ycombinator.com/item?id=32918332 - Sept 2022 (14 comments)

The SQLite Code of Ethics - https://news.ycombinator.com/item?id=31886687 - June 2022 (379 comments)

SQLite – Code of Ethics - https://news.ycombinator.com/item?id=31057104 - April 2022 (2 comments)

SQLite: Code of Ethics - https://news.ycombinator.com/item?id=26547201 - March 2021 (23 comments)

SQLite Code of Ethics - https://news.ycombinator.com/item?id=18297514 - Oct 2018 (19 comments)

For major ongoing topics (like SQLite) it's best to avoid generic tangents. This is in the HN guidelines: https://news.ycombinator.com/newsguidelines.html.

Generic tangents lead to repetitive, generic discussion which can be quite popular but is much less interesting. In fact they usually are much more popular than the curious, nonrepetitive, obscure sorts of topics that we actually want here. Too much repetition is bad for curiosity.