What a nice blogpost! I really enjoyed how Bruno leads us through the whole process including his thoughts and strategies to find the code that needs to be modified and how to do it. Luckily the problem - how to get the bytes of a sqlite record - is small enough in scope to solve in a readable blogpost. That’s a „genre“ i really enjoy.
Interesting that he could also do purely by reading the code, or at least presents it that way. I might have started by hooking a debugger to the application and seeing where an insert takes me. Guess many roads lead to rome.
Not mentioned is that the full test sqlite test suite is proprietary and you need a super expensive sqlite foundation membership to get access to it. That means (unless you get that membership), your patched/forked version will be less extensively tested than the official version. So sqlite is in reality very difficult to fork. Sqlite is very solid, but bugs do sometimes show up in it despite all that testing, and more relevantly, some bugs in development are presumably caught by the testing, which outsiders don't have access to.
I'm somewhat leery of hacking on sqlite for this reason. It seems to me like a good candidate for RiiR.
Reportedly they didn't earn anything from test suite, so it is surprising they keep it proprietary.
>The 100% MCD tests, that’s called TH3. That’s proprietary. I had the idea that we would sell those tests to avionics manufacturers and make money that way. We’ve sold exactly zero copies of that so that didn’t really work out.
Really liked this post. One of the hardest things for me as a junior engineer is navigating complicated codebases and understanding them enough to make the change I want. This gives useful insight on how one might approach a similar problem.
It’s not just you, this is a fundamental challenge in programming. I think this paper by Peter Naur lays out the reason why it’s difficult, because software is a lossy representation of a theory held in one or more individuals heads. The original author had a model for how a problem could be solved by written code, and how that code might be extended or refactored in the future to solve related problems. No amount of API design or naming conventions or documentation can perfectly capture those ideas.
This reminds me: I’m really looking forward to SQLite blob I/O being added to the Golang library. There is an open PR to add it. It will enable a lot of interesting use cases revolving around streaming of blobs rather than storing them in memory.
Generic tangents lead to repetitive, generic discussion which can be quite popular but is much less interesting. In fact they usually are much more popular than the curious, nonrepetitive, obscure sorts of topics that we actually want here. Too much repetition is bad for curiosity.
Interesting that he could also do purely by reading the code, or at least presents it that way. I might have started by hooking a debugger to the application and seeing where an insert takes me. Guess many roads lead to rome.