Its honestly not that hard to build a WAL-style solution exactly the way you wan...

akiselev · on Dec 1, 2021

100% agree. I only used wal2json because my risk tolerance for the project was between "can't be bothered to pay the onboarding/maintenance cost of Kafka for Debezium" and "can't be bothered to implement a reader for the (well documented IIRC) stable binary WAL format" which is a weird spot to be in. This was a PoC written meant to demonstrate how we could implement the features we needed from ES using no more than standard Postgres tooling and a tiny service that was good enough to roll right into production with minor changes. It took under a week, though ideally I would have take the time to decoded the binary WAL format directly after saving the raw stream for safety's sake. Rust wasn't even an option at the time, nowadays it'd be mostly a bunch of macros and annotations with a sprinkingly of hand written FromBytes implementations and tiny bit of IO+serde glue code.

IIRC it took another data scientist and engineer under a month to turn the raw WAL logs into an audit log interface with pretty SSO avatars and weekly reporting. Someone from the devops team with DBA experience implemented time traveling staging DBs with continuous archiving and point in time recovery from production in the same time. Someone else later improved it so PITR used full backups created from a filtered WAL log so devs could select which parts of the production DB they copied over instead of each babying their own staging cluster that took days to rebuild. The whole project ended up giving us all of the benefits of event sourcing using standard, well tested tooling for a fraction of the cost.

We've thrown around the phrase "not invented here syndrome" so much that we've over corrected - as humans are wont to do - and now the younger generation thinks architectures like event sourcing or infrastructure like Kafka are a better solution than to just consume replication logs over a TCP connection to one of the most popular open source databases on the planet (not directed at the GP but my former coworkers :)). I'm starting to wonder if I've reached the age where I sound like the adults in the Peanuts cartoons, except the sound vaguely resembles "Get your resume driven development off my lawn!"

gunnarmorling · on Dec 1, 2021

> "can't be bothered to pay the onboarding/maintenance cost of Kafka for Debezium"

Debezium can also be used without Kafka; either via Debezium Engine [1], where you embed it as a library into your JVM-based application and it will invoke a callback method you registered for every change event it receives. That way, you can react to change events in any way you want within your application itself, no messaging infrastructure required. The other option is using Debezium Server [2], which takes the embedded engine to connect Debezium to all sorts of messaging/streaming systems, such as Apache Pulsar, Google Cloud Pub/Sub, Amazon Kinesis, Redis Streams, etc.

[1] https://debezium.io/documentation/reference/stable/developme...

[2] https://debezium.io/documentation/reference/stable/operation...