Data streaming emerged as a new software category. It complements traditional middleware, data warehouse, and data lakes. Apache Kafka became the de facto standard. New players enter the market because of Kafka’s success. One of those is Redpanda, a lightweight Kafka-compatible C++ implementation. This blog post explores the differences between Apache Kafka and Redpanda, when to choose which framework, and how the Kafka ecosystem, licensing, and community adoption impact a proper evaluation.
is your thesis available to read? Jepsen and these concepts for databases are certainly very different from those for kafa, as we heard from Kyle during the Jepsen testing for Redpanda. Someone needs to write about those perceptions!
That would work. The internal consistency will be at least once and you do de-duplication to handle message reliability through the database then. In other words, you need to uniquely identify each message to ensure idempotency since you will have duplicate writes of the same message to the database.
Just need to make sure to not mess up causal ordering between events because of out of order retries, if such things are important for your application.
Reading sequentially can be hard here, especially depending on throughput and how well you can or can't shard (e.g. how wide is the radius of possible side effects).
E.g. what if one event is something like "credit here, debit there" - you need to process it sequentially for both sides!