Hey, I'm the author of Jocko. I've been working at Confluent the past four years.
I just finished writing a book that shows how to build similar distributed services from scratch, it walks though building a simple distributed commit log with built-in consensus and service discovery from nothing to deployment: https://pragprog.com/titles/tjgo/distributed-services-with-g...
I'm not sure, I haven't heard of that happening before. I've asked my editor if she knows what's up. You can email me at tj at my HN username dot com if you want to follow up.
Hi Alexander ! I'm a big fan of the work you do at vectorizedio and of your blog posts ! I strongly believe there's a need for much simpler event streaming and room for improvement performance-wise.
I took a different path by divorcing from the Kafka protocol and experimenting with what I believe is a simpler to model approach to reliable event processing [1].
that makes total sense. I agree. though in the future, the kafka proto will be impl details. i.e.: once you start moving to http proxy with typed schemas.
Oh hell yeah! That's great news, tons of work went into this -- props to the contributors!
I will take the opportunity to say that Kafka is kind of painful, with or without ZK. Check out NATS! [0]. It doesn't solve all the same problems, but is so much easier to use (during development especially) and can do a lot of the same things.
Correct; but I've seen many uses of Kafka that NATS could totally be used for. For example, load balancing across subscribers (use a NATS queue instead of a Kafka consumer group).
NATS doesn't ever store messages persistently; but this might be fine for your application, and then you don't have to worry about setting 5 different config options to make sure Kafka actually frees up disk space like you expect it to ;)
NATS also enables some unique patterns like request/reply via a "reply to" message header.
It depends what you mean by 'persistently'. Normally NATS streaming will delete the messages after they have been delivered to all subscribers successsfully and some expiration time has passed.
I'm aware, I was referring to the core binary, `nats-server`. NATS streaming server seems to be still receiving attention, but the client library (for Java) hasn't been committed to since 2019, so I'm not sure I'd build a new project with it. JetStream is out (as of this week, I believe) and is an optional module to `nats-server`, as you said.
To the folks at Synadia -- I love NATS, but the naming and organization of these projects could use some work. What's with the `stan.*` repository names? Where did "jetstream" come from? Why is it baked into `nats-server` but `nats-streaming-server` isn't? Is `nats-streaming-server` on the back burner?
Stan is nats streaming. The clients don't have to be updated since they're forward compatible since jetstream. NATS streaming will be deprecated after jetstream is GA is my understanding. Is there a bug in the library you found?
So I guess pay as you go is too expensive for your org? Be curious to understand why you feel it's too expensive when it completely extrapolates any kafka management.
I believe NATS Streaming[0] and upcoming Nats Jetstream[1] could be more relevant to those who looking for a Kafka alternative.
They offer persistent messages, at-least-once delivery similar to Kafka.
NATS by itself doesn’t handle the persistence side at all (by design). It’s at most once delivery, not at least once.
That being said, have you checked out NATS Streaming Server? It’s effectively a first party client for NATS that gives it at least once semantics and persistence, and makes it much more applicable to use cases that are currently on Kafka.
Is there any message ordering guarantee in NATS? With Kafka you can achieve this by using keyed message and messages in the same partition will always be ordered
> messages from a given single publisher will be delivered to all eligible subscribers in the order in which they were originally published. There are no guarantees of message delivery order amongst multiple publishers.
I believe with jetstream, message in stream is ordered as they are written. Jetstream have a concept of consumer, (in the broker itself, not client), which can consume a subset of the stream, filtered by message subject.
I've never really understood the appeal of ordered messages. You end up splitting your data across partitions anyways for parallelism, so who cares? What systems out there require strictly ordered data? It seems like any design that requires something like that is going to be extremely brittle.
Right, but that means you're still "unordered" across those partitions?
> TCP/IP ?
But TCP/IP isn't delivered in order, it rearranges the unordered packages by their ID. I guess ordered delivery would be nice for that, but I just feel like making your protocol not require ordering is far simpler.
Not to mention that both TCP and Kafka have to handle head of line blocking?
I'm not trying to say that ordering is bad or anything, I just feel like it isn't buying me tons.
> Right, but that means you're still "unordered" across those partitions?
Right, so related messages have an ordering guarantee but unrelated messages may be processed out of order relative to each other, which is usually what you want. (Of course you do have to set the record key correctly).
> I'm not trying to say that ordering is bad or anything, I just feel like it isn't buying me tons.
It's a lot more lightweight than full ACID, but if you get your dataflow right it achieves everything that a traditional database does. Without ordering you wouldn't be able to do anything that requires any kind of consistency.
Hm, ok, yeah. So I guess I can see what you mean. I've never had a use case where I felt comfortable relying on any kind of message ordering, and always rely on my application level logic to handle that, or ensure the system is resilient despite ordering (ie: commutative operations only).
To me, it seemed at odds with the parallelism of a partition, but I suppose in this case you'd be partitioning on some sort of semantic key vs, say, a hash.
Thanks for bearing with me on that, this was just an unfamiliar idea for me.
Maybe if you're an e-business, you'll split everything happening on your website by client id, but still want events belonging to a single client to be received in order, for practicality.
NATS is just ephemeral at-most-once pub/sub. It needs NATS Streaming or the new Jetstream for at-least-once persisted data and still has different semantics.
Apache Pulsar offers the same distributed log offering with a fundamentally better architecture, but Kafka has closed most of the gaps now and has far more integrations and a bigger ecosystem.
I"m not sure I understand your question; NATS streaming server (built on top of NATS) supports persistence to disk, a raft group, a SQL table, etc. and it appears the various storage implementations have mechanisms [0] to delete or compress old data.
That being said, I don't think this is what differentiates the two systems, the guarantees they do/don't make are likely what will make the decision for your project.
Kafka is a pretty cool technology, but for every project that I work on, it's never used because it feels like it's overkill (costly and operation heavy). Maybe I should start looking for bigger projects :D
Part of the reason we are removing Kafka's ZooKeeper dependency is to get rid of that "heaviness."
Going forward, you will no longer need to configure and run a separate ZooKeeper service just to run Kafka. For proof-of-concept projects, a single-process Docker image will be available when running in KRaft mode (non-ZK mode).
For bigger projects, you may want to use a managed cloud service. Or if you do choose to manage it yourself, it will be easier running one service than two.
Oh it most certainly simplifies things. I am looking at half the number of boxes needed to run. Which is not insignificant in my cost structure.
What is the migration strategy here? Is it doc'd up yet? I am having flashbacks to migration for follower partitions recently which required a decent amount of pre planning of partition layout.
Also as it is pulling in the duties of ZK into kafka what sort of CPU/memory changes are you seeing? Is it 'meh' or all the way to 'you may want to add a couple of CPUs and a few more GB'? Also is it working ok with the stretched cluster?
Also if you want to hit an interesting market you may want to look at 'does it run OK on a raspberry PI'.
You can tune Kafka down fairly well if you know what you're doing, but it's not optimised for that OOTB. Or just use Confluent Cloud, which is fully managed and scales down as low as you want (costs cents per Gb). Disclosure: work for Confluent.
This is great advice IMO, let someone else manage your Kafka at scale. I feel compelled to mention that other Apache Kafka managed services are available, but agree that it makes sense to offload the complexity if possible! Disclosure: work at Aiven, who offer managed Apache Kakfa on whatever cloud you are using.
Confluent Cloud is a truly 'fully managed' service, with a serverless-like experience for Kafka. For example, you have zero infra to deploy, upgrade, or manage. The Kafka service scales in and out automatically during live operations, you have infinite storage if you want to (via transparent tiered storage), etc. As the user, you just create topics and then read/write your data. Similar to a service like AWS S3, pricing is pay-as-you-go, including the ability to scale to zero.
Kafka cloud offerings like AWS MSK are quite different, as you still have to do much of the Kafka management yourself. It's not a fully managed service. This is also reflected in the pricing model, as you pay per instance-hours (= infra), not by usage (= data). Compare to AWS S3—you don't pay for instance-hours of S3 storage servers here, nor do you have to upgrade or scale in/out your S3 servers (you don't even see 'servers' as an S3 user, just like you don't see Kafka brokers as a Confluent Cloud user).
Secondly, Confluent is available on all three major clouds: AWS, GCP, and Azure. And we also support streaming data across clouds with 'cluster linking'. The other Kafka offerings are "their cloud only".
Thirdly, Confluent includes many additional components of the Kafka ecosystem as (again) fully managed services. This includes e.g. managed connectors, managed schema registry, and managed ksqlDB.
There's a more detailed list at https://www.confluent.io/confluent-cloud/ if you are interested. I am somewhat afraid this comment is coming across as too much marketing already. ;-)
One nice thing about confluent cloud vs MSK is the minimum cost of a confluent cloud cluster is far, far cheaper than the minimal cost of an MSK cluster
Haven't used it personally myself but I've heard it enough to remember it. Redpanda[1] aims to be a Kafka replacement without having to worry about Zookeeper or the JVM
For people who just need a queue, Kafka is a bit like using Kubernetes to run a single Docker container.
We run a number of Kafka clusters, most are relatively low trafic, and the management overhead is pretty. Earlier version did require a bit more attention, but mostly it’s pretty simple to deal with.
Kafka is awesome, but using it in local envs is a pain in the ass, if this is never becomes PROD ready it is already an immense achievement to be able to run Kafka locally with less complexity and overhead.
yeah, really needs a use case that justifies it, I have a particular IoT backend where I made it pluggable between kafka and rabbitmq, ended up just using rabbitmq as it is simpler to work with / manage, and still not really pushing it in terms of performance with thousands of devices.
All this talk about NATs as an alternative to Kafka with no mention of Redpanda’s Zookeeper-less alternative (written in modern C++ - https://vectorized.io/)
What about the BSL makes it a deal breaker? You can host it yourself you just can’t compete with their hosted offering to other customers. I think that’s fair: they get to monetize what they make. And then after 3 years iirc all the code that was BSL or that version anyway goes even more permissive.
I avoided language such as "deal breaker" or "problematic" and chose rather to just say "it would require discussion" in ways that I have not once ever encountered with an Apache licensed project.
Doesn’t seem to be one. They took the Kafka API so it’s a drop in replacement and redid the implementation from the ground up. Your mileage may vary though.
I am just worries that things like 3rd parties integrating with your Kafka broker will stop working or experience issues. Also, I wonder if SASL_SSL is implemented exactly the same way etc.
Finally. I assume there must be good reasons beyond "that's what Hadoop has always used" but philosophically, I never understood why introduce yet another network dependency to handle elections. It really adds up to the operational complexity, from having to manage the Zookeeper cluster to having to fight against DNS.
Before Raft (2013), if you wanted reliable, consistent distributed metadata store you had to implement Paxos which is notoriously difficult to get right. Every service that needed some type of leader election or highly consistent store let Zookeeper deal with that problem (Mesos, Spark, Druid, Storm, and a ton others).
After Raft, it became easier to just implement that layer yourself and so most projects after Raft (or probably more accurately once people started seeing how stable etcd was, ~2014), just used Raft internally where they would have previously used zookeeper.
Raft is a large improvement over Paxos for practical implementations. But it's still tricky to get right. As far as I know, the only widely used, battle tested Raft implementation is github.com/hashicorp/raft. Which is why so many distributed systems are being built on golang over the last few years. I don't know if there is any Java raft implementation which has reached that level of maturity yet - but it seems like Confluence is trying with kraft.
>Some NoSQL databases handle it without Zookeeper.
Most NoSQL databases, now, use Raft, which didn't exist at the time when Kafka was created. Other NoSQL databases, at the time, were not as stable as Zookeeper or had silent bugs that ate data (see aphyr's Jepsen series[1], which thourghly tested several NoSQL databases and found many to be failing, except for Zookeeper).
> Yeah! I mean, I find a lot of linearizability errors in various databases, but this was also my very first time doing this kind of test, and it varies from system to system. Could have easily slipped through the cracks.
In summary, aphyr thought Zookeeper is linearizable even though it doesn't provide linearizable ops.
Sure, a basic ensemble isn't that hard. But securing it is a world of pain and misery.
The ZK development API is also pretty much awful. Apache Curator (which wraps around the ZK API and implements a bunch of common "recipes") makes it less painful, but it really ought to be part of ZK proper.
I have a concern, if the Kafka broker provide both coordination service and Kafka service, how to achieve the resource isolation? If some of the topic which on the coordination service with very high throughput, this must cause instability of the coordination service, could this further cause the instability of the entire cluster?
If some of the broker only provide the coordination service, what is this essential difference? Will this cause more problems for expansion and contraction? Will this bring greater risks when users scale down the brokers? I am very afraid that the coordination service will be shut down due to careless operation.
Yes, that my second concern. If run on separate JVM, what is the difference between use ZK and use controller? Of course, If the new controller is more stable than ZK, more efficient than ZK, this will bring benefits to users. But this is essentially replacing zk with a better product, I think etcd also can achieve this purpose.
If designing a new system is there any reason to choose Kafka over Pulsar at this point?
Apart from Confluent wanting you to use Kafka so they can keep leeching money off you by hijacking de facto ownership of an open source project, of course.
As someone who evaluated Pulsar to replace Kafka, my thoughts...
More moving parts. Brokers, and Bookies, and ZK, plus proxies etc. Plus an additional ZK for inter-cluster replication.
Immaturity - it's still early days for Pulsar, and there's still a lot of bugs being found - and then rapidly fixed, full credit to them, but yeah, not yet as stable. Documentation is often obsoleted, and I found myself having read the code to figure out what was actually going on.
More complex workflows - there's only really one model for a developer consuming or producing against Kafka. With Pulsar, there's multiple different subscription modes, and choosing the wrong one could produce problems.
Also, the need to explicit ack the messages is something you'd have to always watch for to avoid duplicated reads. Also, if using batch receive, when I was looking at Pulsar, you either had to acknowledge the entire batch, or none of it, so a failure during batch processing would lead to the batch being reprocessed, but I think acking within a batch is in development.
No Pulsar IO S3 sink yet.
That said, there's a lot of cool things it's doing, like the built-in schema registry and far easier multitenancy, and offloading older data into S3 etc. transparently to the consumers, so I'm definitely I'm keeping an eye on it.
Lastly, you're taking aim at Confluent, you realise Pulsar is largely controlled by people employed by StreamNative, yeah?
StreamNative doesn’t lock critical functionality behind enterprise agreements while still advertising the software as open source, and doesn’t openly lie about what the system capabilities are.
Not saying that they won’t turn evil at some point, but so far they’re leagues ahead of Confluent in terms of earning developer trust. At a minimum this developer, but also others that I’ve worked with.
Maybe I’m the minority opinion here and that’s fine, but confluent has been far too shady for me to ever consider contracting with them.
We used FOSS Kafka for yonks without hitting any limitations - At one point we were looking at Confluent Replicator, but decided it was just easier to go with Mirror Maker 1 (and you know, no massive licensing fees) - and Mirror Maker 2 largely emulates Replicator in terms of functionality.
I'm aware of a few other things like the MQTT KC Connector, but that was never part of Kafka in the first instance, it's something that Confluent built for paying customers.
And I could argue that the StreamNative "critical functionality" that they lock away behind enterprise agreements is "quick bug fixes for the many bugs we're still finding", if I was feeling mean spirited.
But anyway, it seems your preference for Pulsar is due to Confluent, but they're not the only ones offering managed Kafkas - AWS, IBM, RedHat, etc. etc.
I personally think Kafka has the edge in many ways. It will soon be possible to run a single-process Kafka cluster, which will unlock a lot of applications that previously people used an older systems for, simply because it was easier than standing up a full ZK cluster + Kafka cluster. The broader Kafka ecosystem has features like exactly-once support, KSQL, Kafka Connect, Cluster Linking, and excellent client support that are very valuable.
The Kafka community is huge and the velocity of development is very high. It's easy to forget now, but in the beginning, Kafka didn't even have replication. That's a good reminder that things that seem like permanent advantages of system X over Kafka (for various values of X) may very well prove to be temporary. For example, in this very thread, I see people talking about how various system X'es have the advantage over Kafka because they can run without ZK. Those discussions are almost out of date.
Finally, I work at Confluent and I think the company has always been a positive force in the open source community. I respect the Pulsar people as well, but I think they have a difficult challenge to overcome.
> The broader Kafka ecosystem has features like exactly-once support
No. No it doesn’t. It has at-least-once delivery with client-side deduplication. That’s not new, it’s what TCP does FFS. Why would you lie to people about supporting something long established at best and demonstrably impossible at worst?
> Finally, I work at Confluent....
Oh, that’s why. Never mind then. Continue selling digital snake oil.
This is one of the differences between Kafka (Confluent) and Pulsar.
Confluent make big bold claims "Exactly once delivery" and have aggressive marketing.
Pulsar on the other hand would say we have "effectivley-once". Reading Pulsar docs vs Kafka, Pulsar are very modest about functionality and have no commercial marketing at all.
These days I have noticed Confluent in blog posts do use effectively once but marketing is as aggressive as ever.
Credit where credit is due. Confluent, the marketing and big bold claims is why almost everyone is using Kafka and not Pulsar and may not of even heard of Pulsar. I do find Pulsar architecture more interesting, since Splunk has brought them though it's remained in the background like it always has with no huge push to sell it.
> No. No it doesn’t. It has at-least-once delivery with client-side deduplication. That’s not new, it’s what TCP does FFS.
It's not just deduplication, you can atomically commit a consumer from one topic + produce of records resulting from that. Which is exactly the same exactly-once guarantee that you get from e.g. an SQL database in linearizable mode (a lot of SQL databases will do the same thing internally - optimistically execute transactions and then re-run them in the case of a conflict).
One advantage of Kafka is the ecosystem effect. There are many systems (Flink, Kafka Streams, Pinot, Druid, Presto, etc) that connect to Kafka. I'm not sure about the extent of Pulsar support here, although I'd love to learn more!
The problem I see with kafka is that it was built before cloud architectures were commonly adopted (with distributed systems everywhere). Confluent has put a lot of effort dragging kafka's architecture to the present, but some major features are missing:
- Auto-scaling: Confluent finally introduced "elastic scaling" a few months ago but it only allows you to scale up and must be triggered by the admin (no threshold-based auto-scaling).
- Multi-tenancy: Planning for a multi-tenant kafka cluster is not for the faint of heart. Achieving isolation tends toward liberal usage of topics of which starts to become unmanageable in the low thousands. This isn't crazy when you've got a few hundred microservices and several tenants to keep isolated.
- Decoupled brokers and storage: Any broker scaling or failure can lead to downtime while event storage is redistributed.
Confluent's Cloud service reduces operational overhead but isn't always feasible due to cost, resource limits (like service accounts or schemas for instance), data controls, etc.
There's actually no reason to choose Pulsar anymore. Pulsar has even more layers with Zookeeper + Bookkeeper that requires something like Kubernetes to run well. It was great 5 years ago for heavy users who need better scalability and features than Kafka, however the development has become a mess.
With the removal of zookeeper and tiered storage (separated from compute), Kafka has caught up on scalability while being simpler to deploy. It also has a far bigger ecosystem with more polished features like ksqldb.
Apache Pulsar is hardly an ideal comparison in this context, considering that Pulsar requires ZooKeeper and Apache BookKeeper (which also requires ZooKeeper).
One of the benefits of the Kafka rearchitecture effort is to allow Kafka to "scale down" to run without external dependencies. Using Pulsar would add more dependencies.
They are, but I think they have some ways to go still. BookKeeper is still on ZooKeeper, though they've abstracted the API and have added support for using Etcd instead (not sure if this is production-ready).
Pulsar also requires you to run ZooKeeper and BookKeeper, so TFA has at least one reason you might choose Kafka.
(That said, unlike many I consider depending on ZooKeeper to be a positive sign. "We wrote our own consensus protocol" belongs in roughly the same bucket as "we wrote our own crypto." Using ZooKeeper doesn't automatically mean your distributed system will work but at least you'll have a fighting chance.)
> Pulsar also requires you to run ZooKeeper and BookKeeper, so TFA has at least one reason you might choose Kafka.
BookKeeper is a feature though. Allows to scale the partition beyond the capacity of a storage unit. Effectively unlimited retention for a partition. The problem with Kafka is that the broker is tied to storage.
Yep, it's available from Confluent if you pay for it, but like how Mirror Maker 2 is awfully similar to Confluent Replicator, I believe that Kafka will (eventually) get tiered storage under the Apache licence (I know there's a KIP for it[1]). It's a hard issue to solve, and not sure how much effort in the community is being directed towards it. But bear in mind that it's not just Confluent who have a stake in Kafka - there's a bunch of big corps selling managed/supported Kafka and all of them would probably quite like tiered storage in core Kafka as a feature to help them sell their support/management, so I have some faith in their enlightened self-interest.
The fact that MM2 happened, and Confluent didn't try to stop it, despite it being awfully similar to Replicator, makes me think that Confluent are acting in good faith.
Incidentally, I quite like how Pulsar solved tiered storage, and it's a definite tick in the Pulsar box - it's transparent from a consumer's POV, although there somewhat of a delay in rehydrating the offloaded block, I don't think anyone's expecting near-realtime performance when loading historical data.
Thanks for putting things in perspective, EdwardDiego.
> The fact that MM2 happened, and Confluent didn't try to stop it, despite it being awfully similar to Replicator, makes me think that Confluent are acting in good faith.
Let me share an anecdote related to this example. We (Confluent) were actually the ones who contributed the documentation for MirrorMaker v2 to the Apache Kafka docs (https://kafka.apache.org/documentation/#georeplication). The development lead on MM2 was (an engineer at) Cloudera, yet they never spent the time to provide user-facing documentation to the Kafka project. I don't want to speculate about reasons, yet I noticed that MM2 was documented in the Cloudera docs.
If we didn't care for the Kafka community at Confluent, we would not have spent our own resources and time to fill that gap, given that we have a proprietary product similar to MM2 (i.e., Confluent Replicator).
Shit, wait, there's documentaton for Mirror Maker 2 now? I spent most of my time implementing it by reading hypothetical examples in a KIP, and then diving into the actual code.
Hardly the most straightforward, and it was rather a gaping hole. Thanks for the background on how that hole developed.
I really appreciate Confluent putting that time into documenting something vital, that could compete with your own product, and IMO that does put a nail in the previous commenter's assertions about Confluent's alleged attempts to wall off necessary features of Kafka.
I think you are completely missing the point.
Those kind of system need to be designed in layer.
You can host all 3 layer on each of your VM instance if you want but they should not be mixed in the same process.
One layer BookKeeper provide an abstraction similar to HDFS.
That is it provide file that are horizontally scalable in size and throughput and reliable append only files.
Pulsar is a service built on top of BookKeeper but could run on top of HDFS or something like Amazon S3 ...
And is only responsible for making sure there is only one writer per BookKeeper file even if multiple process try sending request to Pulsar to write to the same partition.
It also try to balance request across all the brokers.
I’ve been following Kafka for 5+ years now. Confluent has done their level best to stack the Kafka PMC with their own employees. Only recently have they added other members. They should not be an Apache project due to how much gaming of ownership they are doing with the project.
Fair enough, not hijacking. Just presenting the software as open source then requiring you to pay substantially for features that it’d be irresponsible to use the software in production without (e.g. geo replication via mirror maker).
Add to that their insistence in claiming “exactly once delivery semantics” from Kafka despite that being provably impossible and I don’t see any reason to trust them as a company or pay for their software.
I’ve been sticking to pulsar for all new projects and have yet to hit a single drawback. It scales better, has less fiddly knobs needing adjusting, has cluster management already built in, and supports traditional pub/suv as well as worker queue semantics. It even has Kafka compatible adapters so it’s relatively easy to migrate existing systems.
Kafka played an important role in the history of distributed system design but it’s time to move to something better built and better managed IMO.
> requiring you to pay substantially for features that it’d be irresponsible to use the software in production without (e.g. geo replication via mirror maker).
MM1 and MM2 are free. You might be getting confused with Confluent Replicator.
> Add to that their insistence in claiming “exactly once delivery semantics” from Kafka despite that being provably impossible
Exactly once delivery is impossible. Exactly once processing is possible. TBH, semantically there's very little difference between those two from an end user perspective.
I think Pulsar is a much better design and further investment in Kafka is a mistake at this point.
Kafka have a lot of downside
1- size for single topic limited to the size of one machine
2- complex stateful client library that need to know which machine is currently the master for each partition.
....
for #1 if you need message order to be maintained you cannot use more than 1 partition.
for #2 99% of the production issue we had where caused by bugs in librdkafka not talking to the right brokers.
Also it mean that the IP for the broker need to all be public IP and the writing throughput and latency is limited by the capacity of the machine that is the master for the partition you are trying to write.
If the machine hosting you partition become overloaded you have to switch the master for that partition to another machine unlike pulsar this is not done automatically and also if you replication factor is 3 your choice are limited to 2 other machine if you don't want to copy the whole partition to a new machine which would take hours.
Yeah, that's true, using librdkafka from C# I hit a few issues where librdkafka was somewhat behind Java in terms of features, I think the one I hit was multi-topic subscriptions.
IIRC Confluent has started putting resources into it - I would hope so, given how .NET Core is going.
That said, the state of Pulsar clients outside of the official Java ones was far worse, I was looking into .NET Core ones and the "official" one (Pulsar-DotPulsar) lacked some key features, whereas a third party one, pulsar-client-dotnet, had far more features, but was still somewhat behind the Java clients.
Caveat is that I looked into all of this when Pulsar was at version 2.6, it's not at 2.7.1, so my comments may well be out of date.
It's worth noting that Twitter built their own system (EventBus) that Apache Pulsar largely mimics in design (and the people who started Pulsar at Yahoo had worked on EventBus prior), with brokers decoupled from storage, and then eventually just decided to get rid of it and use Kafka.
> One catch to this is that for extremely bandwidth-heavy workloads (very high fanout-reads), EventBus theoretically might be more efficient since we can scale out the serving layer independently. However, we’ve found in practice that our fanout is not extreme enough to merit separating the serving layer, especially given the bandwidth available on modern hardware.
Some workloads are very CPU intensive and some are not. Being forced to scale CPU & disk together means one of them is going to be overprovisioned - often by a lot.
I'm pretty surprised Twitter didn't see benefit from doing this if they have multiple Kafka clusters with different use cases.
> Okay, I can see that point, but is it worth the additional latency between broker and Bookie?
It might depend on what you're ingesting and how much. Being able to independently scale ingest and storage is a good alternative to have. It's not only ingest though. It's also consumption. As it stands, having a parallel consumer over a large partition spanning several GBs also requires tons of RAM because a segment must be loaded into memory. In that sense, reprocessing historical data is pretty difficult. There's a lot of complexity hidden in additional Druid, HDFS installations or shoe-horned object storage with their own indexing to support access to historical data semi-fast.
> having a parallel consumer over a large partition spanning several GBs also requires tons of RAM because a segment must be loaded into memory.
I don't know too much about Kafka's internals, but that's my not experience of reading several terabytes of data from a Kafka topic. Memory didn't blow out, although we did burn through IOPs credits.
Good remark. When there's one consumer or multiple consumers hanging on roughly to the same offset (using the same memory mapped segment). Having many consumers hanging on widely different offsets will cause many segments sitting in RAM.
Edit: Kafka apparently does not store a complete log segment in memory, only parts but having many consumers may lead to a lot of churn or a lot of memory consumed. Maybe this is getting better.
Latency is actually reduced compared to kafka because when you have 3 replica you just need to wait for a response of the fastest of the 3 replica instead of waiting for a response of the slowest (master) of the 3 replica.
The benefit is not just independent scaling of resources, but more useful features like archiving and reading from object storage with infinite history, and faster and more reliable data replication across regions.
IIRC, ZK would be more modern and cloud-friendly if it could self-assemble with a preshared passphrase alone. It's good technology otherwise, it's just a PITA to deploy, configure, and support.
I really love a lot of the software under the Hadoop umbrella, but so much of it assumes a static deployment on bare metal hosts, it's a struggle to use it in "modern" setups (HBase, for example; I miss my old friend).
Hadoop launched in 2006, the same year as AWS' cloud portfolio. HBase showed up in 2008.
Many of the hiccups with running Hadoop and friends in containers or on cloud VM's boils down to how hostnames are resolved and advertised; not any significant design issue.
It seems weird to me that cloud providers do not offer distributed coordination primitives "as a service." I understand there are KV stores but not with watches, locks, etc. in the way that etcd and ZK have them.
Azures blob service supports the first one. Its strongly consistent and has a lease/lock API. Ive seen leader election and coordination stuff built on top of it.
The dynamic reconfigure API they added is also terrible, imo. The only thing that made ZK tolerable to use for me was Exhibitor but that appears to be long dead.
[1] https://jepsen.io/analyses