Kafka without ZooKeeper

dig1 · on March 31, 2021

Don't forget that ZooKeeper was the only service that passed all Jepsen [1] tests out of the box.

didip · on March 31, 2021

Reminded me of this project: https://github.com/travisjeffery/jocko

Kafka implemented in Go without needing Zookeeper.

travisjeffery · on March 31, 2021

Hey, I'm the author of Jocko. I've been working at Confluent the past four years.

I just finished writing a book that shows how to build similar distributed services from scratch, it walks though building a simple distributed commit log with built-in consensus and service discovery from nothing to deployment: https://pragprog.com/titles/tjgo/distributed-services-with-g...

kif · on March 31, 2021

Sounds just like the kind of book I wanna read. Unfortunately I wasn't able to pay for it due to "merchant configuration".

I imagine there are no region restrictions, right? I've already contacted support though.

travisjeffery · on April 1, 2021

I'm not sure, I haven't heard of that happening before. I've asked my editor if she knows what's up. You can email me at tj at my HN username dot com if you want to follow up.

kif · on April 1, 2021

I was able to get it sorted through support. Thank you :)

chokolad · on March 31, 2021

Heh, I just bought it based on PragProg newsletter :). Literally 5 minutes ago.

travisjeffery · on March 31, 2021

Awesome! Hope you enjoy reading the book and thanks for buying it.

thibauts · on March 31, 2021

I’m trying to take a similar path with this project [1] though there’s no HA story for now and it’s not meant to be protocol-compatible.

The goal is to make it lightweight and simple to operate, yet very fast.

[1] https://github.com/dataptive/styx

agallego · on March 31, 2021

From a feature list perspective on the readme it basically is what we did at Redpanda (dev here)

thibauts · on March 31, 2021

Hi Alexander ! I'm a big fan of the work you do at vectorizedio and of your blog posts ! I strongly believe there's a need for much simpler event streaming and room for improvement performance-wise.

I took a different path by divorcing from the Kafka protocol and experimenting with what I believe is a simpler to model approach to reliable event processing [1].

[1] https://github.com/dataptive/styx/blob/master/docs/howto/rel...

agallego · on March 31, 2021

that makes total sense. I agree. though in the future, the kafka proto will be impl details. i.e.: once you start moving to http proxy with typed schemas.

bruth · on March 31, 2021

The author now works at Confluent.

cmckn · on March 30, 2021

Oh hell yeah! That's great news, tons of work went into this -- props to the contributors!

I will take the opportunity to say that Kafka is kind of painful, with or without ZK. Check out NATS! [0]. It doesn't solve all the same problems, but is so much easier to use (during development especially) and can do a lot of the same things.

[0]: https://docs.nats.io/whats_new_20

vorpalhex · on March 30, 2021

NATS isn't actually using a log structure though, it's a streaming message broker with a different set of consistency/delivery promises.

cmckn · on March 30, 2021

Correct; but I've seen many uses of Kafka that NATS could totally be used for. For example, load balancing across subscribers (use a NATS queue instead of a Kafka consumer group).

NATS doesn't ever store messages persistently; but this might be fine for your application, and then you don't have to worry about setting 5 different config options to make sure Kafka actually frees up disk space like you expect it to ;)

NATS also enables some unique patterns like request/reply via a "reply to" message header.

Anyway, it's been a joy to use!

KptMarchewa · on March 31, 2021

Sounds more like rabbitmq replacement than kafka

shaklee3 · on March 31, 2021

> NATS doesn't ever store messages persistently

Not true. Both Nats streaming and the upcoming jetstream (core nats) do.

tsimionescu · on March 31, 2021

It depends what you mean by 'persistently'. Normally NATS streaming will delete the messages after they have been delivered to all subscribers successsfully and some expiration time has passed.

shaklee3 · on March 31, 2021

The expiration time can be unlimited, just like kafka

cmckn · on March 31, 2021

I'm aware, I was referring to the core binary, `nats-server`. NATS streaming server seems to be still receiving attention, but the client library (for Java) hasn't been committed to since 2019, so I'm not sure I'd build a new project with it. JetStream is out (as of this week, I believe) and is an optional module to `nats-server`, as you said.

To the folks at Synadia -- I love NATS, but the naming and organization of these projects could use some work. What's with the `stan.*` repository names? Where did "jetstream" come from? Why is it baked into `nats-server` but `nats-streaming-server` isn't? Is `nats-streaming-server` on the back burner?

shaklee3 · on March 31, 2021

Stan is nats streaming. The clients don't have to be updated since they're forward compatible since jetstream. NATS streaming will be deprecated after jetstream is GA is my understanding. Is there a bug in the library you found?

tamale · on March 31, 2021

you also don't have to worry about those kinds of configuration gotchas if you use confluent cloud!

jimmyed · on March 31, 2021

Only you have to be a billionaire to use the confluent cloud.

incognito_limey · on March 31, 2021

So I guess pay as you go is too expensive for your org? Be curious to understand why you feel it's too expensive when it completely extrapolates any kafka management.

stalller · on March 31, 2021

relative to more mature Kafka cloud services, it's crazy expensive.

yes_but_no · on March 30, 2021

I believe NATS Streaming[0] and upcoming Nats Jetstream[1] could be more relevant to those who looking for a Kafka alternative. They offer persistent messages, at-least-once delivery similar to Kafka.

[0] https://docs.nats.io/nats-streaming-concepts/intro [1] https://github.com/nats-io/jetstream

syscomet · on March 30, 2021

[nb: I work for the main company behind NATS]

Jetstream is GA with the 2.2.0 release. Folks who believe in waiting for "not .0" won't have to wait too much longer.

simplecto · on March 30, 2021

Using Kafka only for ordered pub-sub is like ordering tap water at French Laundry.

Groxx · on March 30, 2021

In that if you're already there and just want water, it's totally reasonable. It'll get you wet.

If you went there for tap water, yeah, maybe there are better options.

foobiekr · on March 30, 2021

NATS is not a replacement for any use of kafka that I think is actually a good fit.

SEJeff · on March 31, 2021

Have you seen liftbridge, which was built ontop of nats to fit some of the more traditional kafka use cases?

https://liftbridge.io/

taywrobel · on March 30, 2021

NATS by itself doesn’t handle the persistence side at all (by design). It’s at most once delivery, not at least once.

That being said, have you checked out NATS Streaming Server? It’s effectively a first party client for NATS that gives it at least once semantics and persistence, and makes it much more applicable to use cases that are currently on Kafka.

Docs here if you’re curious - https://docs.nats.io/nats-streaming-concepts/intro

bruth · on March 30, 2021

There was a more recent 2.2 release with durable streams [0]. There is also a comparison page including Kafka [1].

[0]: https://docs.nats.io/whats_new_22 [1]: https://docs.nats.io/compare-nats

gantengx · on March 31, 2021

Is there any message ordering guarantee in NATS? With Kafka you can achieve this by using keyed message and messages in the same partition will always be ordered

Disclosure: I work for Confluent

cmckn · on March 31, 2021

Not the same guarantees, no:

> messages from a given single publisher will be delivered to all eligible subscribers in the order in which they were originally published. There are no guarantees of message delivery order amongst multiple publishers.

https://docs.nats.io/faq#does-nats-offer-any-guarantee-of-me...

nuharaf · on March 31, 2021

I believe with jetstream, message in stream is ordered as they are written. Jetstream have a concept of consumer, (in the broker itself, not client), which can consume a subset of the stream, filtered by message subject.

staticassertion · on March 31, 2021

I've never really understood the appeal of ordered messages. You end up splitting your data across partitions anyways for parallelism, so who cares? What systems out there require strictly ordered data? It seems like any design that requires something like that is going to be extremely brittle.

mrkeen · on March 31, 2021

> You end up splitting your data across partitions anyways

Messages are ordered within partitions.

> What systems out there require strictly ordered data? It seems like any design that requires something like that is going to be extremely brittle.

TCP/IP ?

staticassertion · on March 31, 2021

> Messages are ordered within partitions.

Right, but that means you're still "unordered" across those partitions?

> TCP/IP ?

But TCP/IP isn't delivered in order, it rearranges the unordered packages by their ID. I guess ordered delivery would be nice for that, but I just feel like making your protocol not require ordering is far simpler.

Not to mention that both TCP and Kafka have to handle head of line blocking?

I'm not trying to say that ordering is bad or anything, I just feel like it isn't buying me tons.

lmm · on April 1, 2021

> Right, but that means you're still "unordered" across those partitions?

Right, so related messages have an ordering guarantee but unrelated messages may be processed out of order relative to each other, which is usually what you want. (Of course you do have to set the record key correctly).

> I'm not trying to say that ordering is bad or anything, I just feel like it isn't buying me tons.

It's a lot more lightweight than full ACID, but if you get your dataflow right it achieves everything that a traditional database does. Without ordering you wouldn't be able to do anything that requires any kind of consistency.

staticassertion · on April 1, 2021

Hm, ok, yeah. So I guess I can see what you mean. I've never had a use case where I felt comfortable relying on any kind of message ordering, and always rely on my application level logic to handle that, or ensure the system is resilient despite ordering (ie: commutative operations only).

To me, it seemed at odds with the parallelism of a partition, but I suppose in this case you'd be partitioning on some sort of semantic key vs, say, a hash.

Thanks for bearing with me on that, this was just an unfamiliar idea for me.

mrkeen · on April 1, 2021

> But TCP/IP isn't delivered in order.. I guess ordered delivery would be nice

> both TCP and Kafka have to handle head of line blocking

Well which is it?

(If TCP doesn't give you ordered delivery, why would a head block the rest of the line?)

C4stor · on March 31, 2021

You can split with a business case in mind.

Maybe if you're an e-business, you'll split everything happening on your website by client id, but still want events belonging to a single client to be received in order, for practicality.

geodel · on March 31, 2021

Exactly. We ran into same issue with Kafka. If one needs ordered messages for some reason, Kafka is pretty much useless for this.

manigandham · on March 31, 2021

NATS is just ephemeral at-most-once pub/sub. It needs NATS Streaming or the new Jetstream for at-least-once persisted data and still has different semantics.

Apache Pulsar offers the same distributed log offering with a fundamentally better architecture, but Kafka has closed most of the gaps now and has far more integrations and a bigger ecosystem.

andrewmutz · on March 31, 2021

Does NATS support (or plan to support) log compaction? Without that, it's very hard to replace Kafka.

cmckn · on March 31, 2021

I"m not sure I understand your question; NATS streaming server (built on top of NATS) supports persistence to disk, a raft group, a SQL table, etc. and it appears the various storage implementations have mechanisms [0] to delete or compress old data.

That being said, I don't think this is what differentiates the two systems, the guarantees they do/don't make are likely what will make the decision for your project.

[0]: https://github.com/nats-io/nats-streaming-server/blob/master...

shaklee3 · on March 31, 2021

Yes, jetstream allows setting max messages to 1 in a stream.

SEJeff · on March 31, 2021

And if you want closer kafka semantics built on top of nats, check out liftbridge:

https://liftbridge.io/

haolez · on March 30, 2021

Kafka is a pretty cool technology, but for every project that I work on, it's never used because it feels like it's overkill (costly and operation heavy). Maybe I should start looking for bigger projects :D

colin_mccabe · on March 31, 2021

Part of the reason we are removing Kafka's ZooKeeper dependency is to get rid of that "heaviness."

Going forward, you will no longer need to configure and run a separate ZooKeeper service just to run Kafka. For proof-of-concept projects, a single-process Docker image will be available when running in KRaft mode (non-ZK mode).

For bigger projects, you may want to use a managed cloud service. Or if you do choose to manage it yourself, it will be easier running one service than two.

Disclosure: I work for Confluent.

sumtechguy · on March 31, 2021

Oh it most certainly simplifies things. I am looking at half the number of boxes needed to run. Which is not insignificant in my cost structure.

What is the migration strategy here? Is it doc'd up yet? I am having flashbacks to migration for follower partitions recently which required a decent amount of pre planning of partition layout.

Also as it is pulling in the duties of ZK into kafka what sort of CPU/memory changes are you seeing? Is it 'meh' or all the way to 'you may want to add a couple of CPUs and a few more GB'? Also is it working ok with the stretched cluster?

Also if you want to hit an interesting market you may want to look at 'does it run OK on a raspberry PI'.

jwandborg · on March 31, 2021

A nit regarding the disclosure: I prefer it at the top of the message, and I think that's "best practice", but I don't know for sure.

pdimitar · on March 31, 2021

Your clarification made me wonder:

Is the single process deployment only doable via a container? Or will we actually have OS native process as well?

colin_mccabe · on March 31, 2021

Yes, you can run a single OS native process in KRaft mode, without using Docker. Docker just avoids the need to install a JVM, but it is not required.

math · on March 30, 2021

You can tune Kafka down fairly well if you know what you're doing, but it's not optimised for that OOTB. Or just use Confluent Cloud, which is fully managed and scales down as low as you want (costs cents per Gb). Disclosure: work for Confluent.

lornajane · on March 31, 2021

This is great advice IMO, let someone else manage your Kafka at scale. I feel compelled to mention that other Apache Kafka managed services are available, but agree that it makes sense to offload the complexity if possible! Disclosure: work at Aiven, who offer managed Apache Kakfa on whatever cloud you are using.

unixhero · on March 30, 2021

Thank you for disclosing and not disclaiming.

alex_anglin · on March 31, 2021

Why would someone choose Confluent Cloud over the Kafka offerings of Azure/AWS/GCP?

miguno · on March 31, 2021

Confluent Cloud is a truly 'fully managed' service, with a serverless-like experience for Kafka. For example, you have zero infra to deploy, upgrade, or manage. The Kafka service scales in and out automatically during live operations, you have infinite storage if you want to (via transparent tiered storage), etc. As the user, you just create topics and then read/write your data. Similar to a service like AWS S3, pricing is pay-as-you-go, including the ability to scale to zero.

Kafka cloud offerings like AWS MSK are quite different, as you still have to do much of the Kafka management yourself. It's not a fully managed service. This is also reflected in the pricing model, as you pay per instance-hours (= infra), not by usage (= data). Compare to AWS S3—you don't pay for instance-hours of S3 storage servers here, nor do you have to upgrade or scale in/out your S3 servers (you don't even see 'servers' as an S3 user, just like you don't see Kafka brokers as a Confluent Cloud user).

Secondly, Confluent is available on all three major clouds: AWS, GCP, and Azure. And we also support streaming data across clouds with 'cluster linking'. The other Kafka offerings are "their cloud only".

Thirdly, Confluent includes many additional components of the Kafka ecosystem as (again) fully managed services. This includes e.g. managed connectors, managed schema registry, and managed ksqlDB.

There's a more detailed list at https://www.confluent.io/confluent-cloud/ if you are interested. I am somewhat afraid this comment is coming across as too much marketing already. ;-)

Disclaimer: I work at Confluent.

d_t_w · on March 31, 2021

Confluent Cloud has some nice point-and-click UI for creating associated Kafka resources like Schema Registries and Connect Clusters.

My preference is MSK but I'm very comfortable with vanilla Kafka in AWS at a good price with auto-updates.

tamale · on March 31, 2021

One nice thing about confluent cloud vs MSK is the minimum cost of a confluent cloud cluster is far, far cheaper than the minimal cost of an MSK cluster

dividedbyzero · on March 31, 2021

Is there a GCP offering that isn't just Confluent Cloud billed via Google?

jganetsk · on March 31, 2021

You can use Pub/Sub Lite: https://cloud.google.com/pubsub/lite/docs

With a Kafka compatibility shim: https://github.com/googleapis/java-pubsublite-kafka

Disclaimer: I work on GCP.

lornajane · on March 31, 2021

You can get managed Kafka on Aiven (disclaimer: I work there) on GCP, either through the marketplace or directly through Aiven.

dmlittle · on March 30, 2021

Haven't used it personally myself but I've heard it enough to remember it. Redpanda[1] aims to be a Kafka replacement without having to worry about Zookeeper or the JVM

[1] https://vectorized.io/

timdorr · on March 30, 2021

https://vectorized.io/redpanda/ is a more useful link, since the main domain appears to have some JS errors right now.

agallego · on March 31, 2021

oh odd. what setup to repro the js errors. i'll fix.

mrweasel · on March 31, 2021

For people who just need a queue, Kafka is a bit like using Kubernetes to run a single Docker container.

We run a number of Kafka clusters, most are relatively low trafic, and the management overhead is pretty. Earlier version did require a bit more attention, but mostly it’s pretty simple to deal with.

cfontes · on March 31, 2021

This is huge news.

Kafka is awesome, but using it in local envs is a pain in the ass, if this is never becomes PROD ready it is already an immense achievement to be able to run Kafka locally with less complexity and overhead.

toomanybits · on March 30, 2021

I think that's one of the main points. Now you can run it as a single process more like a traditional broker (although it's obviously still a log).

keithnz · on March 30, 2021

yeah, really needs a use case that justifies it, I have a particular IoT backend where I made it pluggable between kafka and rabbitmq, ended up just using rabbitmq as it is simpler to work with / manage, and still not really pushing it in terms of performance with thousands of devices.

cmason · on March 30, 2021

What do you use instead?

haolez · on March 30, 2021

Cheap managed cloud services, like AWS SQS and Azure Storage Queue (I usually want some kind of persistence for my queues).

jganetsk · on March 31, 2021

Pub/Sub Lite is a cheap managed cloud service on GCP: https://cloud.google.com/pubsub/lite/docs

With a Kafka compatibility shim: https://github.com/googleapis/java-pubsublite-kafka

Disclaimer: I work for GCP.

gwenshap · on March 31, 2021

Confluent Cloud Basic/Standard is a cheap managed Kafka. If the objection is to the deployment and not Kafka clients.

KptMarchewa · on March 31, 2021

We might have different definitions of cheap.

menzella-g · on March 31, 2021

GCP offers Pub/Sub Lite, an inexpensive messaging product with Kafka-compatible client libraries.

https://cloud.google.com/pubsub/lite/docs https://github.com/googleapis/java-pubsublite-kafka

Disclaimer: I work on this product.

NicoJuicy · on March 30, 2021

unixhero · on March 30, 2021

What kind if projects so you work om? I am genuinely curious about use cases. Feel free to obfuscate.

NicoJuicy · on March 31, 2021

E-commerce

In the process of splitting up everything in modules.

( Microservices would be Overkill)

gigatexal · on March 31, 2021

All this talk about NATs as an alternative to Kafka with no mention of Redpanda’s Zookeeper-less alternative (written in modern C++ - https://vectorized.io/)

sgt · on March 31, 2021

What's the downside?

mdaniel · on March 31, 2021

The BSL license makes choosing it "a discussion" versus using OSI licensed software: https://github.com/vectorizedio/redpanda/tree/v21.3.7/licens...

I don't have enough experience with these new vanity licenses to know what the contribution story looks like, either

gigatexal · on March 31, 2021

What about the BSL makes it a deal breaker? You can host it yourself you just can’t compete with their hosted offering to other customers. I think that’s fair: they get to monetize what they make. And then after 3 years iirc all the code that was BSL or that version anyway goes even more permissive.

mdaniel · on April 1, 2021

I avoided language such as "deal breaker" or "problematic" and chose rather to just say "it would require discussion" in ways that I have not once ever encountered with an Apache licensed project.

gigatexal · on March 31, 2021

Doesn’t seem to be one. They took the Kafka API so it’s a drop in replacement and redid the implementation from the ground up. Your mileage may vary though.

sgt · on March 31, 2021

I am just worries that things like 3rd parties integrating with your Kafka broker will stop working or experience issues. Also, I wonder if SASL_SSL is implemented exactly the same way etc.

agallego · on March 31, 2021

hi @sgt - yup SASL_SSL due to the client protos have to be compatible, including the hack you are thinking of.

from a user perspective, existing SASL + SSL + SCRAM will be released next wednesday - so no code changes.

doliveira · on March 30, 2021

Finally. I assume there must be good reasons beyond "that's what Hadoop has always used" but philosophically, I never understood why introduce yet another network dependency to handle elections. It really adds up to the operational complexity, from having to manage the Zookeeper cluster to having to fight against DNS.

nemothekid · on March 31, 2021

Before Raft (2013), if you wanted reliable, consistent distributed metadata store you had to implement Paxos which is notoriously difficult to get right. Every service that needed some type of leader election or highly consistent store let Zookeeper deal with that problem (Mesos, Spark, Druid, Storm, and a ton others).

After Raft, it became easier to just implement that layer yourself and so most projects after Raft (or probably more accurately once people started seeing how stable etcd was, ~2014), just used Raft internally where they would have previously used zookeeper.

dikei · on March 31, 2021

To be fair, many project's Raft implementations contained errors that can and had lead to data lost, so it's not all sunshine and roses.

IMHO, it's still easier to delegate the consensus problem to a third party service like Zookeeper or ETCD.

nvarsj · on March 31, 2021

Raft is a large improvement over Paxos for practical implementations. But it's still tricky to get right. As far as I know, the only widely used, battle tested Raft implementation is github.com/hashicorp/raft. Which is why so many distributed systems are being built on golang over the last few years. I don't know if there is any Java raft implementation which has reached that level of maturity yet - but it seems like Confluence is trying with kraft.

mavelikara · on March 31, 2021

What are the notable projects that implement Raft internally for leader election?

Also, do any of those projects publish their Raft implementation as a library for other projects to include?

beberlei · on March 31, 2021

Anything from hashicorp, vault, consul, nomad for example. yes there is a go library for the basic raft setup afaik

anonymousDan · on March 31, 2021

Because historically implementing something like Zookeeper yourself from scratch is notoriously difficult?

doliveira · on March 31, 2021

I guess what I wonder is why they didn't go with an embedded library or something of sorts. Some NoSQL databases handle it without Zookeeper.

nemothekid · on March 31, 2021

>Some NoSQL databases handle it without Zookeeper.

Most NoSQL databases, now, use Raft, which didn't exist at the time when Kafka was created. Other NoSQL databases, at the time, were not as stable as Zookeeper or had silent bugs that ate data (see aphyr's Jepsen series[1], which thourghly tested several NoSQL databases and found many to be failing, except for Zookeeper).

[1] https://aphyr.com/tags/jepsen

tammerk · on March 31, 2021

https://github.com/jepsen-io/jepsen/issues/399

> Yeah! I mean, I find a lot of linearizability errors in various databases, but this was also my very first time doing this kind of test, and it varies from system to system. Could have easily slipped through the cracks.

In summary, aphyr thought Zookeeper is linearizable even though it doesn't provide linearizable ops.

Looks like Zookeeper needs to be tested again.

dbt00 · on March 30, 2021

I wasn't there when they made the call, but "we know it works" seems like it was the key element here.

autophagian · on March 30, 2021

One immediate downside to this: it is now likely that fewer and fewer people will be exposed to the absolutely excellent Zookeeper logo.

jaytaylor · on March 30, 2021

See:

https://en.wikipedia.org/wiki/Apache_ZooKeeper

https://upload.wikimedia.org/wikipedia/en/thumb/8/81/Apache_...

AdmiralAsshat · on March 30, 2021

Those are some impressively sized hands.

temuze · on March 30, 2021

This is the ideal open source logo. You may not like it, but this is what peak performance looks like.

ignoramous · on March 30, 2021

https://m.huffpost.com/us/entry/us_55c36ebce4b0923c12bbb2cd

koolba · on March 30, 2021

Pretty sure he has gout too.

jakebasile · on March 30, 2021

Every once in a while I learn of something on HN that really changes my life. This is one of those times. Thank you.

spicybright · on March 30, 2021

I did a google image search and wish I didn't.

lotyrin · on March 30, 2021

Still better than the sudo logo.

0x008 · on March 30, 2021

Wait.. sudo has a logo? Wow, that logo is ... special.

dfinninger · on March 30, 2021

https://www.sudo.ws

Wow indeed... I assume that’s a play on “sudo make me a sandwich”?

https://xkcd.com/149/

timdorr · on March 30, 2021

Is there some backstory on this? It looks like it's modelled after a real person. Perhaps one of its creators?

_wqdq · on March 30, 2021

My, what big hands you have.

tofflos · on March 31, 2021

Is there a compiled package for 2.8.0-rc0 available? I only see sources on Github https://github.com/apache/kafka/releases/tag/2.8.0-rc0 and couldn't find anything regarding 2.8.0-rc0 on the Kafka downloads page https://kafka.apache.org/downloads.

ewencp · on March 31, 2021

The downloads page is only for official releases, RCs are not distributed the same way and are not permanent unless promoted to an official release.

See the mailing list thread for 2.8.0-RC0 for where to find the bits if you want to test https://lists.apache.org/thread.html/r16894a11aec73abac521ff... and the project site has some "contact" info for mailing lists where these things are announced and advertised (including releases, Kafka Improvement Proposals, and more) https://kafka.apache.org/contact

dikei · on March 31, 2021

I don't understand the hate for Zookeeper.

Setting up a Zookeeper ensemble is not that hard, they're light on resources and and basically zero maintenance.

elric · on March 31, 2021

Sure, a basic ensemble isn't that hard. But securing it is a world of pain and misery.

The ZK development API is also pretty much awful. Apache Curator (which wraps around the ZK API and implements a bunch of common "recipes") makes it less painful, but it really ought to be part of ZK proper.

accuNum · on March 31, 2021

God forbid you hit a bug in ZK too, as there are 5 people in the world who properly understand that system - there is nobody to help.

z9e · on March 31, 2021

Agreed. I’ve been running Kafka clusters at a few companies now, one that was at a massive Fortune 500 and Zookeeper was the least of my problems.

I’m still glad to see it go away, one less operational dependency the better.

silasiba · on March 31, 2021

Cool feature!

I have a concern, if the Kafka broker provide both coordination service and Kafka service, how to achieve the resource isolation? If some of the topic which on the coordination service with very high throughput, this must cause instability of the coordination service, could this further cause the instability of the entire cluster?

If some of the broker only provide the coordination service, what is this essential difference? Will this cause more problems for expansion and contraction? Will this bring greater risks when users scale down the brokers? I am very afraid that the coordination service will be shut down due to careless operation.

mumrah · on April 1, 2021

Part of the new design is that brokers and controllers can be run on separate JVMs.

silasiba · on April 1, 2021

Yes, that my second concern. If run on separate JVM, what is the difference between use ZK and use controller? Of course, If the new controller is more stable than ZK, more efficient than ZK, this will bring benefits to users. But this is essentially replacing zk with a better product, I think etcd also can achieve this purpose.

taywrobel · on March 30, 2021

If designing a new system is there any reason to choose Kafka over Pulsar at this point?

Apart from Confluent wanting you to use Kafka so they can keep leeching money off you by hijacking de facto ownership of an open source project, of course.

EdwardDiego · on March 31, 2021

As someone who evaluated Pulsar to replace Kafka, my thoughts...

More moving parts. Brokers, and Bookies, and ZK, plus proxies etc. Plus an additional ZK for inter-cluster replication.

Immaturity - it's still early days for Pulsar, and there's still a lot of bugs being found - and then rapidly fixed, full credit to them, but yeah, not yet as stable. Documentation is often obsoleted, and I found myself having read the code to figure out what was actually going on.

More complex workflows - there's only really one model for a developer consuming or producing against Kafka. With Pulsar, there's multiple different subscription modes, and choosing the wrong one could produce problems.

Also, the need to explicit ack the messages is something you'd have to always watch for to avoid duplicated reads. Also, if using batch receive, when I was looking at Pulsar, you either had to acknowledge the entire batch, or none of it, so a failure during batch processing would lead to the batch being reprocessed, but I think acking within a batch is in development.

No Pulsar IO S3 sink yet.

That said, there's a lot of cool things it's doing, like the built-in schema registry and far easier multitenancy, and offloading older data into S3 etc. transparently to the consumers, so I'm definitely I'm keeping an eye on it.

Lastly, you're taking aim at Confluent, you realise Pulsar is largely controlled by people employed by StreamNative, yeah?

taywrobel · on March 31, 2021

StreamNative doesn’t lock critical functionality behind enterprise agreements while still advertising the software as open source, and doesn’t openly lie about what the system capabilities are.

Not saying that they won’t turn evil at some point, but so far they’re leagues ahead of Confluent in terms of earning developer trust. At a minimum this developer, but also others that I’ve worked with.

Maybe I’m the minority opinion here and that’s fine, but confluent has been far too shady for me to ever consider contracting with them.

EdwardDiego · on March 31, 2021

What critical functionality are you thinking of?

We used FOSS Kafka for yonks without hitting any limitations - At one point we were looking at Confluent Replicator, but decided it was just easier to go with Mirror Maker 1 (and you know, no massive licensing fees) - and Mirror Maker 2 largely emulates Replicator in terms of functionality.

I'm aware of a few other things like the MQTT KC Connector, but that was never part of Kafka in the first instance, it's something that Confluent built for paying customers.

And I could argue that the StreamNative "critical functionality" that they lock away behind enterprise agreements is "quick bug fixes for the many bugs we're still finding", if I was feeling mean spirited.

But anyway, it seems your preference for Pulsar is due to Confluent, but they're not the only ones offering managed Kafkas - AWS, IBM, RedHat, etc. etc.

colin_mccabe · on March 31, 2021

I personally think Kafka has the edge in many ways. It will soon be possible to run a single-process Kafka cluster, which will unlock a lot of applications that previously people used an older systems for, simply because it was easier than standing up a full ZK cluster + Kafka cluster. The broader Kafka ecosystem has features like exactly-once support, KSQL, Kafka Connect, Cluster Linking, and excellent client support that are very valuable.

The Kafka community is huge and the velocity of development is very high. It's easy to forget now, but in the beginning, Kafka didn't even have replication. That's a good reminder that things that seem like permanent advantages of system X over Kafka (for various values of X) may very well prove to be temporary. For example, in this very thread, I see people talking about how various system X'es have the advantage over Kafka because they can run without ZK. Those discussions are almost out of date.

Finally, I work at Confluent and I think the company has always been a positive force in the open source community. I respect the Pulsar people as well, but I think they have a difficult challenge to overcome.

Hallucinaut · on March 31, 2021

What challenges do you see Pulsar having (and potentially not overcoming)?

tamale · on March 31, 2021

Pulsar is fighting a massive up-hill battle. Kafka is everywhere (and for good reason).

taywrobel · on March 31, 2021

> The broader Kafka ecosystem has features like exactly-once support

No. No it doesn’t. It has at-least-once delivery with client-side deduplication. That’s not new, it’s what TCP does FFS. Why would you lie to people about supporting something long established at best and demonstrably impossible at worst?

> Finally, I work at Confluent....

Oh, that’s why. Never mind then. Continue selling digital snake oil.

tiew9Vii · on March 31, 2021

This is one of the differences between Kafka (Confluent) and Pulsar.

Confluent make big bold claims "Exactly once delivery" and have aggressive marketing.

Pulsar on the other hand would say we have "effectivley-once". Reading Pulsar docs vs Kafka, Pulsar are very modest about functionality and have no commercial marketing at all.

These days I have noticed Confluent in blog posts do use effectively once but marketing is as aggressive as ever.

Credit where credit is due. Confluent, the marketing and big bold claims is why almost everyone is using Kafka and not Pulsar and may not of even heard of Pulsar. I do find Pulsar architecture more interesting, since Splunk has brought them though it's remained in the background like it always has with no huge push to sell it.

lmm · on April 1, 2021

> No. No it doesn’t. It has at-least-once delivery with client-side deduplication. That’s not new, it’s what TCP does FFS.

It's not just deduplication, you can atomically commit a consumer from one topic + produce of records resulting from that. Which is exactly the same exactly-once guarantee that you get from e.g. an SQL database in linearizable mode (a lot of SQL databases will do the same thing internally - optimistically execute transactions and then re-run them in the case of a conflict).

simplecto · on March 30, 2021

This is a very uncharitable take on Confluent.

They contribute heavily to the project and offer enterprise support, professional services, and proprietary tech like Cluster linking.

Source: I am an enterprise customer of Confluent.

ovis · on March 31, 2021

One advantage of Kafka is the ecosystem effect. There are many systems (Flink, Kafka Streams, Pinot, Druid, Presto, etc) that connect to Kafka. I'm not sure about the extent of Pulsar support here, although I'd love to learn more!

dan_quixote · on March 31, 2021

Quoting myself here [1], but Kafka is still very architecturally un-sound for containerized deployments.

-------------------------------------------------------

The problem I see with kafka is that it was built before cloud architectures were commonly adopted (with distributed systems everywhere). Confluent has put a lot of effort dragging kafka's architecture to the present, but some major features are missing:

  - Auto-scaling:  Confluent finally introduced "elastic scaling" a few months ago but it only allows you to scale up and must be triggered by the admin (no threshold-based auto-scaling).

  - Multi-tenancy:  Planning for a multi-tenant kafka cluster is not for the faint of heart.  Achieving isolation tends toward liberal usage of topics of which starts to become unmanageable in the low thousands.  This isn't crazy when you've got a few hundred microservices and several tenants to keep isolated.

  - Decoupled brokers and storage:  Any broker scaling or failure can lead to downtime while event storage is redistributed.

Confluent's Cloud service reduces operational overhead but isn't always feasible due to cost, resource limits (like service accounts or schemas for instance), data controls, etc.

-------------------------------------------------------

[1] https://news.ycombinator.com/item?id=25984560

manigandham · on March 31, 2021

There's actually no reason to choose Pulsar anymore. Pulsar has even more layers with Zookeeper + Bookkeeper that requires something like Kubernetes to run well. It was great 5 years ago for heavy users who need better scalability and features than Kafka, however the development has become a mess.

With the removal of zookeeper and tiered storage (separated from compute), Kafka has caught up on scalability while being simpler to deploy. It also has a far bigger ecosystem with more polished features like ksqldb.

atombender · on March 31, 2021

Apache Pulsar is hardly an ideal comparison in this context, considering that Pulsar requires ZooKeeper and Apache BookKeeper (which also requires ZooKeeper).

One of the benefits of the Kafka rearchitecture effort is to allow Kafka to "scale down" to run without external dependencies. Using Pulsar would add more dependencies.

tuhaihe · on March 31, 2021

As I know, Pulsar community is doing the work of removing ZooKeeper, this is a helpful link: https://github.com/apache/pulsar/projects/10

atombender · on March 31, 2021

They are, but I think they have some ways to go still. BookKeeper is still on ZooKeeper, though they've abstracted the API and have added support for using Etcd instead (not sure if this is production-ready).

tveita · on March 31, 2021

Pulsar also requires you to run ZooKeeper and BookKeeper, so TFA has at least one reason you might choose Kafka.

(That said, unlike many I consider depending on ZooKeeper to be a positive sign. "We wrote our own consensus protocol" belongs in roughly the same bucket as "we wrote our own crypto." Using ZooKeeper doesn't automatically mean your distributed system will work but at least you'll have a fighting chance.)

rad_gruchalski · on March 31, 2021

> Pulsar also requires you to run ZooKeeper and BookKeeper, so TFA has at least one reason you might choose Kafka.

BookKeeper is a feature though. Allows to scale the partition beyond the capacity of a storage unit. Effectively unlimited retention for a partition. The problem with Kafka is that the broker is tied to storage.

EdwardDiego · on March 31, 2021

It can be a feature, yep, although Pulsar's tiered storage (offload old data into S3 and then retrieve it if required) is far more of a feature IMO.

And BK isn't a feature without cost, it's documentation is... somewhat sparse, and there is significant complexity to maintaining it.

rad_gruchalski · on March 31, 2021

> offload old data into S3 and then retrieve it if required

I remember asking for it 5 years ago: https://radek-gruchalski.medium.com/the-case-for-kafka-cold-.... Confluent turned it into a paid feature.

EdwardDiego · on March 31, 2021

Yep, it's available from Confluent if you pay for it, but like how Mirror Maker 2 is awfully similar to Confluent Replicator, I believe that Kafka will (eventually) get tiered storage under the Apache licence (I know there's a KIP for it[1]). It's a hard issue to solve, and not sure how much effort in the community is being directed towards it. But bear in mind that it's not just Confluent who have a stake in Kafka - there's a bunch of big corps selling managed/supported Kafka and all of them would probably quite like tiered storage in core Kafka as a feature to help them sell their support/management, so I have some faith in their enlightened self-interest.

The fact that MM2 happened, and Confluent didn't try to stop it, despite it being awfully similar to Replicator, makes me think that Confluent are acting in good faith.

Incidentally, I quite like how Pulsar solved tiered storage, and it's a definite tick in the Pulsar box - it's transparent from a consumer's POV, although there somewhat of a delay in rehydrating the offloaded block, I don't think anyone's expecting near-realtime performance when loading historical data.

[1]: https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A...

miguno · on March 31, 2021

Thanks for putting things in perspective, EdwardDiego.

> The fact that MM2 happened, and Confluent didn't try to stop it, despite it being awfully similar to Replicator, makes me think that Confluent are acting in good faith.

Let me share an anecdote related to this example. We (Confluent) were actually the ones who contributed the documentation for MirrorMaker v2 to the Apache Kafka docs (https://kafka.apache.org/documentation/#georeplication). The development lead on MM2 was (an engineer at) Cloudera, yet they never spent the time to provide user-facing documentation to the Kafka project. I don't want to speculate about reasons, yet I noticed that MM2 was documented in the Cloudera docs.

If we didn't care for the Kafka community at Confluent, we would not have spent our own resources and time to fill that gap, given that we have a proprietary product similar to MM2 (i.e., Confluent Replicator).

https://github.com/apache/kafka/pull/9983

EdwardDiego · on March 31, 2021

Shit, wait, there's documentaton for Mirror Maker 2 now? I spent most of my time implementing it by reading hypothetical examples in a KIP, and then diving into the actual code.

Hardly the most straightforward, and it was rather a gaping hole. Thanks for the background on how that hole developed.

I really appreciate Confluent putting that time into documenting something vital, that could compete with your own product, and IMO that does put a nail in the previous commenter's assertions about Confluent's alleged attempts to wall off necessary features of Kafka.

skyde · on March 31, 2021

I think you are completely missing the point. Those kind of system need to be designed in layer. You can host all 3 layer on each of your VM instance if you want but they should not be mixed in the same process.

One layer BookKeeper provide an abstraction similar to HDFS. That is it provide file that are horizontally scalable in size and throughput and reliable append only files.

Pulsar is a service built on top of BookKeeper but could run on top of HDFS or something like Amazon S3 ...

And is only responsible for making sure there is only one writer per BookKeeper file even if multiple process try sending request to Pulsar to write to the same partition. It also try to balance request across all the brokers.

plank_time · on March 31, 2021

I’ve been following Kafka for 5+ years now. Confluent has done their level best to stack the Kafka PMC with their own employees. Only recently have they added other members. They should not be an Apache project due to how much gaming of ownership they are doing with the project.

tyingq · on March 30, 2021

Hijacking seems like a strong word since Jay Kreps is the CEO.

taywrobel · on March 30, 2021

Fair enough, not hijacking. Just presenting the software as open source then requiring you to pay substantially for features that it’d be irresponsible to use the software in production without (e.g. geo replication via mirror maker).

Add to that their insistence in claiming “exactly once delivery semantics” from Kafka despite that being provably impossible and I don’t see any reason to trust them as a company or pay for their software.

I’ve been sticking to pulsar for all new projects and have yet to hit a single drawback. It scales better, has less fiddly knobs needing adjusting, has cluster management already built in, and supports traditional pub/suv as well as worker queue semantics. It even has Kafka compatible adapters so it’s relatively easy to migrate existing systems.

Kafka played an important role in the history of distributed system design but it’s time to move to something better built and better managed IMO.

EdwardDiego · on March 31, 2021

> requiring you to pay substantially for features that it’d be irresponsible to use the software in production without (e.g. geo replication via mirror maker).

MM1 and MM2 are free. You might be getting confused with Confluent Replicator.

> Kafka compatible adapters

For very limited subsets of the Kafka APIs.

staticassertion · on March 31, 2021

> Add to that their insistence in claiming “exactly once delivery semantics” from Kafka despite that being provably impossible

Exactly once delivery is impossible. Exactly once processing is possible. TBH, semantically there's very little difference between those two from an end user perspective.

skyde · on March 30, 2021

I think Pulsar is a much better design and further investment in Kafka is a mistake at this point. Kafka have a lot of downside

1- size for single topic limited to the size of one machine 2- complex stateful client library that need to know which machine is currently the master for each partition. ....

ovis · on March 31, 2021

1. That's untrue. Partitions are limited to what a machine may handle, but topics may be scaled across many ordered partitions.

2. This is generally handled by the client library transparently. Have you ever needed to manage this state manually?

skyde · on March 31, 2021

for #1 if you need message order to be maintained you cannot use more than 1 partition. for #2 99% of the production issue we had where caused by bugs in librdkafka not talking to the right brokers. Also it mean that the IP for the broker need to all be public IP and the writing throughput and latency is limited by the capacity of the machine that is the master for the partition you are trying to write. If the machine hosting you partition become overloaded you have to switch the master for that partition to another machine unlike pulsar this is not done automatically and also if you replication factor is 3 your choice are limited to 2 other machine if you don't want to copy the whole partition to a new machine which would take hours.

rad_gruchalski · on March 31, 2021

Re 2, have you ever tried using Kafka with a non-JVM client?

EdwardDiego · on March 31, 2021

Yep, I have, no issues with the consumer/producer knowing who the partition leader is.

That said, curious to hear your experiences :)

rad_gruchalski · on March 31, 2021

I don’t have issues per se as long as I stick to librdkafka but even that is constantly playing catch up.

Outside of librdkafka and jvm client, it’s gloves off.

EdwardDiego · on March 31, 2021

Yeah, that's true, using librdkafka from C# I hit a few issues where librdkafka was somewhat behind Java in terms of features, I think the one I hit was multi-topic subscriptions.

IIRC Confluent has started putting resources into it - I would hope so, given how .NET Core is going.

That said, the state of Pulsar clients outside of the official Java ones was far worse, I was looking into .NET Core ones and the "official" one (Pulsar-DotPulsar) lacked some key features, whereas a third party one, pulsar-client-dotnet, had far more features, but was still somewhat behind the Java clients.

Caveat is that I looked into all of this when Pulsar was at version 2.6, it's not at 2.7.1, so my comments may well be out of date.

mrkeen · on March 31, 2021

kafkacat is excellent.

https://github.com/edenhill/kafkacat

rad_gruchalski · on March 31, 2021

Definitely, if all one needs is command line.

knodi · on March 31, 2021

Please separate storage from brokers next

EdwardDiego · on March 31, 2021

Why though?

It's worth noting that Twitter built their own system (EventBus) that Apache Pulsar largely mimics in design (and the people who started Pulsar at Yahoo had worked on EventBus prior), with brokers decoupled from storage, and then eventually just decided to get rid of it and use Kafka.

https://blog.twitter.com/engineering/en_us/topics/insights/2...

> One catch to this is that for extremely bandwidth-heavy workloads (very high fanout-reads), EventBus theoretically might be more efficient since we can scale out the serving layer independently. However, we’ve found in practice that our fanout is not extreme enough to merit separating the serving layer, especially given the bandwidth available on modern hardware.

nick0garvey · on March 31, 2021

Some workloads are very CPU intensive and some are not. Being forced to scale CPU & disk together means one of them is going to be overprovisioned - often by a lot.

I'm pretty surprised Twitter didn't see benefit from doing this if they have multiple Kafka clusters with different use cases.

EdwardDiego · on March 31, 2021

Okay, I can see that point, but is it worth the additional latency between broker and Bookie?

> I'm pretty surprised Twitter didn't see benefit from doing this if they have multiple Kafka clusters with different use cases.

Yeah, I think they were too tbh. I wish I could delve more into what they experienced beyond that single blog post I linked.

rad_gruchalski · on March 31, 2021

> Okay, I can see that point, but is it worth the additional latency between broker and Bookie?

It might depend on what you're ingesting and how much. Being able to independently scale ingest and storage is a good alternative to have. It's not only ingest though. It's also consumption. As it stands, having a parallel consumer over a large partition spanning several GBs also requires tons of RAM because a segment must be loaded into memory. In that sense, reprocessing historical data is pretty difficult. There's a lot of complexity hidden in additional Druid, HDFS installations or shoe-horned object storage with their own indexing to support access to historical data semi-fast.

EdwardDiego · on March 31, 2021

> having a parallel consumer over a large partition spanning several GBs also requires tons of RAM because a segment must be loaded into memory.

I don't know too much about Kafka's internals, but that's my not experience of reading several terabytes of data from a Kafka topic. Memory didn't blow out, although we did burn through IOPs credits.

rad_gruchalski · on March 31, 2021

Good remark. When there's one consumer or multiple consumers hanging on roughly to the same offset (using the same memory mapped segment). Having many consumers hanging on widely different offsets will cause many segments sitting in RAM.

Edit: Kafka apparently does not store a complete log segment in memory, only parts but having many consumers may lead to a lot of churn or a lot of memory consumed. Maybe this is getting better.

skyde · on March 31, 2021

Latency is actually reduced compared to kafka because when you have 3 replica you just need to wait for a response of the fastest of the 3 replica instead of waiting for a response of the slowest (master) of the 3 replica.

manigandham · on March 31, 2021

Confluent is already working on this: https://www.confluent.io/project-metamorphosis/infinite

The benefit is not just independent scaling of resources, but more useful features like archiving and reading from object storage with infinite history, and faster and more reliable data replication across regions.

whycombin8 · on March 31, 2021

one step closer to being able to run only a single broker at the edge.

skyde · on March 31, 2021

what is the point of using Kafka if you are using only one broker (no replication)? Might as well just write your data to /dev/null

rad_gruchalski · on March 31, 2021

Works very well. Had some time to try out the 2.8 build and all existing tooling works just fine. Solid work by everyone involved in this.

surfsvammel · on March 31, 2021

Great news! Props!

I have a use case where this would make sense, I’ll dig in directly!

deveffort · on March 30, 2021

Sweet!! Thank you.

CatDogIsGod · on March 30, 2021

Awesome!

airhead969 · on March 30, 2021

IIRC, ZK would be more modern and cloud-friendly if it could self-assemble with a preshared passphrase alone. It's good technology otherwise, it's just a PITA to deploy, configure, and support.

cmckn · on March 30, 2021

I really love a lot of the software under the Hadoop umbrella, but so much of it assumes a static deployment on bare metal hosts, it's a struggle to use it in "modern" setups (HBase, for example; I miss my old friend).

Godel_unicode · on March 31, 2021

https://azure.microsoft.com/en-us/services/databricks

godisdad · on March 30, 2021

Yeah. They should have written it twenty years ago in a datacenter with public cloud in mind

cmckn · on March 31, 2021

Hadoop launched in 2006, the same year as AWS' cloud portfolio. HBase showed up in 2008.

Many of the hiccups with running Hadoop and friends in containers or on cloud VM's boils down to how hostnames are resolved and advertised; not any significant design issue.

pc86 · on March 31, 2021

> Hadoop launched in 2006, the same year as AWS' cloud portfolio.

Which makes it all the less reasonable to assume it would be in any way cloud native, when "the cloud" was at best a nascent idea at that point.

And how many years did it take AWS to get any serious traction after launch?

cmckn · on March 31, 2021

I haven't assumed anything of the sort, I said I loved the software and wished it was easier to use in what is considered a "modern" environment.

rrdharan · on March 31, 2021

The irony of the name Cloudera has not been lost on keen observers of this space.

airhead969 · on March 31, 2021

It makes you wonder how many people think "clouds" run on something besides servers.

closeparen · on March 31, 2021

It seems weird to me that cloud providers do not offer distributed coordination primitives "as a service." I understand there are KV stores but not with watches, locks, etc. in the way that etcd and ZK have them.

the_arun · on March 31, 2021

Agreed. We need highly available & distributed

1. Locking service

2. Id generator

3. Bloom filter etc.,

voidfunc · on March 31, 2021

Azures blob service supports the first one. Its strongly consistent and has a lease/lock API. Ive seen leader election and coordination stuff built on top of it.

ric2b · on March 31, 2021

Why the id generator? Aren't GUID's enough? The chance of a collision is basically 0.

SEJeff · on March 31, 2021

A few different points of view:

https://tomharrisonjr.com/uuid-or-guid-as-primary-keys-be-ca...

https://news.ycombinator.com/item?id=14523523 (the hacker news comments for the above article are really insightful)

https://www.percona.com/blog/2019/11/22/uuids-are-popular-bu...

https://blog.codinghorror.com/primary-keys-ids-versus-guids/

the_arun · on March 31, 2021

For most of the use cases GUID works great. There are some cases where we need a numerical id which is sequential.

skyde · on March 31, 2021

Are you looking for something like this ? https://azure.microsoft.com/en-us/updates/azure-cosmos-db-ap...

closeparen · on March 31, 2021

Yes! Exactly. Are there analogues in GCP and AWS?

pram · on March 31, 2021

The dynamic reconfigure API they added is also terrible, imo. The only thing that made ZK tolerable to use for me was Exhibitor but that appears to be long dead.