Running Apache Kafka at Scale

robalfonso · on March 20, 2015

This is very timely. We are just starting to work on a new centralized logging system using logstache->kafka->elasticsearch->kibana, we may also use kibana for some other tasks as well. Anyone have any experience using that setup? Pros/Cons?

We used graylog for a while but ran into some issues with back pressure on elastic search (hence kafka).

toomuchtodo · on March 20, 2015

Look into the new Graylog (v1.0). It was recently released, and they use Kafka internally now for buffering. We're sending thousands of msgs per second to Graylog without issue.

waitwaitwhay · on March 21, 2015

According to their documentation, the data storage for log events is Elasticsearch and if the Elasticsearch data is lost then the logs are gone. Consideribg that ES is not a database and may lose data this sounds a bit scary to me.

_up · on March 21, 2015

My understanding is that appending only is fine. So if your use case is logging elasticsearch is totatly fine. You probably can't compare it to the robustness of MySQL & PostreSQL. But most NoSQL are not known to be that robust either.

toomuchtodo · on March 21, 2015

You can solve this in one of two ways:

1. Whatever mechanisim you're using to send data to Graylog, you send that data to S3 as well. You can then reload Graylog at anytime with S3 data.

2. Backup your Elasticsearch nodes to S3

I should've mentioned I run this in AWS. Sorry about that!

pswenson · on March 21, 2015

read up here... https://www.elastic.co/blog/resiliency-elasticsearch/

they had a major data corruption bug last year but have taken measures to correct it. I asked if elasticsearch could be a source of truth at elasticon and they didn't say yes, but they indicated that you "could do it" and it is a goal

weego · on March 20, 2015

Having started off planning the exact same stack (but for business intelligence event tracking) we ended up dropping kafka for kinesis. It felt like until we got to the scale where it tips in the favour of self managed infrastructure (if ever) then there was no value and only probable pain in managing that part.

robalfonso · on March 23, 2015

We already self manage our own infrastructure. So yeah, we are in exact opposite positions, thanks for the insight though!

felipesabino · on March 20, 2015

I heard comparisons with other technologies before, like RabbitMQ [1] [2] but I would love to see an feature/performance comparison between kafka and other similar solutions, specially with cloud based services like the new kid on the 'Google Pub/Sub' [3]

[1] http://www.quora.com/RabbitMQ-vs-Kafka-which-one-for-durable...

[2] https://youtu.be/MA_3fPBFBtg?t=35m27s

[3] https://cloud.google.com/pubsub