Why do so many companies insist on shipping their logs via Kafka? I can't imagine deliverability semantics are necessary with logs, and if they are, they shouldn't be in your logs?
Kafka is a big dumb pipe that moves the bytes real fast, it's ideal for shipping logs. It accepts huge volumes of tiny writes without breaking a sweat, which is exactly what you want--get the logs off the box ASAP and persisted somewhere else durably (e.g. replicated).
My experience has been a mixture of "when all you have is a hammer ..." and Pointy Haired Bosses LOVE kafka, and tend to default to it because it's what all their Pointy Haired Boss friends are using
In a more generous take, using some buffered ingest does help with not having to choose between a c500.128xl ingest machine and dropping messages, but I would never advocate for standing up kafka just for log buffering
at that point you are likely slowing down your applications - I think a basic OpenTelemetry collector mostly solves this, and if you go beyond the available buffer there, then dropping it is the appropriate choice for application logs.
Dropping may be an unacceptable choice for some applications, though. For example dropping request logs is really bad, because now you have no idea who is interacting with your service. If a security breach happens and your answer is "like, bro, idk what happened man, we load shedded the logs away" that's not a great look...
In log shipping cases it’s good as a buffer so you can batch writes to the underlying SIEM. This prevents tons of small API calls with a few hundred or thousand log lines each. Instead Kafka will take all the small calls and the SIEM can subscribe and turn them into much larger batches to write to the underlying storage (eg S3).
Don’t forget about all the added cost. never got it as many shops can tolerate data loss for their melt data. So long as it’s collected 99.9% of the time it’s good enough.