Hey! I'm the original author of this post. I'm so excited to share our journey with ClickHouse and the open source Observability world, I'll be happy to answer any questions you may have!
Very cool write up. I'm curious about any challenges you had using Grafana? Also, do you think this sort of system would work as an alternative to Splunk as well?
Grafana works really well out of the box for most use-cases. Initially the LogHouse UI was built using the out-of-the-box Grafana tools (along with the ClickHouse data source plugin which we maintain)
You can get a really long way with the zero code dashboarding tools, especially with the latest plugin release (4.0) which comes with a completely rebuilt query builder which has an opinionated mode specifically for OpenTelemetry data. LogHouse exhausted the limits of the zero-code UI in a few places and it was necessary to evolve the UI from a dashboard into a Grafana Plugin. Doing so gives you much more control to build a full application using the Grafana primitives. I really like that as an SRE I can build a webapp without spending any time building UI components, instead just controlling the layout on page and declaring “this panel has the following `SELECT…` query” which is generated by some typescript function.
Examples of things which we can do on top of Scenes are:
- Pick a different schema based on the query parameters. For instance we have different schemas for different applications (Keeper/Server/Generic K8s app) and the app picks the necessary schema
- Always show the full generated SQL query on the page (We like to use Grafana UI to start off and then jump into fully manual SQL for deeper analysis)
- Take one filter value (for instance, k8s namespace) and look up all of the other filters required (pod names which were live during the time period, region, cell ect.)
- Some small gadgets like enabling users to import the time range from another application URL like DataDog. Oftentimes we start by looking at metrics in another source and then want to jump into the logs.
Can you share some details of how you implemented the cross region routing in Grafana? I think the article mentioned that you created your own plugin, is that plugin open sourced?
We would like to do something similar, but not sure where to start.