Hacker News new | past | comments | ask | show | jobs | submit login

Specialized tools are specialized: just remember the limitations! - Bad with heterogeneous hardware (Cloudflare experience) - Non-throttled recovery (source replicas flooded with replication load) - No real delete/update support, and no transactions - No secondary keys - Own protocol (no MySQL protocol support) - Limited SQL support, and the joins implementation is different. If you are migrating from MySQL or Spark, you will probably have to re-write all queries with joins. - No window functions



ClickHouse isn't general purpose DBMS. It is the best tool for collecting and near-online analyzing huge amount of events with many properties. We were successfully using ClickHouse cluster with 10 shards for collecting up to 3M events per second with 50 properties each (properties translate to columns). Each shard was running on n1-highmem-16 instance in Google Cloud. The cluster was able to scan tens of billions of rows per second for our queries. The scan performance was 100x better than on the previous highly tuned system built on PostgreSQL.

ClickHouse may be used as a timeseries backend, but currently it has a few drawbacks comparing to specialized solutions: - It has no efficient inverted index for fast metrics lookup by a set of label matchers. - It doesn't support delta coding yet - https://github.com/yandex/ClickHouse/issues/838 .

Learn how we created a startup - VictoriaMetrics - that builds on performance ideas from ClickHouse and solves the issues mentioned above - https://medium.com/devopslinks/victoriametrics-creating-the-... . Currently it has the highest performance/cost ratio comparing to competitors.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: