Hacker News new | past | comments | ask | show | jobs | submit login

Biased view of a Citus engineer here :)

Any enhancement to PostgreSQL is also an enhancement to Citus, or rather, the PostgreSQL ecosystem as a whole. For example, PostgreSQL 10's declarative partitioning feature will help enable sharding+partitioning in Citus, which is one of the most frequently requested features.

PostgreSQL 10 also gives you the possibility of setting up a partitioned table in which the partitions are postgres_fdw tables, which allows a basic form of manual sharding without Citus. However, as we've learned over the years, there's a huge difference between the ability to distribute a table across multiple servers and addressing a use case.

A sweet spot for Citus is multi-tenant (SaaS) workloads, in which all queries and transactions are specific to a particular tenant. In that case, you can typically distribute most of your tables by tenant ID, and use (replicated) reference tables for data that is shared across tenants. Citus makes sure that data for the same tenant is automatically co-located and as long as your query filters by a particular tenant and joins by tenant, you get full SQL pushdown (with parallelism in PG10), and ACID transactions. At the same time, you can perform parallel DDL commands across all tenants to enable migrations, bulk load data through COPY, perform parallel rollups or transformations through INSERT..SELECT, and run parallel analytical queries. Overall, the combination of these features and the trade-offs that Citus makes ensure that if you need to scale out a multi-tenant app, sharding through Citus solves it. In many cases, the only changes you need to make in your app are adding a tenant_id column to your tables [1], being explicit about the tenant in your queries or ORM [2], and adding create_distributed_table calls.

None of the Citus features that allow you to scale out a multi-tenant app are available if you do sharding through partitioning+postgres_fdw so far. I also wouldn't expect core postgres to make aggressive trade-offs to optimise for specific use cases. The PostgreSQL way is to make everything pluggable and let extensions specialise.

[1] https://www.citusdata.com/blog/2016/08/10/sharding-for-a-mul... [2] https://www.citusdata.com/blog/2017/01/05/easily-scale-out-m...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: