Hacker News new | past | comments | ask | show | jobs | submit login

While I appreciate PostgreSQL every day, am I the only one who thinks this is a rather bad idea? The row based engine of PG is the antithesis of efficient storage and retrieval of timeseries of similar patterns, yielding almost no compression. Columnar storage should naturally be much better at that (BigQuery, Redshift, Citus), culminating in purpose built stores like Influx, Prometheus or KDB. Prometheus for example manages to compress an average 64-bit FP sample including metadata to just 1.3-3.3 bytes depending on engine. As most DB stuff is I/O bound, that usually translates into at least an order of magnitude faster lookups.



Not all time series data is metrics (which is what Influx & Prometheus are adapted for). Any kind of audit log -- to support a customer service console, for example -- is going to be a time series of fairly complex, multi-field records. Orders (and order modifications), support requests, payouts and transfers... Most of business activity falls under the aegis of time series, nominally immutable, even when it's not numbers. Working with this kind of data is definitely helped by a comprehensive SQL dialect.

To take a contrary position: whose infrastructure is so large that a specialized SQL and storage engine (like Influx) is necessary? Actually not that many companies...so why does infrastructure always end up with these special snowflake versions of things?


Soo much this, just worked on a project that sacrificed reliability, maintainability, and scalability to use a real time database to deal with loads that were on the order of 70 values or 7 writes a second.

Granted this was an especially bad case because it was the shittiest real time database ever built.


Don't think we're going to disagree with this!

And because Timescale is an extension in Postgres, it can actually sit inside (and coexist) with your standard Postgres database if you want. Or, it can be deployed as a separate instance that can transparently JOIN data against your main DB via a FDW.


Companies don't dream of staying small, and they want solutions that scale with them. If they do end up growing, planning ahead with a purpose-built solution saves time and resources recouping technical debt.


And many companies stay small or disappear because they wasted resources on problems they didn't have.


It's a risk either way.


But an equal risk? For most companies it isn't. Their are more failed big dreamers than companies that missed the big time because they couldn't scale.


It's a good question, and I don't know where the quantitative data exists to answer it.

My personal preference is to hire experienced teammates who have solved similar problems in the past, because at most levels of scale (other than, say, Google's or Facebook's), few problems are truly novel. So it's reasonably possible to predict where the business will be WRT technological requirements in, say, 1 to 3 years, sometimes 5 if you're just getting started, and plan accordingly.

In the case of this particular problem (time series data), there are some pretty easy and scalable answers out there -- Datadog, SignalFx, Wavefront, Prometheus, etc. to start with. So it's not like you have to waste a bunch of opportunity cost going out and building a solution.


With references to infrastructure in particular -- not all aspects of the business -- even very large companies rarely need specialized storage engines for logs or metrics. A thousand servers don't generate enough data for it to matter.

We've seen something of a convergence here with ELK -- people push their syslog into the same system they use to provide general text processing for the application, instead of having a separate/weird storage engine for infrastructure logs.


We do acknowledge that storage is not our strong suit, but we think the benefits of having SQL and being able to integrate easily with non-time-series data is a big win itself. Certainly if storage is an issue for you, TimescaleDB is probably not the right choice. But if it isn't, and you are doing more advanced queries (including JOINs or across multiple metrics), TimescaleDB provides a compelling solution in our view.


It's not just the storage itself (which is really cheap nowadays), but more of the fact that every byte read and written needs to go through the processor, pollutes RAM and poisons caches. Less storage usually translates into direct performance gains as well.

Also, if you find yourself JOINing timeseries on more attributes than the timeline itself, you should question whether you really have a solid use case for a timeseries DB.

That being said, always good to see competition in the market, especially if it's built on such a rock solid product and community.

I really like that Postgres is "good enough" solution for almost every use case by now besides relational data, be it document storage, full text search, messaging - or time series now. Nothing wrong with having less stack to worry about, especially for prototypes and small scale projects!


I think it's pretty clear Postgres is getting to a situation where the storage format becomes the limiting factor for a lot of workloads, particularly verious types of analytics. Timeseries are example of yet another type of data that would benefit from different type of storage. There already were some early patches to allow custom storage formats, chances are we might see something in one of the next Postgres versions (not in 10, though).


> It's not just the storage itself (which is really cheap nowadays), but more of the fact that every byte read and written needs to go through the processor, pollutes RAM and poisons caches. Less storage usually translates into direct performance gains as well.

FWIW, I think that's currently a good chunk away from being a significant bottleneck in postgres' query processing. But we're chipping away at the bigger ones, so we'll probably get there at some point.


One more thing to add: Postgres index usage via index-only-scans go a long way to mitigating performance issues of wide rows (although, admittedly not disk-space issues). This allows good performance on columnar rollups.


You are mentioning implementation-specific issues that may have more than one solution.

If one goes to the heavy contenders in this space, e.g. Teradata, you may expect: - DMA for data retrieval - A suitable and linearly scalable network layer with Remote DMA - Row, columnar and hybrid storage options - Utilization of CPU vector options - Etc

The analytical database has become a commodity. I really like Postgres, but I would still do a very careful analysis of my business needs if I were to choose a DBMS when there is such a strong range of commercial options available.


All good points. And especially agree about your last paragraph -- another benefit I didn't highlight is anyone familiar with Postgres does not have to learn a new part of the stack.


I totally agree with all the points here.

Maybe Postgres needs pluggable storage engines [0].

I read Postgres 10 might have this, but looks like it will miss the deadline.

[0] https://www.pgcon.org/2016/schedule/events/920.en.html


Good time series performance is more than just using column-based storage. You also need a query language to take advantage of this and the ordering guarantees it gives you. SQL while it has tried to reinvent itself, is a very poor language for querying TS databases.


From personal experience, not sure I'd agree with that statement. SQL may be limiting for some time-series use cases, but for others it's quite rich and powerful. I won't pretend that SQL solves everyone's time-series problems, but we've found that it goes pretty far.

That said, we may have to get a little creative to support some specific use cases (e.g., bitemporal modeling). Still TBD.

Also, I agree that SQL isn't for everyone (the popularity of PromQL is evidence to that). But a lot of people have been using SQL for a while (personally, since 1999), and there is a rich ecosystem (clients, tools, visualization tools, etc) built around it.


It most definitely is.. LEAD and LAG are about all you get, and they are painfully slow. SQL was made to be order agnostic, and attempts to make it most order-aware don't quite work. A good time series database is build on table order and lets you exploit it. SQL is abysmal for any time series work.

And temporal and bitemporal databases (despite the name) are orthogonal to the aggregation and windowing issues that make time series difficult in SQL or a row-oriented database. The are just a couple of timestamps and where clauses to support point-in-time queries.

Maybe this is why so many time series databases fail. People making them often don't seem to fundamentally understand the issues, Very few, such as Kx and KDB, seem to understand them.


Pluggable storage will definitely miss PostgreSQL 10 since the feature freeze is later this week.


And there's not even a credible proof-of-concept patch.


AFAIK, cstore_fdw (their columnar storage implementation) is not widely adopted among Citus users because it's not real-time. They also rely on memory to be able to query the data efficiently. I believe that the tricky part for time-series databases is not the underlying storage format, if you store the data in a way that allows you to query the data in a specific time period efficiently, that's usually enough for most of the cases.

If the data is not small for that specific time period and, than you will hit some IO problems, if you create the appropriate indexes it may be efficient at first but then you may need to switch to columnar storage format if you really need it. If Timescale uses immutable shards which are basically Postgresql tables, you can easily re-write the historical shards periodically using the columnar storage format rather than default row-oriented table format.

That's usually how most of the time-series databases work. The hot data is in row-oriented format and the system uses buffering to write the efficient columnar tables periodically under the hood.


I think these are good points about cstore_fdw and real-time analysis (although we don't have personal experience with this).

The usual thing that prevents indexes from scaling with large data is that inserts slow down as soon as the index BTrees can't fit into memory. TimescaleDB gets around this problem by auto-sizing of tables: we start new tables as old tables (and their indexes) grow so large that they can no longer fit in memory. This allows us to have tables that are not so big that inserts become slow but big enough that queries don't have to touch many tables (and are thus efficient).

However, as data sizes grow, you may want to convert data to column-stores to save disk-space though, as you allude to. We are looking at how best to do this and the best "archive" format to use. Stay tuned.


(Ozgun from Citus Data)

Your observations on cstore_fdw are accurate. I'm adding a few points about our learnings from cstore deployments in production.

From a use-case standpoint, we find that most cstore deployments use it for data archival and to a lesser extent data warehousing workloads on Postgres. Some of these deployments are fairly sizable (Fortune 1000) and they primarily benefit from compression that comes with cstore_fdw (3-10x).

Citus + cstore_fdw deployments exist, but they are fewer in number. This isn't because of an architectural issue related to cstore or Postgres FDW/extension APIs. Rather, making cstore work for real-time workloads require notable implementation work. In particular:

(1) Real-time ingest & update / delete capabilities: Columnar stores have a harder time providing these features. The most practical way to achieve this is by partitioning your data on time, using a row store for the most recent data, and then rotating that data into a columnar store.

Users who use cstore for real-time workloads, manually set up table rotation themselves. In fact, this type of integration between Citus (distributed db) and cstore_fdw (columnar storage) has been one of the most requested features: https://github.com/citusdata/citus/issues/183

(2) High-availability & disaster recovery: This requires that cstore integrates with PG's write-ahead logs. Fortunately, Postgres 9.6 makes this possible / pluggable and it's on cstore_fdw's immediate roadmap.

(3) Indexes for quick look-ups: cstore_fdw comes with built-in min/max indexes. This helps but doesn't provide quick look-ups across any field. We thought about integration with PG indexes -- one challenge there involves PG indexes' storage footprint and potential hit on compression ratios.

(4) Query planning and execution tuned for columnar store: If your workload is bottlenecked on CPU (rather than disk), these type of improvements can help a lot. For now, we're happy to see both recent and upcoming analytical query performance improvements that are coming up in Postgres 10 and beyond.

Also, putting cstore_fdw aside, I'm a huge fan of Timescale and the work they are doing! My favorite "feature" is that Timescale is a PostgreSQL extension. :)


So I don't know about the data layout here (and definitely agree with you re: column stores), but if you look at stuff like cstore_fdw[0], I think the data layout is also somewhat configurable under the hood. So "based on postgresql" doesn't necessarily mean "uses the same storage" - it could have its own columnar storage with the postgres query planning, connection management and other infrastructure on top, perhaps?

[0] https://github.com/citusdata/cstore_fdw


Currently we are using the default storage mechanism in Postgres, but have had discussions on alternatives and other ways to compress our storage footprint.


Have you tested running it on a FreeBSD box with ZFS? It has lz4 compression by default and makes such a great storage solution for PG. You get compression, snapshots, replication (not quite realtime but close), self healing, etc etc in a battled hardened and easy to manage filesystem and storage manager. I've found you can't beat ZFS and PG for most applications. Edge cases exist of course everywhere.


Compression works well with column stores for different reasons - since all columns are things of the same data type, there are way more opportunities for very very lightweight and highly effective compression, like run length encoding (a,a,a,a,a,a -> 6a) which helps a lot with sparse or duplicate columns, range compression (1,2,3,4,5 -> 1-5), dictionary compression (hippopotamus,potato,potato,hippopotamus -> 1,2,2,1 + a dictionary like {1:hippo,2:potato}), value scale (203,207,202,201 -> 3,7,2,1 + base:200), bit arrays for booleans and bit vectors for low-cardinality columns, etc.

This saves on space but also often improves query latency, provided your execution engine supports queries on compressed columns - less data means more of it fits into cache, meaning less cache misses (i.e. expensive disk reads).

edit: So what I mean is compressing the entire files or disk blocks with LZ might not take advantage of the same properties and might not get you the same deal - here often the in-memory data layout is also compressed, and queries execute efficiently on compressed data, versus with a blindly compressed filesystem you'd ostensibly have to always decompress first.


True. Actually, this execution efficiency is usually way more important benefit of column stores than just saving disk space.

Sadly, PostgreSQL does not support that now, but hopefully it's not that far in the future. It's becoming a fairly regular topic in discussions both with customers and devs on conferences.


Thanks for the great suggestions.

We've certainly discussed running on ZFS internally, but haven't evaluated yet. We're a bit uncomfortable dictating file system requirements to users, so looking to ultimately provide many of these things in the database instead.

Would welcome any testing/benchmarks, though! :)


Reminds me of IBM Informix TimeSeries which has been around for a while. This was IBM's attempt to bridge the gap between the relational model and the columnar format that is optimal for time series, except that they store the time:value arrays in blobs. Aside from compression benefits the columnar storage works really well for irregular time series. If the incoming data is irregularly spaced, row-based schema will have a certain degree of inefficiency due to the need to store NULLs each time one of the columns is missing a valid value.


That point about irregularly spaced data (sparse) is a very insightful observation. I’d just add that a user can to some extent address that by normalization, i.e., splitting incoming data across multiple TimescaleDB (hyper)tables, like in any SQL database. However, the are clear trade-offs here. The upside is that users can themselves balance these trade-offs.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: