More

tudorg · 2024-03-21T10:35:30.000000Z

I'm curious about something: I suppose Salvatore still owns the copyright for most of the code? The old license does include his copyright, up to 2020: https://github.com/redis/redis/blob/7.2/COPYING So I think this change couldn't have been done without his explicit consent? Or did he transferred his rights to RedisLabs or a foundation?

weinzierl · 2024-03-21T10:43:35.000000Z

What your link points to is the BSD license, so yes, he owns the copyright but also gave everyone permission to use and modify the code as they see fit.

There is nothing that prevents anyone to use this code in combination with proprietary code and sell the resulting project for money. If he didn't want that he would have chosen a different license.

tudorg · 2024-03-21T10:51:37.000000Z

Ah, makes sense, thanks! And they do own the trademark, it seems.

tudorg · 2024-03-18T19:38:36.000000Z

Just to set a bit more context, there are two types of clusters in Xata, shared and dedicated. For shared, we indeed bundle the CPU/memory cost into storage. This is because, typically, a small database also won't consume too much CPU/memory, so there is some correlation there.

For dedicated, pricing works the same as in Amazon Aurora.

> At some point your db is munching through costly ram and processor and oft times doing nothing. Serverless is meant to indicate a solution to this concern. it's just hard to see at a glance where the efficiency gains of this approach really are (seemless migrations and branching are cool and all) so it looks like it's too good to be true.

The efficiency gains, for small databases, are in using shared clusters. This breaks at larger scale due to the noisy neighbor problem, however, moving databases seamlessly from one cluster type to another makes it easy to choose the "right size for you".

tudorg · 2024-03-18T16:53:40.000000Z

This is really cool, I love the idea of combining partitioning with foreign data wrappers to get sharded Postgres. I have tried it before and hit the same issue from the article:

> However, on closer inspection, we can see that we didn’t perform an async foreign scan, but executed each statement serially. Currently (based on my read of the PostgreSQL docs and code), PostgreSQL does not support async foreign scans with a “Merge Append” node (e.g., running multiple operations that require merging sort results, such as a K-NN sort).

It feels like Postgres is quite close to do that in an optimal way.

tudorg · 2024-03-15T19:02:37.000000Z

Hi, we were planning to announce this next week, because we're having a launch week, but it's nice to see pgzx already slowly gathering votes here :).

We'll have a blog post where we talk more about the motivation behind it, let me know if you have any questions.

tudorg · 2024-03-15T09:46:48.000000Z

Nice work! I like that it's both a library and a server right from the start. Will it be able to generate INSERT/UPDATE statements to apply the differences?

rexes · 2024-03-15T11:57:32.000000Z

Actually it doesn't dive right into the actual changed lines (even though it can be supported if the `--chunk-size` is set to 1), because it performs an MD5 comparison on the hashed rows.

This would be a great idea, but I am afraid it might be out of the scope of the tool, since we just wanted to keep it on the comparison level only.

We will note your feedback though and see if we can generate such commands in the future :)

tudorg · 2024-03-14T09:37:57.000000Z

> We introduce ExaLogLog (ELL), which is based on a recently proposed generalization of earlier data structures such as HLL, EHLL, and PCSA [17]. However, the geometric distribution of the update values is replaced by a distribution for which it is easier to map a 64-bit hash value to a corresponding random value. When optimally configured, ELL achieves a MVP of 3.67 as theoretically predicted and experimentally confirmed. Compared to HLL with 6-bit registers, ELL supports the same operating range up to the exa-scale, but requires 43% less space.

Sounds promising! I love the HyperLogLog idea and it's good to see improvements to it.

tudorg · 2024-03-11T20:46:31.000000Z

Hey, SQL over HTTP is possible now in Xata, and direct Postgres access will be possible _real soon_.

tudorg · 2024-02-18T17:48:14.000000Z

A good way to think about it is that any operation for which Postgres needs to inspect the existing data can block for a long time if there's a lot of data. For example, adding a unique constraint has to block. There is, however, a workaround with "NOT VALID".

Same with adding a NOT NULL constraint without a default value. If there is a (constant) default value, then postgres can do that without blocking, which is pretty cool. That works because it only needs to modify metadata.

Same with changing column types, they need to go over the existing data.

tudorg · 2024-02-18T17:41:22.000000Z

Nice overview of pg-osc. I'd like to also mention pgroll, which has some similarities, but does this at column level rather than table level and takes things further: it can expose the old and new schema simultaneously (using views), which means you don't need to maintain backwards compatibility code in your app.

Disclaimer: I work at Xata, we maintain pgroll.

e12e · 2024-02-18T17:47:36.000000Z

Pgroll: zero-downtime, reversible schema migrations for Postgres (xata.io)

328 points by ksec 4 months ago | hide | past | favorite | 149 comments

https://news.ycombinator.com/item?id=37752366

https://github.com/xataio/pgroll

CuriouslyC · 2024-02-18T18:09:56.000000Z

Plug bomb, but pgroll looks pretty good. I do a lot of copy/renames to update fields on very large tables with a lot of indexes, if it could automate those scripts including dependencies I would use it for big bespoke migrations in a heartbeat.

On the smaller side, I can see this being useful to avoid migration bugs, but being its own migration tool isn't a great choice since ecosystem specific migration tools have a lot of useful options and can be used programmatically. I'd make a pgroll plugin for alembic and other common ecosystem specific feature-rich migration tools that hooks into the ddl emission to transform a "dumb" migration into your juiced up migrations. That'd make it an instant use for me.

canadiantim · 2024-02-18T18:38:29.000000Z

Even though I'm using django, I'm still considering pgroll because it's got some beautiful features. Thanks for maintaining pgroll!

tudorg · 2023-10-03T16:50:51.000000Z

> Update: I think I see what happened. Xata is a serverless DB service, and this is a tool they wrote that can be used independently of their service (I assume). The subscription options presented to the user look like they are related to this CLI tool, but they are in fact for the broader Xeta service.

Yes, that's right. Xata is a Postgres-based service and we're working on exposing the Postgres DB directly and unrestricted to our users. As part of this, we're also open-sourcing parts of the platform. We'll have more such open source projects soon.

This is not exactly the case with pgroll, as its approach is different from what we do today in Xata, but we'll be incorporating pgroll in Xata soon.