Databricks acquires serverless Postgres vendor bit.io

codeflo · on May 30, 2023

I haven't heard of either of those companies. I don't even fully understand what Databricks does. But it's clear that they have no problem shutting down a production database offering with 30 days notice, and have the gall to title this action "Investing in the Developer Experience". If this doesn't send a message that you shouldn't trust them with anything important, I don't know what would.

qsort · on May 30, 2023

> what Databricks does

It's an ancient African word that means "I am because I can't install Apache Spark".

fsociety · on May 30, 2023

Just install Apache Spark they said. It will be fun they said.

If you have the money, having a managed Spark instance with a bunch of added features can be a big win for some. There is a lot that goes into Spark maintenance.

nerdponx · on May 31, 2023

It also apparently includes some performance optimizations because they control both the hardware and software. And Delta Lake is pretty cool, and hosted MLFlow integration.

sagarm · on May 31, 2023

Databricks built a proprietary vectorized accelerator for Spark they call Photon. It's not just that they've tuned OSS Spark especially well.

RBerenguel · on May 31, 2023

Back when I was a customer (before Photon was released, also during) they had a very good tuning, in the order of around 2x faster for the workloads we had at the time (very large graph computation and a “simple” filtering)

rovr138 · on May 30, 2023

Databricks is a company by the people that built Spark.

They've extended and their platform does a lot now.

andruby · on May 31, 2023

What is Spark?

I assume that’s Apache Spark, which is described as a “ unified analytics engine for large-scale data processing”

Still not clear for me what to use it for :-/

rovr138 · on May 31, 2023

It is Apache Spark. It's a framework that allows processing large amounts of data in parallel on a cluster of computers.

You can use batch processing, streaming, do machine learning and graph jobs. You usually use Scala, Java, Python or R to write your code. The code is executed in Scala, so it all gets converted to it. For example, in Python you'd use PySpark and that gets written down to its scala equivalent which is then executed.

I mainly work in Python, so I'm going to talk about some features there. But it support dataframes and exposes the data in Spark DataFrames. You build operations and those slowly build a DAG. It's not until you either execute, save or request to see the data that it actually starts executing the DAG after optimizing what it needs.

If you need something that spark doesn't support, you can use regular python, but because it won't get converted to spark, it'll run on only one node and be limited. So you have to rewrite your code optimizing for it.

You can process some data in memory, you can use disk, you can use databases. Either as source or targets.

A use case can be, load the raw data as it comes in, transform the data to your intermediary states, then write out different tables based on what they need to do.

---

It's a framework that has an engine to manage code running on clusters, a language to interact with the data, abstractions and optimizations of the code, ways to store the data, checkpoints for optimizations, and other things.

kccqzy · on May 30, 2023

Wow you are right. The blog post doesn't even mention it but the home page https://bit.io/ does.

lucideer · on May 30, 2023

Slight oversimplification but Apache Spark is basically the "open core" to Databricks' commercial platform.

debarshri · on May 31, 2023

It probably was an acqui-hire. If the product was growing at a VC investible rate, they wouldn't have sunset-ed the product. Alternatively, may be they are going rebrand it into something that aligns with databricks.

relativ575 · on May 30, 2023

> But it's clear that they have no problem shutting down a production database offering with 30 days notice

Maybe there is no production db left from paying customers?

codeflo · on May 30, 2023

The homepage suggests otherwise, but who knows: https://bit.io/

re-thc · on May 31, 2023

> I don't even fully understand what Databricks does

The naming is really confusing. When I brick my console it's broken. I'm not sure I want to brick my data :(

coolgoose · on May 30, 2023

Going to beat a dead horse, but 30 days to migrate your database over ? I hope nobody was seriously using it in production, otherwise it's going to be a fun month for them.

tmountain · on May 31, 2023

Putting business critical data on a mom and pop service called bit.io was the first mistake.

klabb3 · on May 31, 2023

Reluctantly agreeing with you. So.. you can’t trust a small shop because an MBA corp dev team at some enterprise shop is always lurking around the corner. But if you go to the behemoth instead, you can get equally screwed because you don’t mean anything to their bottom line (see exhibit Google). The commercial software “service” industry is really fucked. I don’t want another tech bust, but we sure as hell deserve one.

singron · on May 31, 2023

I feel like we learned 20 years ago that buying proprietary software has a bunch of problems, so we switched to open source software. But in the last 10 years, we started buying software services, and now we have all those problems back again (corporate stability, vendor lock-in, principal agent problems, etc.). Maybe we will learn how to run our own software at some point without fully staffed teams of SREs?

re-thc · on May 31, 2023

> Maybe we will learn how to run our own software at some point without fully staffed teams of SREs?

Call them platform engineers?

pjmlp · on June 1, 2023

Because along the way the people creating open source software also have bills to pay, and found out that living outside Hotel Mama comes with own set of caveats.

gibolt · on May 31, 2023

The real concern is for those who don't get the memo until day 31

thewataccount · on May 30, 2023

> Then, final database exports will be available for download through July 29.

Hope nobody was using bit.io as a set and forget solution...

Which I thought was the entire point of the cloud hosted databases?

ibejoeb · on May 30, 2023

> Your databases will continue to work through June 29.

This is crazy. 30 days to migrate? Hope nobody is taking a holiday in the next couple of weeks.

thewataccount · on May 30, 2023

I'm surprised databricks (effectively) is willing to shutter a database service with 1 months notice.

What does that say about their own products? What if you integrate their products and are locked to their platform without any easy migration options?

If they lose interest on one of their own services, you very well may have 1 month to move, and 2 months to have a chance at keeping your data.

ibejoeb · on May 30, 2023

Seriously. Well, I guess their customer roster isn't all that impressive. Sounds like they're willing to burn them.

jen20 · on May 31, 2023

The problem is the signal it sends to Databricks' other customers about how they will be treated in future.

glogla · on May 31, 2023

Databricks is very much like Microsoft or Oracle - it is not sold by technical merit but by sales slides for CTOs. It is unfortunate but this will not impact their bottom line at all, because technical people already overwhelmingly don't want Databricks.

neilv · on May 30, 2023

Is it effectively Databricks that is shutting down an infrastructure solution on short notice?

I'm not thinking about whatever legal technicalities could be debated by lawyers, but what real-world truth is.

paulddraper · on May 30, 2023

You thought correctly.

inssein · on May 30, 2023

Damn, 30 days is quick. I found out about https://neon.tech but then quickly ran into a major bug, and then thankfully found out about bit.io, which is what I use for https://dittoed.app.

Looks like I will have to go back to neon (they fixed the bug).

If anyone has other ideas, I'm all ears. Project is hosted on Cloudflare and they have D1 now, but Dittoed uses a little bit of PostGIS.

singpolyma3 · on May 30, 2023

Have you tried supabase?

5Qn8mNbc2FNCiVV · on May 30, 2023

Neon sounds good for you. I'd wager any kind of managed database is fine, so the question is if you enjoy the features/cost savings Neon brings. Otherwise I cannot recommend using a managed DB enough because that's the best 20 bucks you're gonna spend.

ako · on May 31, 2023

AWS RDS, AWS Aurora or Retool database?

LispSporks22 · on May 30, 2023

We’ve been moving our workflows out of Databricks to PostgreSQL to save a ton. Wonder if what they’re going to do with this would have been handy at the time.

soulbadguy · on May 30, 2023

Where is the saving coming from if i may ask ? Are you guys using Databricks offerings or a self managed spark cluster ?

llama052 · on May 30, 2023

I'm willing to bet they are moving from databricks offerings, considering their pricing is insanity.

glogla · on May 31, 2023

If your data fits into Postgres, using Databricks is just plain waste of money.

lisasays · on May 31, 2023

Anything you'd be missing about Spark from doing so?

Would like to know more about the cost tradeoffs, also. Please elaborate.

dimfeld · on May 30, 2023

Corresponding statement from bit.io here: https://blog.bit.io/whats-next-for-bit-io-joining-databricks...

itsrobforreal · on May 30, 2023

2 months ago they had a blog post titled "bit.io’s new pricing Always available. Guaranteed performance. No surprises."

Surprise!

beoberha · on May 30, 2023

I wonder how much money is enough to give the middle finger to all your customers? Really disappointing to see.

thenipper · on May 30, 2023

That's a bummer, I really liked using bit.io for little experiments. That being said i never paid for it so i can't really complain.

stumblers · on May 30, 2023

same, I'm a paying customer and liked it. I don't have 'production' level demands but was doing lots of prototyping and testing of ideas. Very easy to use and reliable enough to count on.

Stinky.

wferrell · on May 30, 2023

Enjoyed using bit.io. Excited to see what Adam, Jmo and crew do at databricks.

Was easy to export my dbs from bit.io -- did so this morning.

running101 · on May 30, 2023

Get ready for your bill to go through the roof.

barefeg · on May 30, 2023

Are there alternatives to the database service with similar API or is it leaving a gap in the market?

tuukkah · on May 30, 2023

Neon is an awesome serverless Postgres: https://neon.tech/

ilrwbwrkhv · on May 30, 2023

Till the time they are also bought and "sunsetted." That's the problem with all these shiny startups.

tuukkah · on May 30, 2023

Not this one though, they are open source so someone else would start to offer new hosted instances: https://github.com/neondatabase/neon

Some Helm charts: https://github.com/neondatabase/helm-charts

It could potentially be one of their partners:

Vercel https://neon.tech/docs/guides/vercel

Hasura https://neon.tech/docs/guides/hasura

edude03 · on May 30, 2023

> Some Helm charts: https://github.com/neondatabase/helm-charts

For the record though, they're not enough to run neon today[0] - this has been a "problem" since neon was announced here[1].

[0]: https://github.com/neondatabase/helm-charts/issues/35#issue-...

[1]: https://news.ycombinator.com/item?id=31540691

5Qn8mNbc2FNCiVV · on May 30, 2023

I found their docker-compose more helpful than their chart: https://github.com/neondatabase/neon/blob/main/docker-compos...

But I also needed to read their Ansible files to understand how they manage their infra better. Those are deleted now, but luckily you can just look at the history (commit that deleted it: https://github.com/neondatabase/neon/commit/0d3d022eb1fe4a42...)

tuukkah · on May 31, 2023

I found installation instructions for Neon including Ansible files here: https://percona.community/labs/serverless-postgresql/docs/in...

5Qn8mNbc2FNCiVV · on May 31, 2023

That is super helpful and also kinda weird that it's on the percona website and not somewhere in the Neon docs

tuukkah · on May 30, 2023

Yes, you'd have to do some own work to set up a direct competitor from the provided pieces.

They have published a new piece which is how they vertically autoscale Postgres in Kubernetes: https://github.com/neondatabase/autoscaling

nikita · on May 30, 2023

Neon CEO here.

Autoscaling with live VM migrations is quite cool. Here is a blog post on it: https://neon.tech/blog/scaling-serverless-postgres

And yes, the code is open feel free to use it!

edude03 · on May 31, 2023

Yeah I'm not complaining - I mention it because if you're a bit.io customer that wants to migrate to another serverless Postgres solution you won't be able to do it by "just" running the linked helm charts.

(In my opinion) when someone says here's the helm chart I assume running "helm install $THING" would give me a running version of $THING, so it's more so no one has the wrong expectations (like I would)

pjmlp · on June 1, 2023

I keep expecting Vercel to be bought by Sitecore, given how they are replacing all the .NET stuff and pushing Vercel everywhere.

nikita · on May 30, 2023

Neon CEO. We are certainly not going to sunset Neon any time soon. We are extremely well funded and also growing super quickly. Expect some exciting announcements soon!

jfbaro · on May 31, 2023

coW? So devs can have their ephemeral databases in seconds at no cost? Only the reads and writes they use and the “delta” storage of their thin clone?

Bitenporal support?

Smart anonymization (like Tonic.AI)?

Looking forward to hearing the announcements

rcoder · on May 30, 2023

Neon at least has open-sourced their core offering, which provides a migration path for folks who make bigger bets on their platform. So yeah, there's every possibility they'll go away at some point, but unlike a lot of SaaS offerings, it's all Postgres over the wire and under the hood, so you have plenty of migration options (OSS, another managed Postgres vendor, Aurora, Cloud SQL, etc.)

nikita · on May 30, 2023

Neon CEO here. Definitely. Of course Neon storage is a distributed system and you need to know how to run it. But a) we can help b) Percona is a trusted partner of us that can support self hosting for you.

tuukkah · on May 31, 2023

Reassuring to know about the collaboration with Percona and that professional support for self-hosting is available if ever needed.

boomskats · on May 30, 2023

Second Neon, they know what they're doing. It's not their first rodeo.

brightball · on May 30, 2023

I don't know about serverless, but it's hard to beat Crunchydata for PostgreSQL these days. They're my goto.

boomskats · on May 30, 2023

Just make sure you've got your pricing structure & terms negotiated and agreed with them well in advance of putting it into prod.

eevo · on May 30, 2023

Render's offering is really good. Backups, read replica, many common extensions included. Fairly cheap. https://render.com/docs/databases

pistoriusp · on May 30, 2023

[disclosure: I'm the founder of Snaplet]

I think there are a lot of different reasons why people may want to use a service like bit.io, but if you want a database with data in it to code against, run tests against, reproduce production related data-bugs, and run e2e tests against then check out https://www.snaplet.dev.

ed25519FUUU · on May 30, 2023

Amazon serverless Postgres aurora.

gdubya · on May 30, 2023

This is interesting... possibly a move by Databricks to try and build on their "data lakehouse" concept to counter the recent "Fabric platform" announcements at MS Build.

Databricks coined the "Delta lake" concept and are still (just about) leading the way, but Fabric has the potential from MS to take away that marketshare. Databricks need to improve their "serverless SQL" offering, and add a serious "data warehouse" component alongside the lake.

Scubabear68 · on May 30, 2023

Of all the stupid tech terms in the world, for some reason “data lakehouse” grates horribly in my head every time I hear it.

fshbbdssbbgdd · on May 31, 2023

I hope the marketer who came up with it got the lakehouse they were dreaming of.

vforgione · on May 30, 2023

Fabric may eat some of the descriptive analytics portion of Databricks’ lunch, but for core data engineering workflows there is nothing in the Fabric—or Synapse or Power BI—ecosystem that comes close.

There are other fatal flaws to the Spark implementation in Synapse that I think carried over to Fabric. Worst one is the clunkiness/inability to run multiple notebooks concurrently on a cluster.

itsrobforreal · on May 30, 2023

I'm perusing the Fabric docs and they are using Delta Lake, Spark and Azure Databricks as part of that solution

gdubya · on May 31, 2023

Fabric does not use Databricks, but both Databricks and Fabric rely heavily on Delta. Let's just hope that they remain compatible.

itsrobforreal · on May 31, 2023

Ah, so we're in the "extend" portion of the process

tapsboy · on May 31, 2023

This might be a play against Snowflake Unistore (https://www.snowflake.com/blog/introducing-unistore/)

snapcaster · on May 30, 2023

What benefits does one get from using bit.io or other equivalents compared to the AWS built in Aurora? is their offering different and I'm just confused by the jargon?

cccybernetic · on May 30, 2023

It takes < 10 seconds to go from no account to database w/ bit.io

perrygeo · on June 1, 2023

Is the time it takes to spin up a database really a primary concern for anyone but a hobbyist? Hopefully one would take far more than 10 seconds to address the actual concerns of database work: backups, replication, upgrade procedures, access control, settings tuning, required extensions, etc.

If anything, companies are drowning in a proliferation of siloed datastores and most are highly motivated to fix that situation; the exact opposite concern of "quickly spin up a new database".

lionkor · on May 30, 2023

*took, as its shutting down

nostrebored · on May 30, 2023

Feature or liability depending on the market segment. To a ton of enterprise customers this is a nightmare.

unnouinceput · on May 30, 2023

"Serverless"...this word is so thrown around nowadays that it lost its original meaning. Same way the phrase "we're like a family" transitioned from a beloved one in 50's to its thrown away in 90's meaningless all the way to today be considered a red flag when you hear such a word at a hiring interview, the "serverless" word is in its late 90's nowadays. One decade and will become just another red flag.

mborch · on May 30, 2023

The meaning is pretty clear: you don't manage compute, it scales up elastically based on demand, even all the way to zero. Ideally, it reacts quickly enough to changes in demand that you don't need to worry about it. Serverless is basically the original promise of the cloud.

Dunedan · on May 30, 2023

It's not as clear as you think, because companies are watering it down. Just have a look what "serverless"-branded services AWS published the past years.

Take "OpenSearch Serverless" for example: They claim "you only pay for the resources consumed by the workload", but even if you have an OpenSearch Serverless collection you don't use, you pay at least ~$690/month (and that's not even accounting for stored data)!

https://aws.amazon.com/opensearch-service/pricing/

thinkharderdev · on May 30, 2023

What was the original meaning? When I hear "serverless" I think basically:

1. I don't have to think about or manage any servers

2. Usage is metered at a very fine-grained level (per X requests to the API/per GB of data/etc)

3. No fixed cost. You only pay for usage.

Was there a different meaning originally?

unnouinceput · on May 30, 2023

Distributed apps. No central server involved. Peer to peer is one. Or each app is a server too and the information propagates in ripple like style. You connect to me, we sync, then another connects to me and this way the info that only you had now he has it too (and me of course). That's the original serverless idea. Not this walled garden crap with "cloud". Cloud is just a computer that is not yours and anything you put in there it's no longer just yours (or in most cases when you lose the account is no longer yours, period! - HN has plenty of horror stories from Google, Amazon, Microsoft that shit on people and call it rain).

lmm · on May 31, 2023

> Distributed apps. No central server involved. Peer to peer is one. Or each app is a server too and the information propagates in ripple like style. You connect to me, we sync, then another connects to me and this way the info that only you had now he has it too (and me of course). That's the original serverless idea.

I don't remember those ever being called "serverless". Certainly "peer to peer" or "distributed" have a lot more traction.

fdasvklaj432 · on May 30, 2023

well i for one am very happy i never found out about bit.io, which looks amazing and is something i would have used instead of fly.io unmanaged postgres.

newjersey · on May 30, 2023

Disclaimer: I was a non paying user and used it just to try out some code in dotnet entity framework and postgresql (at $work I only ever get to touch sql server but for hobby projects I thought it would be nice to do something that doesn't require paying Microsoft).

Bit io is awesome. It just works. I mean so does elephant but bitio has more storage. I never got very far with my learning and never did tadvanced db concepts like cross apply though so it was just simple entities and tables but it worked just fine and the best part, no credit card required on file.

Fly sounds nice but I don't feel so good about having to give them my credit card number...

tracker1 · on May 30, 2023

Been playing with CockroachLabs (CockroachDB Cloud) as a cloud db platform, and relatively happy with my testing so far. It isn't completely pg compatible, and do wish they'd expose a web based query interface with better connection pooling characteristics.

That said, mostly PG compatible data types, indexes and queries, horizontally scalable with pay for what you use, free and reserved tiers.

candiddevmike · on May 30, 2023

Is this an acqui-hire?

monero-xmr · on May 30, 2023

Assume any acquisition without public terms done over blog post is an acqui-hire. But given the market that's still an accomplishment!

candiddevmike · on May 30, 2023

Are there good examples where an acqui-hire works out for the acquiring company? Seems like the acquiring company's culture is almost always at odds with the company being acquired and it causes the high performing teams they paid dearly for to leave.

I don't understand what a company hopes to gain doing stuff like this as the (long term) incentives don't seem to align.

monero-xmr · on May 30, 2023

If they truly don't want the tech - meaning this is a straight-up acqui-hire - then the employees of the acquired company continue to have a job, and ideally some sort of bonus or earn out for staying N months or years. It is a nicer landing than bankruptcy.

The executives of the firm being acquired usually don't come, unless they have some skillset the acquiring company needs. But they (hopefully) get a cash bonus for the successful acquisition.

Everything is negotiable of course.

aeyes · on May 30, 2023

The only way I have seen this work out is to give the aqui-hired team a ton of equity in the new company so that they don't jump ship immediately.

candiddevmike · on May 30, 2023

What does the acquiring company get out of this transaction though? What's the return on investment here if folks end up leaving or are completely checked out during their rest-n-vest? You can spend millions with a high end boutique consulting firm that will most likely be more accountable and productive than an acqui-hire.

tlarkworthy · on May 30, 2023

Firebase (which I was part of). Dunno if you count it as an acquihire if the product survives but I am pretty sure we did what the big G hoped we would

toddmatthews · on May 30, 2023

if you have a specific task at hand, it may be easier to buy a team that works well with each other and has expertise in a certain area to accomplish a specific task. not expecting them all to stay on forever

mousetree · on May 30, 2023

It's fairly common to not disclose the terms publicly.

lern_too_spel · on May 30, 2023

bit.io is shutting down its service and telling all its customers to find a new solution, so very likely, yes. https://bit.io/

moneywoes · on May 30, 2023

How was bit.io different from say supabase

rovr138 · on May 30, 2023

It's an actual database.

CharlesW · on May 30, 2023

Supabase is also Postgres-based. https://supabase.com/docs/guides/database/overview

rovr138 · on May 30, 2023

I understand that. But bit.io gave you a postgres database. Supabase gives you a firebase alternative that just happens to be built on top of postgres.

- https://supabase.com/docs/guides/getting-started/features

- https://docs.bit.io/docs/getting-started

kiwicopple · on May 30, 2023

(supabase ceo)

Supabase gives you a full Postgres database, we position ourselves as a Firebase alternative because we offer a few other bells-and-whistles. The database is just postgres[0] and so it has more compatibility than bit.io offered[1]

[0] https://github.com/supabase/postgres

[1] bit.io compatability: https://docs.bit.io/docs/supported-sql

rovr138 · on May 31, 2023

Just dug more and I see it.

Foranyone looking, their documentation on how to connect, https://supabase.com/docs/guides/database/connecting-to-post...

newjersey · on May 30, 2023

Also to piggyback on this, supabase deactivates unused (unpaid) instances just like planetscale does for MySQL.

eatonphil · on May 30, 2023

Congrats to Adam and the bit.io team!

thewataccount · on May 30, 2023

I don't mean to take away from getting hired at databricks.

But my understanding is they essentially got hired at Databricks? Maybe got a paycheck to do it?

Meanwhile they shuttered and abandoned the product and all customers.

Is it really the goal to make an mvp and plan to get noticed and acquired vs actually making a product, customers can migrate in a month or not we don't care?

The product they made was literally meant to be a reliable solution. 1 month for all customers to migrate away? Really? That's assuming they see the announcement today too, it would be so easy to miss the email/hacker news post.

geodel · on May 30, 2023

Well, it is going to be incredible journey for customers now.

newjersey · on May 30, 2023

Context for new readers

https://ourincrediblejourney.tumblr.com/

pedrobtz · on May 31, 2023

Is there a class of problems where Databricks Spark might be the best perf/cost solution?

unixhero · on May 30, 2023

Aha consolidation in the space

I think I've seen this before

xiaodai · on May 30, 2023

i might have been the only person using bit.io and saw the potential there. applied for a job but got rejected. meh

tadhunt · on May 30, 2023

Congrats Adam, Jmo & team!

colesantiago · on May 30, 2023

a great and incredible journey it has been.