Hacker News new | past | comments | ask | show | jobs | submit login
Deno KV internals: building a database for the modern web (deno.com)
230 points by avinassh 8 months ago | hide | past | favorite | 94 comments



I think Deno KV is a great tool with a nice API, it's great to see how it works. Really well designed.

I used it a couple of times locally with Sqlite for CLI apps, if you want to do some data manipulation stuff with TS from the CLI and need a db, don't look further.

I also used it in Production with FoundationDB on Deno Deploy.

It does not replace your postgres/mysql database but a different beast entirely for many reasons. One is pricing. You pay per read and write.

An issue I had is that it's hard to migrate from using KV with Deno Deploy. You can migrate to a local SQLite backed instance but will need to develop your own solutions on how you migrate and it will cost more the larger your database gets because you pay for reads.

I do think it's great, but I would recommend using Deno Deploy only if your reads and writes produce value that offset the costs, else you can find yourself in the issue of needing to migrate.

For example, use it for features you offer to authenticated users, but don't use it for things available in the open, else you open up yourself to high fees from DDOS.


> I used it a couple of times locally with Sqlite for CLI apps, if you want to do some data manipulation stuff with TS from the CLI and need a db, don't look further.

Are any of the these tools open source? Would love to look at what you're doing


I often build project specific tooling, to generate code or just transform some data, but maybe someday I create a FOSS version I can share.

Stuff like parse some XML or JSON and output Go structs and functions and Typescript types and functions, HTML,React Components, SQL Tables, Stored functions, Pl/PGSQL.

Mostly to avoid writing boilerplate when I use the same data structures in the database, middleware, client. For simple CRUD apps, it works well. I use local KV to track changes so I don't have to rerun things I don't need.

But Deno is great for CLI reporting tools or Scheduled tasks, fetch and aggregate data.

I think of Deno as a little swiss army knife. It's a tool that got everything built in.

I use the Repl a lot, just for a specific task, get it done then move on.


You can now back it up to Google Cloud Storage and Amazon S3. KV is still in beta, its only going to get better.


For reads, maybe in-memory caching would help with DDOS?


Deno charges for "inbound HTTP requests", so the DDOS can just query uuids till your checks start to bounce


As a FoundationDB partisan, it's been great to see more and more people starting to build stuff with it. Someone recently told me that they think of FDB as "assembly language for databases". I liked that. I think it faithfully captures both the good and bad parts of FoundationDB:

1) You are starting with a blank canvas

2) You will be dealing with low-level primitives

3) You have complete freedom in what you build

4) It's fast if you use it right

For each one of these items you could look at it and say "that's awesome" or "yikes." You know which group you are in!

I hope more people in the 'awesome' group realize that using FoundationDB can transform the development of a new distributed data store from a multi-year project into a few weeks of hacking. This is because FoundationDB solves the scalability, fault tolerance, and ACID parts for free--the least creative and most time-consuming aspects.


Do you feel like the tutorials on https://apple.github.io/foundationdb/tutorials.html#fundamen... are a good way for someone to get an impression of what the experience of working with FoundationDB feels like in practice? Are there any go-to walkthroughs that you'd recommend?


That class scheduling one is a good place to grok the basics. Just extending the kind of approach introduced there will get you decently far.

The best resource after the tutorials is: https://github.com/FoundationDB/awesome-foundationdb

Unfortunately, I think a lot of the advanced 'tricks of the trade' (alternatives to use cases for long-running transactions, when exactly to dispatch to a cloud object store, how to migrate whatever schemas you end up creating, etc.) that all the big serious users of FDB are doing are not as well covered.


I for one would be an avid audience if you ever wanted to blog about any of that!


Yes I’m sad that still nobody did a rock solid implementation of the Postgres API… I will use it all the time if someone did. I would have the best of both worlds.


"Deno KV" looks uncomfortably shoehorned into Deno. Some people are of course gonna use it if it's built in, but it doesn't look something that a good language/platform would ship with, if it wasn't for the money.


Seems like these companies want to be like Vercel or Supabase, a full backend as a service where you just write code and deploy and the backend handles everything for you.


Doesn't just seem like it, follow Guillermo Rauch's investments.

The same game plan every. single. time. Get into the JS developers' zeitgeist, wrap an existing cloud service with a new frontend, sell it at a markup.

Bonus points if the services can sell for each other.


True, however the DX is outstanding for Vercel, and the price is pretty decent enough, even though I know they all just wrap AWS. If I don't have to deal with AWS, I would (and do) pay extra to not do so.


Vercel is not a problem. Pumping millions of dollars into the JS ecosystem through sponsorships and events to define the development landscape in terms of what helps your bottom line... that's bad.

React has RSC because Vercel wanted to fight Shopify. NextAuth was practically bought out by Clerk to serve as sales funnel. <img> tags are marked as "not good" by a leading React framework because the hosting provider behind it wants you to pay them for image optimization.

What Rauch is doing is the developer equivalent of private equity squeezing, and what's insane is how well it's working.


Another example that particularly riles me is that NextJS forces the "edge runtime" (from Vercel) on middleware even when your server is otherwise running in nodejs.

The implication is that you can't do anything that relies on node APIs that edge doesn't support which can be quite limiting.

There's rumour that they're walking back this decision, but it always struck me as an arbitrary way to "encourage" codebases to make themselves compatible with edge which in turn would make deploying using Vercel easier.

(In general I'm reasonably happy using NextJS, though there are many architectural decisions that I find frustrating)


> React has RSC because Vercel wanted to fight Shopify

Dan Abramov (react core team) already said that the React team wanted to do RSC, Vercel were the followers that were eager to integrate and productionize what the React team wanted to do.

NextAuth is a competitor to Clerk auth. How is it bought out? Because Vercel pays an OSS developer to further develop NextAuth, and Guillermo also invested in Clerk? Someone using NextAuth means they're not using Clerk.


This is the most accurate take I've seen on this space. Instead of actual innovation we are being gaslit into overpaying and metered into every aspect of DX.

As I've worked in PE companies, I know all too well how they operate. It's just a shame that developers are naive and clueless about it.


Why shouldn't a hosting provider create a framework optimized for their infrastructure? How is this a bad thing? They aren't stopping anyone from using or creating other frameworks.


Eh I'll disagree with much of that. RSCs are amazing and basically what I've been looking for for quite a while, using TypeScript and JSX as a templating language for HTML, as I used to use PHP, but in a much safer way. Similarly, the image optimization is a good thing too, not necessarily that it goes through Vercel's servers but good in general. Dealing with srcsets with many images can be a pain and you can always use alternative image optimizer too, even open source local ones like sharp. Finally, there are lots of alternatives to what you're stating, no one forces you to use NextJS over, say, Remix or Vite with SSR. You can even not use React at all, or even JS / TS, there are lots of languages that compile HTML together.


You're free to disagree but you didn't disagree at all.

No one is saying they're not useful at all, the problem is Vercel (or really Rauch's entire cartel of companies) strong arm different technologies and topics to fit a narrative that's disproportionately in favor of "use our thing".

RSCs are not amazing if you're not Vercel and don't have access to their closed build format (especially after they killed the serverless output format)

I use image optimizers, I'm not about to go around telling people that img tags are bad.

> Finally, there are lots of alternatives to what you're stating, no one forces you to use NextJS over, say, Remix or Vite with SSR.

Remix had to reject RSCs to start because as one might expect, having one (1) design partner doesn't make for the most fully baked proposition.

Also the "there's other choices" refrain ignores the fact that no one is developing in a vacuum. Vercel is pumping enough money into the ecosystem that the mindshare is shifting towards Next regardless of technical merit. So it's not enough to say "who cares if Vercel distorts what Next is, just use something else"


Other people are building RSC that are faster and better. Expect some announcements later this year.


Remix is adding RSCs in the future, they've already stated. And Vercel isn't as big in influence as you think, many are also having problems with NextJS that are causing them to go elsewhere such as Vite or Remix as I mentioned. There is no monopoly in the frontend.


The DX used to be outstanding for Vercel, but for any new micro-cloud product some story of github-to-deploy or even drag-folder-here is now table stakes; look at Cloudflare for example. Vercel still has the best website dashboard for the other parts of running a website that isn't just deployment.

One problem I saw with Vercel, and a reason why I steered people away, was that they were very slow to react to some of the challenges of serverless workflows, like building their own in-house services to reduce latency or allowing different kinds of serverless tiers for longer-lived workers. You could hit growing pains almost instantly.


Yes, vendor lock in. That's why you need to be careful what you use it for. You rely on it too much, you can't get away from it and can end up costing you a lot of money with the pay per request pricing.

I don't want to write negatives about it, it's a well thought out product. But it 's not free. It is a paid service at the end and that is fine as long as you know what you are getting into.


And most people don’t see through this; they become over the top fans, shoehorning it into every tech stack convo until they notice they are locked in (or change jobs) and then start the entire thing from start again with the tech du jour. Seems fine if you build stuff that is to survive 1-2 years at most, but how teams plan long term on this stuff, I don’t know. We have fast kv, easy to install, not vendor locked in and open source for a long decade+ time. I would like stable, not slightly more convenience that becomes inconvenient once you think about it enough.


As many of the comments here are getting at KV is best suited for making your app state available at all edge servers at Deno Deploy.

You can also use localStorage in Deno but it won't synchronize the data between edge servers on Deploy, so you have to figure out another strategy.

and of course KV works without Deploy, so while you don't get the data distribution benefit, you can still use KV on other deployment platforms.

The alternative, of course, is to use another database. You don't have to use KV in your Deno apps.

Maybe what people want here is the ability to configure their own data synchronization mirrors? Is that already implemented?


Edge Servers seem to be one of the most misused terms in the JS ecosystem

Anyone know were Deno’s “Edge Servers” are? Is it anymore than EC2 instances because it doesn’t seem to follow the Akamai, Cloudflare, Fastly type approach of actually deploying their own hardware PoPs at the edge


and don't get me started on "Serverless" which often involves servers to my own befuddlement.


I guess the question is, is the interface to Deno-KV specified well enough that it could be replaced by another back-end?

Is it a "replaceable part"?


make it work, make it good, make it expensive.


For “good” meaning “we do all the sre/sysadmin stuff for you”. I am not sure it would apply for all the design decisions going in these things.

I think it speaks to how out of control many sre groups are that people willingly spend so much on tools to avoid them. This has real echoes of how the cloud got traction as a way to avoid internal IT.


"it doesn't look something that a good language/platform would ship with"

Why do you think that?


They have a huge incentive to not publish an open source backend that can run locally and isn't sqlite, to inhibit you from actually using Deno KV in production (past from the sqlite limits) outside of their cloud. Check out this quote from the OP:

> So, we set out to build two versions of the Deno KV system, with the same user-facing API. A version of Deno KV for local development and testing, built on SQLite, and a distributed systems version for use in production (especially on Deno Deploy). This post is about the implementation of the production, distributed systems version of Deno KV. If you’re interested in the SQLite implementation of Deno KV, you can read the source code yourself as it is open source.

Intentionally crippled open source software for the sake of selling a cloud subscription really isn't a great feature Deno should ship.

I think this situation is very different than Supabase. Supabase really publish the whole thing as Git repositories, and you can run it on your own servers if you wish.


Welcome to the world of VC-driven tech ecosystems!


Can a language that is not running on top of a VM pull this off? I do not think so? Which is why it seems to me we need to revisit JVM for example.


Why do you say so? It’s not that complex what Deno is doing (though I am not sure the complexity is needed over SQLite with replication).


We actually implemented FoundationDB for our Python focused cloud platform [0] recently too.

We found it to be super easy to set up and scale and it. Setting up FDB is literally install a package and make sure a file gets to all nodes in a cluster and you’re good. It also integrated really well with our function runtime.

You get a lot of distributed goodies for free in FDB with little effort and they “stack” their primitives very well too. As an example, there’s a built in multi-tenancy layer that is just using key prefixes under the hood, but it’s built in natively and can be accessed by the higher level apis.

It’s interesting that Deno went with a full separate transaction layer per region on top of a global cluster instead of doing regional clusters and making one region the primary writer and then doing request coalescing.

[0] https://www.bismuthos.com


I'm used to KV systems being really simple shallow systems. WASI-keyvalue for example is quite simple, stopping at providing an increment capability. https://github.com/WebAssembly/wasi-keyvalue/blob/main/wit/a...

I knew Deno KV was built on FoundationDB, expected this would be a neat but simple systems architecture rundown. But... Turns out Deni really went super deep in building a KV, by adding atomics!

> To maximize performance and concurrency for atomic operations, we built the Deno KV Transaction Layer that manages the global order of atomic operations. For each transaction that is received by this Transaction Layer:

I thought it was particularly creative how a .get returns not just the value, but some reference to what the get was. So when the atomic change comes, they can check the get. This was a neat way to let some sets use previous data safely, I thought.

Does anyone else have any other examples of thicker kv APIs? It feels like we're nearing a cap'n'proto level of promise pipelining, as we extend kv this way; I though Denos dodge was extremely expertly picked to limit how complex references to existing operations would need to be.


Half of HN posts are about poor implementations of what's already a first class citizen in the Erlang runtime / Elixir ecosystem.


Sort of. I was using Erlang in 2007, well before switching to Node/JS. Erlang/Elixir has its downsides, which is why I switched.

The entire industry has been twisting itself in knots trying to solve ops problems that Erlang/OTP solved in software long ago.

I'm trying Deno Deploy though, because it seems like an attempt to combine those benefits with the JS ecosystem. That has advantages in: Language usability, frontend/isomorphic, libraries, and serverless.

So far it feels like the future. Something like this will be, though I'm almost expecting a new language for it.


For non-Erlang people, what’s the reference?



I don't get how this is comparable to FoundationDB or DenoKV



Deno KV has a maximum record size of 64Kib [1] which is pretty severe. Compare with 25Mib for Cloudflare Worker KV [2]. It can be worked around, but it will make your design more complex if you have potentially large textareas, let alone images. Hopefully it will be raised by the time Deno KV gets out of beta.

[1] https://deno.land/api@v1.43.2?s=Deno.Kv

[2] https://developers.cloudflare.com/kv/platform/limits/


I big issue with Cloudflare KV (and Workers too) is that values do not persist at the edge. They are copied on demand much like a CDN and then expired after some time. This works great when you have a ton of traffic but sucks when you want consistent good performance with erratic traffic.

I wish CF had a premium tier where both Workers and KV would persist the the edge.


Durable Objects persist at the edge.


* An edge,


so when you make a change it's replicated to all regions?


No. A DO only exists in a single edge location. That's the entire point of Durable Objects.

D1 will replicate soon, but doesn't do so yet. My understanding is that it will replicate to ~5 locations and keep these locations fresh.


It's weird because all their marketing says it's distributed but it's not really


Distributed for thee but not for me


just use bun and sqlite on your own server. got burned by deno deploy and will never use deno again. serverless is useless and slow.


You can also just use sqlite and deno on your own server. Bun will also build their own cloud at one point. They need to monetise somehow.


Could you please elaborate on what happened?


yes. took part in a hackathon and decided to make my own webframework while doing it. It was supposed to be compatible with deno and bun and I chose deno deploy to deploy my stuff. But 17 hours before the deadline this happened: https://twitter.com/spirobel/status/1786665928954085562

deno deploy just showed a completely useless error message and stopped working.

I installed caddy on a vps and got it to work in 5 minutes.

I ripped out the deno compatibility and switchded to bun sqlite instead of turso and the website got even faster.

That is because the server has the data in the sqlite db. No need to roundtrip to turso.

btw. the framework is open source now: https://github.com/spirobel/mininext

It is like a non bloated version of nextjs. My intention is to displace php and wordpress. It is possible to get started with just html and css knowledge, but you have the whole javascript ecosystem at your fingertips.

It has no external dependencies, just 3 modestly sized files you can understand in an afternoon and it will reload your browser and rebuild when you save your code.


I've been messing around with Bun, esbuild and TSX templates for generating static sites. Bun feels like the future.


I agree. It is so much fun.

Difference between deno and bun is night and day. They really shouldnt be put in the same bucket.


I read this article when it came out last year, and I still wonder if this could be implemented in a cross-runtime way.



Thanks for the link. They seem to build some cool libraries.


I happened to use foundationdb for the leaderboards for an experimental game recently, for which fdb was the main experiment.

The big difference was I used the “record layer”, however, naively believing this meant only needing to understand the record layer . . . No. In foundationdb it is vital to understand all the layers up to and including the one you are using, partly so you can make sense of the documentation, but also because the problems you run into are so tied to the semantics of the lower layers.

That said, the great thing about fdb is deploying it is so easy. It is like a secret weapon hiding in plain sight.

The game https://luduxia.com/showdown/


beautiful game!! all on the web :clap


Thanks! I also added them to my other game that's been here before: https://www.luduxia.com/whichwayround/

The funny thing is I modeled the service for leaderboards on a commercial product an old boss of mine wrote v1 of in PHP/MySQL over 20 years ago. https://web.archive.org/web/20040806114005/http://www.macros...

Games people end up with things like massively sharded mysql and replication fun. One of the nice potential things with fdb is keeping transactions within user scope, and not having to deal with the sharding/replication yourself, you just have to arrange the keyspace correctly instead. I have worked on games where people sorely underestimated the complexities of db replication and literally burned tens of millions in recovering from the mistake.


What are the defining features of this leaderboard model?


It's dumb, and that's a feature!

One of the main things about it is you don't want to be updating the service for every new game, so you defer as many decisions as possible to the game, and be ready for it to make changes over time. The nasty part is this significantly complicates extracting the data for the leaderboards as scores are added, but you do that once and it's done. On some level it's like mongodb with a special index type.

A trend I see with too many efforts is to be way too overly specific, classically things like claiming something shuffling byte buffers around needs to know about the data format. You get surprising mileage and long term velocity out of embracing the lowest levels first, and allowing them to be built on arbitrarily later.


The atomics system use cases with the check() calls sound error prone, do you just get wrong answers if you omit a field that's involved? Could they be automatically done for involved keys? (Or for preserving explicitness in API usage, error-signaled if missing)


This feels like an unnecessary layer on top? Increments can just be a function that uses an existing txn to read, incr, write. It feels like they implemented transactions on top of transactions, adding complexity?

Maybe I didn't read it right though


(I work on Deno KV)

A read-modify-write retry loop causes a high number of commit conflicts, e.g. for atomic increments on integers. Getting higher than 1/RTT per-key throughput requires the "backend" to understand the semantics of the operations - apply a function to the current value, instead of just checking whether timestamp(value) < timestamp(txn.start) and aborting commit if not.


A small tangent on the subject of databases on top of FDB -- mvsqlite is such an insanely cool project and it looks like it's gone quiet. Any plans to pursue that further?

For those that haven't seen it, SQLite on top of FoundationDB: https://github.com/losfair/mvsqlite


Why not funnel all writes to a single region and leverage simpler transaction semantics.


Distributed consensus + fsync() determines the lower bound on commit latency. We have to wait for the data to be durable on a quorum of transaction logs before returning success for a transaction. That's usually 5-10ms, even within a single region.


5-10ms for consensus is forever!


For user based keys, that sounds nice. Except on multiplayer cases, again in these cases one might find alternative solution rather than KV. I don't remember reading that any other service offers this speed.


I have to wonder if it might have been easier to make something like localStorage, but call it serverStorage and add that into deno, then anyone could use both interchangeably. localStorage is well known and super simple to use.


Yes of course. 40 xhr requests to load a single page. Developers with no clue on the impact on what they are doing. Sorry a bit pessimistic here. Just had to deal with some serious similar issues in a team.


I'd be curious to know if the SQLite version would be suitable in production for a small to medium-sized project. Did anybody try it out?


(2023)


> Web apps very often require some persistent application state. Setting up a database involves numerous configuration steps and the subsequent integration of an ORM or other systems.

So this is a NoSQL argument.

> Atomic transactions: a database that provides atomic operations to ensure data integrity and enable complex application logic

Ok, so let's see if they mention "ACID". Nope. They do mention atomicity and durability, but they don't mention consistency and isolation.

So this is about NoSQL and NoACID, but it's not really stated this way.

K/V stores are great for many things. And not having a query language is great!! right up until you need one, then you get to be sad (I should know since I've had to write a query engine due to use of a K/V store).

[No]ACID is where NoSQL gets interesting, IMO. But NoACID doesn't mean you can't have a query language.

Anyways, if you're going to implement yet another K/V store you really should justify it in relation to other K/V stores that exist, not as "using RDBMSes is ETOOHARD".


paid SaaS features appearing in my language's runtime is... odd.

I remember on launch, Deno specifically did not add backwards compatibility for installing Node.js packages from npm, by design. It was supposed to be a "greenfield" approach to serverside javascript/typescript... buuut then they folded and added npm dependency mgmt to their tooling.

Some of the decisions in Deno feel like the "grow and find ways to monetize" strategy of your average vc-funded tech startup, but instead of being a SaaS it's your whole runtime.


One of the reasons .NET contracts were never widely adopted was that the infrastructure to make them useful was only available on VS Enterprise.

Since most companies go with VS Professional, zero adoption.

Same applies to unit testing code coverage on VS.


Yeah, the loss of Deno's original mission statement was saddening. I was hoping for a scrappy, but minimal and slowly growing runtime. Now that's it's being pumped with VC money, the team is spreading themselves thin between KV, Deno Deploy, Fresh Framework, JSR, and NPM/Node compatibility.


KV is open source and can be self hosted


Not the version described in this article.


Fair. I think this article was written before they released the open version. Deno KV is still in beta. If you're running it on Deno Deploy then it's a paid service, otherwise you have the option of hosting it wherever you like, and connecting to it is still pretty straight forward (a URL + environment variable).


With what license?

(Several clicks in it looks like https://github.com/denoland/denokv is the repo and it's an MIT license.)


The built-in Sqlite-based implementation is free and fully functional. It should be useful for local apps.


SQLite3-based implementation of... what?

Oh. Deno KV uses SQLite3 under the covers. That's... funny.


Outsider perspective; wrapping SQLite to provide a simple k/v JavaScript object persistence that supports all the strangeness of their types feels valuable, no? Irrespective of the economics of Deno itself.


SQLite3 seems like overkill for a K/V store, and in particular it seems like a pessimization for a K/V store.

Besides, if you're going to layer a network service on top of SQLite3 then things like rqlite and similar start to look very good, though, admittedly a K/V store is much simpler.


It'd be more useful if Deno exposed a SQLite driver or included one in the standard library like Bun.


this is a big reason why i stay away from Deno

by actively seeking to meter DX, they've actually driven dollars to their competitors.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: