Hacker News new | past | comments | ask | show | jobs | submit login
Launch HN: Drifting in Space (YC W22) – A server process for every user (driftingin.space)
108 points by paulgb on Feb 28, 2022 | hide | past | favorite | 57 comments
Hi HN, we’re Paul and Taylor, and we’re launching Drifting in Space (https://driftingin.space). We build server software for performance-intensive browser-based applications. We make it easy to give every user of your app a dedicated server-side process, which starts when they open your application and stops when they close the tab.

Many high-end web apps give every user a dedicated connection to a server-side process. That is how they get the low latency that you need for ambitious products like full-fledged video editing tools and IDEs. This is hard for smaller teams to recreate, because it takes a significant ongoing engineering investment. That’s where we come in—we make this architecture available to everyone, so you can focus on your app instead of its infrastructure. You can think of it like Heroku, except that each of your users gets their own server instance.

I realized that something like this was needed while working on data-intensive tools at a hedge fund. I noticed that almost all new application software, whether it was built in-house or third-party SaaS, was delivered as a browser application rather than native. Although browsers are more powerful than ever, I knew from experience that industrial-scale data-heavy apps posed problems, because neither the browser or a traditional stateless server architecture could provide the compute resources needed for low-latency interaction with large datasets. I began talking about this with my friend Taylor, who had encountered similar limitations while working on data analysis and visualization tools at Datadog and Uber. We decided to team up and build a company around solving it.

We have two products, an open source package and a managed platform. Spawner, the open source part, provides an API for web apps to spawn a session-lived process. It manages the process’s lifecycle, exposing it over HTTPS, tracking inbound connections, and shutting it down when it becomes idle (i.e. when the user closes their tab). It’s open source (MIT) and available at https://github.com/drifting-in-space/spawner.

Jamsocket is our managed platform, which uses Spawner internally. It provides the same API, but frees you from having to deal with any cluster or network configuration to ship code. From an app developer’s point of view, using it is similar to using platforms like Netlify or Render. You stay in the web stack and never have to touch Kubernetes.

Here's an example. Imagine you make an application for investigating fraud in a large transaction database. Users want to interactively filter, aggregate, and visualize gigabytes of transactions as a graph. Instead of sending all of the data down to the browser and doing the work there, you would put your code in a container and upload it to our platform. Then, whenever a fraud analyst opens your application, you hit an API we provide to spin up a dedicated backend for that analyst. Your browser code then opens a WebSocket connection directly to that backend, which it uses to stream data as the analyst applies filters or zooms/pans the visualization.

We're different from most managed platforms because we give each user a dedicated process. That said, there are a few other services that do run long-lived processes for each user. Architecturally, we're most similar to Agones. Agones is targeted at games where the client can speak UDP to an arbitrary IP; we target applications that want to connect directly from browsers to a hostname over HTTPS. In the Erlang world, the OTP stack provides similar functionality, but you have to embrace Erlang/Elixir to get the benefits of it; we are entirely language-agnostic. Cloudflare Durable Objects support a form of long-lived processes, but are focused on use cases around program state synchronization rather than arbitrary high-compute/memory use cases.

We have a usage-based billing model, similar to Heroku. We charge you for the compute you use and take a cut. Usage billing scales to zero, so it’s approachable for weekend experiments. We have not solidified a price plan yet, but we’re aiming to provide an instance capable of running VS Code (as an example) for about 10 cents an hour, fractionally metered. High-memory and high-CPU backends will cost more, and heavy users will get volume discounts. Our target customers are desktop-like SaaS apps and internal data tools.

As mentioned, our core API is open source and available at https://github.com/drifting-in-space/spawner. The managed platform is in beta and we’re currently onboarding users from a waitlist, to make sure that we have the server capacity to scale. If you’re interested, you’re welcome to sign up for it here: https://driftingin.space.

Have you built a similar infrastructure for your application? We’re interested in hearing the approaches people have already taken to this problem and what the pain points are.




I love seeing more options appear on the horizon for doing stateful serverless work. This article[1] provides a little more motivation for the use cases:

> For quite a long time (and especially in the webdev world), there exists a perception that to achieve scalability, all our request handlers need to be as stateless as possible. In the world of the all-popular Docker containers, it means that all the app containers need not only to be immutable, but also should be ephemeral ... keeping our request handlers stateless, does NOT really solve the scalability problem; instead it merely pushes it to the database.

Though the problems and solutions pointed out in that article don't mean you have to go straight to process-per-X. One solution might be, as mentioned in passing in the OP's launch blog, to keep state in a cache like Redis. If the data fits this approach, it would ease load on the database while allowing each request handler to remain stateless.

Durable Objects seem less focused on heavy computation, but I think they're really interesting as points of synchronisation for e.g. collaborative editing. Having all requests go into a _single thread_ seems important.

[1]: http://ithare.com/scaling-stateful-objects/


Beam's ability to spawn lightweight processes is a life saver. A lot of people are praising liveview for being able to write spas without javascript but the real killer feature is the ability to track a session for a user from the backend. Love how you guys are making that a first class consideration for folks using less powerful platforms.

Did you guys build this on top of beam? my startup had a similar need for opening a process per user and we ended up using a combination of horde + genserver to accomplish something similar. In our case, we spawn a process that mainitains a websocket connection to an external service, maintain some state in there and relay updates to the user over a channel. There is one per client.


We're not using BEAM directly, but I find it pretty neat and spent some time reading up on it when getting started with this. I'm pretty excited by https://lunatic.solutions are doing as well, as an approach to bringing the ideas behind BEAM to WebAssembly. Ultimately, I explored WebAssembly for a while and realized that there was more of a market if we could run containers instead of just WebAssembly modules. (The result of my work in that direction lives on as Stateroom: https://github.com/drifting-in-space/stateroom)


"For quite a long time (and especially in the webdev world), there exists a perception that to achieve scalability, all our request handlers need to be as stateless as possible."

This is definitely overdue for a re-examination. "Web handlers should be stateless" goes all the way back to the 1990s, when a server system was lucky to have a single gigahertz, and even server systems could be looking at being loaded with the obscene quantity of maybe 128MB of RAM.

The solution is obviously not to just flip all the way to the other side. But the landscape has changed a lot since then. I've made a lot of hay out of very selectively stateful web services. It takes some care, but sometimes it honestly takes less care than trying to build completely pristinely stateless servers, because it's not like that's trivial all the time either!


Do you have any recommendations for resources that have been helpful with that? I'm trying to introduce some "tactical statefulness" into a mostly-stateless web backend and would love to find some giants to stand on.


Does the managed service actually require that each user get their own container? For some applications, particularly collaborative ones, it would make much more sense to have a container for each top-level thing that the users are collaborating on, e.g. one per document. I think Sandstorm [1] got this right with its concept of grains, and I've long wanted a tool that brought that model, a stateful container per high-level object, running arbitrary code (unlike Cloudflare Durable Objects), to the world of hosted SaaS. Speaking of Cloudflare, I'm looking forward to seeing what their edge containers can do, when that feature is eventually made public.

[1]: https://sandstorm.io/


> Does the managed service actually require that each user get their own container? For some applications, particularly collaborative ones, it would make much more sense to have a container for each top-level thing that the users are collaborating on, e.g. one per document.

Exactly right. We do not actually require that every user gets their own container; that's a decision that's entirely up to your app. Our API spins up an instance and returns its hostname, and then you can connect to it from as many clients as you like.


Looks like a cool project, but I am not sure I understand the need for one process per user.

Some questions:

- Why do you need one process per user? For low latency, would you just need to make sure you have idle CPU to serve their request, even if that CPU time is multiplexed onto an event loop (one event loop serves many users)?

- Wouldn't this "event loop" actually be more efficient that one user/process, as there would be less context switching cost from the OS?

- Can I just keep a map of (connection, thread_id) on my server, and spawn one thread per user on my own server?

- Could I just load up my server with many cores, and give each user a SQLite database which runs each query in its own thread?

- This way a multi GB database would not be loaded into RAM, the query would filter it down to a result set.


Good questions!

> Why do you need one process per user? / Wouldn't this "event loop" actually be more efficient that one user/process, as there would be less context switching cost from the OS?

We're particularly interested in apps that are often CPU-bound, so a traditional event-loop would be blocked for long periods of time. A typical solution is to put the work into a thread, so there would still be a context switch, albeit a smaller one.

The process-per-user approach makes the most sense when a significant amount of the data used by each user does not overlap with other users. VS Code (in client/server mode) is a good example of this -- the overhead of siloing each process is relatively low compared to the benefits it gives. We think more data-heavy apps will make the same trade-offs.

> Can I just keep a map of (connection, thread_id) on my server, and spawn one thread per user on my own server?

If you don't have to scale beyond one server, this approach works fine, but it makes scaling horizontally complicated because you suddenly can't just use a plain old load balancer. It's not just about routing requests to the right server; deciding which server to run the threads on becomes complicated because you ideally want to decide based on the server load of each. We started going down this path, realized we'd end up re-inventing Kubernetes, so decided to embrace it instead.

> Could I just load up my server with many cores, and give each user a SQLite database which runs each query in its own thread? This way a multi GB database would not be loaded into RAM, the query would filter it down to a result set.

If, for a particular use case, it's economical to keep the data ready in a database that supports the query pattern users will make, it's probably not a good fit for a session-lived backend. In database terms, where our architecture makes sense is when you need to create an index on a dataset (or subset of a dataset) during the runtime of an application. For example, if you have thousands of large parquet files in blob storage and you want a user to be able to load one and run Falcon-type[1] analysis on it.

[1] https://github.com/vega/falcon


> A typical solution is to put the work into a thread, so there would still be a context switch, albeit a smaller one.

Car wasn't fast enough, so we removed the rear view mirror to lower weight. You are looking at the sexy fun to solve problem rather than the useful boring solution of throwing away the stack. Users can already run things like Solidworks in a web browser with near native performance using VDI.

> deciding which server to run the threads on becomes complicated because you ideally want to decide based on the server load of each

High end load balancers have done this since the 90s. This is now easily done with the nginx API.

Honestly I am sure there is some need somewhere for your stack. But hiring a good server/network operations team instead would have saved you a lot of code.


One way to look at it is that it’s like the architecture Github Codespaces uses internally, made available off-the-shelf. I don’t think using a VDI approach would make Codespaces a better product. In fact, I was partly motivated to build this by frustration with laggy VDI setups I had to deal with (though I don’t think VDI has to be bad.)

> High end load balancers have done this since the 90s. This is now easily done with the nginx API.

A load balancer doesn’t (or at least shouldn’t) do everything we need to do, which involves statefully mapping hostnames generated on-the-fly to servers in a cluster. This allows our users to create instances that multiple clients can connect to, as opposed to just using “sticky sessions” or something like that.

Our approach takes less code than you might think —- we lean heavily on nginx and Kubernetes where we can, so we only need to fill in the missing pieces.


So it is kind of like server-less, but each instance:

- Persists for the lifetime of the user session.

- Only processes a single user session.

- Has large amounts of CPU/RAM and writable disk to handle large datasets.


Yes, that sounds about right.


I'll admit, I clicked on this "Launch HN" mostly because the name sounded cool, but after reading the description... the name doesn't seem particularly relevant to the business in any way, which can be fine, it's just interesting to note.

I am a little confused about the product purpose and the definition of who your competition is in the market. I think new SaaS hosting providers are interesting, so please don't take any of this the wrong way, just hoping to give you some space to expand on your ideas more.

> Here's an example. Imagine you make an application for investigating fraud in a large transaction database. Users want to interactively filter, aggregate, and visualize gigabytes of transactions as a graph. Instead of sending all of the data down to the browser and doing the work there, you would put your code in a container and upload it to our platform. Then, whenever a fraud analyst opens your application, you hit an API we provide to spin up a dedicated backend for that analyst. Your browser code then opens a WebSocket connection directly to that backend, which it uses to stream data as the analyst applies filters or zooms/pans the visualization.

You say "put your code in a container", but... wouldn't you basically have to put all your gigabytes of data into a container? The bottleneck to the types of analytic applications you're describing seems unlikely to be the custom backend code, and far more likely to be whatever database is powering the application, which means that each interactive instance really needs to spin up a complete copy of the dataset to gain any performance benefit for these on-demand analytic workloads.

I've worked with a number of high-scale applications, and scaling the backend API server has never been even remotely the main challenge... plus, having dedicated instances of the web server process wouldn't make anything faster than just having an appropriate number of instances, it would just make it more expensive. It's almost always a question of scaling the database -- not the API layer. For offline analytic workloads like you describe, you could potentially spin up fresh copies of the database for each user, and that would make things better, but the challenge of scaling (online) OLAP and OLTP comes from the shared-everything nature of the database itself. If you're intending to provide unique database instances to each user, then all the data needs to either be packaged up with the application, or stored somewhere that the application can retrieve it on startup and load the database, which could be a time-consuming process that creates painfully long cold starts.

> Many high-end web apps give every user a dedicated connection to a server-side process. That is how they get the low latency that you need for ambitious products like full-fledged video editing tools and IDEs.

> We have not solidified a price plan yet, but we’re aiming to provide an instance capable of running VS Code (as an example) for about 10 cents an hour, fractionally metered.

Since you bring up the examples of running GUI desktop applications, I'm wondering if your competition isn't actually AWS WorkSpaces. Someone could build an image for a WorkSpace that includes everything the analyst needs, and then AWS will manage the lifecycle of that instance as the analyst connects and disconnects, billing entirely based on usage. That image could even include vast quantities of data pre-populated into a database, along with a web server that offers local dedicated processes to serve requests from the browser in the WorkSpace, if the company prefers to develop their application's GUI using the web as a platform.

Obviously the challenge with WorkSpace is if you want to offer it to parties outside your company, but AWS does address this use case to some extent: https://aws.amazon.com/blogs/security/how-to-secure-your-ama...

A company could definitely address the nuances and automation of offering WorkSpace to third parties, but such a business would likely be extremely vulnerable to AWS just improving WorkSpace to include those features out of the box.


> You say "put your code in a container", but... wouldn't you basically have to put all your gigabytes of data into a container? The bottleneck to the types of analytic applications you're describing seems unlikely to be the custom backend code, and far more likely to be whatever database is powering the application, which means that each interactive instance really needs to spin up a complete copy of the dataset to gain any performance benefit for these on-demand analytic workloads.

You're right that it does depend a lot on the needs of a specific application. If a bunch of users are accessing the same dataset, and can constantly access the subset of data they need with low latency through a global index, and there isn't much need to do computation interactively at runtime, then a standard architecture is probably a better fit.

Where this approach is useful is if every user needs access to a different subset of the data (e.g. if the underlying dataset is petabytes, and each user needs to interactively explore a different gigabytes-big subset of it). Or if there is a lot of derived compute on top of it, for example, a graph visualization that needs to be updated when the user changes the subset of data in focus.

> I'm wondering if your competition isn't actually AWS WorkSpaces

The general approach of "run and render elsewhere and stream the pixels back" is definitely our competition in the sense that it's something companies currently do. What we provide is a way of moving the client/server boundary to wherever makes sense for your app: if it makes sense to render server-side and stream pixels, you can do that (although we don't yet support UDP, which would be useful in this case); if it makes sense to do data aggregation server-side but render through WebGL, that's also an option.


Do you have any resources to point me towards that elaborate on the benefits of a process-per-tenant/user for performance?

I work on a data-intensive app that fits the use-case you describe but I'm confused about the benefits for performance. (can certainly see how the code would end up nice/simpler) Is this mostly applicable to certain stacks?


> Do you have any resources to point me towards that elaborate on the benefits of a process-per-tenant/user for performance?

Not yet, but we're working on some demos of things that are easier with session-lived backends. One way to think about it is that it's good for repeated queries against the same subset of data -- if you have a dataset of petabytes and your typical use case has users (through filters or queries) repeatedly accessing a sample of ~gigabytes of that data throughout a use session, you could use a session-lived backend to materialize that subset of data in-memory and quickly serve queries off of it without hitting the global index.

Another case where it comes up is when you need to do some stateful computation after loading the data, for example, if you need to generate a graph or embedding layout of some data and refine the layout when users select/deselect data.


That's graphistry's last 7+ years, where we do frontend CPU/GPU visual analytics experiences <> GPU server processes. 100% endorse this design, opens a lot :)

Main tweak is our model grew to "small client browser GPU/CPU session <> serverside multi-node multi-GPU time sharing." Current cloud services (lambda etc) fail here: cold-start, mostly CPU, etc, vs bursty sticky GPU sessions. A lot more power when you can scale resource use... so 1 server process is kind of dinky. Good backend abstractions are tricky here though, so starting with 1 process makes sense as they figure out a sustainable revenue model, e.g., powering demanding visual intelligence apps is vastly different from powering commodity netlify CRUD apps.


Thanks, that means a lot coming from one of the pioneers of the approach!

The fraud analysis use-case is actually semi-based on a real world experience I had building tools for fraud detection in adtech in 2013, where I often found myself taking a time-slice of a graph and loading it up in Gephi to compute a layout. I'd written other browser-based tools to make my work easier, but because I was shoehorning everything into a stateless backend, computing a large graph layout as part of it was tough. So when I saw what you were up to with Graphistry, it immediately resonated with me (though I was no longer working on fraud at the time).


Super cool!

Funny enough... We've stacked up ~4 incarnations of how we back our different kinds of sticky live GPU session workloads, and ironically, a big one we aren't doing but I keep wanting to see solved is user-defined GPU containers (vs our own). So, good luck!


1. How does this compare to MightyApp [0]? Does Mighty work on the UI side automatically, while Drifting in Space requires that the app has some kind of data layer separation to allow acceleration of just the data processing?

2. Is the data processing stream or batch?

3. Could Mighty + DiS work together to completely accelerate a data- and UI-heavy application?

Context: I have been working recently with a reporting-heavy company that is continually using data analysis to understand risk, combat fraud, and identify key patterns in user actions and data.

[0]: Mighty Makes Google Chrome Faster (YC S19) -- https://news.ycombinator.com/item?id=26957215


1. Mighty has some similarities in that they run a (Chrome) process for each user session, but it's quite different in that we are something that SaaS companies can build into their app, whereas Mighty is something that the end-user subscribes to and the SaaS provider doesn't need to know exists. I think Mighty is pretty neat, but it doesn't get around fundamental limitations of browsers, e.g. a WebAssembly process running in Mighty can't address more than 4GB of memory regardless of how beefy the machine Mighty runs it on is.

Since our product is built directly into the SaaS app, it's up to the app's developer to decide at what level they want to split the work between client and server. Doing everything on the server and streaming pixels is one option, but I suspect most applications will want to take a hybrid approach where they do some CPU/memory-intensive work on the server, stream the data to the client, and use the client's GPU (via WebGL/WebGPU) to render it. So that's the approach we're currently optimizing for, but better support for pixel streaming is on our radar too.

2. It's up to the application layer; we just provide a way to run a container and the data layer is up to the app.

3. Yes, an app served by DiS is just a regular web app so you could use it in Mighty. Our hope is that because we shift some of the heavy computation to the server there are less use-cases where this makes sense, but there could be cases where you want to do GPU-heavy rendering which we don't yet provide.


Don't want to spoil the party, but you could get all of this and way more with Elixir, running on a platform that was designed for this. And with LiveViews, you could do without a client. Aren't you kind of reinventing the wheel?


I think the founders have run through an analysis much like this one I gave ten months back: https://news.ycombinator.com/item?id=27195000

Erlang/Elixir is a very neat little ecosystem, but in a lot of ways it's a dead end now. It was alone in its space for so long that it built a lot of ways of doing things that are kinda closed in on its own ecosystem, because there was no other ecosystem to reach out to to speak of, but now there's an abundance of choices and choosing Erlang means choosing something that is built on a lot of assumptions that don't match the world anymore. There may be some "reinvention" in building something on WASM and other communication mechanisms, but it's one with a path forward.

In particular, Erlang/Elixir have a lot of integrated solutions for modern code problems, but being either first or very early, none of them are best-of-breed anymore. You could think of them as the first draft of a lot of modern techs. Between that, and the fact that you can't build a business based on going to your customers and saying "Hey, everyone, I've got a great platform, just allocate the budget to rewrite your entire codebase into this somewhat obscure language and we'll make everything all better!", it just isn't a viable choice for a business, or at least not one that has any plans for growth. (And I don't mean VC-funded hypergrowth... I mean, the regular kind too.)


It’s true that Erlang/Elixir/OTP have this solved in that ecosystem, as I mention in the post. When we talked to dozens of teams that have internally built tech like this already, only one had gone that route.

In general I haven’t seen any really data-heavy apps in Elixir, are there examples I should be looking at? It could be interesting to compare performance.


Very cool! Reminds me a bit of Jupyter and the whole code notebook world too. Spawner almost seems like a more general purpose JupyterHub, which IMHO is a good thing (jhub is frighteningly complex to config and setup these days).


> Spawner almost seems like a more general purpose JupyterHub

That's actually a very good way of putting it to people who understand the reference!

One of the things I've been playing with is actually using Spawner to spin up Jupyter Lab notebooks with their new(ish) collaboration feature. Jupyter and VS Code both work very nicely with Spawner's architecture out-of-the-box, since they can be put into a container and accessed entirely through an HTTPS connection.


Yeah a 'spawn a VS code server instance on these files' microservice could be super handy for lots of things. There are fantastic technical doc tools like mkdocs, mdbook, etc. but none of them have an editing interface. You could add an 'edit' button to their generated HTML that opens a spawned VS code server instance on the files, and now you've got a little wiki / knowledge base that a small team can work from.


If anyone from fly.io is watching, I think it would be smart for fly.io to acquire this new company and integrate the concept, and maybe some of the implementation, into the Fly platform.


Interesting. Will it be possible to control sweeper via API also?

I'm a solo open-source maintainer and have a popular project that people want to orchestrate many instances of. Each instance (a.k.a session) is stateful and individually configurable. I'm excited to test out spawner. Any company that makes it super simple for open-source maintainers make money by providing a managed service will be a huge success - from my initial thoughts, this looks to fit the bill.


Is the use-case you have in mind for a Sweeper API being able to shut down a pod based on an external event? We don't have a nice HTTP API for that yet (you could go through the Kubernetes API), but only because I haven't gotten around to implementing it. Would that serve the use case you have in mind?

If I can help with anything as you look into it, do let me know!


I don't know about the GP, but I would actually like to be able to keep the pod alive while it's doing some processing, in case the user wants to run a long process, go away, then come back later when it's done. Yes, I know there are other tools for orchestrating pure batch jobs, but I imagine some applications are a mix of interactivity and long-running computations.


That makes sense. We currently don't support that directly, although we have a “grace period” which is how long it waits for a service to be idle before shutting it down. You could set to a very high number and then have the service manage its own termination when it became idle. But that's a bit of a hack, first-class support for that use case is something I'll think about.


Here's one way you could implement first-class support for this use case. It's a bit of a hack, but it's simple. IIUC, the proxy is a sidecar, meaning it runs in the same network namespace as the main container. So the proxy could listen on a particular port on localhost, and as long as a connection is open to that port, the sweeper wouldn't touch that pod. Then the main container would just need to open a TCP connection for the period of time that it wants to make sure it stays running.


If I understand correctly, Sweeper clears up sessions that haven't received a request in a certain amount of time. Essentially the use case would be to leave the session running until I ask Sweeper to clear it via an API request.

Just to illustrate where I'm coming from, what I have so far mimics the pm2 cli as an API with built-in reverse-proxy, with create (similar to init), reload, restart, start, stop and delete.


What are you using for the server resource provisioning for your hosted service? Firecracker on KVM? Current services like AWS Fargate/Lightsail containers/Google Cloud run are not competitive pricing wise for dynamic container spawning at scale unless you provision ahead of time. For this sort of services, your managed solution needs to be competitive with e.g. raw compute providers like DigitalOcean and Hetzner.


We're running on GKE right now, which allows us to iterate quickly, and we'll focus on the unit economics as we scale. As part of our research we've talked to dozens of teams who have already implemented this architecture, and most of them ended up using EKS or GKE (a few did use Firecracker or raw VMs), so they're already subject to those prices and it isn't a problem for them. We know that the unit economics may never make sense for hosting free tools and services, but we're focused on high-value SaaS and internal tools. For our target users, our value proposition is that we replace engineering/devops effort, not just the raw compute we provide.


systemd has some nifty comparatively recent functionality that lets you do some quite similar things, with higher isolation. I mention it because I find it really cool, some of the possibilities are quite eye-opening, and many, perhaps most, services would benefit from switching from static to dynamic user IDs plus StateDirectory et al. to manage /var directories.

https://0pointer.net/blog/dynamic-users-with-systemd.html

The specific example use case that matches here:

> By combining dynamic user IDs with socket activation you may easily implement a system where each incoming connection is served by a process instance running as a different, fresh, newly allocated UID within its own sandbox. Here's an example waldo.socket:

  [Socket]
  ListenStream=2048
  Accept=yes
> With a matching waldo@.service:

  [Service]
  ExecStart=-/usr/bin/myservicebinary
  DynamicUser=yes
> With the two unit files above, systemd will listen on TCP/IP port 2048, and for each incoming connection invoke a fresh instance of waldo@.service, each time utilizing a different, new, dynamically allocated UID, neatly isolated from any other instance.

By allocating a new user ID for every invocation, you definitely limit the number of instances you can run—systemd only has a pool of 4336 dynamic user IDs (61184–65519) to allocate from, beyond which point I presume it’d refuse to accept any more connections. But it’s cool stuff, anyway.

(You could also just go for socket activation without a dynamic user, but I was thinking of this from the dynamic user perspective because that’s the more novel thing; socket activation has been around for much longer.)


This is pretty cool. Have you considered not killing the instance after the user disconnects, but pausing the container instead? This opens up the whole tradeoff space between how long you keep the state vs cost to the infrastructure/price to the user.


In the case of containers it gets tricky because of how it interacts with the scheduler (e.g. if a node is idle but has a bunch of paused containers that could be unpaused at any time, the scheduler has to decide how to proceed), but I love the concept. It's something I've thought a bit about in a world where the server can be compiled to WebAssembly, because it's imaginable to suspend it and serialize the memory state so that it can be sent off to storage somewhere and pulled out when the next request comes in. This was actually part of the motivation behind a library I wrote called Stateroom (https://github.com/drifting-in-space/stateroom), which creates a stateful WebSocket server as a WebAssembly module, but I haven't yet implemented the ability to freeze the state of the module between requests.


Congrats on launch of your excellent idea. This feels like a logical extension to the accelerating movement toward app architectures that put serious compute at the edge (Fly.io and CloudFlare Workers come to mind). Exciting times.


This looks cool, but it is making my "solution looking for a problem" bell ring a bit :) Have people you talked to needed this? Your example seems somewhat contrived tbh.

Good luck!


Always a valid concern :)

I've experienced the need first-hand as well as talked to people who experienced it. The most prominent group of users are development tools, because that world has already embraced this architecture -- software like VS Code and Jupyter already takes the same approach, we just generalized it. One way of looking at it is that our bet is that applications other than dev tools will embrace this architecture too.

The example is only partly contrived; I began my career doing fraud analysis on ad market data and would run jobs overnight that computed an embedding layout, I wished for a way to recompute the embeddings on-the-fly as I filtered the data.


Ah, the analogy to VSCode and Jupyter actually helps me understand it.


What are your thoughts on using Drifting in Space as a code executor/dev environment in the browser?


That's definitely a use case we're interested in. For example, here's a demo of spinning up a VS Code instance just by hitting an API endpoint: https://www.youtube.com/watch?v=ON-mHFxd04U


Super interesting!

Have you guys tested Drifting in Space with executing users code and opening ports? (like replit)


Currently we only expose one port per host, and it needs to speak HTTP. I do have a use-case in mind that requires exposing arbitrary TCP/UDP ports, as long as they're specified at “spawn time”, which might not quite get at the functionality replit has if it allows you to map ports dynamically while a service is running.

So I guess the answer is “probably not in the near future, but maybe eventually” :)


How does this compare to Phoenix liveview? As I understood it that also does something like this?


LiveView is pretty neat. The last time I used Erlang was before Phoenix and Elixir came on the scene so I can't talk from personal experience, but my understanding is that LiveView is a good easy way to add state synchronization to an app but using it for anything high-CPU/memory becomes limiting. If you've tried it, I'm curious to know if that matches your experience, because I confess it's not something I've tried directly.


if you're using phoenix for anything high cpu, you can easily calll out to python or write a native function using rust. That said, there's also the nx library that lets you do a lot of complex numerical processing within elixir (it calls out to googles linear algebra libraries under the hood"


Rewriting it in Rust doesn't magically make it use no CPU! CPU usage becomes an architectural concern at a certain point, not a language concern. Even the fastest languages can't make e.g. a game server scale the way a CRUD app does.


> Rewriting it in Rust doesn't magically make it use no CPU!

obviously, but it does'r mean certain HEAVY algorithms wont benefit from judicious use of mutability and getting closer to the metal. you can't always fan out your computation and sometimes you need to get the result out to the user fast. even if you dont NEED to, the user appreciates an experience that feels snappy.

> CPU usage becomes an architectural concern at a certain point,

depends what you're trying to do and the value of getting it done fast. Otherwise we'd never use lower level languges. python calls out to c for certain things for a reason.

> Even the fastest languages can't make e.g. a game server scale the way a CRUD app does.

true a slow language isn't going to be much worse than fast language for io, but a high performance system might be able to update that game state faster and passit back to the controller in a slow language that can take care of sending teh data out.

> CPU usage becomes an architectural concern at a certain point, not a language concern

thats a blunt assertion. I can think of plenty of use cases where its just as much a language concern. There's a reason dropbox and figma wrote performance critical parts of their system in rust.

for reference: I built my entire startup in elixir and have managed to get by using just elixir, judicious use of architecture and writing really tight sql queries. Yes, there are certain performance bottlenecks that can be addressed with architecture but to say that applies to all cases is foolish.

Luckily, our roadmap won't need any of these high cpu demands for ahwile. at some point, we want to use some sort of machine learning. There is just no way you're going to scale up large scale matrix operations over large data sets in pure elixir. elixir is copy om wrote and evaluate by value. Great ergonomics for day to day work but they have an overhead that is unacceptable under certain contexts. One being matrix operations. Luckily we have Nx which calls out to low level c code. Otheriwse we'd be using rust or python calling out to tensor-flow on a separate microservice.


Right, but what would happen if you started to saturate your servers' CPU processing spending all your time in highly-optimised Nx routines? You'd start looking to solutions like the OP in order to scale that CPU-bound work.

I think we might be arguing past each other in more-or-less-agreement so I might bow out at this point.


Why does it become limiting? From what I understand you can send a message to some other a end to do the heavy processing?


Did they really just reinvent CGI and sell it as SaaSS?


Ah, the letters CGI bring back fun memories. But no, this has very little to do with CGI.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: