Launch HN: Drifting in Space (YC W22) – A server process for every user

crabmusket · on March 1, 2022

I love seeing more options appear on the horizon for doing stateful serverless work. This article[1] provides a little more motivation for the use cases:

> For quite a long time (and especially in the webdev world), there exists a perception that to achieve scalability, all our request handlers need to be as stateless as possible. In the world of the all-popular Docker containers, it means that all the app containers need not only to be immutable, but also should be ephemeral ... keeping our request handlers stateless, does NOT really solve the scalability problem; instead it merely pushes it to the database.

Though the problems and solutions pointed out in that article don't mean you have to go straight to process-per-X. One solution might be, as mentioned in passing in the OP's launch blog, to keep state in a cache like Redis. If the data fits this approach, it would ease load on the database while allowing each request handler to remain stateless.

Durable Objects seem less focused on heavy computation, but I think they're really interesting as points of synchronisation for e.g. collaborative editing. Having all requests go into a _single thread_ seems important.

[1]: http://ithare.com/scaling-stateful-objects/

cultofmetatron · on Feb 28, 2022

Beam's ability to spawn lightweight processes is a life saver. A lot of people are praising liveview for being able to write spas without javascript but the real killer feature is the ability to track a session for a user from the backend. Love how you guys are making that a first class consideration for folks using less powerful platforms.

Did you guys build this on top of beam? my startup had a similar need for opening a process per user and we ended up using a combination of horde + genserver to accomplish something similar. In our case, we spawn a process that mainitains a websocket connection to an external service, maintain some state in there and relay updates to the user over a channel. There is one per client.

paulgb · on Feb 28, 2022

We're not using BEAM directly, but I find it pretty neat and spent some time reading up on it when getting started with this. I'm pretty excited by https://lunatic.solutions are doing as well, as an approach to bringing the ideas behind BEAM to WebAssembly. Ultimately, I explored WebAssembly for a while and realized that there was more of a market if we could run containers instead of just WebAssembly modules. (The result of my work in that direction lives on as Stateroom: https://github.com/drifting-in-space/stateroom)

jerf · on March 1, 2022

"For quite a long time (and especially in the webdev world), there exists a perception that to achieve scalability, all our request handlers need to be as stateless as possible."

This is definitely overdue for a re-examination. "Web handlers should be stateless" goes all the way back to the 1990s, when a server system was lucky to have a single gigahertz, and even server systems could be looking at being loaded with the obscene quantity of maybe 128MB of RAM.

The solution is obviously not to just flip all the way to the other side. But the landscape has changed a lot since then. I've made a lot of hay out of very selectively stateful web services. It takes some care, but sometimes it honestly takes less care than trying to build completely pristinely stateless servers, because it's not like that's trivial all the time either!

crabmusket · on March 2, 2022

Do you have any recommendations for resources that have been helpful with that? I'm trying to introduce some "tactical statefulness" into a mostly-stateless web backend and would love to find some giants to stand on.

mwcampbell · on Feb 28, 2022

Does the managed service actually require that each user get their own container? For some applications, particularly collaborative ones, it would make much more sense to have a container for each top-level thing that the users are collaborating on, e.g. one per document. I think Sandstorm [1] got this right with its concept of grains, and I've long wanted a tool that brought that model, a stateful container per high-level object, running arbitrary code (unlike Cloudflare Durable Objects), to the world of hosted SaaS. Speaking of Cloudflare, I'm looking forward to seeing what their edge containers can do, when that feature is eventually made public.

[1]: https://sandstorm.io/

paulgb · on Feb 28, 2022

> Does the managed service actually require that each user get their own container? For some applications, particularly collaborative ones, it would make much more sense to have a container for each top-level thing that the users are collaborating on, e.g. one per document.

Exactly right. We do not actually require that every user gets their own container; that's a decision that's entirely up to your app. Our API spins up an instance and returns its hostname, and then you can connect to it from as many clients as you like.

justsomeuser · on Feb 28, 2022

Looks like a cool project, but I am not sure I understand the need for one process per user.

Some questions:

- Why do you need one process per user? For low latency, would you just need to make sure you have idle CPU to serve their request, even if that CPU time is multiplexed onto an event loop (one event loop serves many users)?

- Wouldn't this "event loop" actually be more efficient that one user/process, as there would be less context switching cost from the OS?

- Can I just keep a map of (connection, thread_id) on my server, and spawn one thread per user on my own server?

- Could I just load up my server with many cores, and give each user a SQLite database which runs each query in its own thread?

- This way a multi GB database would not be loaded into RAM, the query would filter it down to a result set.

paulgb · on March 1, 2022

Good questions!

> Why do you need one process per user? / Wouldn't this "event loop" actually be more efficient that one user/process, as there would be less context switching cost from the OS?

We're particularly interested in apps that are often CPU-bound, so a traditional event-loop would be blocked for long periods of time. A typical solution is to put the work into a thread, so there would still be a context switch, albeit a smaller one.

The process-per-user approach makes the most sense when a significant amount of the data used by each user does not overlap with other users. VS Code (in client/server mode) is a good example of this -- the overhead of siloing each process is relatively low compared to the benefits it gives. We think more data-heavy apps will make the same trade-offs.

> Can I just keep a map of (connection, thread_id) on my server, and spawn one thread per user on my own server?

If you don't have to scale beyond one server, this approach works fine, but it makes scaling horizontally complicated because you suddenly can't just use a plain old load balancer. It's not just about routing requests to the right server; deciding which server to run the threads on becomes complicated because you ideally want to decide based on the server load of each. We started going down this path, realized we'd end up re-inventing Kubernetes, so decided to embrace it instead.

> Could I just load up my server with many cores, and give each user a SQLite database which runs each query in its own thread? This way a multi GB database would not be loaded into RAM, the query would filter it down to a result set.

If, for a particular use case, it's economical to keep the data ready in a database that supports the query pattern users will make, it's probably not a good fit for a session-lived backend. In database terms, where our architecture makes sense is when you need to create an index on a dataset (or subset of a dataset) during the runtime of an application. For example, if you have thousands of large parquet files in blob storage and you want a user to be able to load one and run Falcon-type[1] analysis on it.

[1] https://github.com/vega/falcon

mike_d · on March 1, 2022

> A typical solution is to put the work into a thread, so there would still be a context switch, albeit a smaller one.

Car wasn't fast enough, so we removed the rear view mirror to lower weight. You are looking at the sexy fun to solve problem rather than the useful boring solution of throwing away the stack. Users can already run things like Solidworks in a web browser with near native performance using VDI.

> deciding which server to run the threads on becomes complicated because you ideally want to decide based on the server load of each

High end load balancers have done this since the 90s. This is now easily done with the nginx API.

Honestly I am sure there is some need somewhere for your stack. But hiring a good server/network operations team instead would have saved you a lot of code.

paulgb · on March 1, 2022

One way to look at it is that it’s like the architecture Github Codespaces uses internally, made available off-the-shelf. I don’t think using a VDI approach would make Codespaces a better product. In fact, I was partly motivated to build this by frustration with laggy VDI setups I had to deal with (though I don’t think VDI has to be bad.)

> High end load balancers have done this since the 90s. This is now easily done with the nginx API.

A load balancer doesn’t (or at least shouldn’t) do everything we need to do, which involves statefully mapping hostnames generated on-the-fly to servers in a cluster. This allows our users to create instances that multiple clients can connect to, as opposed to just using “sticky sessions” or something like that.

Our approach takes less code than you might think —- we lean heavily on nginx and Kubernetes where we can, so we only need to fill in the missing pieces.

justsomeuser · on March 1, 2022

So it is kind of like server-less, but each instance:

- Persists for the lifetime of the user session.

- Only processes a single user session.

- Has large amounts of CPU/RAM and writable disk to handle large datasets.

paulgb · on March 1, 2022

Yes, that sounds about right.

coder543 · on Feb 28, 2022

I'll admit, I clicked on this "Launch HN" mostly because the name sounded cool, but after reading the description... the name doesn't seem particularly relevant to the business in any way, which can be fine, it's just interesting to note.

I am a little confused about the product purpose and the definition of who your competition is in the market. I think new SaaS hosting providers are interesting, so please don't take any of this the wrong way, just hoping to give you some space to expand on your ideas more.

> Here's an example. Imagine you make an application for investigating fraud in a large transaction database. Users want to interactively filter, aggregate, and visualize gigabytes of transactions as a graph. Instead of sending all of the data down to the browser and doing the work there, you would put your code in a container and upload it to our platform. Then, whenever a fraud analyst opens your application, you hit an API we provide to spin up a dedicated backend for that analyst. Your browser code then opens a WebSocket connection directly to that backend, which it uses to stream data as the analyst applies filters or zooms/pans the visualization.

You say "put your code in a container", but... wouldn't you basically have to put all your gigabytes of data into a container? The bottleneck to the types of analytic applications you're describing seems unlikely to be the custom backend code, and far more likely to be whatever database is powering the application, which means that each interactive instance really needs to spin up a complete copy of the dataset to gain any performance benefit for these on-demand analytic workloads.

I've worked with a number of high-scale applications, and scaling the backend API server has never been even remotely the main challenge... plus, having dedicated instances of the web server process wouldn't make anything faster than just having an appropriate number of instances, it would just make it more expensive. It's almost always a question of scaling the database -- not the API layer. For offline analytic workloads like you describe, you could potentially spin up fresh copies of the database for each user, and that would make things better, but the challenge of scaling (online) OLAP and OLTP comes from the shared-everything nature of the database itself. If you're intending to provide unique database instances to each user, then all the data needs to either be packaged up with the application, or stored somewhere that the application can retrieve it on startup and load the database, which could be a time-consuming process that creates painfully long cold starts.

> Many high-end web apps give every user a dedicated connection to a server-side process. That is how they get the low latency that you need for ambitious products like full-fledged video editing tools and IDEs.

> We have not solidified a price plan yet, but we’re aiming to provide an instance capable of running VS Code (as an example) for about 10 cents an hour, fractionally metered.

Since you bring up the examples of running GUI desktop applications, I'm wondering if your competition isn't actually AWS WorkSpaces. Someone could build an image for a WorkSpace that includes everything the analyst needs, and then AWS will manage the lifecycle of that instance as the analyst connects and disconnects, billing entirely based on usage. That image could even include vast quantities of data pre-populated into a database, along with a web server that offers local dedicated processes to serve requests from the browser in the WorkSpace, if the company prefers to develop their application's GUI using the web as a platform.

Obviously the challenge with WorkSpace is if you want to offer it to parties outside your company, but AWS does address this use case to some extent: https://aws.amazon.com/blogs/security/how-to-secure-your-ama...

A company could definitely address the nuances and automation of offering WorkSpace to third parties, but such a business would likely be extremely vulnerable to AWS just improving WorkSpace to include those features out of the box.

paulgb · on Feb 28, 2022

> You say "put your code in a container", but... wouldn't you basically have to put all your gigabytes of data into a container? The bottleneck to the types of analytic applications you're describing seems unlikely to be the custom backend code, and far more likely to be whatever database is powering the application, which means that each interactive instance really needs to spin up a complete copy of the dataset to gain any performance benefit for these on-demand analytic workloads.

You're right that it does depend a lot on the needs of a specific application. If a bunch of users are accessing the same dataset, and can constantly access the subset of data they need with low latency through a global index, and there isn't much need to do computation interactively at runtime, then a standard architecture is probably a better fit.

Where this approach is useful is if every user needs access to a different subset of the data (e.g. if the underlying dataset is petabytes, and each user needs to interactively explore a different gigabytes-big subset of it). Or if there is a lot of derived compute on top of it, for example, a graph visualization that needs to be updated when the user changes the subset of data in focus.

> I'm wondering if your competition isn't actually AWS WorkSpaces

The general approach of "run and render elsewhere and stream the pixels back" is definitely our competition in the sense that it's something companies currently do. What we provide is a way of moving the client/server boundary to wherever makes sense for your app: if it makes sense to render server-side and stream pixels, you can do that (although we don't yet support UDP, which would be useful in this case); if it makes sense to do data aggregation server-side but render through WebGL, that's also an option.

ConnorLeet · on Feb 28, 2022

Do you have any resources to point me towards that elaborate on the benefits of a process-per-tenant/user for performance?

I work on a data-intensive app that fits the use-case you describe but I'm confused about the benefits for performance. (can certainly see how the code would end up nice/simpler) Is this mostly applicable to certain stacks?

paulgb · on Feb 28, 2022

> Do you have any resources to point me towards that elaborate on the benefits of a process-per-tenant/user for performance?

Not yet, but we're working on some demos of things that are easier with session-lived backends. One way to think about it is that it's good for repeated queries against the same subset of data -- if you have a dataset of petabytes and your typical use case has users (through filters or queries) repeatedly accessing a sample of ~gigabytes of that data throughout a use session, you could use a session-lived backend to materialize that subset of data in-memory and quickly serve queries off of it without hitting the global index.

Another case where it comes up is when you need to do some stateful computation after loading the data, for example, if you need to generate a graph or embedding layout of some data and refine the layout when users select/deselect data.

lmeyerov · on March 1, 2022

That's graphistry's last 7+ years, where we do frontend CPU/GPU visual analytics experiences <> GPU server processes. 100% endorse this design, opens a lot :)

Main tweak is our model grew to "small client browser GPU/CPU session <> serverside multi-node multi-GPU time sharing." Current cloud services (lambda etc) fail here: cold-start, mostly CPU, etc, vs bursty sticky GPU sessions. A lot more power when you can scale resource use... so 1 server process is kind of dinky. Good backend abstractions are tricky here though, so starting with 1 process makes sense as they figure out a sustainable revenue model, e.g., powering demanding visual intelligence apps is vastly different from powering commodity netlify CRUD apps.

paulgb · on March 1, 2022

Thanks, that means a lot coming from one of the pioneers of the approach!

The fraud analysis use-case is actually semi-based on a real world experience I had building tools for fraud detection in adtech in 2013, where I often found myself taking a time-slice of a graph and loading it up in Gephi to compute a layout. I'd written other browser-based tools to make my work easier, but because I was shoehorning everything into a stateless backend, computing a large graph layout as part of it was tough. So when I saw what you were up to with Graphistry, it immediately resonated with me (though I was no longer working on fraud at the time).

lmeyerov · on March 1, 2022

Super cool!

Funny enough... We've stacked up ~4 incarnations of how we back our different kinds of sticky live GPU session workloads, and ironically, a big one we aren't doing but I keep wanting to see solved is user-defined GPU containers (vs our own). So, good luck!

doublerebel · on March 1, 2022

1. How does this compare to MightyApp [0]? Does Mighty work on the UI side automatically, while Drifting in Space requires that the app has some kind of data layer separation to allow acceleration of just the data processing?

2. Is the data processing stream or batch?

3. Could Mighty + DiS work together to completely accelerate a data- and UI-heavy application?

Context: I have been working recently with a reporting-heavy company that is continually using data analysis to understand risk, combat fraud, and identify key patterns in user actions and data.

[0]: Mighty Makes Google Chrome Faster (YC S19) -- https://news.ycombinator.com/item?id=26957215

paulgb · on March 1, 2022

1. Mighty has some similarities in that they run a (Chrome) process for each user session, but it's quite different in that we are something that SaaS companies can build into their app, whereas Mighty is something that the end-user subscribes to and the SaaS provider doesn't need to know exists. I think Mighty is pretty neat, but it doesn't get around fundamental limitations of browsers, e.g. a WebAssembly process running in Mighty can't address more than 4GB of memory regardless of how beefy the machine Mighty runs it on is.

Since our product is built directly into the SaaS app, it's up to the app's developer to decide at what level they want to split the work between client and server. Doing everything on the server and streaming pixels is one option, but I suspect most applications will want to take a hybrid approach where they do some CPU/memory-intensive work on the server, stream the data to the client, and use the client's GPU (via WebGL/WebGPU) to render it. So that's the approach we're currently optimizing for, but better support for pixel streaming is on our radar too.

2. It's up to the application layer; we just provide a way to run a container and the data layer is up to the app.

3. Yes, an app served by DiS is just a regular web app so you could use it in Mighty. Our hope is that because we shift some of the heavy computation to the server there are less use-cases where this makes sense, but there could be cases where you want to do GPU-heavy rendering which we don't yet provide.

kimi · on March 1, 2022

Don't want to spoil the party, but you could get all of this and way more with Elixir, running on a platform that was designed for this. And with LiveViews, you could do without a client. Aren't you kind of reinventing the wheel?

jerf · on March 1, 2022

I think the founders have run through an analysis much like this one I gave ten months back: https://news.ycombinator.com/item?id=27195000

Erlang/Elixir is a very neat little ecosystem, but in a lot of ways it's a dead end now. It was alone in its space for so long that it built a lot of ways of doing things that are kinda closed in on its own ecosystem, because there was no other ecosystem to reach out to to speak of, but now there's an abundance of choices and choosing Erlang means choosing something that is built on a lot of assumptions that don't match the world anymore. There may be some "reinvention" in building something on WASM and other communication mechanisms, but it's one with a path forward.

In particular, Erlang/Elixir have a lot of integrated solutions for modern code problems, but being either first or very early, none of them are best-of-breed anymore. You could think of them as the first draft of a lot of modern techs. Between that, and the fact that you can't build a business based on going to your customers and saying "Hey, everyone, I've got a great platform, just allocate the budget to rewrite your entire codebase into this somewhat obscure language and we'll make everything all better!", it just isn't a viable choice for a business, or at least not one that has any plans for growth. (And I don't mean VC-funded hypergrowth... I mean, the regular kind too.)

paulgb · on March 1, 2022

It’s true that Erlang/Elixir/OTP have this solved in that ecosystem, as I mention in the post. When we talked to dozens of teams that have internally built tech like this already, only one had gone that route.

In general I haven’t seen any really data-heavy apps in Elixir, are there examples I should be looking at? It could be interesting to compare performance.

qbasic_forever · on Feb 28, 2022

Very cool! Reminds me a bit of Jupyter and the whole code notebook world too. Spawner almost seems like a more general purpose JupyterHub, which IMHO is a good thing (jhub is frighteningly complex to config and setup these days).

paulgb · on Feb 28, 2022

> Spawner almost seems like a more general purpose JupyterHub

That's actually a very good way of putting it to people who understand the reference!

One of the things I've been playing with is actually using Spawner to spin up Jupyter Lab notebooks with their new(ish) collaboration feature. Jupyter and VS Code both work very nicely with Spawner's architecture out-of-the-box, since they can be put into a container and accessed entirely through an HTTPS connection.

qbasic_forever · on Feb 28, 2022

Yeah a 'spawn a VS code server instance on these files' microservice could be super handy for lots of things. There are fantastic technical doc tools like mkdocs, mdbook, etc. but none of them have an editing interface. You could add an 'edit' button to their generated HTML that opens a spawned VS code server instance on the files, and now you've got a little wiki / knowledge base that a small team can work from.

mwcampbell · on March 1, 2022

If anyone from fly.io is watching, I think it would be smart for fly.io to acquire this new company and integrate the concept, and maybe some of the implementation, into the Fly platform.

smashah · on Feb 28, 2022

Interesting. Will it be possible to control sweeper via API also?

I'm a solo open-source maintainer and have a popular project that people want to orchestrate many instances of. Each instance (a.k.a session) is stateful and individually configurable. I'm excited to test out spawner. Any company that makes it super simple for open-source maintainers make money by providing a managed service will be a huge success - from my initial thoughts, this looks to fit the bill.

paulgb · on Feb 28, 2022

Is the use-case you have in mind for a Sweeper API being able to shut down a pod based on an external event? We don't have a nice HTTP API for that yet (you could go through the Kubernetes API), but only because I haven't gotten around to implementing it. Would that serve the use case you have in mind?

If I can help with anything as you look into it, do let me know!

mwcampbell · on March 1, 2022

I don't know about the GP, but I would actually like to be able to keep the pod alive while it's doing some processing, in case the user wants to run a long process, go away, then come back later when it's done. Yes, I know there are other tools for orchestrating pure batch jobs, but I imagine some applications are a mix of interactivity and long-running computations.

paulgb · on March 1, 2022

That makes sense. We currently don't support that directly, although we have a “grace period” which is how long it waits for a service to be idle before shutting it down. You could set to a very high number and then have the service manage its own termination when it became idle. But that's a bit of a hack, first-class support for that use case is something I'll think about.

mwcampbell · on March 1, 2022

Here's one way you could implement first-class support for this use case. It's a bit of a hack, but it's simple. IIUC, the proxy is a sidecar, meaning it runs in the same network namespace as the main container. So the proxy could listen on a particular port on localhost, and as long as a connection is open to that port, the sweeper wouldn't touch that pod. Then the main container would just need to open a TCP connection for the period of time that it wants to make sure it stays running.

smashah · on March 1, 2022

If I understand correctly, Sweeper clears up sessions that haven't received a request in a certain amount of time. Essentially the use case would be to leave the session running until I ask Sweeper to clear it via an API request.

Just to illustrate where I'm coming from, what I have so far mimics the pm2 cli as an API with built-in reverse-proxy, with create (similar to init), reload, restart, start, stop and delete.

KloudTrader · on Feb 28, 2022

What are you using for the server resource provisioning for your hosted service? Firecracker on KVM? Current services like AWS Fargate/Lightsail containers/Google Cloud run are not competitive pricing wise for dynamic container spawning at scale unless you provision ahead of time. For this sort of services, your managed solution needs to be competitive with e.g. raw compute providers like DigitalOcean and Hetzner.

paulgb · on Feb 28, 2022

We're running on GKE right now, which allows us to iterate quickly, and we'll focus on the unit economics as we scale. As part of our research we've talked to dozens of teams who have already implemented this architecture, and most of them ended up using EKS or GKE (a few did use Firecracker or raw VMs), so they're already subject to those prices and it isn't a problem for them. We know that the unit economics may never make sense for hosting free tools and services, but we're focused on high-value SaaS and internal tools. For our target users, our value proposition is that we replace engineering/devops effort, not just the raw compute we provide.

chrismorgan · on March 1, 2022

systemd has some nifty comparatively recent functionality that lets you do some quite similar things, with higher isolation. I mention it because I find it really cool, some of the possibilities are quite eye-opening, and many, perhaps most, services would benefit from switching from static to dynamic user IDs plus StateDirectory et al. to manage /var directories.

https://0pointer.net/blog/dynamic-users-with-systemd.html

The specific example use case that matches here:

> By combining dynamic user IDs with socket activation you may easily implement a system where each incoming connection is served by a process instance running as a different, fresh, newly allocated UID within its own sandbox. Here's an example waldo.socket:

  [Socket]
  ListenStream=2048
  Accept=yes

> With a matching waldo@.service:

  [Service]
  ExecStart=-/usr/bin/myservicebinary
  DynamicUser=yes

> With the two unit files above, systemd will listen on TCP/IP port 2048, and for each incoming connection invoke a fresh instance of waldo@.service, each time utilizing a different, new, dynamically allocated UID, neatly isolated from any other instance.

By allocating a new user ID for every invocation, you definitely limit the number of instances you can run—systemd only has a pool of 4336 dynamic user IDs (61184–65519) to allocate from, beyond which point I presume it’d refuse to accept any more connections. But it’s cool stuff, anyway.

(You could also just go for socket activation without a dynamic user, but I was thinking of this from the dynamic user perspective because that’s the more novel thing; socket activation has been around for much longer.)

rfonseca · on March 1, 2022

This is pretty cool. Have you considered not killing the instance after the user disconnects, but pausing the container instead? This opens up the whole tradeoff space between how long you keep the state vs cost to the infrastructure/price to the user.

paulgb · on March 1, 2022

In the case of containers it gets tricky because of how it interacts with the scheduler (e.g. if a node is idle but has a bunch of paused containers that could be unpaused at any time, the scheduler has to decide how to proceed), but I love the concept. It's something I've thought a bit about in a world where the server can be compiled to WebAssembly, because it's imaginable to suspend it and serialize the memory state so that it can be sent off to storage somewhere and pulled out when the next request comes in. This was actually part of the motivation behind a library I wrote called Stateroom (https://github.com/drifting-in-space/stateroom), which creates a stateful WebSocket server as a WebAssembly module, but I haven't yet implemented the ability to freeze the state of the module between requests.

chrisweekly · on March 1, 2022

Congrats on launch of your excellent idea. This feels like a logical extension to the accelerating movement toward app architectures that put serious compute at the edge (Fly.io and CloudFlare Workers come to mind). Exciting times.

wizwit999 · on Feb 28, 2022

This looks cool, but it is making my "solution looking for a problem" bell ring a bit :) Have people you talked to needed this? Your example seems somewhat contrived tbh.

Good luck!

paulgb · on Feb 28, 2022

Always a valid concern :)

I've experienced the need first-hand as well as talked to people who experienced it. The most prominent group of users are development tools, because that world has already embraced this architecture -- software like VS Code and Jupyter already takes the same approach, we just generalized it. One way of looking at it is that our bet is that applications other than dev tools will embrace this architecture too.

The example is only partly contrived; I began my career doing fraud analysis on ad market data and would run jobs overnight that computed an embedding layout, I wished for a way to recompute the embeddings on-the-fly as I filtered the data.

wizwit999 · on Feb 28, 2022

Ah, the analogy to VSCode and Jupyter actually helps me understand it.

kamikazeturtles · on Feb 28, 2022

What are your thoughts on using Drifting in Space as a code executor/dev environment in the browser?

paulgb · on Feb 28, 2022

That's definitely a use case we're interested in. For example, here's a demo of spinning up a VS Code instance just by hitting an API endpoint: https://www.youtube.com/watch?v=ON-mHFxd04U

kamikazeturtles · on March 1, 2022

Super interesting!

Have you guys tested Drifting in Space with executing users code and opening ports? (like replit)

paulgb · on March 1, 2022

Currently we only expose one port per host, and it needs to speak HTTP. I do have a use-case in mind that requires exposing arbitrary TCP/UDP ports, as long as they're specified at “spawn time”, which might not quite get at the functionality replit has if it allows you to map ports dynamically while a service is running.

So I guess the answer is “probably not in the near future, but maybe eventually” :)

boxed · on Feb 28, 2022

How does this compare to Phoenix liveview? As I understood it that also does something like this?

paulgb · on Feb 28, 2022

LiveView is pretty neat. The last time I used Erlang was before Phoenix and Elixir came on the scene so I can't talk from personal experience, but my understanding is that LiveView is a good easy way to add state synchronization to an app but using it for anything high-CPU/memory becomes limiting. If you've tried it, I'm curious to know if that matches your experience, because I confess it's not something I've tried directly.

cultofmetatron · on Feb 28, 2022

if you're using phoenix for anything high cpu, you can easily calll out to python or write a native function using rust. That said, there's also the nx library that lets you do a lot of complex numerical processing within elixir (it calls out to googles linear algebra libraries under the hood"

crabmusket · on March 1, 2022

Rewriting it in Rust doesn't magically make it use no CPU! CPU usage becomes an architectural concern at a certain point, not a language concern. Even the fastest languages can't make e.g. a game server scale the way a CRUD app does.

cultofmetatron · on March 1, 2022

> Rewriting it in Rust doesn't magically make it use no CPU!

obviously, but it does'r mean certain HEAVY algorithms wont benefit from judicious use of mutability and getting closer to the metal. you can't always fan out your computation and sometimes you need to get the result out to the user fast. even if you dont NEED to, the user appreciates an experience that feels snappy.

> CPU usage becomes an architectural concern at a certain point,

depends what you're trying to do and the value of getting it done fast. Otherwise we'd never use lower level languges. python calls out to c for certain things for a reason.

> Even the fastest languages can't make e.g. a game server scale the way a CRUD app does.

true a slow language isn't going to be much worse than fast language for io, but a high performance system might be able to update that game state faster and passit back to the controller in a slow language that can take care of sending teh data out.

> CPU usage becomes an architectural concern at a certain point, not a language concern

thats a blunt assertion. I can think of plenty of use cases where its just as much a language concern. There's a reason dropbox and figma wrote performance critical parts of their system in rust.

for reference: I built my entire startup in elixir and have managed to get by using just elixir, judicious use of architecture and writing really tight sql queries. Yes, there are certain performance bottlenecks that can be addressed with architecture but to say that applies to all cases is foolish.

Luckily, our roadmap won't need any of these high cpu demands for ahwile. at some point, we want to use some sort of machine learning. There is just no way you're going to scale up large scale matrix operations over large data sets in pure elixir. elixir is copy om wrote and evaluate by value. Great ergonomics for day to day work but they have an overhead that is unacceptable under certain contexts. One being matrix operations. Luckily we have Nx which calls out to low level c code. Otheriwse we'd be using rust or python calling out to tensor-flow on a separate microservice.

crabmusket · on March 1, 2022

Right, but what would happen if you started to saturate your servers' CPU processing spending all your time in highly-optimised Nx routines? You'd start looking to solutions like the OP in order to scale that CPU-bound work.

I think we might be arguing past each other in more-or-less-agreement so I might bow out at this point.

boxed · on March 1, 2022

Why does it become limiting? From what I understand you can send a message to some other a end to do the heavy processing?

jesprenj · on March 1, 2022

Did they really just reinvent CGI and sell it as SaaSS?

paulgb · on March 1, 2022

Ah, the letters CGI bring back fun memories. But no, this has very little to do with CGI.