I'm the author of the original 'refork' feature introduced to Puma a couple year...

byroot · on Oct 13, 2022

Hey Will, Pitchfork author here. As mentioned in the Readme, thanks for your Puma PR, it was indeed quite instrumental in Pitchfork inception.

As for why it wasn’t used much as a Puma feature I don’t know, but there are a few challenges with it that prevented me from putting it in production. The most important one being that new workers end up being grand children of the original process, so if the middle process die, you may end up with zombies etc.

It’s also very scary to fork a process that may have live threads currently processing a request. I believe Pitchfork solves most of that.

wgjordan · on Oct 13, 2022

I noticed the double-fork with PR_SET_CHILD_SUBREAPER to reparent the new workers, which is a nice reliability improvement for the edge-case where the middle-parent worker possibly crashes. It adds a Linux dependency (as noted), but that probably still covers most production use-cases where the extra reliability that comes with reparenting is most needed. This enhancement could probably be incorporated into Puma.

As for the concern about forking a process that may have live threads currently processing a request, this should already be solved in the Puma implementation. The worker shuts down and finishes serving all pending requests before reforking. There is also an `on_refork` hook for to trigger extra garbage-collection to maximize copy-on-write efficiency, or close any connections to remote servers (database, Redis, ...) that were opened while the server was running.

bullen · on Oct 13, 2022

Isn't HTTP/1.1 transfer encoding chunked a simpler solution to the same problem?

byroot · on Oct 13, 2022

Sorry, I see absolutely no relation between the two.

bullen · on Oct 13, 2022

When you chunk responses you save memory as the only buffer you need to keep is the chunk?

nirvdrum · on Oct 13, 2022

Neither Puma nor Pitchfork are generally used as a static file server since they're not particularly well-suited to that. They're used as Ruby application servers. The memory consumption and savings being discussed is oriented around the memory required by the Ruby VM to process a dynamic request (e.g., a request to a Ruby on Rails application).

bullen · on Oct 13, 2022

Ok, I was confused by "minimize memory usage by maximizing Copy-on-Write performance"...

That said: Static files never should be chunked.

Chunking is ONLY interesting with dynamic responses.

nirvdrum · on Oct 13, 2022

That's fair. I was trying to guess where the misunderstanding arose from and I guessed wrong. I'm sorry about that.

Rounding out the previous thought, the idea with many forking servers is to boot up to the point before a request is served and then fork off for each request. You do gain CoW benefits, but if you have any lazy data structures that are reified in the call, now each child is faulting. Pitchfork will take a child that has processed a request and promote it as the parent, replacing the original process. Now, this new parent is the new CoW base with the expectation that forks from that will result in even greater memory sharing. For a framework like Rails, there's a lot that happens after a request is received.

byroot · on Oct 13, 2022

Ah I see. As Kevin said, the HTTP response is generally nothing compared to the memory required by the VM.

A response might be a couple MB, a medium sized Rails app will need a couple hundred MB to load it’s code etc.