To elaborate a little bit on the supervision tree thing then: there's a bunch of different behaviours you can associate with process failure depending on your needs. Let's say you have a Postgres connection pool and for some reason the pool manager process dies. You can set it up so that the death of the manager will:
- kill all of the child processes that the pool was managing
- return an error to all of the request handlers who had active queries going while not touching the request handlers who didn't
- restart the pool manager
- once it's running, respawn the managed pool processes
This is all machinery that's pre-built into the OTP runtime. While that's all happening your app as a whole can keep trucking along and everything that doesn't need to make a database query carries on without even noticing that something was amiss.
The slogan "let it die" gets tossed around the Elixir/Erlang community quite a bit. This is referring to Erlang Processes (the internal lightweight processes, not the host process with a formal OS PID associated with it). Your whole app doesn't die, just the broken parts, and the OTP supervisor subsystem brings them back to life quickly.
- kill all of the child processes that the pool was managing
- return an error to all of the request handlers who had active queries going while not touching the request handlers who didn't
- restart the pool manager
- once it's running, respawn the managed pool processes
This is all machinery that's pre-built into the OTP runtime. While that's all happening your app as a whole can keep trucking along and everything that doesn't need to make a database query carries on without even noticing that something was amiss.
The slogan "let it die" gets tossed around the Elixir/Erlang community quite a bit. This is referring to Erlang Processes (the internal lightweight processes, not the host process with a formal OS PID associated with it). Your whole app doesn't die, just the broken parts, and the OTP supervisor subsystem brings them back to life quickly.