There is tradeoff between exploration and exploitation that has to be made.
The goal in this case corresponds to a particular type of bandit. It is to postpone death for as long as possible by pulling the right arm. I actually didn't find this type yet (a mortal multi-armed bandit has a birth-death process of the arms themselves).
Edit: This is only about the learning process on the non-pilot side, as one of the other commentators already articulated.
I think more appropriate in this context is learning though. And I would suggest here multi-armed bandits (with every arm representing a pilot).
https://en.wikipedia.org/wiki/Multi-armed_bandit
There is tradeoff between exploration and exploitation that has to be made.
The goal in this case corresponds to a particular type of bandit. It is to postpone death for as long as possible by pulling the right arm. I actually didn't find this type yet (a mortal multi-armed bandit has a birth-death process of the arms themselves).
Edit: This is only about the learning process on the non-pilot side, as one of the other commentators already articulated.