How would I be able to see the whole tree of replies, though? Would I need to rely on an exponentially large number of services being available at request time?
Messages are also sent between servers directly, so HN would store the full tree, if you replied to a comment your AP server would forward the message to the HN one and it would appear there.
I think all that is the responsibility of the viewer. When you view a single message on Mastodon, then Mastodon is responsible for fetching the rest of the thread for your display.
In practice, it looks like Mastodon also stores / caches local copies of messages and profiles, pretty much as they are received. (Including thumbnails!)