To summarise the relevant details, the "AppView" service is responsible for the sorts of queries that aggregate across users, and that has its own database setup - I think postgres but I'm not 100% sure on that.
You're right, as usual. AppView is on a Postgres cluster with read replicas doing timeline generation (and other things) on-demand. We're in the process of moving it toward a beefy ScyllaDB cluster designed around a fanout-on-write system.
The v1 backend system was optimized for rapid development and served us well. The v2 backend will be somewhat less flexible (no joins!) but is designed for much higher scale.
The BGS (which is an atproto "relay" service) subscribes to all PDS event streams on the entire network, and aggregates and relays them.
This way it's possible to get all network data from a single place (the BGS) rather than having to connect to every PDS, which is simpler for consumers and dramatically reduces the workload of PDS hosts.
> The BGS handles "big-world" networking. It crawls the network, gathering as much data as it can, and outputs it in one big stream for other services to use. It’s analogous to a firehose provider or a super-powered relay node.
"Big-world" networking by Big Tech-to-be Bluesky with super-powers, I wonder? Is this BGS also going to be federated, or is that the big centralized beating heart of this platform managed exclusively by BS?
There can be multiple BGSes (like there are a few Web search engines) but it's expensive to run so there probably won't be many. Alternative designs are either more expensive or don't have the same features.