"One massive monolithic website" is, I think, meant to be read as referring to the WMF sites being a thing that have a shared telecommunications Single Point of Failure—a "choke point" where a given piece of information can only get from a given WMF site, to a user, by travelling through WMF-managed Internet infrastructure.
Remember Napster, back in the day? It was able to be shut down because it had an SPOF: Napster-the-corporation owned and maintained all the "supernodes" that formed the backbone of the network.
Or consider the Great Firewall of China. If the Great Firewall can block your site/content entirely with a single rule, you have an SPOF.
The answer to such problems isn't simple sharding-by-content-type into "communities" like you're talking about; this is still centralized, in the sense of "centralized allocation."
Instead, to answer such problems, you need true distribution. This can take the form of protocols allowing Wiki articles to be accessed and edited in a peer-to-peer fashion with no focal point that can be blocked; this can take the form of Wikipedia "apps" that are offline-first, such that you can "bring Wikipedia with you" to places where state actors would rather you don't have it; this can take the form of preloaded "Wikipedia mirror in a box" appliances (plus a syncing logistics solution, ala AWS Snowball) which can be used by local libraries in countries with little internet access to allow people there access to Wikipedia.
> WMF sites being a thing that have a shared telecommunications Single Point of Failure
In fact, one of the long-term projects in WMF is making sure the infrastructure is resistant to single-point-of-failure problems - up to whole data center going down. We are pretty close to it (not sure if 100%, but if not close to it). Of course, if you consider existence of WMF to be point of failure, it's another question, by that logic existence of Wikipedia can be treated as single point of failure too. Anybody is welcome to create a new Wikipedia, but that's certainly not a point of criticism towards WMF.
> It was able to be shut down because it had an SPOF: Napster-the-corporation owned and maintained all the "supernodes"
WMF does not own the content or the code, both are in open access and extensively mirrored. WMF does own the hardware - I don't think there's a way to do anything about it, unless somebody wants to donate a data center :)
> If the Great Firewall can block your site/content entirely with a single rule, you have an SPOF.
> Instead, to answer such problems, you need true distribution.
I am skeptical about the possibility of making community work using "true distribution". Even though we have good means to distribute hardware and software, be it production or development code, we still do not have any ways to make a community without having gathering points. I won't say it is impossible. I'd say I have yet to see anybody having done it.
But if somebody wants to try, all power to them. You can read more about Wikimedia discussions on the topic here: https://strategy.wikimedia.org/wiki/Proposal:Distributed_Wik...
> this can take the form of preloaded "Wikipedia mirror in a box" appliances
We are pretty close to this - you can install working Mediawiki setup very quickly (vagrant or I think there are some other containers too, I use vagrant), dumps are there. Won't be 100% copy of true site since there are some complex operational structures that ensure caching, high availability, etc. which kinda hard to put into a box - they are in public (mostly as puppet recipes) but implementing them is not out-of-the-box experience. But you can make a simple mirror with relatively low effort (probably several hours excluding time to actually load the data, that depends on how beefy your hardware is :)
Most of this, btw, is made possible by the work of WMF Engineers :)
> We are pretty close to this ... [things you'd expect ops staff to do]
That doesn't come close at all, from the perspective of a librarian who wants a "copy of Wikipedia" for their library, no? It assumes a ton of IT knowledge, just from the point where you need to combine software with hardware with database dumps.
The average library staff who'd want to set this up in some African village would be less on the side of the knowledge spectrum of "knows what to do with a VM image", and more toward the side of "can plug in and go through the configuration wizard for a NAS/router/streaming box."
Once I can tell such a person to buy some little box with a 4TB hard disk inside it, that you plug in, go to the URL printed on the top, and there Wikipedia is—and then it can keep itself up to date, with a combination of "large patches that get mailed on USB sticks that you plug in, wait, and then drop back into the mail", and critical quick updates to text content for WikiNews et al that it can manage to do using a 20kbps line that's only on for two hours per day—then you'll have something.
I presume you have you tried Kiwix? For less than $100, you can install the full Wikipedia (with reduced size graphics) on a cheap Android tablet with a 64GB card. The installation the first time is a little clumsy, but the experience once it's local is solid: http://www.kiwix.org/downloads/.
I don't think "critical updates" are really that necessary. Swapping SD cards a couple times a year would solve most of it. I think it's pretty incredible (and useful) to to be able to have access to all that information for such a low cost even if it's a few months (or even years) out of date.
> That doesn't come close at all, from the perspective of a librarian who wants a "copy of Wikipedia" for their library, no?
Depends what you mean by copy. If it's just a static data source, any offline project would do it. If it has to update it's trickier, but some offline projects do it too. If you want to run a full clone of Web's fifth popular website, yes, it requires some effort. Sorry, no magic here :)
> "can plug in and go through the configuration wizard for a NAS/router/streaming box."
There are boxes that are integrated with one or another of the offline projects. There's also Wikipedia Zero - which in the world where mobile coverage is becoming more and more widespread even in poor regions, may be even better alternative.
Remember Napster, back in the day? It was able to be shut down because it had an SPOF: Napster-the-corporation owned and maintained all the "supernodes" that formed the backbone of the network.
Or consider the Great Firewall of China. If the Great Firewall can block your site/content entirely with a single rule, you have an SPOF.
The answer to such problems isn't simple sharding-by-content-type into "communities" like you're talking about; this is still centralized, in the sense of "centralized allocation."
Instead, to answer such problems, you need true distribution. This can take the form of protocols allowing Wiki articles to be accessed and edited in a peer-to-peer fashion with no focal point that can be blocked; this can take the form of Wikipedia "apps" that are offline-first, such that you can "bring Wikipedia with you" to places where state actors would rather you don't have it; this can take the form of preloaded "Wikipedia mirror in a box" appliances (plus a syncing logistics solution, ala AWS Snowball) which can be used by local libraries in countries with little internet access to allow people there access to Wikipedia.