The causal links in this analysis are very heterogeneous.
Some causes for accelerated aging seem relatively direct and plausible with causal models that have supporting literature, i.e. air quality.
On the societal level, it's much more complicated. For example, there will be an immense number of paths how education affect aging, some positive and some negative.
I wonder how much of those effects boil down to a few highly influential (unobserved?) covariates, such as physical activity, drug consumption and crime rate.
[edit] by the way, look at the replication materials - I've rarely seen such clean code. Kudos!
This paper builds on a series of pathways towards harm. Those are plausible in principle, but we still have frustratingly little evidence of the magnitude of such harms in the field.
To solve the question of whether or not these harms can/will actually materialize, we would need causal attribution, something that is really hard to do — in particular with all involved actors actively monitoring society and reacting to new research.
Personally, I think that transparency measures and tools that help civic society (and researchers) better understand what's going on are the most promising tool here.
The counterpoint here is that in practice, humility is only found in the best of frequentists, whereas the rest succumb to hubris (i.e. the cult of irrelevant precisions).
Popper requires you to posit null hypotheses to falsify (although there are different schools of thought on what exactly you need to specify in advance [1]).
Bayesianism requires you to assume / formalize your prior belief about the subject under investigation and updates it given some data, resulting in a posterior belief distribution. It thus does not have the clear distinctions of frequentism, but that can also be considered an advantage.
My experience is that it is always important to criticize free speech absolutism, especially when people behave as if it were an atemporal concept. In reality, most of the world for most of the time has had various compromises between protecting individuals and society on one hand and free speech on the other.
That said, I think your take is also empirically supported. There is this [1] very interesting study which comes to the same conclusion. It uses broadcast range of radio towers to do a quantitative analysis on the potential effects and finds few. Interestingly enough, I have seen other studies with similar designs that do show persistent effects of exposure to broadcasts, so I’m favorable to the idea that this one really is a valid null finding.
Today's world have people working for 8 hours a day (minimum) / 5 days a week while making little to no progress on their overall livelihood, and at the same time people (read - the privileged) have more "consumables" yet no one is ever truly happy anymore, resulting in insane concentrations of wealth and power in the hands of a few.
Also, the past isn't just defined by slavery. There are plenty of examples we can learn from the people before us.
Wow, that must be quite expensive! You said the files alone are a few PB. So at least 2PB / 8 servers ~= 250TB per server, which would probably put each server at > 20k $ (unless you’re putting it together with duct tape and scraps, but even then the disks will cost a ton).
Not exactly. Attachments are only fetched from Discord as the user requests them. This means that the vast majority of attachments are never stored on my server. Right now, I only have about 280TB of attachments locally on my own infrastructure. You can see more stats here: https://searchcord.io/about
While I get your point, it doesn't carry too much weight, because you can (and we often read this) claim the opposite:
Linear regression, for all its faults, forces you to be very selective about parameters that you believe to be meaningful, and offers trivial tools to validate the fit (i.e. even residuals, or posterior predictive simulations if you want to be fancy).
ML and beyond, on the other hand, throws you in a whirl of hyperparameters that you no longer understand and which traps even clever people in overfitting that they don't understand.
So a better critique, in my view, would be something that the JW Tukey wrote in his famous 1962 paper: (paraphrasing because I'm lazy):
"better to have an approximate answer to a precise question rather than an answer to an approximate question, which can always be made arbitrarily precise".
So our problem is not the tools, it's that we fool ourselves by applying the tools to the wrong problems because they are easier.
My maxim of statistics is that applied statistics is the art of making decisions under uncertainty, but people treat it like the science of making certainty out of data.
I indeed find the lesson that it describes unbearably bitter. Searching and learning, as used by the article, may discover patterns and results (due to infinite scaling of computation) that we, humans, are physically uncapable of discovering -- however, all those learnings will have no meaning, they will not expose any causality. This is what I find unbearable, as it implies that the real world must ultimately remain impervious to human cognizance; it implies that our meaning- and causality-based human reasoning ultimately falls short to model the world, while general, computation-only methods (given ever-growing computing power) at least "converges" to a faithful (but meaningless) description of the world.
See examples like protein folding, medicine research, AI-assisted diagnosis, self driving cars. We're going to rely on their results, but we'll never know why those results work. We're not going to reject self-driving cars if those cars save lives per same distance driven and/or same time driven; however, we're going to sit in, and drive, those cars blind. To me, that's an unbearable thought, even apart from the possibility that at some point the system might break down, and cause a huge accident inexplicably. An inexplicable misbehavior of the system is of course catastrophic, but to me, even the inexplicable proper behavior of the system is an unsettling thought -- because it is inexplicable.
Edited to add: I think the phrase "how we think we think" is awesome in the essay. We don't even know how our reasoning works, so trying to "machinize" those misconceptions is likely bound to fail.
Arguably, "the way our reasoning works" is probably a normal distribution but with a broad curve (and for some things, possibly a bimodal distribution), so trying to understand "why" is a fool's errand. It's more valuable to understand the input variables and then be able to calculate the likely output behaviors with error bars than to try to reduce the problem to a guaranteed if(this), then(that) equation. I don't particularly care why a person behaves a certain way in many cases, as long as 1) their behavior is generally within an expected range, and 2) doesn't harm themselves or others, and I don't see why I'd care any more about the behavior of an AI-driven system. As with most things, Safety first!
Some causes for accelerated aging seem relatively direct and plausible with causal models that have supporting literature, i.e. air quality.
On the societal level, it's much more complicated. For example, there will be an immense number of paths how education affect aging, some positive and some negative.
I wonder how much of those effects boil down to a few highly influential (unobserved?) covariates, such as physical activity, drug consumption and crime rate.
[edit] by the way, look at the replication materials - I've rarely seen such clean code. Kudos!
https://github.com/euroladbrainlat/Biobehavioral-age-gaps/bl...