> Moreover, it's not obvious to me how one would go about obtaining "unbiased" corpora without somehow relying on subjective societal values that are different everywhere and continually evolving.
I don't believe that problem will ever be completely solvable. But I think the road to go is to make these assumptions always explicit. I.e. when the machine learning system derives a result, program it to additionally return a proof of how it came to this result. And also give a way to let the ML system return a list of all axioms and derivation rules that it has currently learned, so that they can independently be checked how much they are biased and can thus be corrected.
It's pretty hard to return "rules" for a ML system, especially a non-linear system. Google is currently working systems that use a trillion features - I can't imagine returning some kind of rule list for that.
> Google is currently working systems that use a trillion features - I can't imagine returning some kind of rule list for that.
As I wrote: It would already help if the ML system as a first step returned the derivation with only the rules that were concretely used for a concrete derivation - this list is much shorter and can thus much easier be checked.
I don't believe that problem will ever be completely solvable. But I think the road to go is to make these assumptions always explicit. I.e. when the machine learning system derives a result, program it to additionally return a proof of how it came to this result. And also give a way to let the ML system return a list of all axioms and derivation rules that it has currently learned, so that they can independently be checked how much they are biased and can thus be corrected.