It strikes me that the decades of academic literature that establish "in order to get a result correlating X and Y, we needed to control for A, B, C, ..." would be a critical input into any system attempting to work with medical data. In a way, the plaintext of historical medical journals has encoded much of this expert knowledge, albeit with retractions and errors the further back you go. But that might change the conclusion of the OP into more of an "incredibly hard problem" rather than "doomed to fail."