Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I was referring mainly to this one (from the same group and it actually surpassed humans on average):

https://stanfordmlgroup.github.io/projects/chexnet/

In their paper they even used "weaker" DenseNet-121 instead of DenseNet-169 for Mura/bones. DenseNet-BC I tried is another refinement of the same approach.




Those are some sketchy statistics. The evaluation procedure is questionable (F1 against the other 4 as ground truth? Mean of means?), and the 95% CI overlap pretty substantially. Even if their bootstrap sampling said the difference is significant, I don't believe them.

Basically, I see this as "everyone sucks, but the AI maybe sucks a little less than the worst of our radiologists, on average"




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: