> We note that transcripts of video-taped conversations were taken from NAPLS for the training and evaluation sets, whereas the Reddit dataset includes entirely typed-in forum messages. People express themselves online in ways they wouldn't in face-to-face encounters, we reckon, so take this comparison of the two with a pinch of salt.
Totally agree, it's a little bit weak. Also the number of training and validating instances is somehow low... BUT anyhow interesting that it seems as if there are enough features for prediction of mental illnesses in text. And it's a great approach to develop quantitative instruments for diagnosis of this "soft illnesses".