Hardly a fair race. And of course its on-going in the pursuit of better ad targeting. And it its one of those weird places you find yourself when dealing with something like Quora and other sites. Basically every 'sample' has both data (the sample) and meta-data (where it came from, when) and those things add up. Looking at some of the research which de-anonymized datasets it was pretty clear that who was saying what, especially a large corpus of such utterances, would yield identify information, if not the actual person but the 'same' person.