"Once the dictionary is created, we can average all the word vectors for a given headline to get the numeric representation of the headline itself."
Can someone explain me why this is useful? Aren't we losing a lot of precision from word2vec results?
And as a general question: is there any useful knowledge we can extract from this vizualisation apart: most of the time, news channels write about different things?
Don't get me wrong, I think the techniques displayed here are really cool, but I have the feeling the conclusions are either absent or trite.
All that in-depth explanation, and he forgot the first rule of data visualization: always label your axis!
This should have been at the top, but was buried at the bottom of the article: "The left side of the 2D representation represents the more serious headlines, while the right side represents the more silly headlines. "
The axes are intentionally unlabeled. They don't represent anything, just spatial coordinates.
The fact that X-axis had a parseable interpretation (and the sliding scale of serious is not a hard rule as there are many counterexamples; I noted the scale as an quick observation) was due to the randomness of the clustering algorithm. It is possible that the chart could end up rotated if the algorithm is run again.
>Coincidentally around the same time Facebook announced their anticlickbait initiative, Facebook open-sourced their fasttext project, which can quickly build models to classify text using some of the above example techniques. Hmmmmmm…
There's an interesting notion, a clickbait filter.
Can someone explain me why this is useful? Aren't we losing a lot of precision from word2vec results?
And as a general question: is there any useful knowledge we can extract from this vizualisation apart: most of the time, news channels write about different things?
Don't get me wrong, I think the techniques displayed here are really cool, but I have the feeling the conclusions are either absent or trite.