Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Comments by Top HN Posters Analysed by IBM's Watson User Modelling API (kolinko.github.io)
180 points by kolinko on Dec 13, 2014 | hide | past | favorite | 90 comments



This is hilarious!

I'm ranked first in:

  Cheerfulness   (Before or after morning coffee?)
  Orderliness    (Because of my 3rd normal form sock drawer?)
  Gregariousness (Before of after my 3rd beer?)
  Agreeableness  (I disagree! Watson needs debugging.)
I'm ranked dead last in:

  Imagination (No one I know could imagine how this could be.)
  Authority-challenging (My teachers & bosses would disagree.)
  Intellect (Before or after my mother dropped me on my head?)


This could make a good movie plot. The all powerful AI analyses your characteristics, gets it all wrong and assigns you to the wrong line of work (Harry Potter?) :)


Master short story scifi writer Sheckley, R. has this covered:

Bad Medicine

http://www.gutenberg.org/files/9055/9055-h/9055-h.htm


This happens in the pilot episode of Futurama.


I thought the joke was that the AI was completely correct.


Sounds more like Divergent. Gah, can't believe I know that.


Is this funny/good (movie plot)/interesting, or scary?


Intellect (Before or after my mother dropped me on my head?)

In other words, you are ranked 148th out of like 100k really smart people.

So, maybe quitcherbitchin? :-)


> In other words, you are ranked 148th out of like 100k really smart people.

Its not a top by category ranking of all HN posters, there's a merged list of top commenters and leaders, and then the ranking is from that. Being on the bottom of that (as I am for agreeability) doesn't mean you are still ahead of everyone on HN that didn't make the list.


You are being pedantic. I am being sort of humorous.

I think the main gist of my point stands though: Ranking dead last on this list hardly equates to "dropped on your head." That's like saying "I only won one of the less important Nobel Prizes. God, I'm such a loser." Or something.

I like edw. I like a lot of people here. But sometimes folks here really suffer from tunnel vision in a bad way.

Cheers.


Ha, indeed. :-)

Watson didn't even remotely consider me. What do I have to do to get its attention?!


Somebody please make a browser plugin that uses this data.

I know the data isn't perfect, but it would be nice to be able see who in a thread is a top HNer and which character traits are outliers from the norm. You get insulted by somebody ranked low in Sympathy? No need to worry.

I'm serious. Somebody do this. It would look great on the résumé.


Or, we could learn to do brief history searches and check out past user comments if it really matters.

I'm generally pretty suspicious of any sort of computer-generated personality profiles, and though I understand the appeal to techies of empathy-as-a-service I don't think it's something we should consider relying upon.

Hell, half the fun in life is trying to figure out who other people really are.


Empathy-as-a-service is the killer app for Google Glass. Finally!


Actually, that's way less far-fetched than you think: http://www.autismspeaks.org/news/news-item/google-glass-app-...


Somewhere, an ad executive is thinking up a commercial about humanizing glassholes...


I'm sitting on a sleepy but full Tokyo train and laughed out loud... those glassholes don't have a chance do they.


one man's fun is another man's nightmare.


One of the other fun ideas we had: display a personality rating next to the comment input field.

Imagine how it would affect people commenting if they saw that what they are supposed to post is aggressive, or passive-aggressive ;)


> You get insulted by somebody ranked low in Sympathy? No need to worry.

Because you totally need to worry if someone insults you on the Internet.


As it turns out, words can unexpectedly hurt whether they come from the internet, office, or a bathroom stall.


Hi. I built a Chrome extension for Hacker News that lets people follow others and get notifications when they are replied to or their karma changes. http://hackbook.club

The ext basically matches my original vision now, so I'm thinking hard about what features to add next, but I'm not quite sure I understand what you're asking for. Are you saying that when the extension says "So and so replied to you", you want, say, a sympathy score shown with that user as well?


On one level it could be a simple bookmarket (bookmark with JS). When clicked, it goes through the HN discussion page. When it finds a top HNer's username, it (1) colors it and (2) shows the category if that HNer is in the top or bottom 10% in any category.

For instance: edw519 33 minutes ago | link Top: Cheerfulness, Orderliness, Gregariousness, Agreeableness Bottom: Imagination, Authority-challenging, Intellect

(I'd also something that shows me when a commenter is somebody important, even if not a karma king.)


Ok this actually makes a lot of sense. When someone replies to you on Facebook, you generally know who they are. On Hacker News, you have no idea. Including some basic (albeit imperfect) information along with a notification reply could be really useful.

This could even be true for the follow-feed mechanism. If you're following someone on Twitter or friends with them on Facebook and they write something that appears in your feed, you are already familiar with the author on some level. In the Hackbook extension, it's fairly common to follow people and not know much about them at all. Again, including some basic Watson-generated information along with the "a user you're following wrote a comment" newsfeed item could be helpful.

Let me give it some more thought and maybe I'll have time to work on it this week.


Observing something changes it. It'd probably be useful in the beginning, but as soon as people start replying based on these personality profiles, it won't be useful anymore.


I wonder if this data might be more interesting on the bottom 100 users with a longevity > 2 years. And of course it would be interesting to see the differences between the top 100 HN users and the top 100 Reddit users. That might provide some insight into the echo chamber effect.


This place is just as much an echo chamber as reddit. Often moreso.


I agree with that, one interesting question though is can you use data like this to characterize the echos. If so, then one could point it at an arbitrary forum, score the top 100 posters, and pull out a 'flavor' for the forum. That would be kind of neat.


I don't know if anyone else noticed this, but @Tichy ranks first in Emotional range, Fiery, Prone to worry, Melancholy, Immoderation, Self-consciousness and 2nd in Susceptible to stress. I haven't read any of his/her comments but this makes me wonder how distinctly each of these measurements are calculated. Given another random set of 100 users, would one user come out on top in all of these categories too?


It makes me think that a few extreme posts (or even one) are weighing the results so heavily that they win in a bunch of categories.


I first read about the "Big 5 personality Traits" in this Economist article last year. Very interesting read on how some researchers were using Twitter writings (specifically some keywords) to gauge a person's personality - http://www.economist.com/blogs/economist-explains/2013/05/ec...


One of the difficulties this kind of output has is that it runs very quickly into issues of semantics and model labeling.

We humans each build models kind of like these when we interact with one another, so when I ask somebody, "do you think <name> is an agreeable person?" and they reply "sure I think he is!" they're consulting that model to provide me an answer. Humans can even do a kind of pairwise sorting on that model and tell you if person1 is more or less agreeable than person2.

However, even if our individual models may differ a bit, and the results of these kinds of questions to each other might differ a bit, there's an inherent "humanness" to the results because people generally have a pretty similar semantic understanding of what "agreeableness" means.

However, what does Watson think agreeableness means? I have no idea, nobody really knows. Watson can't really explain it. All we know is that there's a model that produces a scored (and thus rankable output) when asked to score a corpus on that model and somebody somewhere labelled that model as the "agreeableness" model, perhaps based on some heuristics or parameters that were intended to define that notion.

It's thus very hard for humans to trust scoring like this because when it doesn't make sense, it doesn't make sense for reasons that no human would have about the matter. For example, I would personally say pg is far more agreeable than I am, yet Watson scores our respective collection of comments exactly the same. I can't explain it, Watson can't explain it, and thus it feels "wrong" and now I can't trust the scores that Watson provides me.


Well, the real question this tool answers is not really "do you think X is Y", but "do you think X's comments show Y".

There's also a lack of documentation from IBM as to what the results mean exactly and how solid they are.


Well, in a sense, most of us all only know each other through our comments, and that's all we can ever base an assessment like this on. By proxy we have to assume that when people's inner thoughts leak out into the Internet on a forum like this (and in a sustained enough way to make them a top-100 karma earner) that their aggregate corpus of comments will be a reasonable insight into who they are.

So for all purposes that you, I or Watson can demonstrate, "do you think X is Y" and "do you think X's comments show Y" are functionally the same.

edit

I just checked what Watson thinks are my needs. Apparently I don't have many, and everybody on HN has an extreme need for Challenge.

I almost feel like these results require a lot of interpretation, and that interpretation is about as reliable as a horoscope.


> I almost feel like these results require a lot of interpretation, and that interpretation is about as reliable as a horoscope.

I got a similar feeling from this - that's why I'd love to see some hard data behind the algorithm, or at least bits and pieces about the methodology used to arrive upon it.


You may want to consider using a stable sort to list the results. Users with equal scores are shuffled each time I click the topic name.


Does not seem correct. The top poster for practicality has some useless thank you posts that provide nothing useful to the reader. StackOverflow even tends to lock questions that just get a lot of useless thank you comments to prevent putting useless information on the page for people coming to find reference material.


There are three users which had a word count below the minimum recommended by Watson: whoishiring, peter123 and the one you mentioned - Libertatea


I'm so incredibly grateful to have this user model https://pbs.twimg.com/media/B4xNHXtCAAAoFsw.png:large

2144 days into this experiement that is HN, I couldn't ask for a better analysis of myself than through empiricism. My online self may not be my "true self" but it certainly represents a portion of who I want to be.

Now to contextualize, I wonder how we trend together and apart from the median user model as individuals and the community? The distributions seem interesting -- for example at a glance we appear heavy on challenge seekers but light on stability!


I learned two things from this: I'm no longer in the top 100, and I don't recognize a lot of the names that are. I must not be spending as much time here as I used to.


I expected Grellas to be there, and damn, the average score per comment he has is ridiculous.

Rayiner's doesn't make sense, though. An average of 0.75? So he has 57275 comments? Wow.


Rayiner, to his credit, engages controversy without fear. I'm sure it leads to some interesting voting.


The average score is limited to recent comments (for some definition of recent, no idea about the specifics).


The API we used has a limit of 0.5MB data, so pushed all the recent comments up to that limit.


I had assumed ameister14 was talking about the avg shown on user pages:

https://news.ycombinator.com/user?id=kolinko


These results look pretty spotty...the rankings seem about as random as picking out names from a Bingo machine. User ssciafani is among the top users in Cautiousness, Openness, and Adventurousness, Stability, and Practicality. Any "Yeah, user johndoe is totally some-characteristic! revelations may be no more the intentional result of a sophisticated algorithm than the confirmation bias that many have when reading a horoscope.


That's really cool. For some reason, the ranking changes every time I click on the same category. Edit: I just realised it's probably because the position of users who have the same score is randomised.

Also, why is Libertatea (https://news.ycombinator.com/threads?id=Libertatea) ranking so high in many categories and yet he only has 5 posts?


Libertatea's score was computed on a very low number of comments so it may contain large errors, pergaps we should remove him/her or show a warning there...


I had a small giggle at my rankings, not going to lie.

Without a definition for what each of these things are, it's a little unclear what this is actually saying. I'm assuming there's some documentation on this somewhere?


IBM's/Watson's documentation is really vague on what these things are, but there's some explaination to that in Wikipedia:

http://en.wikipedia.org/wiki/Big_Five_personality_traits

One thing that we're wondering is whether a score 1% means that it believes that the person has little of this treat, or that it has little proof to believe that the person has this treat. If I'm not mistaken it's the former.


Yes, former. It may be useful to translate these into other systems[0] so that you see what each end of the spectrum means.

[0] http://similarminds.com/global5/g5-jung.html


That would be fun - to see if a person is INTF or ENTJ next to his/her nickname on the page :)


on "seeks love" and "values hedonism", there was almost no commenter above .5.

For "challenges authority" all visible commenters were above .95.


I think that IBM Watson expects a general text no? I admit to having done some personality classification on OKC profiles (around 200) and found that disproportionately to the average population, there were a lot of people that were classified as caring, helpful, social, popular individuals.

It fits the medium. Just like one would expect HN comments to be full of more "head-y" discussions.


I didn't find such expectation in the docs, and someone else posted this, where it says that researchers from IBM were using their tech to analyse tweets:

http://www.economist.com/blogs/economist-explains/2013/05/ec...

I'd assume that this might be the same algorithm, but I'm not really sure of that.


I'm extremely relieved that sometime in the past few months I've dropped off the list.

In case you're wondering, IBM's BlueMix is a public installation of Cloud Foundry, which is an opensource PaaS. Disclaimer: I work on Cloud Foundry at Pivotal.


This is interesting. On the visualization, though, I'm not sure what the purpose of the inner ring is (it just seems to chart the first entry in the group at the next level out, which doesn't seem to be particularly meaningful.)


Very nice project, thanks for introducing me to IBM Watson. I will also play with it a little bit:) I don't know how it determines Openness and stuff, pasted one of my texts in there, it gave out 'interesting' results:)


Yeah, we couldn't find any proof that the results are meaningful in any way, but I think it's a fun tool nevertheless.

One thing I know for sure is that Watson doesn't recognise irony - I posted this poem and it was marked as "cheerful" and "optimistic": http://bukowski.net/poems/a_smile_to_remember.php


The algorithm finds me fiery and prone to worry. Seems about right.


It also places you #1 for adventure seeking, which is fitting, considering your daily bike commute through New York City :-)


Are there any other services that offer similar functionality to the User Modeling API? I've come across a few Sentiment Analysis services, but nothing with this level of detail.


http://www.uclassify.com/browse

You can run MBTI albeit as 4 separate API calls.


On an iPad I cannot click on anything above extraversion and need to reload to click something else.

Not sure about the results, I'm pretty sur I value liberty more than Watson thinks I do.


It's not so much about what you value, but what values you project.


It would be nice to see what comments it thinks lead to certain beliefs on its part, but I am pretty sure it needs some work.


It does not match current /leaders. Is it old?


We made a mistake and published a list of top 100 commenters, not leaders. But we'll be updating this in a moment.

Sorry for the confusion.


I love how everybody is high in self-expression. I'm not sure, though how cperciva ended up on top in need for harmony.


I think he plays violin so it kinda makes sense ;)


On a related note, how does one get access to the leaderboard data past user #100?


It's the leaderboard merged with 100 most commenting.


Grellas is the ideal human ^^


He's last on Cheerfulness :)


[Sorted by: Intellect]

31. [0.95] whoishiring (threads) (about)

Is this a Human? Must be well rounded !


I wonder if this could be used to detect shills and astroturfers.


I don't think this kind of analysis would provide any insight there. A simple look at users who have high activity on certain topics would be a good first filter, though.


Any feedback? :)


Thanks for doing this. It is hard to judge the data, but I just did a quick check for agreeableness, and it seems reasonable. The person rated highest for agreeableness top does in fact seem way more agreeable than the person at the bottom.

Most agreeable: https://news.ycombinator.com/threads?id=StavrosK Least agreeable: https://news.ycombinator.com/threads?id=dragonwriter


I'd be interested to learn what some of the headings really mean. For example, the only listing I'm near the top of is "Trust" (I'm #4) but I don't really know what that means in this context. I had a quick look at the IBM docs but couldn't figure it out.

Selfishly, it would also be pretty cool to be able to select a person and see their rankings across the categories :-)

Oh, and your "top 100 users" link on the front page is broken as it's not an absolute link.


- top 100 users link - fixed, thanks

- rankings across all the categories. good idea - shouldn't be hard to do, but don't have the time to implement it today :)

- as for what the headings mean, from what I understand they are related to http://en.wikipedia.org/wiki/Big_Five_personality_traits - but yes, the IBM watson docs are very vague on this subject


You need to load your js files off https, by defalt your page does not work and you need to disable mixed content warnings for anything to load


Thanks - it's fixed now :)


Click Hedonism, ctrl f 'tokenadult', shrug at the sufficiently advanced technology indistinguishable from magic.


Purely in terms of the website, you have no back button from the list of users but also no support for browser history.

Otherwise cool!


Gunna have to try this one :)


whoishiring leads the "Artistic interests" section.


Definitely looks like context is lacking here in some respects. I found it interesting that 'whoishiring' "needs love" more than almost everyone on the list.


Cool app! IBM is in the process of significantly updating their documentation on User Modeling, but meanwhile, here are some basic descriptions of some of the traits, as well as links to some of the research behind the service.

User Modeling analytics are developed based on the psychology of language in combination with data analytics algorithms. User Modeling extracts three types of personal characteristics from the data a person generates in social media or within their written/digital communications: Big 5 Personality - This is the most used personality model that generally describes how a person engages with the world by the following five dimensions: – Openness-to-Experience - associated with curiosity, intellect, and an appreciation for art and adventure – Conscientiousness - associated with organization and industriousness – Extraversion - associated with positive and outgoing attitudes toward other people – Agreeableness - associated with compassion and cooperation toward other people – Neuroticism - associated with a sensitivity to negative emotions Each of the five top-level dimensions has six sub-facets that further characterize an individual at a finer-grained level. Basic Human Values - this model describes motivating factors that influence a person's decision-making. Our current model includes five dimensions of human values based on Schwartz's work in psychology: – Self-Transcendence - motivated by helping others – Self-Enhancement - motivated by increasing social status – Hedonism - motivated by pleasurable experiences – Openness-to-Change - motivated by experiencing new things in the world – Conservation - motivated by tradition and conformity Fundamental Human Needs - this model is based on Maslow's hierarchy of needs and Ford's work on Marketing and consumer-related needs modeling. It describes, at a high level, which aspects of a product will resonate most with a person. – Ideal - the person likes high-end, finely crafted products – Self-Expression - the person likes products that express their individual identity – Closeness - the person likes products that help them establish closer relationships with family and friends – Excitement - the person likes products that provide exciting, adventurous experience – Practicality - the person likes products that simply get the job done

For more detailed information about the research and technical background behind the User Modeling service, see the following: You read what you value: understanding personal values and reading interests Gary Hsieh, Jilin Chen, Jalal Mahmud, Jeffrey Nichols; CHI 2014. 983-986. Understanding individuals' personal values from social media word use Jilin Chen, Gary Hsieh, Jalal Mahmud, Jeffrey Nichols; CSCW 2014. 405-414 Recommending targeted strangers from whom to solicit information on social media Jalal Mahmud, Michelle X. Zhou, Nimrod Megiddo, Jeffrey Nichols, Clemens Drews; IUI 2013: 37-48 Modeling User Attitude toward Controversial Topics in Online Social Media Huiji Gao, Jalal Mahmud, Jilin Chen, Jeffrey Nichols, Michelle Zhou; ICWSM 2014. Who will retweet this?: Automatically Identifying and Engaging Strangers on Twitter to Spread Information Kyumin Lee, Jalal Mahmud, Jilin Chen, Michelle Zhou, Jeffrey Nichols; IUI 2014. 247-256 KnowMe and ShareMe: understanding automatically discovered personality traits from social media and user sharing preferences Liang Gou, Michelle Zhou, Huahai Yang; CHI 2014. Identifying User Needs from Social Media Huahai Yang, Yanyuo Li; IBM Research Report. PersonalityViz: a visualization tool to analyze people's personality with social media Liang Gou, Jalal Mahmud, Eben M. Haber, Michelle X. Zhou; IUI Companion 2013: 45-46


This makes me think that new "open source" fight in the AI future will not be about source code or patents but the corpus used in the training set.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: