If I had the resources zuckerberg has, and the access to information and data that he does, my posts would be optimized to show asynchronously by region, and to show first in the region most likely to respond positively to my post. That way it sets a tone of positivity for the rest of the commenters. At least in theory.
Nice write up! I would assume in a large distributed back-end like facebook that notifications are not sent immediately and definitely not sent to everyone at the same time. This probably means that it is impossible to rely on them to get a first post comment.
It did seem like some people were able to compose well thought out responses in a very short time. Which most likely was due to them getting a head start thanks to an advanced notification.
While the analysis was interesting, I can't help but wonder why so many people write pointless comments to what is probably some marketing person impersonating Mark.
It's a ego boost. Nothing more. You associate yourself with a famous person. It feels good. Even if it is totally pointless. Maybe a friend notices. And sometimes your comment gets many likes, but then you're really lucky.
I think a lot of them share same hopes/sentiment that I did. A lot of them write as if they think there is high chance that Mark will read their comments. Even though I am pretty certain at this point that he won't.
I would probably go with auto-reply as soon as he post something, receive the text message and then edit the reply to something meaningful. Fun read! :)
Well, I know this wasn't your intention, but you could always "sell" that top spot and edit in the content from a top marketer, keeping it subtle enough that it looks legit.
Also, Mark is just one person. Imagine if you opened this script up for use targeting the fans of any celebrity / influencer. Companies would be willing to pay a LOT of money to be able to reach those audiences.
Can't believe I'm advocating comment spam, but sometimes the opportunities jump out at me. :)
This is why I love this industry. He started with a somewhat "bad" idea- even an automated reply would have a really hard chance being the first comment. But, he was able to experiment and tweak the idea to do something pretty cool with his work while learning. Well done!
There is an amazing amount of Interesting™ stuff you can do with social media. I've been actively looking for and following people in the "social media influencer" game to see just how they pull it off (things like getting apps to the top of the app store, building gigantic and legit Twitter and Instagram accounts, etc.)
On this particular story, doesn't Facebook let you create notifications in any way? I get instant notifications from certain best friends for some reason. Maybe you could create a burner FB account and make Zuckerberg the only friend. I'd also consider trying this on someone like Robert Scoble or a tech journalist - someone who has a gigantic following, but relatively low comment velocity.
Anecdotally, through ~2009/11 you could use keyword density as a significant ranking variable on the Google Play store. I was amazed when I discovered as web SEO had long left this behind. I had an app coming up 3rd in searches for things like FB and YT.
Great point. I was actually thinking to say something along those lines in the conclusion.
While I would say my approach was mostly useless with a celebrity of Mark's scale. It could work with less popular accounts that do not get as much attention.
Love seeing someone using selenium and I had no idea you could use anything else than the Chrome driver, what's a headless browser?! I automated whole sections of my old Sales Rep job with Selenium it was so awesome
Basically just a browser without a GUI. You can run a normal browser like Chrome or Firefox with a dummy display server or a real headless browser like PhantomJS or SlimerJS.
I was working a sales development type role, which involved some pretty fierce competition to "claim" inbound leads (contact form submissions basically) before anyone else did. Also there were a series of clicks and repetitive tasks you had to perform on each of these leads after you claimed it.
So I automated getting a list of leads, opening them in multiple tabs and sequentially performing all the repetitive tasks so when I eye balled them, they were at the right point in the process. I reckon I save myself about 1.5 hours per day doing this.
I doubt it's the data itself that your browser is struggling with. It's more likely the parsing of the comment content, generation of 20,000 templates, keeping track of probably a quarter million DOM nodes, listening to a hundred thousand events, and making sure all of that is layed out and aligned sensibly in real-time that your computer struggles with.
Good point, although 18 MB of text is a lot of text.
If I might digress a bit… most text editors will choke on an 18 MB text file as well (and if it’s an 18 MB text file with just one line (like a base64 string with no line breaks)… good luck getting anything other than a hex editor to open that with any speed). The only text editor I’ve found that doesn’t bog down too much when opening huge text files (even some of the ones with no line breaks!) is the one that comes with OS X: TextEdit¹.
At first I also thought that stalking would be summarising several sources to find his physical location or something. So like you I was also initially a bit disappointed by the non-mallace in his approach.
But given his goal of "becoming a friend" by automatic following Mark, I ended up concluding that stalking wasn't hyperbole.
I can't tell if saying 'Mark Zuckerberg – the Bill Gates of our time' was a joke or not. I mean, isn't Gates very much still going strong? I get the analogy I suppose, but doesn't that imply Gates is all washed up?
Bills big asset growth period was the mid 1980s to late 1990s when MSFT stock doubled every other year. Its only doubled once in the entire 21st century.
Zucky is in his high personal asset growth stage now.
can you figure out a posting schedule for this Mark Zuckerberg guy, maybe there are certain times he is more likely to post. Then you can increase your rate of monitoring during those times, without such a high chance of being labelled a spammer by Facebook?
It takes a while for a Facebook post to processed. When posted it shows up in the user's news stream. However, once the page is refreshed it might be a few minutes before the post is seen again. I watched a video where some developers were talking about the problems with getting feedback to the posting user and pushing the posts to different server farms around the world. A lot of posts take minutes to propagate.
Enjoyed the article. For once, this article showed some flaws in the original process/idea, and showed very nicely how an original seed idea turned into something different, and more involved/interesting.
Clearly a lot of work and sweat went into getting the results you did, and the final output looks very polished.
does Mark respond to any of the posts? if so, what kind. Do some types of posts always generate more "buzz" among other commenters? Those are some of the questions that would be interesting to answer.
In this case the process was far more interesting than the findings, thanks for citing the book and videos.
I don't remember seeing any responses in my data set. I think I have seen him respond to comments before, hence where I got my inspiration.
I also know that he has people who actually are connected to him (he friended them back) and their comments appear to show up differently from the regular followers.
He does reply to those comments a lot more, intensifying the illusion that if you write to him, he will read it.
Lovely write-up, made me smile. I'll be reading your other posts on related subjects.
Great to read a walk through of a "directed trial and error" approach. Out of curiosity, how did you select NLTK and a graphing approach? Did you consider other techniques for ploughing through the data?
It would just be much easier to use the Facebook graph api, there is an official Python module and is well documented, and would be less likely to hit rate limits or other blocks - ironically that was one of the reasons that the author used scraping instead of the api.
Incidentally, the Graph API wouldn't work for this use case, since the Graph API will not let you get any data from users unless that user is authorized.
Bill Gates was also from an affluent family, which among other things helped him to log a lot of hours on a computer at the time when it was not available to the masses.
I think there is some luck at play for both of them. But I also think they deserve credit for their accomplishments.
I give both of them credit for their successful businesses.
It's the same credit I give to Donald Trump. They are all fine businessmen. In my world, I want moral businessmen.
They all made a lot of money.
There's a part of me that cringes whenever Gate's talks about giving away his money. Yes, he made the money legally. We paid for his philanthropy? (To the Gate's, and his wife's non-profit; stop giving third world farmers Monsanto seeds. They are only getting one crop. Listen to Buffett's son.)
As to Mark. Yes he's a fine businessman. I don't think he has had an original thought since he stole the idea from the twins. I've never worshiped, nor liked that guy. I take that back--loved the tee shirts, and jeans. Honestly--I've never understood the whole tie thing.
And to be completely honest; I will love the internet again when/if FB is displaced by a the next big, new, wiz bang app. I am really tired of FB.
Go ahead call me a Troll. Call me a Hater. I like a lot of Founders, and their companies. I have never liked these two.
To get to the top you need a lot of luck, a lot of ability, a huge drive and every advantage you can get. This doesn't detract from the amazing accomplishment of both men.
I think the article is very tongue in cheek! I'm pretty sure the author was trying to freak Mark out; I'm not certain this work would help to get him a job at Facebook either!