Hacker News new | past | comments | ask | show | jobs | submit login
100 million Facebook users (download torrent) (thepiratebay.org)
60 points by pinksoda on July 31, 2010 | hide | past | favorite | 32 comments



Before people get too bent out of shape/excited, the data within the files is just people's names and counts as well as the URL to their profile. All of these profiles were already public.


But if you went and got it from Facebook, I could alter my settings and make it go away. The torrent is forever.

However, to believe what I just said is completely naive. If you screw up and make something public, the likelihood that it is already forever somewhere is close enough to 100% that you should just expect it.


The URL to your profile and your name are always public information on Facebook, regardless of what settings you tweak. All this torrent supplies is your name and your URL.


Any 1337 hackers interested in pulling off the same heist on LinkedIn are invited to peruse the list at this notorious Warez site:

http://www.google.com/search?sourceid=chrome&ie=UTF-8...


Facebook won't allow this link to be posted in the feed. Its considered abusive.


I believe Facebook doesn't allow any links from The Pirate Bay.


Did you try a link shortening service? Maybe one without 301 redirect


You can also get Google profile URLs for 16,256,271 users from the top post in this thread: http://news.ycombinator.com/item?id=1537968


I don't know if you've ever tried, but Google generally won't actually let you look through millions of results. It also eventually presents you with a CAPTCHA to solve if it detects that you're scraping its SERPs.


This is basically the business model of RapLeaf.


Facebook's fuckup is not to have some sort of anti-scraping system running. If they'd limited the number of public-facing profile views to 100 a day per IP address then this would have taken 41,666 days for one IP address (obviously you'd use more, but it illustrates the point).


And many college students would be blocked from Facebook... even if you increased the number to 1000 :)


It wouldn't solve the entire problem, but resolving the IP address to its assigned DNS name could also help. Anything coming from .edu could be exempt, but 100 full hits an hour from *-dhcp.mylocalisp.com could be flagged for abuse.


Fb could up the limit for an ip based on the number of users logging in from that location. Plus I'm only thinking the public facing non-friend profile views.


Do many colleges use NAT? Every college whose network I've been on gives external IPs to each network device.


The University of Alabama in Huntsville uses NAT for the folks who use its wireless network, or the network in the dorms; everyone on those networks gets a 10/8 address. Additionally, UAH is connected to I2, but doesn't provide any IPV6 connectivity for those who use its I1 connection. :/


What is the background on this file? What data is in it, where and how was it leaked, why does it exist, why do FaceBook users need to be concerned, or not?



I haven't looked at what's in the leaked data but is it some guy just incrementing id numbers and getting the JSON result via http://graph.facebook.com/someidnumber ?

http://developers.facebook.com/docs/api


What's the difference of this torrent with this one ...

http://www.google.co.uk/#hl=en&q=site%3Afacebook.com%2Fd...


... and some profiles in HN are searchable...

http://www.google.co.uk/#hl=en&q=site%3Anews.ycombinator...


I would like to offer a bounty to the first capable script author who can take this 'proof of concept' and correlate names with email addresses and/or telephone numbers.


Now for someone that scrapes the friend graph and turns it into a torrent. That might be more useful.


If this included friend information it would be much more interesting.

As it stands, it's just a proof of concept.


You could gather friend information easily. The info is there, this guy just didn't collect it.


I guet a malware warning when visiting the site.


What is this useful for?


I guess it's useful for tracking people who might have deleted their facebook account.


While this is cool, I'd be truly impressed if it was expanded to all 500 million Facebook users, and included all the friend graphs. Basically a fully scraped copy of Facebook's data, kept up to date, and available for free via Bit Torrent. I suppose the best shot at doing that would be using a bot-net.


This is supposedly everyone with publicly searchable profiles so I'd be truly impressed if 329 million people adjusted their settings, and not at all surprised if the 500 million figure did something cute like classify visitors as users, eg: digg and their "40 million users" of which 0.0047% liked the current most popular story.


How do you "visit" facebook? You get a login page.


Lots of pages and even apps don't require you to login.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: