Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Tell HN: Crunchbase searches are public
41 points by ig1 on Sept 29, 2010 | hide | past | favorite | 14 comments
I'm assuming most people aren't using crunchbase for confidential searches, but just in-case anyone is, it looks like crunchbase saves every search and assigns it a sequential id which can than be used by anyone to look up that search.

For example a search I did:

http://www.crunchbase.com/search/advanced/companies/506276

By incrementing/decrementing the counter on the end you can see the searches other people were doing at the same time.




I can picture hundreds of data pack-rats writing the exact same shell script, hoping they're not too late :)


Just realized this only covers company search, although the same issue is there for people search:

http://www.crunchbase.com/search/advanced/people/1250638

Interestingly it looks like people search gets hit much more often than company search which is the opposite of what I would have guessed.

I wonder how long it's going to take for someone to make a high-score table of the most searched founders and startups ;-)


here, have the first step:

curl -s http://www.crunchbase.com/search/advanced/people/1250638 | grep advanced_search_query | cut -d\" -f12


Likely much faster ways of going through this but to expand on your thought:

#!/bin/bash

     for (( i=1; $i < 1250646; i++))

     do

                curl -s http://www.crunchbase.com/search/advanced/people/$i | grep advanced_search_query | cut -d\" -f12 >> people.txt

     done


If anyone wants to hack on this together, shoot me an email (email in my profile) I'm thinking of some...interesting uses for the data.


This is the kind of stuff that made Arrington want to sell to AOL.


Looks like they killed it ... all these addresses say "record not found". And, when you run a new search, you get a more standard address, e.g. search for IBM, the address is http://www.crunchbase.com/search?query=IBM


I just wondered about the quality of information at crunch-base, so I searched for a man who sold his first company "XIV" to IBM for $300M, his second company "Diligent" for $165M (also to IBM) and just recently have left IBM and founded a new one named "Axxana".

The only record I have found was about Axxana which secured $9M (seires b).

see at http://www.crunchbase.com/person/moshe-yanai


Appears to be sort of fixed? I think they're expiring them, but I'm not sure based on what.

As of now, my search (#506811) appears to still work, as do a couple before it, including 506800. But search numbers 1, 2, 100000, 500000, 505000, 506000 all fail.

(I'm sorry I didn't do a proper binary search.)

Someone can check on my search later and see if it's time based. Or do a bunch of searches until it expires and see if they're just storing the N number of last searches.



Which begs the question, if you search for a tree in an empty forest and nobody logs it, does it make a sound? (Lumberjack pun not intended)


People search for "mark" and "zynga" not "how do I kill my co-founder and get his equity?"

I didn't see any searches that seemed in any way sensitive. Are there any?


I'm guessing people might search for their "stealth" business ideas or companies might be searching for takeover targets.

Or say a VC posts a link to crunchbase search to their twitter, someone could look at the previous/next results and could find out what companies that VC might have been looking at.

People assume searches are anonymous and can often search for things without thinking about it. In one case I had a google search referrer in my blog which was "xyz takeover abc" which came from an IP address owned by xyz (who happened to be a YC company I mentioned on my blog) prior to their takeover by abc being announced.


Just advanced searches though?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: