Having been through similar decision making processes myself (with the Google Custom Search API, the Google Translate API, etc), this is just as likely an abuse mitigation technique as it is a revenue generation opportunity.
Requiring even a modest at-cost fee for a web service does wonders to discourage all sorts of misuse, from wanton large-scale data mining, to blatant repacking and resale, to worse. (Heck, simply requiring a valid credit card alone helps.)
And sadly no, simply having low quotas for free access doesn't entirely suffice. If there's material value to be extracted from a free service, you'd be amazed at the lengths people will go through to create large numbers of low-volume scrapers. Most of these are obvious and easy to detect and defeat, but continually doing so adds up in cost, and it takes engineers away from providing better services to legitimate customers.
In short, most people on the outside don't appreciate just how difficult the handful of bad guys make it for companies to do something good for the other 99%. So I'm sympathetic to Microsoft here, I really am.
This. The best way to fight these types spammers and scrapers is through economics - provide the content at-cost and it no longer becomes cost-effective for pray-and-spray models.
Imagine what the world would be like if there was a cost-per-unit to sending email - our inboxes would be a much saner, friendlier place. Fewer marketing emails, virtually no spam (no longer economical).
This got me thinking...is it possible for Microsoft to secretly buy out DDG and leave them the fuck alone except where they need help(ie. access to Bing index)?
For all the money the Bing unit keeps pouring, I feel buying DDG and leaving them the heck alone can be a reasonable long-term bet with relatively little risk.
The challenge would be to keep the founders/team motivated. They could spin it off as a completely different company on IPO track and give significant equity. But then what if Google wants to buy them out? And of course, the Bing team may have a problem with MSFT creating internal competition though that may just push em to do better.
More likely if DuckDuckGo gets in an acquisition bidding war, I put my money on Gabe passing acquisition for raising a huge funding round that lets him take some money off the table.
With all the flaws(enough that I don't use it), it remains a rare search engine start-up that has its heart in the right place: to actually serve consumers versus build some technology or team and get acquired(looking at you, Powerset).
I think you're getting ahead of yourself here. DDG has 0.1% of all search traffic ( 30m out of 23b queries per month ). They get all of their 'relevancy' from bing, and blekko. The reason you like them is because they clean up these results, provide peripheral add-ons like user privacy, no ads, easier syntax for power users, and one-boxes which function more as a knowledge-base than a search engine. They don't crawl/index the web. or if they do, we haven't heard anything about it, or seen any different relevancy rankings from the BOSS api. So they are really a new face to already extant search engines.
You mention a 'bidding war'. Who would buy them? Any 'features' they provide can be copied by msft/google if they feel threatened. They don't have their own backend search technology so partners who might want to do search with them, ask.com, search.com, etc. have no reason to work with them as opposed to the BOSS api themselves or even Bing. I'm just as excited as you are about innovation in search and competitors to google, but DDG needs to be re-architected on the back-end before these sorts of pronouncements make sense.
yep, i completely agree. DDG is built off Bing's platform, so they're screwed if MSFT ever made their index private. They also don't got a strong brand outside of hackers who've heard of them.
I must admit they do a good job in the PR area of making themselves seem big and sexy. But they're really not innovating search, or providing any thick value in improving search. At least Blekko was trying to create a better algorithm.
"At least Blekko was trying to create a better algorithm."
Why the past-tense? Blekko is still alive and kicking. (And I hope they find success, because they're doing amazing technical work. From a purely technical perspective, they deserve ten times the press that DDG is getting.)
I like your defense of Blekko. I am often puzzled why blekko is talked off so little, compared to DDG, on Hacker News. Nothing against DDG, but blekko is the only new entrant trying to fight Google heads on. As it does its own crawling, and Skrenta believed that search can be improved when he started (apparent from his blog posts).
But I think, the mistake Blekko did perhaps was being too close to a Google kind of (i.e. traditional) search engine. On this path, it may take them atleast another 3 years, before people start taking them seriously. It will be a hard and grueling road.
But the thing, I like about them, is that they did not make the mistake of cuil, and are being conservative in making promises.
Overall, I suspect, they may be feeling a bit out of sorts, as when they started out (2007) the social network thing was still in infancy. And now people are talking about facebook coming into search and so on, which if it happens, may be a totally different approach to search, than perhaps what blekko did, which was trying to emulate and out-do Google.
Yeah, I was thinking of Cuil, haha.. and you're right, they deserve much more press. It's like people here want to create a self fulfilling prophecy by saying DDG should be acquired..
Yeah I'm pretty excited when DDG gets to a point where they can start hiring the engineers to build a crawler/indexer, that way they can really start to kickass.
As an aside, I still use blekko everyday, and they obliterate google and bing in certain intent queries. If you search for some intent query in these categories: travel, jobs, real estate, cars, finance, legal, medical, services, and merchandise sales AND one of their blekko/user curated slashtags fires/autofires, blekko destroys. ( eg. cure for headaches )
I've used both Blekko and DDG when trying to look for some reviews of laptops, and Blekko was much better than DDG and even Google. I went back to using Google all the time now, but I'm surprised how DDG got all this attention and Blekko gets none, when I think Blekko is better than DDG, at least relevancy wise.
Here is something ironic. I bet you many of DDG's users are hackers who use it for privacy reasons. The same hackers who rely on Google Analytics in their own websites. If DuckDuckGo grows to become more than a niche search engine, the same hackers who use it will have to reinvent Analytics somehow. This is reason #1 why DDG will stay small. Reason #2 is that if necessary, a DuckDuckGoogle can be created in an afternoon's worth of effort in Mountain View.
I don't understand your line of reasoning. Yes, Google Analytics is useful, and we would have to replace it if Google the company went out of business or discontinued it, but why is that going to motivate people to use Google the search engine?
blekko has been running a crawl+index of several billion pages for 2 years now, so perhaps I can talk about this a little.
If you want access to a big crawl to grep through it for interesting data, then Common Crawl is awesome and inexpensive and I don't think you can get anything like it for the price, unless your query is simple enough to run as a blekko webgrep (https://blekko.com/webgrep).
If you want to build a search engine, Common Crawl isn't so useful. Search engines want _directed_ crawling of the pages that they think are good. Crawling is only a small fraction of the total work done in a search engine. Search engines generally aren't on AWS, because the right configuration of machine isn't rented by Amazon -- serving queries needs SSDs or more ram and less cpu than what Amazon offers. So, what Common Crawl offers a search engine is higher costs and mostly bad data.
I'm sure they still make heavy use of the Bing API, but have since expanded their range of sources to soften the blow somewhat. They're now getting results from Blekko (who run their own index), and are I'm sure they've been building out their own index. There's a full list of sources here - http://help.duckduckgo.com/customer/portal/articles/216399-s....
We also don't have any pricing details for higher volume usage of Bing, and DDG are in a much better position to negotiate a better deal these days.
This seems like an odd move. It's not like bing has any traction with developers at all. Wouldn't charging them make it even harder to gain traction? I am not familiar with thei API, what does it have over google that would make me pay for it?
Google doesn't have a search API (it was long ago deprecated https://developers.google.com/web-search/). So the API itself is the advantage over Google in this case.
Did you read the link you provided? They actually mentioned right at the top that they have a newer search API which is recommended (though I'll admit that I missed it too at first).
Yes, I did :). As mentioned below, it's not a comparable product. Custom Search is site-specific. It's an API for the "search this page using Google" forms you sometimes see on blogs and the like. It's not an API for "capital S" Google Search, while Bing's API really is their Search API.
Ahh, gotcha. I thought Custom Search was that old thing they had which let you create a "customised" Google Search which had your colour scheme of choice and optional restriction of the results by topic (i.e. basically standard search), not the equivalent of 'site:x'.
the "custom search API" does not deserve the "search" in it's name. it's a different index, it's crap, and you can't access it as a whole, but only whitelisted parts.... google does not have a search API, it has a sad excuse to a search API
In my experience with the Bing API in the last several months, I've found that you get what you pay for. Its performance has been inconsistent at best, to the point which I created the site http://isthebingapiworking.heroku.com/. The web search api frequently orders of magnitude fewer results than the actual website, a problem making a frequent theme in their developer forums http://www.bing.com/community/developer/f/12254.aspx.
By charging for their search API, I would just say that Microsoft is beginning to take their API seriously. It seems pretty clear that minimal resources, if any, were dedicated to the free version.
I'm bummed, because I've found relative success using their news search api (particularly for the article aggregation component of http://www.congressionalprimaries.org/), and now we'll have to look into alternatives, but if this means actually providing a decent product, I think this is a good move for Microsoft.
No surprise there. I wonder if all the faux search engines will have to start either crawling/indexing, or transition to Knowledge Engines ( I'm looking at you DDG ). Curious to see if this sparks people to start more search companies.
Actually the search API would be interesting for domain specific search. You can use the API to create a site to present result specific to MP3, for example, formatting the result with the MP3 attributes.
Requiring even a modest at-cost fee for a web service does wonders to discourage all sorts of misuse, from wanton large-scale data mining, to blatant repacking and resale, to worse. (Heck, simply requiring a valid credit card alone helps.)
And sadly no, simply having low quotas for free access doesn't entirely suffice. If there's material value to be extracted from a free service, you'd be amazed at the lengths people will go through to create large numbers of low-volume scrapers. Most of these are obvious and easy to detect and defeat, but continually doing so adds up in cost, and it takes engineers away from providing better services to legitimate customers.
In short, most people on the outside don't appreciate just how difficult the handful of bad guys make it for companies to do something good for the other 99%. So I'm sympathetic to Microsoft here, I really am.