Hacker News new | past | comments | ask | show | jobs | submit login

Okay, perhaps this is easily explained, but I am baffled.

When you search google for a small phrase, for example, oauth authorization:

search query: oauth authorization About 525,000 results

search query: oauth "authorization" About 1,470,000 results

How does the result not only increase, but increase nearly three-fold?

I've always thought the quotes were meant to be more exact. For example, if you search for Web Design without the quotes, you will find results including web design and website design, but if you search for "Web Design" with the quotes, you will not find the entries that say "Website Design" so that reflects less results than the latter. So why am I seeing this oddity, does it make sense to anyone else?




The number of results reported is just a half-assed guess. The server which got the search term calculates it by looking at what fraction of documents matched, what is the total index size, etc.

Usually, the search phrase is hashed and the phrase sent to a backend server based on the hash. So 'oauth authorization' and 'oauth "authorization"' hash differently and get sent to different backend servers. These two calculate the 'number of results' figure differently, and hence you get the difference.

That is my guess at how you're seeing these numbers; I don't work for Google (but have some knowledge of another search engine).


The number of search results figures shown by google are not at all accurate.


Maybe they should "replace" those figures with a search composition figure that denotes how close the results are to the search terms, in the case where Google returns related-type results when it can't find exact matches. Green/yellow/red type stuff.


"replace"

Sorry, double quotes don't mean what they used to. We'll need to come up with something else for that.


+replace?


That is strange. Perhaps when searching "oauth" "authorization" it is increasing the distance allowed between the words available for search.

At least "oauth authorization" seems to behave more normal.


It's not clear that it still works, but there's also the AROUND(n) operator.

  oauth AROUND(3) authorization
would reduce the set to results that include "oauth" and "authorization", in any order, separated at most by three words.


That would be amazing!

Where do you find the fnlist + manpage? Is it possible to search for "foo <at most 3 words> bar"? Aka "around(3), but order matters".


oauth authorization looks for that - oauth authorization - in the page.

oauth "authorization" looks for oauth, with "authorization" nearby, in the page.

So you will get more hits for the latter.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: