Hacker News new | past | comments | ask | show | jobs | submit login

I don't see this case as a particular problem. "People wanting to learn about goats" might actually be in the minority. Most people actually googling that word may in fact be looking for some alternative meaning because someone used it in some other sense and they didn't understand, so they Googled. And then clicked Urban Dictionary, voting it up.

However, there's a particularly interesting case:

https://www.google.com/#q=tsla

I don't know about you (Google personalizes results to some degree) but the first result I see is Yahoo Finance. Google Finance is second. Why is Google promoting Yahoo over their own product?




> I don't see this case as a particular problem. "People wanting to learn about goats" might actually be in the minority.

This is the problem. Search isn't a democracy. I lookup the results I need, so not giving me filtering ability makes no sense. An engine that does what google does is an amazing achievement, but no longer makes sense as a model for the exact reason you gave as a defense.

edit: If disagreement could be verbalized it would be super helpful to me. I have been thinking about this issue a lot, and often I see people say the same thing as me:

* single website controls almost 100% of english language results

* limiting of images/video from results

* incorrect/wrong results

* limited respect for boolean and quotation operators

* high ranking sites are not authoritative, e.g. w3schools, wordpress automated blogs.

* no way to filter at all

But then down vote my conclusion:

> searches could be filtered and parameters set by user.

Really would be useful to understand my logic error, or if I have missed something.


Search being a democracy is really just a crude way of creating a better ranking system than just looking at, say, keyword occurrence count. Humans are great at filtering out spammy and useless websites, and the democracy system picks up on that.

As a next step, privacy issues aside, what if they "profiled" you by the types of things you search, and tried to guess what you need based on other people who "think like you"?

For example, I'm a programmer, and if I search "python", I'm probably searching for something different than a biologist who is researching reptiles. This would be fairly obvious to decide based on the other types of things I typically Google for.

I'm sure Google is probably already researching how to do this, though. It sounds difficult to me though because of the sheer number of models you'd have to train and store, and then figure out how to run a distributed index on. It might be more feasible to create some small set (e.g. ~1000ish) profiles of "types of people" and then match you into one of those types. This could also mildly alleviate the privacy issue as the profiling could be done offline on the client.


I made this point below about inability to provide context. In the other thread link, I think I provided why,. although I am no machine learning specialist but I think because:

Google can never necessarily know what you want and can never truly know you achieved your goal, so you could not train it properly.

Not only would you need to discover what profession I am in, assuming you had fully updated linked profile, etc. you would need to build a comparable universe of like minded people and calibrate.

Then, you would have to assume what inputs are similar in that they have same/similar parameters and expect similar results.

Then you would have to assume which link I clicked was the answer, for every person who did this same thing.

Then you would have to discount your bias as an engine, because you provide the top results to me and (for now) people trust the engine so they typically have a false choice of the first 5-10 things. If those 5-10 things are wrong, whole model is in error to extent it is wrong.

Any one of these would provide error and the cascade leads to larger disparity. Google IS SO AMAZINGLY GOOD, it has actually managed to make this not a problem for a very long time.


> Google can never necessarily know what you want and can never truly know you achieved your goal, so you could not train it properly.

They do have some amount of confirmation. All of their search results are redirect links, so they're tracking which links you click on. Based on the timing of those clicks, they can tell if you clicked on a result, left that site a few seconds later, and then clicked on another result further down the page, which probably means the first result didn't give you what you want. It's not perfect but it's still potential training data.

If that site has Google Ads or a Google '+1' icon, they can get slightly more information about how you spend your time on that site. I don't know about the legality of this but it's technically feasible.


But you are given the ability to filter your results! Countless ways actually!

Qualify your searches and modify them if needed. You were looking for information about goats. You should qualify "goats" with some form of "information about" statement.

"goat" + "animal" makes the wikipedia page for goats the first search result.

"goat" + "facts" gives you countless trivia pages, information, videos, etc

Alternatively:

goat -"greatest of all time" -"goatse" will search for "goat" without including "greatest of all time" or "goatse".


Correct. However the problems (as I see them now) are 2fold:

No concept of time. Best case conception of time seems to be either provided by the site, "article date" or something to this effect, or t0 = when google first learned about the page.

So yes, "goat" + "animal" will return your results. Try:

https://www.google.com/#q=%22nodejs%22+%2B+%222016%22++mongo...

Top anser for: "nodejs" + "2016" mongo api

returns top hit: 2015, 2nd hit 2014.

and that I can't give it context myself:

I am on Mac, but my pc is broken looking for windows info or don't include Alexa1000 links as authoritative. million short (i believe) removes the Alexa1000, but not their link authority.

Also, [neverShowWordpressSite unless traffic >3million unique] some larger news sites are actually built on wordpress like bloomberg. But the point is that I would delist by technology, and filter by time and tweak my authority parameters.

However, if google let you do this it would exponentially compound the difficulty as the algorithm would exist on both sides of equation.


You can set a time range under "Search Tools". See here: http://i.imgur.com/n5BNkei.png

As for filtering by backend technology - I'm not sure that would always be feasible. While it may be possible to filter default Wordpress sites, I'm not so sure about sites whom backbone architecture may not be known or publicly available.

E:

As for your computer issue, try searching for "problem/error name" + "solved" rather than "how to fix" + "problem/error name".


Thanks, the "issue" was an illustrative example, however I am a relatively tech savvy person and I use google frequently and I didn't know about this feature. I would be interested to know what percentage of searches leverage it relative to people who type recent year/date into the box for just a quick and dirty statistic about visibility.

I don't actually want to filter backend technology, but I would like to communicate to the search engine that I do not trust (nor want to have returned as a result) any wordpress, blogger or medium website, and I want their rank to be negative.

That is another extreme example, however to discover new things is hard and to find useful information, when communities of bad actors have spent years incentivized to rank higher but not produce quality, it could be easier to simply delist everything and gradually add websites you trust to have authority.


I was going to suggest daterange, but it didn't work for me (haven't used it in awhile), however, I think this tool replaced it, you can even set custom ranges with that.


It is though. Search is about providing the highest likelihood of the desired result, not matching words, at least in one value space.

If most people searching for "goat" want to know what the word means in slang, that should be the top result. Possibly you could argue that you want a personal search profile that knows you value Wikipedia higher than other links, but for the default case it feels like optimising for highest odds of success makes sense.


The default case is that everyone has a "search profile" but it is made by google and applied to abstracted parameters.

Everyone wants a "search profile" except they would like to control it and how it is applied as it is, for most people, their most important interaction with a computer, e.g. how they access information.

Currently, in some respects, that is out of a single silo or set of balkanized silos.

This will not be true in the future. One place can not dictate information flow for world. Plus, Alphabet has better things to do


>I don't know about you (Google personalizes results to some degree) but the first result I see is Yahoo Finance. Google Finance is second. Why is Google promoting Yahoo over their own product?

Because that's how lawsuits happen. They let the search engine run itself. If more people are using Yahoo Finance over Google Finance, the search results will reflect that.

I genuinely believe they aren't dumb enough to open themselves up to monopolistic behavior lawsuits. The Microsoft lawsuits weren't that long ago and I'm sure Google is wary enough to make some attempt at avoiding a repeat.

I have little reason to suspect Google purposefully toys with their search results to promote their own products. Given the quality of their products - I'm more inclined to believe that when a Google product is the first result for a name/search, it's probably because people actually use/enjoy that product.

As an example, Google Maps vs other "Maps". While Google is certainly trying to join the ranks, I find other "Maps" to be entirely unusable with terrible UI and am not surprised in the slightest when Google Maps is the first result when looking for directions.

It could also be you use Yahoo Finance more often and thus personalized results had it listed first. Google Finance ranks 4th on the page for me for that search result.


Interesting. But Yahoo has their own search engine. If Yahoo listed Yahoo Finance first, and Google listed Google Finance first, is there any legal issue? Both are working search engines and nobody is forcing you to use one over the other. I don't see a "true" monopoly situation. At least not a Comcastish monopoly in which other players are explicitly blocked from entering the market.

For what it's worth, Bing and Yahoo also list Yahoo Finance first, if searched for "TSLA".


>But Yahoo has their own search engine.

No they don't. They're a front-end for Bing search (at least until 2019 and in everywhere but Japan, which IIRC is a front-end for Google), but that's being nitpicky. :)

FWIW I've always had an issue with companies being punished for simply being better than alternatives. That includes Microsoft's advertising IE, even if IE at the time wasn't the best browser. People were free to use IE to download a better browser, so I never saw the issue with providing IE as a default. Linux was free, they were free to buy a computer, set it up themselves, and install Linux on it. The fact that they weren't choosing to do so should not result in Microsoft being punished.

However my beliefs and precedent set by previous law (even if it isn't legal precedent?) is still enough to have Google play it cautiously. Especially since there has been threat of such lawsuits if they were caught playing with their search results to advertise themselves and "kill off" competitors.


I just went there and the first "result" is a giant stock chart widget showing the 1-day chart with the options of viewing 5-day, month, etc. This chart takes up a full 1/3 of the vertical screen size. At the bottom of the chart there's 3 quick-links to: Google Finance, Yahoo finance, and MSN Money. The next result-ish thing is Google's "In the News" section that has an article about the broken Gigafactory promises. The first real result is yahoo finance, just as it is for you. I suspect it isn't so much that Google is promoting Yahoo over its own product, since that first chart is clearly a Google product, but that it doesn't bias the real search results in their favor. It's a legitimate organic search result. Most people probably just want the stock price from the first/simplest way possible, which is provided or they want the yahoo finance page.


Not here but because of the "s=TSLA" in the Yahoo's url?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: