Is Google Search Deteriorating? Measuring Google's Search Quality in 2022

lbriner · on Jan 11, 2022

One thing I find annoying is that they still return results that are those sites that seem to register a load of terms that all point to the same page. You see this with telephone numbers, song lyrics etc. where the result looks like "Lyrics for Stairway to Heaven" but you click through and there is no content, just a page that says "Upload some lyrics to this song".. etc.

These sites should be heavily penalised for click-baiting and they have been doing it for years.

cik · on Jan 11, 2022

I've found my result quality has gone up dramatically now that I use extensions allowing me to block domains from google search results. It seems silly that I ought have to do this, but Google has finally become useful again as a result.

dcminter · on Jan 11, 2022

Google used to let you do this itself when you were logged in. I never understood why they removed that; surely the information on what domains were deemed unwanted was valuable!?

ouid · on Jan 11, 2022

Google's customers want access to the product, this would be denying them that access.

dcminter · on Jan 11, 2022

So I assume by customer you mean advertisers and by product you mean the users carrying out searches.

For their commercial search results I'll grant you that there's an incentive. But for their non-commercial search results why would they care? That's how it used to work, you couldn't delete a domain from the paid results, but you could make it disappear from the unpaid ones.

Jenk · on Jan 11, 2022

Are these client-side only extensions that are manipulating the DOM, or are they somehow feeding parameters to google to exclude domains (by e.g., automatically suffixing a series of "-site:blah.com"s to the query)?

BitwiseFool · on Jan 11, 2022

I don't know if I'm secretly being A/B tested, but Google seems to ignore - and "" terms in my queries from time to time.

AnyTimeTraveler · on Jan 11, 2022

I have also noticed that and it has been really difficult to search for obscure error messages now.

mattarm · on Jan 11, 2022

They manipulate the search result DOM. uBlacklist is the one I use and it'll manipulate the search result page telling you when something has been blocked, allows you to unlbock that site, etc.

machiaweliczny · on Jan 11, 2022

Could you provide name of this extension?

cik · on Jan 11, 2022

I use Personal Blocklist (https://chrome.google.com/webstore/detail/personal-blocklist...). Works wonders

Baljhin · on Jan 20, 2022

I use the userscript "Google Hit Hider by Domain (Search Filter / Block Sites)" by Jefferson Scher, whom you may know as one of the top support specialist at Mozilla: http://www.jeffersonscher.com/gm/google-hit-hider/

"GHHbD" is a precursor to uBlacklist; and was for many years THE replacement to blocking sites on Google Search after Google removed the built-in function.

It also has a 'Block' button next to the search results (remember, the script existed long before uBlacklist), which allows to grey-out or hide results, based on subdomains down to the base domain.

The disadvantage of GHHbD is that it doesn't handled regex filtering. However one of its many advantages is that it does block TLDs: https://greasyfork.org/en/forum/discussion/comment/55821/#Co...

fifnir · on Jan 11, 2022

I use uBlacklist, it puts a little "block this domain from results" button next to google results

orangepurple · on Jan 11, 2022

I just installed this! Thanks for the suggestion!

Don't forget to use ungoogled chromium if possible

The only downside is you will have to load unpacked extensions instead of using the Chrome "store" and you will have to manually install Chromium updates from the same site:

https://chromium.woolyss.com/

tedivm · on Jan 11, 2022

Why not just use firefox?

eru · on Jan 18, 2022

Some people might still prefer Chromium for other reasons?

Eg there are performance and memory differences between the two browsers, and different extensions are available.

dewey · on Jan 11, 2022

> you will have to manually install Chromium updates from the same site

That sounds like a bad idea for something like a browser that should get security updates as fast as possible.

amelius · on Jan 11, 2022

Yes, give me a downvote button on Google results.

mprovost · on Jan 11, 2022

I worked on a search engine at a startup that did exactly this, you could up and downvote each result. The main feature was that we essentially "sharded" the search engine so it could be embedded on different sites and give different results based on each community. So a search for "casting" on a fishing website would give different results than the same search on a metallurgy site, as voted by each community. We could also learn passively by watching which links users clicked on and where they "disappeared" from the search engine - presumably the last clicked link was the result they were looking for.

Google did copy the voting feature on their results page briefly but abandoned it. [0] This was back in 2006-7. We learned the hard way that it's pretty much impossible to compete with Google in search even when you're innovating. They either copy you, or can just blackhole you out of existence.

[0] https://techcrunch.com/2007/11/28/straight-out-of-left-field...

mcv · on Jan 11, 2022

> presumably the last clicked link was the result they were looking for.

This is a risky assumption. I tend to click all the links first, and only then check if they're the results I'm looking for.

gruez · on Jan 11, 2022

You can work around that by using javascript (checking for tab losing focus) and checking how quickly the links are being clicked. If you middle-click a bunch of links, then don't click any more links, then it can be inferred that one of the links that you clicked was "good".

mcv · on Jan 11, 2022

How can that be inferred? Maybe the results were all bad and I'm refining my search query. Or trying a different search engine.

gruez · on Jan 11, 2022

>Maybe the results were all bad and I'm refining my search query

that can be detected, by checking further searches are made using the same session.

>Or trying a different search engine.

Maybe if your audience is mainly HN users it might cause an issue, but I think most people don't bother with that so you can count it as noise.

mcv · on Jan 11, 2022

The lack of an ability to vote on search results seems like a baffling omission. Normally Google loves to crowdsource their work, but for this one area where we would actually want it, they decide that they know better when they clearly don't.

djohnston · on Jan 11, 2022

It would be abused into the ground by the same people who set up these scam sites in the first place. For most things on the internet, the crappiness of the average person is the limiting factor.

wruza · on Jan 11, 2022

They don't have to make it a public vote. I'm fine to do this work myself for a week or so, and then enjoy crap-free searches. Or link my votes account to someone who I trust and use their opinion.

Idk why everyone is praising public anything, because local communities and knowledge webs worked fine pre-internet. Most of the bullshit came with globalization.

amelius · on Jan 11, 2022

I wish there was some standardized way to say "I trust you" on the internet, and share part of their information bubble (e.g. reviews, up/downvoted websites, youtube recommendations, etc.), and this bubble could have some transitive properties (if I trust X and X trusts Y, then to some extent I trust Y too).

But it's probably too privacy sensitive (if I see which sites you upvoted, then I have some information about you). Hence, that's probably why this has to be either completely private or completely public.

mcv · on Jan 11, 2022

Well, every social media network tries to capture and monetize this, of course.

But yes, it would be nice to have this in a less proprietary way. But I fear it's either going to be a privacy issue (because the person you trust has to publish every page they like and dislike), or it's going to be anonymous and therefore easily gamed by bad actors.

amelius · on Jan 11, 2022

If the group of trusted people in a bubble is large enough (say 1000s of people), then it wouldn't be a problem, I suppose.

For example, I wouldn't mind sharing my upvoted websites/videos/products with everyone on HN, as long as it is anonymized. Bad actors can be distrusted by the community, I suppose (though moderation seems to be largely an unsolved problem still).

eru · on Jan 18, 2022

Anonymization is actually really, really hard.

wruza · on Jan 11, 2022

Click up/downvote.

Click/dismiss "make it private/public" bubble.

Set default vote visibility in settings.

Enable/disable the bubble.

List/add/remove sites in a privacy list.

But the common notion is that it's probably too hard for "regular" users, so we have the internet of nonsense instead. Few more years and we won't find anything good.

machiaweliczny · on Jan 11, 2022

What if they took only votes from people that pay and have account for X time? Also what’s the problem with adding revert per user? With some events architekture it should be trivial. They could even train AI on this data to automatically sugest suspects.

PoignardAzur · on Jan 11, 2022

Here's an easy thought experiment: for any propose of the type "what if they had a system that did X?", imagine that someone tells you they'll give you 10 million dollars if you can figure out a way to game that system that can't be detected by Google.

If you spend 5 minutes on it, can you think of a way? If you can, congratulations, now imagine millions of other people thought about the same tricks and you get the reason why Google can't ever really win against SEO.

eru · on Jan 18, 2022

Google might not be able to 'win', but they can stay ahead.

Eg if they can get to a place where SEO efforts point in the same direction as making your website genuinely more useful to people, then that's good enough for Google.

mcv · on Jan 11, 2022

It's totally fine if they only use my own votes to modify my results. In fact, if we're going to do that, I'd really like to block some sites from my search results entirely. Though ideally, I would also like to include the votes from people I trust.

lbriner · on Jan 11, 2022

People sell good accounts for lots of money to the spammers.

mcv · on Jan 11, 2022

And when those accounts suddenly start behaving like bad accounts, their votes get treated as such. There will be a window of confusion, but it might not be that big. And there would be an economic incentive for people to establish trusted accounts with high-quality voting behaviour.

Schroedingersat · on Jan 11, 2022

Their entire business (advertising) is clustering. They've already done the hard bit.

Just cluster people by downvotes and whatever other thousand metrics are already being tracked, and allow them to see results given by other clusters by showing which areas are dense on a PCA or something or saying 'I want my results to only be influenced by people who have downvoted github.org' or whatever.

amelius · on Jan 11, 2022

Wouldn't that provide more data on who are the scammers?

djohnston · on Jan 11, 2022

Not really. In these areas there's a general principle that the moment you start using a signal to enforce on bad actors that signal's value decays. It's sad, I actually think web3 could help curb abuse at scale by making it too expensive to scale horizontally. Imagine paying a "stamp" to send an email on the blockchain, spam email would dramatically drop. Maybe similar mechanics could be in play for other things but then you lose democratization of information.

kgonza · on Jan 11, 2022

What is web3? What exactly is it? And how could it be used in this case?

djohnston · on Jan 11, 2022

The idea is you'd have an email client but all data is stored encrypted on chain and you'd have to pay tx costs to send things. By having this sort of stamp tax spammer wouldn't be able to scale mass messaging as it would bankrupt them.

jspiral · on Jan 11, 2022

in meatspace though, my real mailbox is mostly filled with spam that the sender paid to have delivered. granted, it would at least act as a limiting factor.

Schroedingersat · on Jan 11, 2022

If you have an auditable tx history, then mixers/exchanges/wallets etc can be marked as sources of the money used by spammers

6LLvveMx2koXfwn · on Jan 11, 2022

Isn't Google's monetization for Search results the primary reason for not allowing crowd funded quality metrics? If only good results get to the top how do they make money? Google's Search business model appears to be predicated on poor results getting disproportionately prominent listing.

mcv · on Jan 11, 2022

The actual sponsored links end up at the top anyway. I don't see how bad regular results would contribute to their revenue; it would just drive people away.

cool_dude85 · on Jan 11, 2022

>I don't see how bad regular results would contribute to their revenue

Let's say you run a good song-lyrics site that has the correct lyrics for everyone's favorite songs. You happen to be on page 2 of google results for common queries; all of page 1 is taken up by spammers, fake pages, etc.

How can you possibly drive traffic to your site? Maybe you can invest in SEO but no promises there. You'd be competing against people whose whole focus is SEO and nothing else. The only option left is to buy ads.

mcv · on Jan 11, 2022

Yeah, but so can the spammers and fake pages.

And it all works together to make Google worse.

I agree that it would work as long as Google has an absolute monopoly on search. Google wouldn't care how bad their search results are, because there's nowhere else to go. But if there are alternatives, users should stop using Google and use the alternatives instead, and then Google has an incentive to improve their search results for users again.

6LLvveMx2koXfwn · on Jan 13, 2022

Let's say we have three search results, good, average and bad, displayed on one result per page in that order. The user is happy, Google is unhappy as there is no motivation for the site owner of the best search result to pay Google for their listing. The only way Google gets money is if it enables the result order to be modified, i.e. the owners of the average or poor sites to disproportionately effect their listing prominence by appearing on page 1. Google do this by allowing paid adverts. Bad or average search results thus get disproportionately high prominence in the listing with the knock on effect that the owner of the best search result now has motivation to pay Google to regain their primary listing. And even further, if the site owner of the best search results now has to pay Google for the prominence of their listing why waste money on continuing to provide the best results? The user is unhappy, Google is happy.

aqme28 · on Jan 11, 2022

But you do vote on search results. Google knows if you stay on a page or leave it quickly for another result. That's an implicit vote that they definitely take into account.

Having that explicit button might not really add any additional value.

Schroedingersat · on Jan 11, 2022

Which heavily penalizes sites that make information easy to find.

Especially for use cases like reviews where you are looking for multiple opinions.

The hard to navigate site full of waffle, and SEO duckspeak nonsense gets a positive while the site with clear concise information the user can absorb in 2s gets penalized.

mcv · on Jan 11, 2022

How? I don't use Chrome.

tommek4077 · on Jan 11, 2022

Browserfingerprinting, they see you again on ther resultspage after x seconds -> bad result. They don't see you again -> good result.

mcv · on Jan 11, 2022

That's not how I use the results page, though. I open the top x number of results in separate tabs, and then check those pages. I only go back to the tab with the results page after I've checked them all, so that might be a while, even if they all suck.

So if they look at my behaviour the way you say, then the feedback they'd get from me would be that the top x-1 results are always bad, while the xth result is always good. That sounds like a poor algorithm for them, but it might explain why my Google results always suck.

aqme28 · on Jan 11, 2022

Thanks to things like Google Analytics, Google knows what site you're viewing at any moment in time. It can see you go through the results sequentially and stop at some point.

lacksconfidence · on Jan 11, 2022

Luckily for search, it is how the vast majority of users use things. Less than 1% of users doing things differently makes the stats a little messier, but still works quite well.

junon · on Jan 11, 2022

It's a huge vector for abuse. The signal:noise ratio wouldn't improve at all.

Schroedingersat · on Jan 11, 2022

If I can block the 5 word soup gpt2 fake support sites that make up the top 20 results for obscure debugging messages from my own personal account, then my signal to noise ratio went from 0 to some number above 0. This is an infinite improvement for the first page.

Noone ever wants to go to xypdf.com for any reason unless they want to feel like they just had a stroke, how is (was? it made me stop using google except as a last resort in 2019) it often 3 of the top 5 results?

I wish I could just invert their SEO quality metric (there was a golden window around 2018 where you could just type -best into search engines that still respected subtraction to get only good reviews but sadly quality sites have fallen into line with the duck speak). I feel it's a pretty reliable indicator of garbage.

acdha · on Jan 11, 2022

The signal:noise ratio would certainly improve for anyone voting even if they didn’t use it to adjust other people’s search results.

I’m also skeptical that all of Google’s enormous investment in ML and staffing is completely powerless to identify bad actors with atypical usage patterns. What seems far more plausible is that they’ve decided it isn’t costing them more in ad sales than it brings in. There are individual domains which would improve results by being blocked but they also pay for search ads so … unsolved grand challenge of computer science it is!

lacksconfidence · on Jan 11, 2022

> I’m also skeptical that all of Google’s enormous investment in ML and staffing is completely powerless to identify bad actors with atypical usage patterns.

Isn't that exactly what people are complaining about here though? At least part o the problem with search results is that google seems powerless to recognize and remove useless bad actors (stack overflow copies, etc.) from their index.

acdha · on Jan 11, 2022

My position is that they could do far more if they cared - simply blocking spam domains like it’s the previous century would make my experience better – but that they have made a business decision not to devote more resources to the problem.

amelius · on Jan 11, 2022

Yes, Google says they want all our information to "improve our user experience". Except it's not very credible that this is the true reason.

freediver · on Jan 12, 2022

Come try Kagi Search and get both upvote and downvote!

1vuio0pswjnm7 · on Jan 11, 2022

"These sites should be penalised for click-baiting and they have been doing it for years."

If SEO works and a result appears closer to result #1 in the SERPs, then the "true", non SEO-assisted result it is displacing would appear further from result #1. Apply this across the board and what we have are many, many non-SEO results that are pushed down in Google's ranking. No one is "penalising" these pages, however they suffer visibility problems because they have not engaegd in SEO. The incentives created by Google's secretive ranking system and online advertising commercial focus are perverse or at least in conflict with the user's goals. Google discourages and even prevents any user from looking at results that were hits but were not ranked high. Pages that may not succombed to the the influence of such incentives may "disappear".

What if a user understands this and wants ignore the Google ranking system. What if the user wants to see the true, non-SEO results. Google actively limits the user's ability to see those displaced results. For example if a user searches for a common term, such as "example", she will not be able to view more than 200-300 results. Elsewhere in this thread someone also noted even with a paid API, Google limits users to 1000 results. If the user wants to the see the full range of pages that have hits for the word "example", she cannot do so. If the user would like to perform a single search for all pages containing the term "example" and then sort by some other objective criteria such as alphabetical by domainname, date, page size, etc., she cannot do so.

Under Google's model of the web, pages that do not acquiesce to an online advertising company's secretive ranking system may become nondiscoverable, despite the fact that they may indeed match the user's query. Computers assist us in searching through data but "relevance" is ultimately decided by the user. That is why we can have HN threads that claim search result quality is declining. Though they may be slower, humans can determine relevance better than any computer. From the disclosures of Matt Cutts and others we know that humans are involved in Google's ranking implementation. Penalties are used. The search process is not 100% math/computer-based. However, in Google's model of the web, filtering results is the exclusive domain of the online advertising company and only the humans on its payroll, not the user performing the search. There is no option to disable the online advertising company's "assistance" in filtering.

marginalia_nu · on Jan 11, 2022

Ironically, my work on my own search engine has led me to be a bit more patient with Google's problems. At least I think I understand them better. Search engines fail in weird ways.

I think in part that Google just has gotten a spectacularly confusing failure mode. If it can't find good matching contents, it starts second-guessing your query and producing other results, which makes you think it's not even considering what you entered. It may even be "better" in the sense that it's more likely to return at least something relevant, but in practice it's bad UX because it's so unintuitive what's happening. It's probably one of those unfortunate optimizations that are invisible when they work and frustrating when they don't.

There is so much stuff on the Internet it's easy to start thinking there is guaranteed to be good results for any search, and that just doesn't seem to be the case. Especially with highly specified searches with 6-8 terms, you quickly enter the domain where you're reasonably unlikely to find an exact match.

eitland · on Jan 11, 2022

> I think in part that Google just has gotten a spectacularly confusing failure mode. If it can't find good matching contents, it starts second-guessing your query and producing other results, which makes you think it's not even considering what you entered.

This is probably part of it but not the whole explanation:

Try to search on Google for:

    slack ngrok

When I and others did earlier today there were a number of pages that contained the words including from the slack.com domain.

The top result however was a page that didn't contain ngrok at all.

I saw a specialist at another search engine comment that it was because it was a very popular result (at least thats what I read into it).

Here's my problem with Google: they are either just really bad at QA or they don't care or they consistently overestimate their dumb AI and underestimate me.

I'm fed up.

Not including pages that doesn't contain the search terms or anything similar isn't hard when there are multiple good results at the exact same domain / pagerank, is it?

tethys · on Jan 11, 2022

You're probably talking about this site? https://api.slack.com/tutorials/tracks/responding-to-app-men...

It is the top result on Bing as well. It probably shows up in this spot on both search engines because it's prominently linked on https://ngrok.com/.

eitland · on Jan 11, 2022

So it seems this is an understandable case. Google not only rank but also filter pages based on text of links pointing to them.

This has been going on for a while, at one point before Obama, searching for a "a miserable failure" got the White House as first result.

But this cannot be the entire explanation for why so many results are useless?

meragrin_ · on Jan 11, 2022

> underestimate me.

Given the term "ngrok", I can't say I blame them/it. It looks like a typo to me. I imagine for well over 90% of people searching, it would be a typo. Perhaps, it is a common typo?

eitland · on Jan 11, 2022

Well, it is a well known tool for temporary tunneling of http.

Google knows and the rest of the results confirm it.

Now, if your explanation was the entire explanation it would be kind of ok if they did like Kagi do and Google did themselves at some point, ask nicely:

did you mean <something else>?

or

we included results for <something else>. Please use doublequotes if you want exact matches.

Of course with Google this would be pointless as as far as I can see they ignore doublequotes anyway these days...

acdha · on Jan 11, 2022

That’s the key part: a decade ago, that search would have worked perfectly well. They removed the tools for fixing the problem so they own it.

psyc · on Jan 11, 2022

I've taken this possibility into account after the numerous recent threads about this, but I don't think it holds up. I'm very often able to find exactly what I'm looking for after 15 minutes of massaging search terms on 3-4 different search engines. The reason it feels like search has gone to hell, for me, is that for 20 years I took for granted that I could type the first terms that came into my mind into one search engine (Google) and the result I wanted was the first result.

Outside of programming-related topics, anything I search returns pages of pop psychology listicles or news articles. Since I am literally never looking for pop psych listicles, Google (and, to be fair, the other search engines as well) has become a lot less usable.

I agree that the open web has deteriorated, with crap drowning out real content. But I maintain that Google et al have failed, or been beat. The content is there, they just can't find it and/or rank it anymore.

sshine · on Jan 11, 2022

Yes, the more popular terms you search for, the more muddy the waters get.

If you search for anything related to sexuality and psychology, the results are littered with sites that were squatted for serving ads (i.e. no relevant content, just ads), poor quality articles with very low quality content (e.g. poorly formed Quora questions with no expert answers).

As with anything, you have to know what search terms for a given subject are good. For example, you'll get more objective answers the more academic sounding you are, because the fewer people have tried to occupy those search terms.

andriesm · on Jan 11, 2022

I agree with your experience - to me it also seems I used to enter a search phrase and get back a page of results that exactly match it, these days, I often get things that only vaguely thematically relate to my search phrase instead. Sometimes I would change a word in my search phrase or add an extra word to make the intent much more specific and still get essentially the same search results!!! Terrible, just terrible.

mhitza · on Jan 11, 2022

If I'm searching for a development issue, and my first results page contains 2-3 sites that 1-1 copy stackoverflow/github issues, then Google has failed. I doubt those can be more authorative than the original sources

pja · on Jan 11, 2022

Even programming related topics now return heavily SEOd sites with instead of high quality programming content. Often the SEOd content is not completely awful for these searches, but it’s usually not great & never as good as the best sites.

Sander_Marechal · on Jan 11, 2022

Most often it's a rip of StackOverflow surrounded by spammy ads

marginalia_nu · on Jan 11, 2022

> I'm very often able to find exactly what I'm looking for after 15 minutes of massaging search terms on 3-4 different search engines.

If you are adjusting your query, then you are going to get different results, possibly including ones that contain the information you want (but not your original search terms).

We've been tweaking search terms to find what we were looking for since day one, what's changed the most it's that it's gotten far less likely that you'll realize you need to do this.

I think this is one of the real drawbacks of ML-algorithms, their failure modes are completely incomprehensible. Dumb algorithms we can grok, and learn to help along the way when it doesn't work. There is really no point where it will always work.

gregallan · on Jan 11, 2022

I think this is my biggest problem with how Google now works. It's always been disappointing when you didn't find what you were looking for. But you used to be able to examine the results and see how your search terms might not have been optimal, and adjust accordingly. It was the expectation that you'd have to tweak. Now, changing your exact search terms hardly seem to make a difference.

I think the major difference is that the algorithm used to highly weight matching of specific words and phrases from the search terms, so adding a word, re-ordering, and swapping for synonyms would drastically change the results. Now it seems they're using ML and natural language processing to try to actually understand what you're looking for and give it to you. You can change your search terms, but the language embedding doesn't change much, so the system is actually working as intended. I could see that this might actually be desirable for a large segment of the population who wants their search engine to "just work" in response to natural language queries. If the corpus being indexed was high quality, maybe this would be a good experience. But due to the ads, affiliate marketing, and blogspam that make up a large part of modern internet content, it's simply frustrating.

I wouldn't be surprised if they've done user testing that validates their approach. Programmers tend to be comfortable with the concept that a computer will do what you ask, even if it's not what you meant, but most people want to get the right results on the first try. The natural language/ML approach may be much more intuitive and forgiving in that regard. It's just not an approach that's compatible with the low average quality of the content being indexed, in that it takes away the authority of the user to improve their search results.

I think there's somewhat of a tradeoff in search performance between quality of results on the first try and ability to improve the results on subsequent tries, and google is now optimizing for the former at great cost to the latter. And honestly they're failing at both.

xwolfi · on Jan 11, 2022

Either google failed, or the pop science listicles have won ... they're not passively being indexed, sorry you found them by mistake. They actively and aggressively create completely artificial content, tailored not at humans but at google, so that they can push the latest stupid ad for some telephone company on you.

It's like bitcoin if you want - they compete for something so useless the entire concept becomes a huge waste of time. Search engine SEO is like hashrate-dependent token mining: the only people who win are the farmers at the cost of burning their entire ecosystem.

pjc50 · on Jan 11, 2022

> There is so much stuff on the Internet it's easy to start thinking there is guaranteed to be good results for any search, and that just doesn't seem to be the case.

I'm increasingly of the opinion that Google (the advertising engine) has destroyed Google (the search engine), by the two step process of making it profitable to produce blogspam then forcing search to remove blogspam - and a lot of the useful content has gone out with that bathwater.

Not to mention the rise of unsearchable platforms. Google can't search inside Discord.

remus · on Jan 11, 2022

> I'm increasingly of the opinion that Google (the advertising engine) has destroyed Google (the search engine), by the two step process of making it profitable to produce blogspam then forcing search to remove blogspam - and a lot of the useful content has gone out with that bathwater.

If you're looking for someone to blame google seems like the wrong party here. Surely the people abusing the system (i.e. blog spam) should at least share a good chunk of the blame.

pjc50 · on Jan 11, 2022

Sure, but emergent bad actors exploiting features of a system that pay them are kind of inevitable, faceless, and there's an endless supply of them. The internet would be a very different place if it wasn't for this phenomenon of human behavior.

Google are just the most powerful and most visible actor in this, and replacing the early less profit orientated web with a dark forest of advertising and tracking is to a great extent on them.

(I don't think the web3 people have realised how important it is that the cost/benefit ratio of "ham" (good content) needs to be above the cost/benefit ratio of spam, by quite a large margin, or spam drives out ham)

cromwellian · on Jan 11, 2022

I’ve been on the internet since the 80s and the bad actors were still around back then. USENET was made unusable at one point by people spamming, mass cross posting, and playing cancel wars. This was well before ads or profit existed on the net.

acdha · on Jan 11, 2022

This is all true but Google and later Facebook made it easier to monetize spam and were clearly unconcerned about the impact on communities unless advertisers stopped buying. An unethical immigration law firm still needed to sign up clients but a teenager in Moldova could scrape someone else’s content, SEO it, and make a decent income without knowing who the money came from.

I think pjc’s point is extremely important: micropayments could dramatically change things but it’s quite hard to set values which will deter enough spam to be effective without excluding innocent people or simply increasing the damages when someone is compromised. That was what killed proof-of-work email spam concepts decades ago: even if you could get adoption, it wouldn’t have hurt the people using botnets to spam as much as many legitimate users.

cromwellian · on Jan 11, 2022

Anything that facilitates commerce makes it easier to monetize spam, just as it makes it easier to monetize porn, or even illegal activity (eg bitcoin). You can't really fault Google or Facebook for this, as soon as the internet moved away from hobbyists, academia, and scientists towards the general public, it was absolutely inevitable.

The only way to have prevented something like that would have been to make the internet a giant closed-garden AOL like system with total control over users, identity, and content.

Fundamentally, if you make something frictionless to join, you end up with parasites. If you impose a cost to join, you end up hurting average users, because it's hard to raise the price high enough to keep out bad actors, especially if the expected value is positive and high. (If I need to spend $5000 in order to get one sucker for a MAKE MONEY FAST scheme, it's still worth it)

Part of Apple's value proposition is a tightly controlled walled garden with a high cost to entry. If you're willing to pay those costs, you're protected from a lot of spam, at the cost of a lot of friction to enter the ecosystem, and the lost of full autonomy over your devices.

kreeben · on Jan 11, 2022

>> Google can't search inside Discord.

Can anyone, even in theory? Are there open APIs to all systems of discord? Does Discord have one? Wouldn't that open up all of these systems to systematic classification of all users? Also, is there a web link you can construct that will open up e.g. Discord's desktop app when you click it?

krageon · on Jan 11, 2022

Searching Discord externally is not possible AFAIK. It's an actively user-hostile platform, which is doubly irritating because most community efforts (e.g. modding of games) mandate Discord and allow for nothing else. Frequently they don't even accept communication from another space, such as bug reports via email or GitHub.

ciphol · on Jan 11, 2022

That is good for privacy though.

krageon · on Jan 11, 2022

It's practically mandatory to "verify" your account with a phone number for many servers. Discord is incredibly anti-privacy, the fact that it is not transparent to people that aren't currently using it is a vendor lock-in measure.

neutronicus · on Jan 11, 2022

You're probably much less likely to leak PII to non-community members through Discord than a forum.

I also think ephemeral-by-default seems to result in much fewer long-running fights than you would have in public-record-by-default communities like forums.

Like when a thread gets heated in a forum, it keeps getting bumped and none of the people involved can resist picking at the sore.

On Discord it seems like someone steps away, the channel moves on, and the fight actually dies

lpcvoid · on Jan 11, 2022

No - anonymous accounts are good for privacy. If you're contributing information about something, having that information index-able is just common sense.

nikanj · on Jan 11, 2022

Even Discord can’t search inside Discord

darkerside · on Jan 11, 2022

The world wide web is not the world. If it exists you can search it. Even if you need to open a client, run Discord search, and execute OCR to interpret the results.

In tech, it's never what's possible, it's just what costs area reasonable.

AmericanBlarney · on Jan 11, 2022

Or what's legally allowed by the ToS...

darkerside · on Jan 11, 2022

Explore a partnership, figure out how to mask your usage, etc. Not impossible. If you're looking for defeat, you will find it.

WalterBright · on Jan 11, 2022

On Amazon I searched in "CDs & Vinyl" for the band "The Birdstones". It showed me results in books.

Criminy. Amazon does this all the time. If I wanted a book, I'd have searched under "Books". I am not confused about the difference between "CDs" and "Books".

What's the good of having categories if they're completely ignored?

kevincox · on Jan 11, 2022

I now avoid Amazon for this reason. I'll search for something like "4k HDMI capture card" and half the results are 1080p (and don't say 4k or a synonym anywhere) and a tenth do DisplayPort. I end up having to open every linked product and use my borwser search to confirm that my requirements are actually present.

Same with any sort of requirement like "plastic" a colour...

It just wastes my time so I shop elsewhere.

I think a much better UX would be saying: No results but consider searching for "4k capture card" or "HDMI capture card".

aix1 · on Jan 11, 2022

I was curious and tried it myself. The search results page explains -- right at the top -- what is happening:

Showing results from All Departments

No results for The Birdstones in Music, CDs & Vinyl

I think this widening of search scope is not unreasonable.

cik · on Jan 11, 2022

It depends how you view the problem. If one assumes that humans make mistakes - it's probably reasonable to show people things outside their selected search scope. If you're an exacting user, this is rather frustrating.

Realistically neither of the above were the concern. It's an attempt to boost engagement in the hopes that you'll find a product for eventual purchase, nothing more.

aix1 · on Jan 11, 2022

I didn't design this UX, so can't be certain of the designer's intent.

However, more charitable interpretations could be:

1. perhaps the user has accidentally chosen the wrong category?

2. perhaps the product they're looking for is miscategorised? (I've definitely seen these.)

eitland · on Jan 11, 2022

But then tell me loud and clear that you did it.

If WalterBright didn't see it it probably wasn't loud and clear enough.

aix1 · on Jan 11, 2022

Perhaps. I'll just point out that the wording I quoted is literally the first thing that the results page shows, in headline font.

I don't often come to Amazon's defence, but here it's IMO a bit of a stretch to infer malicious intent.

WalterBright · on Jan 11, 2022

You're right, it is there. It is not in headline font, however. It's in a small font, and I did not see it. I just looked at the search results, not seeing all the other noise on the page.

estebarb · on Jan 11, 2022

Some funny example is "Eclipse": Do you mean Eclipse the IDE? Or the vampire movies? Maybe you mean the astronomical event? or the Nissan Eclipse car? Or you wrote wrong ellipse?

I'm fine with Google not reading my mind at the first try, but at least offer me alternatives and exact text matching. For example since a couple of years looking for phones or symbols is completelly broken.

And I really admire how Google is able to guess the encoding of the websites, detect what is text, do language detection and dropping all the porn. Writting a crawler that actually works is HARD.

elondaits · on Jan 11, 2022

When I search for "databricks series b valuation" in Google (from Argentina, using Google.com in English) result #6 is:

"Python get value from database - Büro Jorge Schmidt", which judging by title and preview seems to be a Python + MySQL tutorial. It returns a 403 error and might be a hacked site, since the home page is for a graphic design studio in Munich.

Result #8 is something similar:

"Intellij flatten packages - Músicos de Viaje". This is definitely a hacked site (from Spain, apparently) that redirects me somewhere else.

Result #10:

"How to calculate tax percentage in sql query". Another hacked site, this time for an evangelical church from Brazil.

Now... how can Google think that any of these sites are relevant? Even if it doesn't realize the pages are hacked... even its crawler has been fed content that included the keywords... :

A - The sites themselves don't match the query at all.

B - No legit site about the subject would link to these sites.

C - The results themselves (title, url, preview), as Google shows them, have nothing to do with the search!

creato · on Jan 11, 2022

I just tried that search, all of the results look relevant and I definitely don't get any of the results you are getting.

I wonder if you have some malware that is hijacking the results? I once had some malware (chrome extension) that was corrupting my search results. It was surprisingly difficult to remove (given that it was a chrome extension...).

elondaits · on Jan 11, 2022

No, I get the same results on Safari on iOS (iPhone) so I have to eliminate the possibility of malware.

Google results are personalized, based on location, search history, etc. The fact that I'm in Argentina has been adding a lot of noise to results on searches where my location is not relevant at all.

In this case, I suspect that Google thinks these hacked sites with developer target content are relevant to me, because of my regular search history.

zaptheimpaler · on Jan 11, 2022

It sounds like their location based targeting sucks. Spotify has the same problem for me - it spams my playlists & recommendations with songs that I have never, ever shown the slightest interest in purely based on location.

This seems to be a common theme in the industry. Recommenders heavily overweight location - an incredibly general factor, even in the presence of troves of specific, individual level data. Goes to show how little basic reasoning really goes into how these systems work.

bombcar · on Jan 11, 2022

All online tools should have a “pretend I’m in Silicon Valley” toggle so you can get the same results the engineers get.

Localized search is useful at times (restaurants for example, I like getting the local McDonald’s or China Wok rather than the biggest one in New York) but it’s completely useless for many terms. But maybe not the ones Google makes the most money on.

bsmith · on Jan 11, 2022

How about spamming me with artists I have taken the time to mark as 'Do not play'? The FIRST recommended album in Album Picks is such an artist. Good reminder that I should get rid of my premium subscription, thanks!

aix1 · on Jan 11, 2022

I too tried this (geolocated to Argentina) and cannot reproduce the behaviour. All the results I am seeing on the first couple of pages are relevant.

P.S. Chrome, incognito window.

elondaits · on Jan 11, 2022

If I use an incognito window I don't get the crappy results either.

My feeling is that since that query is SO unusual for me, based on my search history (I can't even say what it means) it raises the "likelyhood" that the hacked spam sites with programming terms that also include those keywords are good results for me.

If I search for something more typical, like "Spider-Man No Way Home" or "Ruby rails tutorial" the results don't include hacked sites.

sct202 · on Jan 11, 2022

I see some people are having trouble replicating the results, and I wonder if you some how got thrown into a really bad user test group--if they do that.

isolli · on Jan 11, 2022

It's why I love the possibility of deliberately activating or de-activating localization in DuckDuckGo.

amelius · on Jan 11, 2022

Could you try again from an incognito window?

elondaits · on Jan 11, 2022

Yes, I did. No trash. Safari on Mac (or iOS), once I log in, spews the same garbage results.

Copied from another answer:

---

My feeling is that since that query is SO unusual for me, based on my search history (I can't even say what it means) it raises the "likelyhood" that the hacked spam sites with programming terms that also include those keywords are good results for me.

If I search for something more typical, like "Spider-Man No Way Home" or "Ruby rails tutorial" the results don't include hacked sites.

---

qiqitori · on Jan 11, 2022

Rule #1 of Google usage: turn off results based on search history. Think about it, how could that possibly improve your search results when you search for something you have never searched for before? I (sort of) remember the day that was announced, almost everyone turned it off immediately.

KarlKemp · on Jan 11, 2022

Almost nobody turned if off. And, to answer your question: If your history shows hundreds of queries for Java, PHP, and Ruby, odds are your query about Perl or Crystal or Go isn't about the species or stone or game.

mynameismon · on Jan 11, 2022

adding another datapoint, I could only find the results the OP was returning in the 10th page (~100th result)

elondaits · on Jan 11, 2022

Good to know you see those too.

They're so obviously unrelated hacked sites that they shouldn't even be listed.

fhe · on Jan 11, 2022

just to add another data point, I tried the same search too, and all results look normal and relevant.

mhss · on Jan 11, 2022

Same here, all relevant results.

ur-whale · on Jan 11, 2022

> I just tried that search, all of the results look relevant

Do you geolocate in Argentina?

pverghese · on Jan 11, 2022

Just tried that search in English. First result is "The data- and AI-focused company has secured a $1.6 billion round at a $38 billion". And if you click the search box you get the first search option as databricks valuation history where the first result is funding every round.

Google has been so much better for search for me than other search engines. Atleast for what I search for programming, news etc.

elondaits · on Jan 11, 2022

I start seeing trash from result #6 on. First are relevants.

Also, results are personalized. Not everyone sees the same.

If I do an incognito search (not logged in) I get results without trash.

ipaddr · on Jan 11, 2022

Google thinks you like trash

elondaits · on Jan 11, 2022

For anyone interested, further info:

- I checked the cached (by Google) copies of the hacked pages and they include mentions of a "Databricks SQL Connector". So if I search for "Databricks" Google thinks "it must be a programming thing".

- If I now search for "databricks series a valuation" I don't get the spam results, for some reason. I think that if I repeat the search Google produces the exact same results... but internally, since I first searched, it might have realized that those sites were not good.

exikyut · on Jan 11, 2022

For me this only shows one spam result in the 20th position: https://imgur.com/a/s8ETiNl (tallish image)

That being said, I have issued queries in the past that I have just found absolute walls of malicious results. I haven't really invested much attention span in entertaining this sort of thing so I just move on and modify my query, but I'll keep an eye out for it going forward.

freediver · on Jan 11, 2022

The article mostly talks about IA (instant answers) which are notoriously hard. The recent advances in machine learning have made the technology more approachable, so startups like Kagi Search (disclaimer: founder) can also leverage latest advances in NLP and compete on this ground.

To give just a few examples:

Query 1: how many stars in the usa flag

Google: https://cln.sh/63sVzh

Kagi: https://cln.sh/bFEHsD

Pretty surprising that Google would get something like this wrong.

Query 2: when did moon explode

Google https://cln.sh/fUhdJS

Kagi https://cln.sh/5wDvXG

Both engines feature the same article but for some reason Google decides this is not fiction, and gives a (wrong) answer.

Query 3: do most rabbits have short or long ears

Google: https://cln.sh/JuOeqq

Kagi: https://cln.sh/BkZi6O

Both engines use the same article for source, but Google completely misses the context.

These examples show that a search startup has a chance to go neck-to-neck with Google and compete even in technology as sophisticated as instant answers. We invested considerable resources in the Kagi Search AI capabilities, discussed in some detail here https://kagi.ai/last-mile-for-web-search.html

What is mind boggling though from a product management perspective is that Google had nearly a decade head start and a cash purse of hundreds of billions of dollars to get this right.

To be fair, it is likely that the vast majority of queries are answered correctly, but only the outliers get the public attention. Also Kagi is not without its own share of silly mistakes too, but just being able to be considered in the same basket as Google is already a huge thing for us.

crdrost · on Jan 11, 2022

My favorite is, given that I have a baby and I am a trained scientist and I live in the US, I find myself converting milliliters to fluid ounces a lot.

Right now the Google Assistant will correctly transcribe the request to a Google search... Only for the search to interpret “ml” as “miles” and, faced with the discrepancy between the length and a volume, cube the miles. So I am expecting an amount that is like 1 oz because I am converting like 30 mL... and I instead get 4 quadrillion ounces (exact number is 1 mi³ = 140,942,994,870,857 + 1/7 oz because of course it's got that extra 7th in there what were you expecting from our ridiculous US system, haha).

otherotherchris · on Jan 11, 2022

Ten years ago Google Calculator used to work perfectly. I don't understand how they've broken it and why no one is responsible for fixing it.

Perhaps everyone who understood the codebase has left?

hn_go_brrrrr · on Jan 11, 2022

No one is going to get promoted for maintaining a calculator, so it rotted.

Clearly they should have put a chat app in it.

Hayarotle · on Jan 11, 2022

At least we can still use duckduckgo and use the !wa bang (wolfram alpha)

Sometimes even without the !wa, as duckduckgo itself often provides the result

remus · on Jan 11, 2022

It's not that broken. If you type "30ml in fluid ounces" in to google (i.e. the same thing you did 10 years ago) it works as expected.

otherotherchris · on Jan 11, 2022

Don't be "fair", instant answers have been useless misinformation in every instance I've triggered them. And they've been broken for years.

No one at Google is responsible for these half baked and largely irrelevant widgets or wants to stake their career fixing them.

rryan · on Jan 11, 2022

> No one at Google is responsible for these half baked and largely irrelevant widgets or wants to stake their career fixing them.

You're just ... wrong about this. There's an entire team of dozens of people (maybe hundreds now) focusing on this specific web answer feature. I personally worked on the team (not this feature, though).

I don't understand why people say things they know nothing about.

onion2k · on Jan 11, 2022

I think it largely stems from this very popular comment from a couple of years ago: https://news.ycombinator.com/item?id=19553840 From an external point of view certainly seems to explain a lot of Google's issues with the stagnation and death of many of it's products.

The fact that there have been plenty of other comments from Googlers to back it up since shows that, in some parts of Google at least, there's a grain of truth there. It might not be the whole story, many teams might be proud of their part, and many Googlers may not be focusing on promition and shiny new things, but that doesn't really matter to Google's users. What we see has been plenty of Google products being killed off, or left to rot. Now even Search has people complaining about it. Startups are getting traction competing against it. A decade ago that would have been unthinkable.

If you're right and teams in Google do actually care about the older, less shiny things they build then Google has a significant brand and reputation problem. If you're wrong then Google has a massive engineering culture problem. Either way, Google has a problem.

resonious · on Jan 11, 2022

I think people say this because some problems stick out like a sore thumb and stay that way for months if not years. The line of inference is that there must be no incentive inside the company to fix them.

sangnoir · on Jan 11, 2022

Isn't this just the software engineering equivalent of the fundamental attribution problem? My backlog is long because I have important stuff(TM) to address, whereas their backlog is sign of dysfunction.

6510 · on Jan 11, 2022

> I don't understand why people say things they know nothing about.

Well, in all honesty I wonder how this exact same thing when reading the "answers".

I'm having visuals of someone repeatedly trying to throw something and missing the wall entirely.

Lets examine the original product implemented in wetware:

http://answers.google.com/answers/index.html

> "At the bottom of every question page we provide a link to answers-support@google.com. We encourage you to use this link whenever you see questionable content posted to the site. In your email, please provide information about the question, its ID number, and the reason you find the content questionable."

This one went through the wall!

You have no further questions.

Der_Einzige · on Jan 11, 2022

And yet despite all of those people, the quality is bad and seems to get worse. A startup seems to be outcompeting them.

Sometimes industries or teams have effectively negative value, or their preconceived notions about how something works based on teachings from their field is wrong. This is the case for chiropractors (the whole field is useless / a net negative), vs back doctors.

We see this happening today with Google moving away from traditional and high quality techniques like direct keyword ranking, bm25, and pagerank and move towards lower quality methods based on hype such as BERT/other "semantic search" based on LMs, query rewriting, and using these dense vectors directly in pagerank (and this degrading it).

The amazing power of language models in certain domains (text generation) has unfortunately caused a proliferation of them in a place where they are still pretty bad (information retrieval and search).

Google is full of search chiropractors when they need search back doctors.

jeffybefffy519 · on Jan 11, 2022

Instant answers is terrible, its someone’s interpretation of something rather then letter me decide.

Kiro · on Jan 11, 2022

Every instance? Come on, instant answers are bad but let's not exaggerate things.

eitland · on Jan 11, 2022

Let me phrase it this way:

I'll happily say bye bye to instant answers across all search engines just to get my working search engine back (preferably with a blacklist).

mynameismon · on Jan 11, 2022

Weird, two of those queries are right for me: Query 1: https://imgur.com/6cy3jBg.png Query 2: https://i.imgur.com/MLF8jpT.png

The third query is wrong for me too, though.

perryizgr8 · on Jan 11, 2022

> The article mostly talks about IA (instant answers)

Not really. TFA is discussing the search results. If google/bing want to put instant answers then that is what will get judged. If your ML/AI is not good enough yet to provide natural language answers, don't make it the most prominent part of the search results.

mda · on Jan 11, 2022

Fwiw, for "how many stars in the usa flag" Google gives the correct answer for both logged in and not.

Cipater · on Jan 11, 2022

I got the correct answers from Google when I did the first and second search.

Same results as you on the third one though.

klondike_ · on Jan 11, 2022

I think that Google is optimizing for the "average user" to the detriment of power users such as the HN crowd. Most people treat Google as an internet oracle and send queries like "how do I do X" while power users will search for keywords. One example of this optimization is the automatic answer boxes that show up for certain questions, which are wrong disturbingly often or don't include important details.

ghosty141 · on Jan 11, 2022

The average use absolutely does this. I see this with family members and friends. Most just type in full questions.

From my experience the best way to get good results is to start typing keywords for yourr question and then creating the query based on the autocomplete results. If I notice I don't get autocompletion for a certain query I'll restructure it until I do. This has proven very useful in providing good results.

For technical stuff, using the quotation marks is almost essential.

Cd00d · on Jan 11, 2022

I haven't experienced quotation marks making any difference in recent times. I think that tip is out of date.

User23 · on Jan 11, 2022

I miss Altavista. I could generally either find exactly what I was looking for inside of three iterations of refining the search, or find that it wasn't to be found.

I still just want a blazing fast full text search of the reachable WWW that understands regexes and a basic predicate calculus. Unfortunately the overhead and small potential user base means that under the current regime such a thing will never be made.

Speaking of, if any government actually wants competition, they don't need to break up Google, they just need to force them to offer full access to their cache and compute at some reasonable rate, much like how the ILECs were made to carry the CLECs' traffic.

bunabhucan · on Jan 11, 2022

Do you miss Altavista or do you miss the size of the web in the late 1990s?

otherotherchris · on Jan 11, 2022

Common crawl already exists and is nearly as good as Google's index.

marginalia_nu · on Jan 11, 2022

Common crawl is just a dataset, not an index. The difficulty is not finding a sufficiently large haystack, it's getting at the needles.

carabiner · on Jan 11, 2022

All right, I think this comment is peak HN.

pictur · on Jan 11, 2022

I think it's the opposite of what you said. non-expert users search more precisely with longer sentences.

achairapart · on Jan 11, 2022

This. And let's not forget that is the "average user" that mostly naively clicks on ads, not the power user. And selling ads is still Google core business.

Looks like Google is slowly turning into a big nigerian scam.

Gigachad · on Jan 11, 2022

I think this is overall a good thing. Power users have trained their behaviour to work in the way that simple systems can deal with. While average users ask the question exactly how they would ask another human. Google has now reached a level where it works best when you deal with it in a natural and human level.

There is nothing actually better about the way we originally used search engines, it was just required at the time.

evouga · on Jan 11, 2022

No, I disagree. With keyword search, I am confident that eventually, with enough included and excluded terms, I will find what I’m looking for.

With natural language search, sometimes it works great, but it’s a crapshoot and when you don’t get the results you need you’re stuck.

Several times in recent memory Google has returned results so bad, I completely gave up searching. Most recently was when i was trying to look up a Windows 11 BSOD error code (where even pasting the error code verbatim only brought up pages of garbage sites with no useful technical information).

SturgeonsLaw · on Jan 11, 2022

Google results for Windows error codes have been gamed to hell and back by the likes of Easus and Drivereasy, where their tools are somehow the answer to every single fault Windows could possibly manifest.

ghusto · on Jan 11, 2022

I was ridiculed by my Natural Language Processing professor twenty years ago for exactly this.

P: "Pfft. So, what? You think the current state of search is good?! Having to type in keywords instead of just asking a question in normal language?"

Me: "... is this a trick question?"

Mezzie · on Jan 11, 2022

I think it's somewhere in the middle.

For some queries, being able to ask off the top of your head without thinking is good. Think 'what day of the week was June 14th, 2002?' or 'Who is the mayor of Los Angeles?' For quick questions with clear answers, the current system is a huge advantage over what we had before.

For other, more complicated queries, the act of composing your search and considering your keywords, etc. is a step in the process that helps a searcher mentally understand the results they're going to receive along with what they mean. Having to stop and consider makes you aware that you're working within a system and its constraints, which makes it more suitable for questions that are complex or not socially settled.

Not all information is the same.

wolpoli · on Jan 11, 2022

The web itself is deteriorating.

Instant answers (IA) caused a shift in the way contents are written. Content optimized for IA tend to be repetitive and shallow. Viewing content written for IA is a frustrating experience and these tend to dominant the result page now.

marginalia_nu · on Jan 11, 2022

The reason you are seeing a lot of that sort of content is because Google is looking for that sort of content, in point because of Google's peculiarities, but also because of how refined the art of black hat SEO has become.

Meaningful websites still exist. The bulk of the content on the Internet is older than IA, and it's still out there (not that you can find it with Google).

eitland · on Jan 11, 2022

Hey, free advertising for you and your search engine:

Marginalia search, by punishing ad- and tracking-heavy pages and by being strict in how it interprets my queries sometimes surfaces better results than Google and DDG.

In particular I have found great resources about Linux partitioning and git usage after giving up mainstream search engines and trying them in marginalia.

That says quite something about how badly broken the situation is given that marginalia is one person and a tower pc in a living room.

I keep getting reminded about a Linux quote on how they managed to go forward by studying the latest 20 years of OS research and throwing it all away :-)

dehrmann · on Jan 11, 2022

No idea if you're right or not, but I think it's an interesting take. It might be because of SEO, it might be more information in walled gardens, it might be the death of the personal web page.

Both Google search and the internet are changing, as are our perceptions, so it's really hard to say why search quality is better or worse.

wolpoli · on Jan 11, 2022

Whether search quality is deteriorating really depends what we expect from a search engine. Many of us on HN use search engine as a way to look for web pages with specific words. Search engines did start out performing word searches, but somewhere along the way, they started trying to answer questions. Many of the complaints such as Google dropping search terms are actually the result of the search engine trying to answer questions instead of performing keyword searches. Search engines these days are doing well for the easy questions, but of course they completely misunderstand questions with more nuances.

A bit off topic: Some of the search queries in the article used Google like a keyword matching tool, while other queries in the article were using Google to answer questions. That's because we only have one search box trying to do everything. Would we be better served if we have a checkbox to tell the search engine that we just want to perform word searches?

dehrmann · on Jan 11, 2022

> trying to answer questions

Poor Jeeves. He was before his time.

visarga · on Jan 11, 2022

It's easy to tell if it's a natural language question or keywordese. They can classify that.

tremon · on Jan 11, 2022

I sometimes try to translate my keywordese to English when I'm confronted with a clueless response. It hardly ever improves the results.

CLLD · on Jan 11, 2022

It's definitely deteriorating, and the worst part is that it completely ignores quotes if it thinks you meant something else, and shows the results for what it thinks you want. Completely useless in a lot of cases

IAmEveryone · on Jan 11, 2022

Google corrects spelling in the way you mention, but removes quotes only if there are zero results for the search with quotes. Example, finding one result; https://www.google.de/search?q=%22but+when+therefore+is+an+a...

Getting this wrong is probably why people think we need more than definitive assertions from people operating from subjective impression.

Changing a letter in the query, it says, right at the top:

    Showing results for "but when therefore is an adverb"
    No results found for "but when therefore is tan adverb"

michaelt · on Jan 11, 2022

I have an Intel Realsense camera, which sometimes reports the error "Failed to recconect" (there being a typo in the drivers) [1] - that's a pretty unique error, so in combination with the product name that should be a very easy keyword search, right?

But no, when I search for realsense "failed to recconect" Google returns pages that contain neither realsense nor recconect [2]. They offer me a supreme court opinion, a review of a car dealership, and a facebook church service.

Correcting the spelling of a query is one thing - but also completely ignoring other keywords? I can see why there are so many people posting about the poor quality of Google's search results.

[1] https://github.com/IntelRealSense/librealsense/blob/5ff27fca... [2] https://imgur.com/a/okYV5V2

CLLD · on Jan 11, 2022

This has only happened to me twice so far, but it has given me the "Did you mean:" even when using quotes for searches that definitely have results (as I found them with duckduckgo). I can't remember what they were now

eitland · on Jan 11, 2022

This seems to depend on a lot.

I rarely see "No results found".

That is a completely valid result in my opinion.

I often see pages that doesn't contain my search string.

Sometimes is is something trivial like

query: "something about x y and z"

result contains: "... something about x. Y and Z however..."

which is understandable.

Often though I have no clue how a result got there. Maybe it is linked to using a certain text (link bombing)? I don't know.

google234123 · on Jan 11, 2022

This isn’t completely true, Google search is very complicated and definitely had the ability to rewrite queries in some cases.

kebman · on Jan 11, 2022

If you search for anything political on Google, you'll notice that the results are clearly slanted in one direction, towards the opinion of a handful of pre-approved news outlets. This leads me to seek alternatives whenever I need neutral sources, for instance Yahoo search.

photochemsyn · on Jan 11, 2022

I wonder if they're getting cash kickbacks from established corporate media outlets for pushing their material to the top of the search results. That would actually be less creepy than if its being done as some kind of information manipulation program.

It's high time Google and other search engines were forced to expose the inner workings of their ranking algorithms to the public, particularly now that they have near-monopoly power in the sector. People should also be able to adjust the dials on the algorithm themselves.

otherotherchris · on Jan 11, 2022

In Australia, Google is being blackmailed into boosting Fairfax, Newscorp and Seven West Media content higher up in their index. It's reached a point where most queries are useless if they contain a word or even a synonym for that word that has been recently used in a major media site owned by those three.

I use Google through a VPN to avoid it. That breaks maps integration.

DaedPsyker · on Jan 11, 2022

I'm curious, why go through the hassle of a VPN rather than another search engine? Are the other results so poor or do the others do the same as Google?

narrator · on Jan 11, 2022

Try "what countries are using ivermectin" in google.com and then try Yandex.com. For me, the third site on Google (the kitchen sisters) appears broken and the rest are all some variation of "why ivermectin is bad" articles. Yandex actually answers the question.

GuB-42 · on Jan 11, 2022

I think it is related to the fight against "fake news", "hate speech", etc... People don't tolerate a truly neutral search engine, because it will reflect human nature and human nature is not always pretty. I remember the time when Google returned antisemitic websites when searching for "jew", they refused to do anything about it because "jew" is used mostly by antisemites and therefore, an antisemitic website is what people searching for that term most likely want, the search engine did its job. I don't think it will fly today.

So search engines now have to get the "truth", preferably the politically correct one, and since you can't rely on the crowd for that, you have to introduce bias, and "pre-approved news outlets" are the most obvious choice.

remus · on Jan 11, 2022

I find these responses fascinating as the "clearly slanted" results tend to change direction depending on the political affiliation of the person making the claim! Having said that, I'd love to be proved wrong if you have any evidence to show a particular bias one way or the other?

bad_username · on Jan 11, 2022

Search "mass formation psychosis" on Google and DuckDuckGo. This is a trending phrase popularized by a doctor that has been canceled and banned from the mainstream due to his criticism of the world's COVID response.

DDG shows the author's substack as #1 result and is neutral otherwise. The other doesn't even have it on the first SERP, and is overwhelmingly critical.

If you argue that covid response is not politics, I will disagrer strongly.

defaultprimate · on Jan 11, 2022

Search politically controversial topics incognito with a VPN on using google vs yandex vs duckduckgo? Ivermectin, January 6th arrests, BLM protest deaths, VAED, mRNA studies before 2020, Robert Malone, Geert Vanden Bossche. I mean there's an endless list of things you can experiment with.

kebman · on Jan 12, 2022

My experience is that it's slanted whichever side of the easel you're one. The evidence is really clear when you measure results between various search providers, and especially when searching up contentious or controversial topics. So it doesn't really change direction so much as it confirms one particular set of beliefs depending on which search engine you're on. I think this is clearly in Google's disfavour, because people have started to notice and they're actively searching for alternatives to Google in order to avoid it.

dustintrex · on Jan 11, 2022

As a corollary, search on Google News (as in, browsing to news.google.com and searching there, or !gn via DuckDuckGo) is really bad. The index seems to update really slowly, so breaking events are usually missing entirely, and the grouping of articles into single events is also quite broken.

Mistletoe · on Jan 11, 2022

Sounds like you want non-factual answers.

marginalia_nu · on Jan 11, 2022

Non-factual answers are especially interesting, because otherwise reasonable and intelligent people believe in them. Everyone would be better off studying those things and see how they've drawn those conclusions we don't subscribe to, so that we can figure out what we are wrong about.

It's absolutely catastrophic if you are not allowed to draw your own conclusions about things. This is your inalienable prerogative as an adult in a free country. Even at the risk of some people being wrong sometimes, you simply cannot have authorities distributing doctrine and call yourself a democracy.

6510 · on Jan 11, 2022

That strikes me as a wonderful idea for a new search engine: Just politics. Authors and sites could create a mini profile to refine the results.

ziml77 · on Jan 11, 2022

You want a search engine that biases everything towards what you already think? That's an idea that is beyond terrible.

lvs · on Jan 11, 2022

Oh give it a rest.

MangoCoffee · on Jan 11, 2022

OP is not wrong

lvs · on Jan 11, 2022

Tedious.

andrew_ · on Jan 11, 2022

The most frustrating part of using Google these days (for me anyhow) is Google returning results that don't match terms that I specifically wrap in quotes. If I search for:

"gamakatsu octopus hooks"

I expect to only receive results for that. Instead I get bombarded by results that match a portion, or when Google thinks I tangentially might have meant something else. There was a time when it respected the quote characters, but those days have long since passed.

anont094h0 · on Jan 11, 2022

What's galling is that they've actively gone out of their way to make it worse, instead of just letting it regress through neglect.

For example, a few weeks ago, I image searched for a meme that I created years ago on 4chan. A dozen or so results were returned, none of them relevant. But if you tack on the name of a 4chan archive, for example "4plebs" (not even "site:4plebs..."), all of the sudden it turns up.

Google in general seems to penalize 4chan and its archives, which is ironic since it's one of the few places where actual humans post OC. Meanwhile Pinterest spam, AI-generated blog posts, and reddit threads full of bots and shills abound in its results.

anont094h0 · on Jan 11, 2022

Speaking of 4chan and google's declining result relevancy, a particular instance of the latter was discussed there (one of the few places it could be discussed, given the amount of censorship that prevails everywhere else these days):

https://desuarchive.org/g/thread/76372135/

This is still the case today, at least in the US (I just checked). Instead of emphasizing the painting of Beethoven we all know, the one that was actually done during his lifetime, the one featured in the infobox of his Wikipedia page (which is also the top link result), it instead emphasizes a much more obscure painting that was done posthumously, for no obvious reason other than it giving him a noticeably darker skin tone. I'm not even offended by it, I just find it ridiculous that Google actually went out of its way (probably for pc reasons) to train their algorithm to return less relevant results.

ramoz · on Jan 11, 2022

When I search “tim lee food blogger age” Google actually shows results with “age” striked out (so it shows top results as if age wasn’t part of the queried string).

Trying to think why/how it’d conclude that age wasn’t necessary for good results.