How AltaVista, the first good search engine, fell into the digital abyss

pelle · on Dec 19, 2020

I was the first corporate web master for AltaVista and joined in January 1996 to manage everything except for the actual search engine itself.

It was my first real job at a large company and taught me a lot about working in corporate America.

I saw so many mistakes made within the year I worked there that were obvious even at the time for a lot of us that worked there, but at the same time there are many similarities to what happens with other very well funded projects trying to make sense of a new technology and way of doing businesses within a large very important company with a very different business model.

I have seen virtually the exact same playbook happen in the enterprise blockchain space in multiple occasions over the last 5 years.

It is sad in many ways to see what happened to DEC (probably more so than AltaVista). It was such an innovative company back in the 60s and 70s, but unlike IBM weren't able to reinvent themselves in first the new 80s world of PCs and then later internet. Classic case of innovators dilemma.

AltaVista itself largely died, because in a misguided attempt to manage the innovators dilemma they just tried to rebrand everything network oriented they had as AltaVista.

People only remember the search engine now and for good reason. But we had AltaVista firewalls, gigabit routers, network cards, mail server (both SMTP and X400 (!!!) and a bunch of other junk without a coherent strategy. Everything that had anything to do with networking got the AltaVista logo on it.

The focus became on selling their existing junk using the now hip AltaVista brand, but the AltaVista search itself was not given priority.

I learnt a lot from my experience there, grew to be extremely skeptical, learnt to love Dilbert and also learnt how cool the DEC Hardware and Digital Unix was compared to the Sun Sparc and Solaris stuff I had to work on afterwards.

sradman · on Dec 19, 2020

Anyone familiar with Inverted Text Indexes when AltaVista was at its peak appreciated DEC's business model and execution. Google's PageRank pulled the rug out from under standard Inverted Text Indexes and quickly obsoleted the competition. Re-ordering the entire index nightly with potentially multiple passes is non-trivial and requires throwing a lot of hardware at the problem.

In my opinion, marketing wasn't the problem; the problem was fast-following a disruptive technology without understanding how it worked. Once details of PageRank were published it was too late.

dumbfounder · on Dec 19, 2020

Why was it too late? They were still number one when pagerank was published, all they needed to do was put more resources into it. Google’s algorithms weren’t magic. The biggest problem was lack of vision, and willingness to bet on that vision. Investors in Google were swinging for the fences, that mentality would never have been at home at DEC.

ericbarrett · on Dec 19, 2020

Google’s algorithms were definitely “magic” by the standards of the time. Altavista never got much better than literal keyword search even when they introduced PageRank-style weighting. Google immediately had useful results from misspellings, questions, and related keywords, and only got better from there. Plus it was much faster; responses in 1-2 seconds versus 5-10 sometimes for AV. Maybe what you said is true, but they were deep in the hole as soon as Google launched.

guenthert · on Dec 19, 2020

I don't recall AltaVista's response time as slow (although I was impressed by Google's fast response time, which they proudly reported then). It was the page-load time, which was slow for AltaVista compared to the one of Google, which initially drove me away once that alternative was available.

still_grokking · on Dec 20, 2020

Yeah, all portal sites were cluttered heavily. Google was only the logo, the search input field, and the submit button. That was one of the biggest advantages for me.

Scoundreller · on Dec 20, 2020

I liked HotBot’s. It had a customizable portal page where you could X out boxes that you didn’t want.

hollerith · on Dec 20, 2020

You forget the "I'm feeling Lucky" button :)

alain94040 · on Dec 19, 2020

True, and Google is slowly being disrupted by the next generation of Internet search (looks like disruption happens every 20 years or so).

The age where "search" is meant to return pages will eventually end. I know it's early, but you can think of GPT-3 as the next generation of search engines. You consume the entire "Internet knowledge", and you answer questions, by merging information from multiple sources. Not just returning one existing page. That's sort of what GPT-3 does, you just don't think of it as a search engine yet.

Google felt "magic" in 2000. A search-oriented GPT-3 will feel "magic" soon.

Keyframe · on Dec 20, 2020

GPT-3 will be remembered and used by no one in no time. Google is already answering questions to queries and also offering alternative queries. It's now in a hybrid state where it tries to answer your stuff but also provide links. Quite interesting and quite usable already. GPT-3 is a cute experiment, nothing more.

Nullabillity · on Dec 20, 2020

GPT-3 can't do source criticism, and hiding the source means that the user can't do it either.

rhizome · on Dec 19, 2020

The lists of questions that Google returns as snippets can be seen as possible elements of a paragraph for a process that doesn't quite know what a paragraph is. Clicking on them acts as a survey response, a signal for that element of information's significance in the topic for which you searched.

rvba · on Dec 20, 2020

> Google is slowly being disrupted by the next generation of Internet search

What is the name of the search service (or services?) that are better than Google? You make such big statements and provide zero examples.

Bing is inferior to Google; duckduckgo is even worse - the search is often unreliable (even after years of work); Tineye got hindered by GDPR and it was search via picture anyway. So what else have we got there?

I remember the time when Google came out very well. It was a strictly better product than Altavista. Once you started using Google you would never go back to Altavista - the difference was like night and day. Altavista was dead for you. There were no obstacles to switch either. I assume that if a new product came today, that was better, even say 25% better, people would start switching via word of mouth.

Back in the day Altavista lost because their technology was inferior. It wasnt about website design (although Google's clean landing page helped), it was 100% technology: Google was strictly better. In fact at that time there were other search service too - they were even worse - and often provided nearly random results (e.g. search for William Shakespeare -> get random porn website...).

PageRank algorithm sounds easy once you know it, but it is easy once someone tells you about it. It is much more difficult to invent. Back in the day many other people worked on improving search (those were the times of catalogs and webrings) and Google were the first (?) to come up with something like that. PageRank was basically bleeding edge research sponsored by spy agencies. It only sounds easy with the benefit of hindsight.

Also not directly to you, but the opening poster basically writes that:

* they could make Altavista work - but the meddling management hidnered it (how? did they have their own PageRank equivalent? I doubt it)

* they could have made blockchain work - but the meddling management hidnered it

I see a certain Scooby doo pattern. And a senior developer/manager/architect (with 20++ years of experience under their belt) who claims that they could make blockchain work, while most people with such experience know that blockchain is an empty buzzword.

beojan · on Dec 19, 2020

That is the type of thing Google (and Wolfram Alpha) tries to do now.

gman83 · on Dec 19, 2020

Isn't GPT-3 licensed exclusively to Microsoft? Maybe Bing will actually get good.

hooande · on Dec 19, 2020

all of search isn't question/fact based. I'd love to see the real numbers, but I'd think that less than a third of all searches are suited for a factual question answer system.

Google search is still largely based on human feedback. what people link to, what terms they choose to use in relation to other terms. It will be difficult to disrupt that

homarp · on Dec 19, 2020

does the Ad business model work with GPT-3 answer ? Isn't the point of google to make it 'confusing' enough to the point that I will click on the ad because it's clearer ?

How GPT-3 search-oriented answer would help with that ?

bsder · on Dec 20, 2020

You are whitewashing history.

Google was inferior to AltaVista for quite some time.

According to Wikipedia: "In 2000, AltaVista was used by 17.7% of Internet users while Google was only used by 7% of Internet users, according to Media Metrix."

So, two years after official founding, Google was still not ahead to AltaVista.

AltaVista even had a "visual clustering" thing that used Java (which would work great now with Javascript) that would allow you to refine your searches. I still cry that we don't have the equivalent of that 25 years later.

Then AltaVista got caught in the great DEC "hostile giveaway". Which left Google sitting in the right place with nobody to really compete against them.

One problem was that AltaVista was early--it basically predated common people using the web--and so it wasn't quite so clear how you monetized it. DEC effectively ran it as a goodwill "free service" to the internet. Even in 1999, on-line commerce wasn't very big. Remember, the big AOL/Time-Warner merger was in 2000.

The one thing that Google got right was timing. Google was in the right place when everybody switched from services like AOL to basically just accessing the web directly. And this let them put ads in the search which could be used for monetization.

TylerE · on Dec 20, 2020

You are confusing popularity with quality.

Was AOL the best ISP? McDonalds the best food?

TylerE · on Dec 20, 2020

(Also, an aside it's possible for it to be totally true too)

Early Google users definitely trended towards power users, and if the average early Google user made 3x as many queries as the average Altavista user, Google would have higher actual query volume.

hollerith · on Dec 20, 2020

>Plus it was much faster; responses in 1-2 seconds versus 5-10 sometimes for AV.

That is my recollection, too.

dumbfounder · on Dec 20, 2020

Not mine, but maybe that was after I switched. AltaVista was fast when I used it, but highly influenced by meta tags.

sradman · on Dec 19, 2020

The publishing of the PageRank algorithm was the wrong milestone; my bad. The important milestone was a less well-defined point when search engine users and competitors realized that Google was disruptive.

PageRank was the way forward but the true schlep, in the Paul Graham sense [1], was the brute force continuous hardware scaling required to keep up with the growth of the Internet. DEC needed a technical pivot first and foremost. I've seen no evidence that the AltaVista team understood the technical challenge but I could be easily convinced that bean counters and/or management stifled a promising response; absence of evidence is not evidence of absence.

[1] http://www.paulgraham.com/schlep.html

dboreham · on Dec 19, 2020

My recollection from the time (I didn't work at either place but was in the general area) was that everyone assumed Google were delivering their performance by use of massive hardware resources (I heard 70K machines at the time). So perhaps AV folks just couldn't imagine getting the hardware budget to complete effectively?

It's also worthwhile I think to point out that at that time, it wasn't settled that "search engines" as we know them today (like Google) were _the_ way to use the Internet. There were alternatives such as Yahoo (curated directory), and browser-side catalogs (netscape.com home page) that were much more popular. So it is possible also that AV folks weren't exactly thinking "We have to light the afterburners to go after Google in this immensely important internet search space". They might have been thinking "odd, someone spent $xxxM on hardware to run a search engine, that'll never work out".

jonathankoren · on Dec 20, 2020

WRT to curated directories, in the late 90s when Google came (along with others like Northern Light), it was clear that both directories (Yahoo and Moz in particular) and the established search providers (Yahoo, AltaVista, Metacrawler, etc.) were being overwhelmed.

Yahoo (aka Inktomi) search frequently took multiple pages of to find anything of quality. Curated directories were missing quite a bit. Furthermore, if I remember correctly there was some sort of payola scandal around Moz/OpenDirectory editors.

The market was primed for a better solution, and PageRank provided it.

Scoundreller · on Dec 20, 2020

I always laughed when I saw « Powered by Inktomi », like, why does this company I’ve never heard of want to be associated with this ?

jonathankoren · on Dec 20, 2020

Inktomi was pretty good then. In fact they powered both Yahoo and Microsoft web search at the time. When they were acquired by Yahoo, they formed the basis of Yahoo search, that was well respected in the search research community for years until Yahoo started to implode.

johbjo · on Dec 20, 2020

Early Google used commodity hardware to the point of being comical. Somewhere there is a "historical" display of commodity motherboards spread out on cork trays in a rack. Maximizing density/minimizing cost.

The ability to scale in this way with commodity hardware was new and important.

btilly · on Dec 20, 2020

When Google published PageRank in 1998, they didn't publish MapReduce, the framework required to do coordinate large distributed computations, until 2004. By then, AltaVista was dead.

If AltaVista had thrown resources at the problem using any of the then widely known techniques for scaling, they would have been constantly dead in the water.

bouncycastle · on Dec 20, 2020

What made me switch to Google from AV was the simplicity of the Google home page and the clean results page. AV was getting too cluttered, Google loaded faster. (We were still on dial-up too!)

alisonatwork · on Dec 20, 2020

The funny thing is that I remember AltaVista being much cleaner than Yahoo and the other competitors at the time. I think they only switched to the cluttered page to try keep up with the trends. Then Google came along and rolled it all back to basics. (Of course the better results helped too.)

richajak · on Dec 20, 2020

AV was good enough for me, better than the competitors. Suddenly google came, it was faster, I switched to google and never looked back.

Google was simple on those days, I wish I can bring back that experience.

bsder · on Dec 20, 2020

> It was such an innovative company back in the 60s and 70s, but unlike IBM weren't able to reinvent themselves in first the new 80s world of PCs and then later internet. Classic case of innovators dilemma.

No, this wasn't it. DEC actually had a quite good PC business which died when everybody died to the Asian manufacturers.

DEC also had plenty of technical stuff as well as cash flow to ride it out.

DEC was purely a result of executive level and board level malfeasance.

After the board forced the founder Ken Olsen (who wasn't a great CEO but actually did have vision) out, Robert Palmer (who had no vision AND was incompetent) (and not the singer--the singer probably would have been better as CEO) was given marching orders to sell off the company. Which he did--with no vision whatsoever. Lots of people tried to fight against it, but any division which started righting itself immediately got flogged off.

The patent lawsuit allowed DEC to jettison a bunch of the fab to Intel which then made them an attractive target to Compaq.

People love to comment that the merger killed COMPAQ when the reality was the entire US domestic PC industry was completely collapsing.

HOWEVER, to give you an idea as to how badly the "hostile giveaway" was managed, Compaq effectively bought DEC for less than their enterprise service annual revenue--about $2 billion per year. HP later milked this stream for more than a decade.

So, in the middle of a PC industry collapse, no executive could figure out how to convert a $2 billion annual revenue stream for enterprise services plus a whole bunch of leading edge technology into a profitable company.

This shows just how shit-tastic the executive management for both DEC and COMPAQ really were.

But, hey, the DEC board got their stock bump and cashed out.

echelon · on Dec 19, 2020

I read your comment a few times and didn't understand what had happened at all. I figured it all hinged on the term innovator's dilemma, since you used it a few times. So I looked it up.

Wow. Mind blown.

I think we're all familiar with disruption, but Innovator's Dilemma poses a theory or framework as to why it happens. It's rather brilliant and seems to fit every case of market disruption I can think of.

Incumbents serve existing markets and don't care about the small markets served by disruptive startups. They're making small, iterative improvements to their product.

Disruptive tech eventually hits parabolic improvement and by this time the incumbent can't catch up.

https://en.m.wikipedia.org/wiki/The_Innovator%27s_Dilemma

The competing S-curve idea is brilliant.

Thanks for sharing your anecdote and helping me learn something fundamental today.

bob33212 · on Dec 19, 2020

To be fair even google and early investors in google including myself ( 2005 post IPO ) didn't understand how lucrative and important being a dominant search play would be.

Blikkentrekker · on Dec 19, 2020

It is rather unfortunate that a single website won out in the search engine war and became the front page of the internet.

It's more power than any one company should ever have.

There was a time where Google refused to filter on human intervention and for political reasons, but that time has long passed.

zuppy · on Dec 20, 2020

It won because competition was really bad. Altavista was not a good search engine.

When you were looking for something, you had to use multiple tools, like Lycos, Hot Bot, Ask Jeeves and the multitude of the web portals that were available back then.

I remember when I first saw Google, it was an instant decision and I never went back, that much was the difference in quality. Unfortunately for us, nobody found a working business model into search in time and this allowed Google to become too big for any viable competition.

Blikkentrekker · on Dec 20, 2020

The problem is that how the market works is that once a monopoly has been established, that it's very hard for a competitor to challenge that, even if he have a better idea.

Large players make lesser costs due to large vertical integration and bulk discounts.

Even if I were to invent a new search algorithm that would be superior to Google's in terms of satisfaction with most searchers, I would be unable to get a wedge in.

This is not so big a problem with say a power company, but with a search engine, it becomes the front page of the internet through which everyone goes — the end result is that Google commands a great deal of political influence simply with how it decides that it's algorithms should work and what pages to prioritize.

Courts ordered Microsoft in the past to provide Windows users with notice of other web browsers, which was the primary way Internet Explorer lost it's dominance — perhaps it is time to order Google to provide users notice of other search engines and web browsers too as a matter of antitrust.

Multicomp · on Dec 19, 2020

thanks for posting this. I love when HN brings together the principal actors and primary sources. OT: I wish we could use webmaster more often, even though thats a much murkier proposition in the days of product managers deploying the new site via CDeployment.

_bz2r · on Dec 20, 2020

> how cool the DEC Hardware and Digital Unix was compared to the Sun Sparc and Solaris stuff I had to work on afterwards.

I'd love to hear more about this, if you are willing to elaborate.

Crashprofessor · on Dec 19, 2020

I was in Nassau working between FSG/PM and the ACA/CORBA team. Recall many trips to the WRL, they always had the coolest stuff; network tunnels and firefly. Back then who you hated said a great deal and I recall several other CEs playing the silver bullet game to perfection.

I'm curious the relation to the corporate blockchain?

at-fates-hands · on Dec 19, 2020

>> I'm curious the relation to the corporate blockchain?

I think its the idea of companies seeing new tech and trying to leverage it somehow in their own businesses and failing - badly.

Here's my own anecdotal evidence with blockchain:

I work in a very large health care company. For about four months, there was a huge buzz around the company about how to leverage block chain in health care. The main idea was using block chain to manage patient accounts and PPE.

We had all the execs bring in to do huge presentations. They brought in people from IBM to talk about Hyperledger and other block chain companies. They posted videos and articles about how this going to transform healthcare, we were all told this was going to be huge. They told people they were going to form a new team, hire developers and this was a going to be a huge focus in 2020.

Six months later? You couldn't find a single resource on any of it on any of the internal company sites. All the presentations stopped, the execs stopped talking about block chain seemingly over night and it was like poof! the idea of block chain, or any mention of it or the "revolution" that supposed to follow? Completely disappeared into the abyss, never to be heard from again.

I have no idea how much they sank into the notion that block chain could be used for health care, or how many people they hired or the contracts they signed with IBM, but I can only assume they lost a lot of money before they finally realized it wasn't going to work out.

ptx · on Dec 20, 2020

Would blockchain have been useful for healthcare?

As I understand it, the main reason to use blockchain is having multiple parties intact without any party being trusted, so if you do have some central trusted authority it's pointless. There's also the requirement that the network has enough participants that no single party can gain a majority of the network's compute power, which seems a very iffy assumption with private/permissioned blockchains.

So I don't see how blockchain is useful in a healthcare context. Please correct me if I'm wrong and enlightenen me if I'm missing something.

howlgarnish · on Dec 20, 2020

You're not missing anything, and this lovely flowchart from NIST shows just how rare actual use cases are:

https://thefinanser.com/wp-content/uploads/2019/03/NIST-flow...

However, the execs making decisions about hot buzzwords like "blockchain" or "AIOps" etc neither understand nor care.

akiselev · on Dec 20, 2020

That flowchart is from the United States Department of Homeland Security Science & Technology Directorate, not NIST.

tus88 · on Dec 19, 2020

> It is sad in many ways to see what happened to DEC (probably more so than AltaVista). It was such an innovative company back in the 60s and 70s,

They should have merged with a PC make obviously.

> but unlike IBM weren't able to reinvent themselves in first the new 80s world of PCs and then later internet.

Ironically IBM did fail to reinvent itself despite inventing the PC and it's cake was stolen from it.

sradman · on Dec 19, 2020

> They should have merged with a PC make obviously.

DEC was acquired by Compaq [1]:

> In 1998, Compaq acquired Digital Equipment Corporation for a then-industry record of US$9 billion. The merger made Compaq, at the time, the world's second largest computer maker in the world in terms of revenue behind IBM.

Maybe you were being sarcastic.

[1] https://en.wikipedia.org/wiki/Compaq#Acquisitions

blihp · on Dec 19, 2020

I took it to be sarcastic. It was a spectacular fail as within about a year of the acquisition everything good they were doing had completely disappeared from view never to be seen again. That's not to say that DEC could have done any better on their own: with the absolute market dominance of Wintel at the time, they were flaming out as so many other mini and workstation companies did as their markets evaporated.

wiremine · on Dec 19, 2020

It's hard to explain just how good Google was back in the day, compared to those early search engines.

I distinctly remember someone suggesting it to me at our CS lab in college. A few of us had never heard of it, and we all started to do some searches to try it out. There was silence for about 5 minutes, and then someone said "this is really good."

Say what you want about what Google has turned into, but it was an incredibly important tool that came around at the right time.

Also, it makes me really happy that they've kept the "I'm Feeling Lucky" button around.

userbinator · on Dec 19, 2020

The keyword(no pun intended) being was --- I remember going through many pages of relevant results and finding the "needle in a haystack" many times with Google, but now it seems like anything even vaguely obscure gets absolute rubbish results (and it claims there are no more results if you try to go deeper, despite the fact that I know the pages are out there.) I'd say it started to decline roughly 10 years ago.

Amezarak · on Dec 19, 2020

Google has definitely just plain delisted a lot of older sites, even if I do exact quote matches I can’t get results for sites I know exist and used to be indexed and often turn up on other search engines.

On top of that, as you say, anything that isn’t from a huge web property or corporate site is so heavily penalized you’re lucky to ever see it even if it is new.

The utility of google as a web search engine is definitely declining over time.

nimbleal · on Dec 19, 2020

This is possibly a self-destructive pattern, too. Once nobody can find anything outside of Quora/Wikipedia/Pinterest/Stackoverflow etc. why would anyone bother creating small sites/blogs (this has been happening already for a decade) — But then, once the entire accessible web is just these handful of big sites, what is Google’s utility? They all have internal search and are increasingly accessed primarily through apps

thesagan · on Dec 20, 2020

Like grocery stores or many types of restaurants, for example. They usually can't compete against the Walmarts and Paneras, which suck up consumer mindshare. Similar with soft drinks. Coke and Pepsi build as much mindshare as possible. You may invent a great product, but it likely will never get noticed among corporate noise.

Google, Wikipedia, Facebook are becoming the default go-to spots for particular activities, similar to what McDonald's did to fast dining in the 60s and 70s. I guess, in general, markets saturate, oligopolize, and the small spots fade away unless they're niche or very high quality.

akiselev · on Dec 20, 2020

It's even delisting a lot of content from existing sites. For example, I've got previous HN comments that I used to reference using `akiselev site:ycombinator.com [some key words]` that are nowhere to be found now. I started noticing this a few years ago.

wolco2 · on Dec 19, 2020

So much that it could disappear and we would be able to move on. Hard to say that a few years ago.

OpticalWindows · on Dec 20, 2020

>Google has definitely just plain delisted a lot of older sites

This really sucks. What are the alternatives?

ColinHayhurst · on Dec 21, 2020

https://news.ycombinator.com/item?id=25372401

klenwell · on Dec 19, 2020

> It's hard to explain just how good Google was back in the day, compared to those early search engines.

This 2000 New Yorker article does a pretty good job:

https://www.newyorker.com/magazine/2000/05/29/search-and-dep...

It was actually how I discovered Google. I was a grad student at the time (English literature) and dreamed of accessing all the primary and secondary sources on a particular topic from the convenience of my own keyboard (still a pipe dream today). I cycled through every search engine I could find looking for one that would actually return results relevant to my search terms. I remember AltaVista being touted as one of the best but still failing to satisfy.

As soon as I read this New Yorker article, I checked out google.com and had that same eureka experience: finally, this is the one.

neilv · on Dec 19, 2020

In grad school, in the early days of the dotcom boom, some of us CS-ish people curious about startups would talk about them. One day, one of the PhD students asked everyone which dotcom's stock would they most like to have. I said Google. (It might not have been a company yet, but we kinda assumed.)

Google worked surprisingly well, obviously they were going to be hugely important (unless another player appeared and did a big next leap), and obviously they were smart. (On the side, I also got warm-fuzzies that the search hits I got seemed to have a strong Linux bias.)

The grad student who asked, coincidentally, ended up deciding not to finish his PhD, and went to Google. :)

compiler-guy · on Dec 19, 2020

I remember being blown away by how good google was around those times.

But I Waldo remember just how blown away I was when I first saw altavista do it’s thing. Altavista was as equally amazing in 1995 as google was in 2000. Innovation goes in waves.

philliphaydon · on Dec 19, 2020

I donno, for me early google was horrible. I always went back to AV because it gave me the results I was looking for. But over time AV just got slower and bloated and Google just got better.

1vuio0pswjnm7 · on Dec 19, 2020

Glad to see someone else had the same experience. When someone recommended Google to me, I tried it and immediately went back to AltaVista. The only distinctive characteristic I remember noticing was the immense whitespace on the Google home page. That sort of simplicity in page design was different from other search and portal websites at the time.

Using AltaVista for me meant digging through page after page of results. Comprehensiveness was the main concern. It was up to the user to evaluate the relevance of the results.

Early Google made similar claims about comprehensively searching millions of pages, but as we know today, they are intent on inferring meaning and purpose. They actively discourage and prevent users from combing through page after page of results. User evaluation (i.e., intelligence) is not expected. Google attempts to evaluate results for the user based on popularity, originally estimated primarily by counting backlinks. Popularity as a filter is useful sometimes but deeply flawed at others. It's arguable Google has dulled, atrophied or stunted development of web users' analytical skills. When it first appeared on the web, Google had no paid placements and no advertising. What was not to like? They later abandoned their original mission to avoid the influence of paid placement. They became beholden to advertising.

It was not difficult to see when and where the influence of advertising came into AltaVista. However when this started to happen at Google, Google tried to hide the ads by making them text-only. As if the influence was not there.

We need another AltaVista, where user evaluation of results is allowed and encouraged, with a mission statement like the original Page and Brin paper announcing Google: no influence by advertising. Ultimately PageRank was dependent on human discretion: the decision whether or not to link to another page. We soon learned that this discretion, this choice to link or not to link, easily becomes driven by money when people know it effects PageRank. Google quickly got gamed and it has been trying to pretend it can manage this ever since.

We need non-commercial search.

contingencies · on Dec 19, 2020

I also had this experience. However, I now feel all search engines are returning worse results. Very rarely can I find anything meaningful outside of major content providers. Everywhere my results are filtered through a local government lens. In many ways it feels like search on Facebook, YouTube, Twitter, Instagram, LinkedIn, WeChat, closed appstores, etc. have become the new defacto content aggregation mechanisms, and those are tending toward feeds and curated content rather than requiring active search. The best and most meaningful results on most actual search engine results pages are Wikipedia excepts, or trivial widgets (currency or unit conversion results, weather forecasts, etc.). I don't have any objective measures and am unlikely to be a representative sample but I feel I am searching less, my browser has become a URL-bar based memory system, the traditional 'home page' notion has been replaced by a Firefox auto-curated most-frequent-sites list, and I use traditional search engines more like a command line than a library index. OTOH we now have Library Genesis which is brilliant.

1vuio0pswjnm7 · on Dec 19, 2020

Irrespective of its legal status, LG is proof one does not need 75,000 employees, nor any perceived "brilliance", to provide a world-class, non-commercial information retrieval service. It can hold its own next to many university libraries' remote access facilities and it puts Google Scholar to shame. It is command-line friendly and offers bulk data. Unlike Google, LG does not try to guess what the user is searching for and answer questions. One encourages traditional learning, the other is incessantly trying surveil its users in the name of selling online ad services to third parties.

ssvss · on Dec 23, 2020

does libgen provide full text search. I thought it only provided title/author search.

rhizome · on Dec 19, 2020

IIRC, the draw was "you don't have to use plus signs" and the other options, and you didn't...until Google started trying to read between the lines of our searches and we started wishing Google had plus signs.

Google's aversion to zero-result queries is a problem, and possibly the problem, with their "accuracy," so to speak.

chubot · on Dec 19, 2020

I remember the first time I had heard of Google -- it was ALREADY used as a verb!

I was interning at Xerox PARC (adjacent to Stanford campus) in summer 2000. As far as I remember, I used Alta Vista at the time.

Somebody asked a question, and one of the researchers/intern mentors said to "Google it"! I think there were some puzzled looks, but we tried it, and I started using Google and never went back.

When I returned to college in the fall, I remember my former housemate saying how good Google was too. She had started using it too. Once people started using it, they never stopped!

freedomben · on Dec 19, 2020

Agreed. I was a heavy user of Info Seek at the time (I especially liked how they highlighted search terms in the results) but it was hard to deny how much better Google results were. For a long time I just used Info Seek first and would try Google second, but after a while I realized I was just using Google. The results were impressive.

spicyramen · on Dec 19, 2020

I used to go to library to find CS books samples and read them for about 1-2 weeks before I actually get to code. After a classmate share Google with us it was just unbelievable the quality of results and how easy was for us to start finding sample code and literature. Yes, the best tool probably in modern human history

enneff · on Dec 19, 2020

The main thing I remember about Google was that it was fast, really fast compared to AltaVista, which was becoming so slow as to be nearly unusable (sometimes you’d wait 30 seconds for a page load).

hliyan · on Dec 19, 2020

Weirdly similar experience. I too discovered Google in my college CS lab, when I noticed the (then) weird looking homepage. I had been swearing by AltaVista at the time. But it was with Google that I was able to, for the first time, get what I wanted without having to go through multiple pages of search results. A search engine giving you what you want within the first few results of your very first search attempt was just not a thing. Crafting search queries was an art, and so was wading through results.

pault · on Dec 19, 2020

On the other hand, search engines of those times gave you what you asked for, not what it decided was best for you. That option is gone now, and searching for many niche topics is no longer possible. If that's what you're looking for, it might as well be 1990.

peterlk · on Dec 19, 2020

I wonder if the "I'm feeling lucky" has stayed around because it's a good training heuristic for Google Home (and maybe other services?) even if very few people actually use it. I'm genuinely curious why they've kept it around.

flycaliguy · on Dec 19, 2020

My guess would be it’s more of a branding exercise. Quirky, colourful, friendly.

saalweachter · on Dec 19, 2020

It's also pretty ideal feature to keep around even if it is mostly unused.

1) It isn't very complicated -- it doesn't require a lot of code that is going to rot.

2) When it is used, it's cheaper rather than more expensive.

3) It only "clutters" a UI that isn't actually seen that often nowadays -- you have to actually go to google.com, rather than just typing in the browser bar -- and isn't particularly cluttered.

Google234 · on Dec 19, 2020

However, if it makes the experience worse then it’s not worth keeping. I’m sure they did some study where they found that most users weren’t happy and had to go back.

decasteve · on Dec 19, 2020

What drew me to AltaVista was that it gave me results that didn't exist elsewhere. It took some boolean gymnastics to get precise results, or going a few pages deep, but stuff, if it existed, was findable.

I almost wish there were less precise and more obscure options out there today. A search engine that purposefully didn't index mainstream news, social media, nor shopping/product sites.

_Microft · on Dec 19, 2020

I wish there were still a search engine that followed my query strings and operators to the letter instead of attempting to be extra clever. Even Bing seems to be doing that no longer.

stinos · on Dec 19, 2020

This would be my main complaint about DDG: when there are no results for quoted search terms (but not when it's a single quoted term) it will just ignore the quotes and give results which most of the time are completely not what I need without even saying it is doing that, yet often it takes me a while to realize I made a typo and it's now giving me things I cannot use. IIRC it was not always like that though.

chris_f · on Dec 19, 2020

We will respect your search query at Runnaroo if you use quotes. Google also has a verbatim search filter, which does a decent job most of the time.

_Microft · on Dec 19, 2020

Thanks, I will check it out. Good luck with the project!

the__alchemist · on Dec 19, 2020

This looks really cool! I've on a search project as well, that looks like it has the same goal, and uses a similar approach and philosophy.

dheera · on Dec 19, 2020

... until you want to search for something verbatim with quotes in it. Google doesn't seem to respect escaped quotes like "hello \"this\" is a test" which is quite surprising to me considering how they invent programming languages and all.

jeffbee · on Dec 19, 2020

I imagine this is because the punctuation isn’t in their index at all, so it makes no difference how you format the query.

derefr · on Dec 19, 2020

Is there an whole-Web search engine that isn’t a fulltext search engine, but rather works more like grepping a virtual text file that contains the entire Internet, such that one could match on punctuation and so forth?

jeffbee · on Dec 19, 2020

I don't know ... would you be willing to insert a $1000 bill every time you wanted to make a full pass over such a corpus?

dheera · on Dec 19, 2020

They could execute the query on the index and then filter before serving.

I'm sure a thousand engineers at Google can figure this out.

jeffbee · on Dec 19, 2020

You think it's practical to run the query on the index, materialize the full result, re-index it, run the full query string, and rank? This suggests you do not understand the scale of the problem.

dheera · on Dec 19, 2020

You don't need to re-match or re-index on the full set of results to display just the first page.

I fully understand the scale of the problem. I also think this is a very approachable problem for a thousand Google engineers who have done a lot harder things.

mech422 · on Dec 19, 2020

So much this... I commented earlier how I miss altvavista search language. Stuff like the 'near' operator was awesome. Want all john doe's papers ? "john" near "doe" got you "John Doe", "Doe, John", "John A. Doe", etc

wslh · on Dec 20, 2020

Yes, playing with the near operator was a way to do a manual page rank before Google existed.

sshagent · on Dec 19, 2020

oh god yes, that would be nice. I know what i typed stupid search engine, if i put it in quotes...i only want that exact phrase. Don't just cast that aside and search for the words individually.

alexchantavy · on Dec 19, 2020

Yeah, I wish that there was an option in Google (or anything else) to rank my results based on a criteria other than "many people near you have searched for the same thing". Sure, this is helpful for current events but for other things I've been finding that my results are garbage blogs written by an AI and I'm needing to append "reddit" to the end of my query to get a real answer

nefitty · on Dec 19, 2020

I think I’m appending “Reddit” to about 70% of my searches these days.

michaelcampbell · on Dec 19, 2020

I've largely switched to duckduckgo, which does honor quotes in that way.

stinos · on Dec 19, 2020

Hmm, read this after I posted my other comment :) So, is that a setting? Does a search for "nonexistin gcrap" return no results for you then? It does for me..

michaelcampbell · on Dec 19, 2020

I played with it a bit; it may be limited to one word. So "nonexstin" does come back with sites with that literal word in it. But with 2 words, it does the "what I think you mean" thing. TBH, I have only tried it with one word at a time for my use of "the site must have at least this exact word" type use cases.

rangersanger · on Dec 19, 2020

I would happily pay $5 a month For an ad free search engine that didn’t index Pinterest.

dalbasal · on Dec 19, 2020

I think this can be generalised as "more power to users." You might want to filter Pinterest. OP wants to avoid mainstream news, social media, nor shopping/product sites. This requires users to be involved.

Filters, refinement, user preferences. etc etc. Search is a powerful tool, and the power has been growing exponentially since 1995. But, almost all of that power is channeled into instant results, intuitiveness, "sane defaults" and such. All the capabilities are in the background and the assumed user is lowest denominator on all fronts. The low effort, low technical capabilities, low understanding, etc. This is most people (me included) most of the time, but not all of the time. Maybe I want to filter seo spam (like pinterest) more aggressively. Etc.

This isn't a slam on google really. One thing can't do everything and the thing google search does is what most people need most of the time. It's definitely the thing that makes the most money. But... none of a search engines power has been channeled into making search a power tool. It's a tool that everyone uses many times per day, but there is no learning curve. No getting better. You can't really invest effort and get rewarded for that effort.

Pity really, all this power is there. It's right under the surface. Let us be more than the lazy, dumb user sometimes.

Google have reversed the User Interface paradigm. Instead of users learning the software and telling it what to do, the software learns the user. I don't necessarily mean NNs or personalisation. I mean that the paraign is software-centric. "If the user does X, how should the software respond?" instead of "If the user user wants Y, how does she make the software do it?" That's great for intuitiveness, but it also creates a frame where the software gets better over time but the user never does.

abraae · on Dec 19, 2020

Yeah, I also want filters, refinement, user preferences. etc.

And I want it to be social - I want to search using the same filters that Tim Bray uses.

Or I want to use the filters that the people use who in aggregate tend to click through to the same results pages that I visit.

So much amazing stuff could be done, but here we are controlled by algorithms that treat us all as brain dead sheep who only want to know about what the Kardashians are up to.

I suspect that Google knows about these desires, but would prefer the world where everyone is a sheep.

visarga · on Dec 19, 2020

> think this can be generalised as "more power to users."

I agree, we need more power to control. There should be more than one recipe for ranking and filtering, and we should have more to say about what information is being hidden from us. Society is diverse, a single formula automatically discriminates some groups. Let people make the choice.

smichel17 · on Dec 21, 2020

> Google have reversed the User Interface paradigm. Instead of users learning the software and telling it what to do, the software learns the user. I don't necessarily mean NNs or personalisation. I mean that the paraign is software-centric. "If the user does X, how should the software respond?" instead of "If the user user wants Y, how does she make the software do it?" That's great for intuitiveness, but it also creates a frame where the software gets better over time but the user never does.

May I quote you on this?

It is a great description of the 3rd premise in a talk proposal I submitted to LibrePlanet 2021, which goes something like:

- Software freedom is predicated on the notion that users ought to be in control of what runs on their computers.

- I claim that tech literacy is necessary to exercise this control.

- "Google Design" presumes that "nontechnical users" are not and will never be capable of tech literacy.

If all of these are true, then "Google Design" is incompatible with the idea that Software Freedom is for everyone. And indeed, we see this borne out in many people's disdain for copyleft licenses, which impose restrictions on developers in order to guarantee freedoms for end users. That's a logical viewpoint, if you believe that end users wouldn't benefit from software freedom, anyway.

"Google Design" does make software more accessible to end users… at the cost of turning your computer into an appliance. Which makes it a self-fulfilling prophecy. People don't magically become tech literate; they intentionally learn the skills so they can achieve something. Remove the power from the tools, and you take away their motivation to become tech literate. I have a great story about my grandma to illustrate this point.

If our goal is to make computing (and, by extension, software freedom) more accessible to everyone, then we need to change our approach. We need to prioritize education and teaching tech literacy. And most of all, we need to write software that is powerful enough that users will see the benefit in learning to use it.

That's they key points from the first half of the talk. The second half is about how we can create software that is powerful without sacrificing usability. It basically boils down to "make simple things easy and hard things possible" — creating software that Just Works™ without configuration, but allows you to customize its behavior if/when you want to. In particular, I'm a fan of embedded scripting interfaces (e.g. spreadsheets), which allow a vast amount of customization without creating "settings overload" that is typical of many FLO[1] programs.

[1]: https://wiki.snowdrift.coop/about/free-libre-open#flo

dotancohen · on Dec 19, 2020

I'm with you on this. I would pay far less, $1/month would be the point where I would begin considering it. There are just too many "only $5" services and they would add up quickly if I were to allow myself that amount of money.

The problem with this model, though, is support. A service with 100 million untrained end users each paying a dollar will need to provide quality customer service to 100 million untrained end users. It is much easier to sell ads to 10,000 "affiliates" who feel that they have some special arrangement and thus accept some responsibility of their own, and make $10,000 off each of those.

true_religion · on Dec 19, 2020

I agree there are too many $5 services, but a search engine is something that I use every waking hour of my day. I get so much use from it, that $5 per month is a bargin.

I would definitely pay Google just to be able to remove some sites permanently from my results. They're just spam to me, even if Google thinks they are legitimate.

freedomben · on Dec 19, 2020

Yes, exactly right. If people are paying any amount at all they expect it to work and work perfectly. I'm not saying that's unreasonable, but it does quickly become unfeasible at scale with users who have no training at all.

Then on top of that the skimming the payment processors do makes a large percentage of that $1/month go to transaction fees. Charging yearly helps with that but many people won't want to do that.

Closi · on Dec 19, 2020

Facebook makes $50 per user per year, so I would assume $1 per month is an order of magnitude less than what google is making from you at the moment.

$10 per month is probably more realistic.

layer8 · on Dec 19, 2020

I’m using a search shortcut that just adds “-pinterest” to all searches (and also activates Google verbatim search mode).

enriquto · on Dec 19, 2020

I'd love an option to setup domains where no search results are to be returned. Of course pinterest, but also quora, (and sometimes reddit), etc.

pault · on Dec 19, 2020

-site:Pinterest.com

mishac · on Dec 19, 2020

That doesn't work well for me because I get still get pinterest.co.uk, pinterest.ca, pitnerest.it etc etc.

I guess we need wildcard domain exlclusion.

LordNight · on Dec 19, 2020

The correct command is -inurl:pinterest.

Additionally, for those, who likes to add reddit in the search query, i'd also like to recommend " inurl:forum OR inurl:board " (note also that forum and forums gives different results for some reason).

Personally, I like to do broad searches on google to discover new stuff, but for the past 7 or 8 years, google often returns ~100 result instead of the usual 1000000+. And, contrary to the logic, adding various inurl: commands actualy gives you more results instead of limiting them and thus significantly improves searching capabilities.

ajnin · on Dec 19, 2020

I'm using the following User Script to remove specific domains from Google search results. It does the job well enough for me.

https://greasyfork.org/en/scripts/1682-google-hit-hider-by-d...

kome · on Dec 19, 2020

google with filters for a dollar a month would be awesome, like a function to "block this domain forever"... it would clean my search results from the annoying offenders. and i would gather user inputted blocked domains data to produce better search results for non subscribing users...

kordlessagain · on Dec 19, 2020

Working on it.

slater · on Dec 19, 2020

mech422 · on Dec 19, 2020

The really annoying part about pinterest, is I never seem to be able to get back to the original directions/plans/whatever. Its like "Hey, look at this cool stuff I won't tell you how to make, even though I swiped it from a HOWTO site"

disown · on Dec 19, 2020

or you could just add "-pinterest" to your search query and save yourself $5.

mech422 · on Dec 19, 2020

I would also pay to have google freaking remember my search preferences. You've got my dang account - pull the prefs. from it.

jeffbee · on Dec 19, 2020

Honestly wouldn’t it be against EU antitrust policy to explicitly do this?

dotancohen · on Dec 19, 2020

It would work if instead of deindexing Pintrest specifically, the end user could enter a list of sites that they wish to never see results for.

dalbasal · on Dec 19, 2020

>>if it existed, was findable

This is really the precedent that mattered, I think. Remember one line of a poem? You can probably find the rest. Recall a Douglas Adams piece about rice farming in Bali or Java or something... you can find it. As you say, it took boolean gymnastics. But, that really just means modifying your search terms and scrolling through some results.

Being less powerful than 2020 google kind of put more power/responsibility in users' hands. A user needed to use a search engine like a tool.

I want what you want too... Everything google (mail/search/youtube, etc)... but designed for more user effort. Finding the best-for-most result instantly is great, but sometimes you want the tool to help you find a result on page 112 within a few minutes.

Bring back a little bit of a directory feel even. Let me narrow down, refine and shape results gradually. Assume that I will rummage through a bunch of crap to find what I want.

foobarian · on Dec 19, 2020

Same, at one point the size of AltaVista's index was vastly above all the competition and was the best place to go for finding obscure stuff. With the boolean filters it was possible to exhaustively search the web.

Unfortunately that didn't scale with the size of the web; at one point the amount of pages got large enough that doing a smart ranking like Google did became more effective than the "look at all pages of results" usage required for AltaVista. It felt like a monumental transition, like when a program gets too big for main memory and causes the computer to start swapping to disk.

tams · on Dec 19, 2020

Not exactly the same approach that you describe, but Million Short allows you to exclude the most popular websites from your search results.

https://millionshort.com/

Terretta · on Dec 19, 2020

Yes... Enjoyed AltaVista and FAST (AllTheWeb)[1] for precision searching.

That reminded me of Architext, and EWS (Excite for Web Servers)[2], but a good bit of Googling (ahem) later, seems difficult to find much about most of the mid to late 90s standalone local search engines any more, except perhaps Inktomi. Most are not mentioned in Wikipedia search engine timeline[3].

1. https://en.wikipedia.org/wiki/Microsoft_Development_Center_N...

2. https://www.wired.com/1998/01/excite-moves-to-patch-search-s...

3. https://en.wikipedia.org/wiki/Timeline_of_web_search_engines

EDIT:

Using the search engine Qwant, I was able to find an original EWS (Excite for Web Servers) help page:

http://www1.udel.edu/Excite/AT-helpdoc.html

At the time, it was amazingly out of the box:

Excite for Web Servers makes it easy for you to add searching -- Excite, Inc.'s advanced concept-based searching -- to your Web site.

Excite for Web Servers provides a simple Web-browser interface for doing all the things necessary to enable concept-based searching of collections of documents -- administering, indexing, and searching over the collections.

In particular, one can:

- define a document collection -- that is, specify a set of documents to be considered a single collection over which one can search,

- design customized pages for displaying to users who wish to search over that collection,

- index that collection, monitoring the progress, and search the collection.

With Excite for Web Servers, it's easy to set up concept-based-searchable Web sites in minutes.

kevin_thibedeau · on Dec 19, 2020

The "near" operator was particularly useful.

vehemenz · on Dec 19, 2020

The head librarian at my high school was an older lady in her early 60s, and she suggested that I use Google because it was the best search engine. Back then I used Webcrawler, but there wasn't too much distance between the competition.

I thought it was weird that "googol" was spelled incorrectly and that Google's logo was ugly even by Paint Shop Pro 4 standards. It looked like search for kids. I assumed the librarian didn't know anything about computers and dismissed her advice. Within a few months everyone was using Google.

parsimo2010 · on Dec 19, 2020

Pro tip: Most librarians have master's degrees and their field is all about information storage/organization/retrieval. It takes very little time to teach a kid how to put books on a shelf. The reason librarians are at libraries is to manage the collection of knowledge and to help people find information. So I'd give a lot of weight to a librarian's opinion on a search engine. The hardest part is finding out whether or not someone working at a library is a "real" librarian without being insulting.

bonoboTP · on Dec 19, 2020

Probably different in different places, but in many libraries I've been to, you often interface with student part time workers or the stereotypical ancient person who still lives in a previous age and is bitter about people being impolite.

But I'm always reading positive stuff about American public libraries that they are not really just about borrowing books, but free internet, photocopying, showers, some kind of free social program to help poor people with any information related stuff, like job search or government forms.

reaperducer · on Dec 19, 2020

My parents used to check out power tools from the library when they were doing home maintenance. Libraries are awesome.

Mediterraneo10 · on Dec 19, 2020

> Most librarians have master's degrees

This may be true of public libraries (at least major branches) and university libraries in the USA, but is it true even of high school libraries?

tokai · on Dec 19, 2020

You can try looking up job adds. But yes it is true for hs libraries.

amelius · on Dec 19, 2020

> It looked like search for kids.

Today, everything looks like it is made for kids.

https://blog.prototypr.io/are-we-designing-for-children-an-a...

whoopdedo · on Dec 19, 2020

Google also came about around the same time as the original candy-colored iMac. It was part of the general rebellion against "boring" technology.

asdff · on Dec 19, 2020

I really was boring when the floor, the desk, the cubicle, the walls, and the computer were all the same shade of aging beige. The sterile asylum days of technology industrial design.

aidenn0 · on Dec 19, 2020

OT, but... That article has the most misleading graph I have ever seen. It plots two items with the same units, but different origins and scale. It looks like we had a huge drop in active leisure time and it was replaced hour-for-hour by screen time (and that screen time was near zero in 2000). Instead a modest decline in the former and a slightly larger increase in the latter

amelius · on Dec 20, 2020

(*) I have to add an exception for Apple here.

beezle · on Dec 19, 2020

When will somebody replace Google (and Bing) with one that actually works again? Is the cost of entry that high? Is there no business model to permit one to operate at some level of profit?

I'm sure most here have been frustrated by the difficulty of getting "good" results on searches, even with modifiers. But what most troubles me is Google's memory/history has grown smaller and smaller, as if it has Alzheimers - searches that used to return results now bring back none.

toast0 · on Dec 19, 2020

The cost of entry is pretty high.

The web corpus is huge, which leads to follow on problems: it's expensive to fetch it, and to proccess it, and to host the resulting indexes. Fetching is tricky also because you're likely to get blocked from sites it you're too aggressive.

To justify all that expense, you need a lot of users, but it will be hard to get those users because there is 20 years of 'search = google' to compete with. Yahoo search user testing from 10+ years ago found that users would prefer search results displayed with Google branding over search results with Yahoo branding, regardless of the search results. Maybe it wouldn't be so bad if it's Google vs a new hip name, but you have to somehow cultivate that hipness. Bing doesn't have it, Amazon tried doing websearch and quit pretty quick (but maybe it's used for Alexa?).

You'd realistically need to build out an advertising platform too. Using Google's ad platform while trying to compete with their core market seems like a bad idea. Using Microsoft's ad platform is probably not going to a good experience, but maybe you can start with it.

zuppy · on Dec 20, 2020

there’s another big problem that I read about a few days ago: many sites are blocking all crawlers that are not google through robots.txt file.

jldugger · on Dec 20, 2020

The common crawl is ~270 TiB uncompressed: https://commoncrawl.org/2020/12/nov-dec-2020-crawl-archive-n...

Which would be easy enough to store in 2020, but then you need to preprocess it in a way that is amenable to both search and result ranking. But lets say your indexing is super good and matches the compressed version: 80TiB. Throw that in EBS, and you're paying $6k/month just to store it. You also need CPU, and memory to actually compute from that data though! If we instead use i3 metal instances, you're looking at about $2700/month each, and you'll need 15 of them for 3x replication. $40k per month isn't bad, if you're a startup with VC funding. But... we also need network egress... All this just to be literally the common denominator search engine with zero users.

So, how do your users get to you? In 2020, you have three major sources: browser searches, phone searches and direct traffic. If you want to be in the browsers, you're going to have to pay, and if you want to be default you must pay more than the incumbents who have their business model figured out. And bid at scales of roughly your Series B and C combined. Phone OSes, same deal: you need to be prepared to bid high, and in volume. Direct traffic is basically word of mouth / marketing driven, and for our common crawl search, we can assume is relatively nil. So even search traffic has an acquisition cost, and almost all of the sources run their own search engine that you would need to bid against.

So this point you need to start thinking about revenue, because every query you get literally costs you money. We know that search engine ads work okay, since the user is clearly expressing intent. But those different users have very different values -- someone searching from an iPhone or MacBook Pro is likely more valuable to advertisers than a 10 year old Linux laptop running Firefox with Adblock and a Pi-Hole DNS server. And without traffic nobody's going to bother running campaigns on your platform.

Alternative revenue strategies seem unlikely to work -- Google is free and Bing literally pays users, so subscription seems unworkable. You could try to find a niche, the way DDG has, and perhaps chisel away at market share slowly, but you'd need some content indexed that is unavailable to competitors, and that will come at a price.

jakub_g · on Dec 19, 2020

Cliqz tried to build index from scratch and be privacy-first, but they failed to reach user base big enough to survive the pandemic.

https://cliqz.com/announcement.html

ColinHayhurst · on Dec 21, 2020

Replacing isn't necessary, but we need alternatives. And that means independent indexing. You can alawys use more than one.

There are only 9 companies (crawling & indexing) independently: Google, Bing, Yandex, Baidu, Sogou, Mojeek, Gigablast, Naver.kr, seznam.cz

as listed here, with other search partners (of Google and Microsoft)

https://twitter.com/SearchEngineMap/lists

Disclosure: I maintain that resource and work at Mojeek

Dumblydorr · on Dec 19, 2020

Too many sponsored links, and too much personalization. What happened to displaying the best information at the top in an easily scanned fashion?

Maybe there would be more search engines in a less anti-competitive market... monopolies are hard to tackle.

l33tman · on Dec 19, 2020

I only browsed the article but had fond memories of how revolutionary AltaVista was when it launched. Suddenly you could browse the web without following links. This started a trend where you provided lesser and lesser value by stuffing links on your homepages (yes this was a thing) as people didn't need the links anymore.

Some years later I remember AltaVista suddenly became full of paid links and ads, to the point of unusability. This is when Google came in, with no ads, no paid links, and actual good search results.

The irony.. now Google fills at least half the first search page with paid for links and unusable results.

Unfortunately, nobody (in the long run) gives away something completely for free. I would pay $1-2/month for a search portal without paid links and no sell-off of my private info.

jaypeg25 · on Dec 19, 2020

Do you think $1-2 a month would really be enough? Something tells me Google makes more than $1-2 a month off me.

chris_f · on Dec 19, 2020

The estimated average Google search ad cost-per-click (CPC) is $2.69 [0].

Working on the conservative assumptions that an average person will run 10 searches a day and click on 1 search ad a day, Google will make $83.39 a month from the average user. It is likely much more though.

[0] https://valveandmeter.com/pay-per-click-statistics/

l33tman · on Dec 19, 2020

Their global ad revenue is a public figure right? It should be easy to estimate an average. If they have a billion users, do they actually have $100B in monthly revenue?

I highly suspect the distribution is not flat in any way though (some users are much much more valuable than others and it's why advertisers pay FB and Google to target the ads for them).

chris_f · on Dec 19, 2020

Interestingly enough, I just went to 'Google' that question [0] and got some very non-relevant info boxes:

https://imgur.com/a/J9U5ZwJ

-----

[0] https://www.google.com/search?q=google+ad+revenue

plorkyeran · on Dec 19, 2020

Averaging one click on a search ad per day per user is much higher than I would guess.

m-i-l · on Dec 19, 2020

Bear in mind that there is a vast amount of clicking on "wasted" adverts, i.e. paid adverts which would have been the top result of an organic search anyway, e.g. eBay spending $20M a year on ads targeting the keyword "eBay", which they thought was great because it appeared to give a $245.6m return on investment, until they switched it off for a bit and realised they got pretty much the same traffic and conversion rate without the massive advertising spend[0].

[0] https://thecorrespondent.com/100/the-new-dot-com-bubble-is-h...

chris_f · on Dec 19, 2020

I would say that may be true for the average HN reader, but in my experience definitely not true for the average overall user.

Based on the linked study [0], 57% of people can't tell the difference between a Google ad and an organic search result.

[0] https://varn.co.uk/01/18/varn-original-research-almost-60-pe...

drexlspivey · on Dec 19, 2020

I've never clicked an ad in my life

nalekberov · on Dec 19, 2020

How old are you?

l33tman · on Dec 19, 2020

I never click on their ad links so it would be a pure bonus for them if I pay :) But yeah I don't know... sometimes the supply/demand curves never meet.

Still think the business opportunity for a really good paid-for search engine is there. Of course it's not trivial to make a search engine but as I feel Google's usability is in a falling trend, the bar is getting lower..

apocolyps6 · on Dec 19, 2020

off of serving you ads specifically? I doubt it. Bulk ad targeting data is under 30c per person.

Source: https://ig.ft.com/how-much-is-your-personal-data-worth/

RoutinePlayer · on Dec 19, 2020

I remember having AltaVista as my primary bookmark, until they accelerated the ads and monetization, and render it almost completely useless. It was so obvious they wanted to make quick, obscene money from of it. They cared very little about the negative impact that had on user experience. That's why AltaVista went like the dodo. Period.

l33tman · on Dec 19, 2020

According to the article, altavista switched owners 3 times during the last 3 years of its life. I assume one of these had the bright idea to fill the results with crappy paid-for links. Google has these geniuses to thank for priming the playfield for them by destroying themselves.

hakfoo · on Dec 19, 2020

I recall there was a low-bandwidth, "text only" version you could use for some years afterwards which was just the search box. Probably suitable for Lynx users. Then they added one banner.

I don't think the UI switchover was uniquely AltaVista-- remember this was the time when everyone wanted to be a portal and had to have a section with sports scores, repackaged news, and stock tracking.

Yahoo was the last man standing on that path, but I seem to recall a lot of hay being made about the Excite/@Home stuff where the ISPs were supposed to push their portal on unsuspecting customers.

WalterGR · on Dec 19, 2020

I very distinctly remember when AltaVista removed support for "quoted words and phrases" in queries. I was studying CS in college at the time. I think that's when we switched en masse to Google.

mech422 · on Dec 19, 2020

I still miss AltaVista's query language. So much better at narrowing stuff down to _exactly_ want you wanted then google is, even today. Between googles "let me guess what you want and ignore search terms" and their paid placements, page 1 of the SERP is useless for technical work. Doubly so if your researching something obscure...

pg_bot · on Dec 19, 2020

I still remember when I was first introduced to google in the 5th grade. Our computer class taught us about search engines and how to find information online. They showed us lycos, altavista, dogpile, ask jeeves, etc. and everyone in the class had their favorite site they would use when working on projects.

Within a day of showing us google, every kid in the class used google exclusively. They were so much better than their competition at the time.

CapitalistCartr · on Dec 19, 2020

I loved Altavista. The thinking had to be done by the user, but uf you understood the process, it was a precise and accurate tool. Before it got sold around, of course.

I'd construct searches along the lines of:

(Word OR Word) AND (Word NEAR Word)

And get great results. Of course, the Web is way to big and Javascript-y for that now.

dalbasal · on Dec 19, 2020

Being a predecessor is not a failure. Au contraire.

One of Dawkins' memorable lines is "Descendents are common. Ancestors are exceptionally rare"

You could say crocodilians succeeded and dinosaurs failed. A croc is still a croc, but the dinosaurs are hummingbirds and seagulls. If you think about it though, both are ancestors... an exceptional success.

Altavista is a Khan.

KineticLensman · on Dec 19, 2020

> A croc is still a croc, but the dinosaurs are hummingbirds and seagulls.

Slightly off-topic, but I went to Wikipedia to remind myself of the specific category of dinosaur that birds are descended from, and the 'Today's featured article' was about Achelousaurus, a ceratopsid dinosaur! I think this is the first time I've had such a close match to the thing I was interested in. From there, it was just three clicks to the article I needed [0], which, incidentally, stated that "The present scientific consensus is that birds are a group of maniraptoran theropod dinosaurs that originated during the Mesozoic Era".

[0] https://en.wikipedia.org/wiki/Origin_of_birds

chris_f · on Dec 19, 2020

At the time, HotBot was my other go-to search engine. Interesting to see that it is still around, but is now a VPN company.

guytv · on Dec 19, 2020

... It's not the same company. The "HotBot" name and domain were sold for $155k in 2016 to an unnamed buyer[0]

0 https://en.wikipedia.org/wiki/HotBot

petepete · on Dec 19, 2020

I was a loyal mamma.com user until Google came along. I always liked the quality of results and the UI. They were swallowed up by Copernic who appear to still exist in the desktop search space.

jkmcf · on Dec 19, 2020

I owe most of my career to AltaVista after comp.lang., comp.unix., and comp.databases.* -- even more so because AltaVista had indexed Internet newsgroups.

llaolleh · on Dec 19, 2020

Am I the only one who thinks this article is very poorly written? It doesn't really tell all the strategic reasons why Altavista fell into oblivion and its structure is all over the place.

I also love Paul Graham's framework for imagining the future and working backwards. If we think like that, Google is nowhere near the form of a final solution to information retrieval. An ideal state would be to retrieve information correct the first time with everything you need bundled into the page. If that problem is solved, then you have to tackle the question of why the user was asking the query in the first place, and how your product can help people have a solution to their answer so that the query is never repeated!

jimpick · on Dec 19, 2020

I joined a set-top box startup in 2002 based in Palo Alto (Digeo) that ran out of the former Altavista offices - same building as PAIX.

https://www.datacenterknowledge.com/archives/2009/02/11/paix...

I ran a build cluster in the server room in the basement where Altavista used to be located. The server room was actually pretty small - just a few rows of racks. We still had a sign in our office that said "Altavista Operations". It's pretty mindblowing just thinking how small internet-scale things were back then compared to now.

FourthProtocol · on Dec 19, 2020

I think the first search engine I used was Magellan. Until AltaVista happened. Loved AltaVista, until Hotbot/Inktomi, which was the only one I had trouble letting go of, I think only because of their really clean and minimal UI. Even when Google came along I was one of the few that had a finite number of web sites I used to use that made Google virtually useless to me. The only search engine I use now is DDG.

It was as insane marginalising Google back then as so many other tech fads since then, including IOT, Bitcoin, XP/agile, Netbooks, 3D TVs (remember those?), and so on.

There is an upside to not having used anything Google - to this day I have zero reliance on any single product of theirs.

guytv · on Dec 19, 2020

1) Google was the only search engine at that time that understood the power of search. Yahoo pushed its search box below the fold. Lycos, AV tried to bring more "content" to the search page. 2) It was blazingly fast. 3) It allowed you to test your current search with the competition search engines. After a few times you did that, you realized they were by far the best, and did not try it anymore.

papito · on Dec 19, 2020

In the time of dial-up, having a fast-loading search page probably accelerated the transition. Users factor in speed of load into their preference, even though they are not aware of it. Old school search engine home pages were criminally crowded. Google provided frictionless experience.

gandalfian · on Dec 19, 2020

After the dot com crash? many of the spiders just stopped indexing new content. The altavista index became way out of date. If you weren't already in a yahoo category you weren't getting in. Dmoz didn't even have the editors to appoint any new editors... It was less a failure than a mass giving up, as if everyone had gone but forgotten to turn the servers off when they left.

pdevr · on Dec 19, 2020

AltaVista was the first search engine I used. It was the best, because I hadn't seen anything else.

Then a colleague introduced me to the early version of google.com - a minimum viable product, before that phrase became popular.

Within a short period of time, without being aware of it, I almost completely stopped using AltaVista, because Google was so much better.

Memories.

liveoneggs · on Dec 19, 2020

altavista starting serving pop-up and pop-under ads. Google, at the time I switched, was serving tiny text-only ads. Both had good results.

at_a_remove · on Dec 19, 2020

I miss the NEAR keyword. I gather that Google has AROUND(n), or did at some point (one never knows with them), but damn, NEAR was pretty helpful when you had a nicely constructed Boolean that wasn't quite getting you what you needed.

_abox · on Dec 19, 2020

Yeah it's a shame. Digital could have been the Google. Probably a much better one (it can't get much worse than Google's privacy invasion, after all).

olentangy · on Dec 19, 2020

Coulda, woulda, shoulda.

I remember when I left DEC for Apple in the mid-80's and my manager told me I was making a mistake going to work for a company making toy computers.

DEC was a true leader during the mini-computer era, but after that nope. Happens to a lot of companies.

_abox · on Dec 19, 2020

True, they really neglected AltaVista. That was a colossal mistake. They really got stuck in their old mindset.

jve · on Dec 19, 2020

Ahh, that day when I was introduced with the search engine & internet, they took us some kids to library PCs. The librarian instructed us to open altavista and said:

If you want to find Mr. Bean and will search for "Bean", you will find... beans. Type "Mr. Bean".

Only years later someone told me about google.

TylerE · on Dec 20, 2020

This line from the Wiki jumped out at me:

> As of 1998, it used 20 multi-processor machines using DEC's 64-bit Alpha processor. Together, the back-end machines had 130 GB of RAM and 500 GB of hard disk drive space,

I'm typing this on a machine with 20 threads, 64GB of RAM, and a hair over 12TB of disk.

404mm · on Dec 19, 2020

* don’t confuse with astalavista ;)

corty · on Dec 19, 2020

From the same era, I still remember Excite Extreme: https://www.youtube.com/watch?v=aw9x5us4ZM4

Not useful, except for the bling factor. 3D VR search in 1998!

dbsmith83 · on Dec 20, 2020

I remember AltaVista even had a free dialup service. IIRC, it was like netzero, in that it displayed ads in a window that stayed on top. Of course, I would just find the window handle and set it to invisible

kazinator · on Dec 19, 2020

When I tried the Google search engine in 1998 for the first time, I dropped AltaVista like a hot potato and never looked back.

Any subsequent AltaVista history is pretty much an irrelevant "all over but the shouting".

m3kw9 · on Dec 19, 2020

The results were crap and it was slow that’s what killed it. While other alternatives were getting better

laurent92 · on Dec 19, 2020

Results were crap because pages started doing SEO by stuffing white keywords on white background. If it were relaunched today, it may be efficient. But the whole game is to stay on top of this.

DDG provides almost the Google results, yet it feels much less efficient. I think, for the single-box-with-single-keyword-search market, Google is the best we can do, but there might still be room for other search engines.

Also, if you need historical, political or medical information, that is 3 domains where Google is already out of the game.

There is a lot of room today for a search engine which would only returns technical results and not politically or racially motivated results like Google does (Google had project to promote races other than Whites, and thus, started not returning some results depending on the race of the scientist).

spaniard89277 · on Dec 20, 2020

> Also, if you need historical, political or medical information, that is 3 domains where Google is already out of the game.

I find myself using every non-google search engine in this areas. I'm pretty lazy about search engines but it bothered me enough.

orev · on Dec 19, 2020

That’s a bit harsh. It was the best thing available at the time, and it held its place for a while, until Google came along.

Progress is always about moving from one thing to the next.

trinix912 · on Dec 19, 2020

Does anyone know why are both altavista's and digital's sites excluded from archive.org?

Dumblydorr · on Dec 19, 2020

Is there a search engine which displays 4-6 thumbnails of top sites? I feel google results are just giant walls of text that you randomly pick one. If you could actually see the relevant text on the website, you'd not even have to click?

mrich · on Dec 19, 2020

Interesting, didn't know this was just a tech showcase. And it's always puzzling how management fails to monetize something new and successful. It seems most are only able to copy competitors.

wintorez · on Dec 20, 2020

The first good search engine was metacrawler.

rsync · on Dec 19, 2020

s/first/only

FTFY.

SubGenius · on Dec 19, 2020

Unrelated, but does anybody remember a site called astalavista.com? It had a lot of script kiddie tools, among them sub7, msn messenger flooders etc. I miss the early internet.

404mm · on Dec 19, 2020

It was astalavista.box.sk

FartyMcFarter · on Dec 19, 2020

It's on archive.org . Here's a randomly chosen snapshot from 2001:

https://web.archive.org/web/20010119175000/http://astalavist...

unclekev · on Dec 19, 2020

I remember paying for a membership for their "Premium Security Portal" back in 1999/2000. Got in a lot of trouble at school because of the things I learnt on that site. Good times :)

_abox · on Dec 19, 2020

Oh yes I remember it! It was a great place for software serial numbers too.