I was the first corporate web master for AltaVista and joined in January 1996 to manage everything except for the actual search engine itself.
It was my first real job at a large company and taught me a lot about working in corporate America.
I saw so many mistakes made within the year I worked there that were obvious even at the time for a lot of us that worked there, but at the same time there are many similarities to what happens with other very well funded projects trying to make sense of a new technology and way of doing businesses within a large very important company with a very different business model.
I have seen virtually the exact same playbook happen in the enterprise blockchain space in multiple occasions over the last 5 years.
It is sad in many ways to see what happened to DEC (probably more so than AltaVista). It was such an innovative company back in the 60s and 70s, but unlike IBM weren't able to reinvent themselves in first the new 80s world of PCs and then later internet. Classic case of innovators dilemma.
AltaVista itself largely died, because in a misguided attempt to manage the innovators dilemma they just tried to rebrand everything network oriented they had as AltaVista.
People only remember the search engine now and for good reason. But we had AltaVista firewalls, gigabit routers, network cards, mail server (both SMTP and X400 (!!!) and a bunch of other junk without a coherent strategy. Everything that had anything to do with networking got the AltaVista logo on it.
The focus became on selling their existing junk using the now hip AltaVista brand, but the AltaVista search itself was not given priority.
I learnt a lot from my experience there, grew to be extremely skeptical, learnt to love Dilbert and also learnt how cool the DEC Hardware and Digital Unix was compared to the Sun Sparc and Solaris stuff I had to work on afterwards.
Anyone familiar with Inverted Text Indexes when AltaVista was at its peak appreciated DEC's business model and execution. Google's PageRank pulled the rug out from under standard Inverted Text Indexes and quickly obsoleted the competition. Re-ordering the entire index nightly with potentially multiple passes is non-trivial and requires throwing a lot of hardware at the problem.
In my opinion, marketing wasn't the problem; the problem was fast-following a disruptive technology without understanding how it worked. Once details of PageRank were published it was too late.
Why was it too late? They were still number one when pagerank was published, all they needed to do was put more resources into it. Google’s algorithms weren’t magic. The biggest problem was lack of vision, and willingness to bet on that vision. Investors in Google were swinging for the fences, that mentality would never have been at home at DEC.
Google’s algorithms were definitely “magic” by the standards of the time. Altavista never got much better than literal keyword search even when they introduced PageRank-style weighting. Google immediately had useful results from misspellings, questions, and related keywords, and only got better from there. Plus it was much faster; responses in 1-2 seconds versus 5-10 sometimes for AV. Maybe what you said is true, but they were deep in the hole as soon as Google launched.
I don't recall AltaVista's response time as slow (although I was impressed by Google's fast response time, which they proudly reported then). It was the page-load time, which was slow for AltaVista compared to the one of Google, which initially drove me away once that alternative was available.
Yeah, all portal sites were cluttered heavily. Google was only the logo, the search input field, and the submit button. That was one of the biggest advantages for me.
True, and Google is slowly being disrupted by the next generation of Internet search (looks like disruption happens every 20 years or so).
The age where "search" is meant to return pages will eventually end. I know it's early, but you can think of GPT-3 as the next generation of search engines. You consume the entire "Internet knowledge", and you answer questions, by merging information from multiple sources. Not just returning one existing page. That's sort of what GPT-3 does, you just don't think of it as a search engine yet.
Google felt "magic" in 2000. A search-oriented GPT-3 will feel "magic" soon.
GPT-3 will be remembered and used by no one in no time. Google is already answering questions to queries and also offering alternative queries. It's now in a hybrid state where it tries to answer your stuff but also provide links. Quite interesting and quite usable already. GPT-3 is a cute experiment, nothing more.
The lists of questions that Google returns as snippets can be seen as possible elements of a paragraph for a process that doesn't quite know what a paragraph is. Clicking on them acts as a survey response, a signal for that element of information's significance in the topic for which you searched.
> Google is slowly being disrupted by the next generation of Internet search
What is the name of the search service (or services?) that are better than Google? You make such big statements and provide zero examples.
Bing is inferior to Google; duckduckgo is even worse - the search is often unreliable (even after years of work); Tineye got hindered by GDPR and it was search via picture anyway. So what else have we got there?
I remember the time when Google came out very well. It was a strictly better product than Altavista. Once you started using Google you would never go back to Altavista - the difference was like night and day. Altavista was dead for you. There were no obstacles to switch either.
I assume that if a new product came today, that was better, even say 25% better, people would start switching via word of mouth.
Back in the day Altavista lost because their technology was inferior. It wasnt about website design (although Google's clean landing page helped), it was 100% technology: Google was strictly better. In fact at that time there were other search service too - they were even worse - and often provided nearly random results (e.g. search for William Shakespeare -> get random porn website...).
PageRank algorithm sounds easy once you know it, but it is easy once someone tells you about it. It is much more difficult to invent. Back in the day many other people worked on improving search (those were the times of catalogs and webrings) and Google were the first (?) to come up with something like that. PageRank was basically bleeding edge research sponsored by spy agencies. It only sounds easy with the benefit of hindsight.
Also not directly to you, but the opening poster basically writes that:
* they could make Altavista work - but the meddling management hidnered it (how? did they have their own PageRank equivalent? I doubt it)
* they could have made blockchain work - but the meddling management hidnered it
I see a certain Scooby doo pattern. And a senior developer/manager/architect (with 20++ years of experience under their belt) who claims that they could make blockchain work, while most people with such experience know that blockchain is an empty buzzword.
all of search isn't question/fact based. I'd love to see the real numbers, but I'd think that less than a third of all searches are suited for a factual question answer system.
Google search is still largely based on human feedback. what people link to, what terms they choose to use in relation to other terms. It will be difficult to disrupt that
does the Ad business model work with GPT-3 answer ?
Isn't the point of google to make it 'confusing' enough to the point that I will click on the ad because it's clearer ?
How GPT-3 search-oriented answer would help with that ?
Google was inferior to AltaVista for quite some time.
According to Wikipedia: "In 2000, AltaVista was used by 17.7% of Internet users while Google was only used by 7% of Internet users, according to Media Metrix."
So, two years after official founding, Google was still not ahead to AltaVista.
AltaVista even had a "visual clustering" thing that used Java (which would work great now with Javascript) that would allow you to refine your searches. I still cry that we don't have the equivalent of that 25 years later.
Then AltaVista got caught in the great DEC "hostile giveaway". Which left Google sitting in the right place with nobody to really compete against them.
One problem was that AltaVista was early--it basically predated common people using the web--and so it wasn't quite so clear how you monetized it. DEC effectively ran it as a goodwill "free service" to the internet. Even in 1999, on-line commerce wasn't very big. Remember, the big AOL/Time-Warner merger was in 2000.
The one thing that Google got right was timing. Google was in the right place when everybody switched from services like AOL to basically just accessing the web directly. And this let them put ads in the search which could be used for monetization.
(Also, an aside it's possible for it to be totally true too)
Early Google users definitely trended towards power users, and if the average early Google user made 3x as many queries as the average Altavista user, Google would have higher actual query volume.
The publishing of the PageRank algorithm was the wrong milestone; my bad. The important milestone was a less well-defined point when search engine users and competitors realized that Google was disruptive.
PageRank was the way forward but the true schlep, in the Paul Graham sense [1], was the brute force continuous hardware scaling required to keep up with the growth of the Internet. DEC needed a technical pivot first and foremost. I've seen no evidence that the AltaVista team understood the technical challenge but I could be easily convinced that bean counters and/or management stifled a promising response; absence of evidence is not evidence of absence.
My recollection from the time (I didn't work at either place but was in the general area) was that everyone assumed Google were delivering their performance by use of massive hardware resources (I heard 70K machines at the time). So perhaps AV folks just couldn't imagine getting the hardware budget to complete effectively?
It's also worthwhile I think to point out that at that time, it wasn't settled that "search engines" as we know them today (like Google) were _the_ way to use the Internet. There were alternatives such as Yahoo (curated directory), and browser-side catalogs (netscape.com home page) that were much more popular. So it is possible also that AV folks weren't exactly thinking "We have to light the afterburners to go after Google in this immensely important internet search space". They might have been thinking "odd, someone spent $xxxM on hardware to run a search engine, that'll never work out".
WRT to curated directories, in the late 90s when Google came (along with others like Northern Light), it was clear that both directories (Yahoo and Moz in particular) and the established search providers (Yahoo, AltaVista, Metacrawler, etc.) were being overwhelmed.
Yahoo (aka Inktomi) search frequently took multiple pages of to find anything of quality. Curated directories were missing quite a bit. Furthermore, if I remember correctly there was some sort of payola scandal around Moz/OpenDirectory editors.
The market was primed for a better solution, and PageRank provided it.
Inktomi was pretty good then. In fact they powered both Yahoo and Microsoft web search at the time. When they were acquired by Yahoo, they formed the basis of Yahoo search, that was well respected in the search research community for years until Yahoo started to implode.
Early Google used commodity hardware to the point of being comical. Somewhere there is a "historical" display of commodity motherboards spread out on cork trays in a rack. Maximizing density/minimizing cost.
The ability to scale in this way with commodity hardware was new and important.
When Google published PageRank in 1998, they didn't publish MapReduce, the framework required to do coordinate large distributed computations, until 2004. By then, AltaVista was dead.
If AltaVista had thrown resources at the problem using any of the then widely known techniques for scaling, they would have been constantly dead in the water.
What made me switch to Google from AV was the simplicity of the Google home page and the clean results page. AV was getting too cluttered, Google loaded faster. (We were still on dial-up too!)
The funny thing is that I remember AltaVista being much cleaner than Yahoo and the other competitors at the time. I think they only switched to the cluttered page to try keep up with the trends. Then Google came along and rolled it all back to basics. (Of course the better results helped too.)
> It was such an innovative company back in the 60s and 70s, but unlike IBM weren't able to reinvent themselves in first the new 80s world of PCs and then later internet. Classic case of innovators dilemma.
No, this wasn't it. DEC actually had a quite good PC business which died when everybody died to the Asian manufacturers.
DEC also had plenty of technical stuff as well as cash flow to ride it out.
DEC was purely a result of executive level and board level malfeasance.
After the board forced the founder Ken Olsen (who wasn't a great CEO but actually did have vision) out, Robert Palmer (who had no vision AND was incompetent) (and not the singer--the singer probably would have been better as CEO) was given marching orders to sell off the company. Which he did--with no vision whatsoever. Lots of people tried to fight against it, but any division which started righting itself immediately got flogged off.
The patent lawsuit allowed DEC to jettison a bunch of the fab to Intel which then made them an attractive target to Compaq.
People love to comment that the merger killed COMPAQ when the reality was the entire US domestic PC industry was completely collapsing.
HOWEVER, to give you an idea as to how badly the "hostile giveaway" was managed, Compaq effectively bought DEC for less than their enterprise service annual revenue--about $2 billion per year. HP later milked this stream for more than a decade.
So, in the middle of a PC industry collapse, no executive could figure out how to convert a $2 billion annual revenue stream for enterprise services plus a whole bunch of leading edge technology into a profitable company.
This shows just how shit-tastic the executive management for both DEC and COMPAQ really were.
But, hey, the DEC board got their stock bump and cashed out.
I read your comment a few times and didn't understand what had happened at all. I figured it all hinged on the term innovator's dilemma, since you used it a few times. So I looked it up.
Wow. Mind blown.
I think we're all familiar with disruption, but Innovator's Dilemma poses a theory or framework as to why it happens. It's rather brilliant and seems to fit every case of market disruption I can think of.
Incumbents serve existing markets and don't care about the small markets served by disruptive startups. They're making small, iterative improvements to their product.
Disruptive tech eventually hits parabolic improvement and by this time the incumbent can't catch up.
To be fair even google and early investors in google including myself ( 2005 post IPO ) didn't understand how lucrative and important being a dominant search play would be.
It won because competition was really bad. Altavista was not a good search engine.
When you were looking for something, you had to use multiple tools, like Lycos, Hot Bot, Ask Jeeves and the multitude of the web portals that were available back then.
I remember when I first saw Google, it was an instant decision and I never went back, that much was the difference in quality. Unfortunately for us, nobody found a working business model into search in time and this allowed Google to become too big for any viable competition.
The problem is that how the market works is that once a monopoly has been established, that it's very hard for a competitor to challenge that, even if he have a better idea.
Large players make lesser costs due to large vertical integration and bulk discounts.
Even if I were to invent a new search algorithm that would be superior to Google's in terms of satisfaction with most searchers, I would be unable to get a wedge in.
This is not so big a problem with say a power company, but with a search engine, it becomes the front page of the internet through which everyone goes — the end result is that Google commands a great deal of political influence simply with how it decides that it's algorithms should work and what pages to prioritize.
Courts ordered Microsoft in the past to provide Windows users with notice of other web browsers, which was the primary way Internet Explorer lost it's dominance — perhaps it is time to order Google to provide users notice of other search engines and web browsers too as a matter of antitrust.
thanks for posting this. I love when HN brings together the principal actors and primary sources. OT: I wish we could use webmaster more often, even though thats a much murkier proposition in the days of product managers deploying the new site via CDeployment.
I was in Nassau working between FSG/PM and the ACA/CORBA team. Recall many trips to the WRL, they always had the coolest stuff; network tunnels and firefly. Back then who you hated said a great deal and I recall several other CEs playing the silver bullet game to perfection.
I'm curious the relation to the corporate blockchain?
>> I'm curious the relation to the corporate blockchain?
I think its the idea of companies seeing new tech and trying to leverage it somehow in their own businesses and failing - badly.
Here's my own anecdotal evidence with blockchain:
I work in a very large health care company. For about four months, there was a huge buzz around the company about how to leverage block chain in health care. The main idea was using block chain to manage patient accounts and PPE.
We had all the execs bring in to do huge presentations. They brought in people from IBM to talk about Hyperledger and other block chain companies. They posted videos and articles about how this going to transform healthcare, we were all told this was going to be huge. They told people they were going to form a new team, hire developers and this was a going to be a huge focus in 2020.
Six months later? You couldn't find a single resource on any of it on any of the internal company sites. All the presentations stopped, the execs stopped talking about block chain seemingly over night and it was like poof! the idea of block chain, or any mention of it or the "revolution" that supposed to follow? Completely disappeared into the abyss, never to be heard from again.
I have no idea how much they sank into the notion that block chain could be used for health care, or how many people they hired or the contracts they signed with IBM, but I can only assume they lost a lot of money before they finally realized it wasn't going to work out.
As I understand it, the main reason to use blockchain is having multiple parties intact without any party being trusted, so if you do have some central trusted authority it's pointless. There's also the requirement that the network has enough participants that no single party can gain a majority of the network's compute power, which seems a very iffy assumption with private/permissioned blockchains.
So I don't see how blockchain is useful in a healthcare context. Please correct me if I'm wrong and enlightenen me if I'm missing something.
> They should have merged with a PC make obviously.
DEC was acquired by Compaq [1]:
> In 1998, Compaq acquired Digital Equipment Corporation for a then-industry record of US$9 billion. The merger made Compaq, at the time, the world's second largest computer maker in the world in terms of revenue behind IBM.
I took it to be sarcastic. It was a spectacular fail as within about a year of the acquisition everything good they were doing had completely disappeared from view never to be seen again. That's not to say that DEC could have done any better on their own: with the absolute market dominance of Wintel at the time, they were flaming out as so many other mini and workstation companies did as their markets evaporated.
It's hard to explain just how good Google was back in the day, compared to those early search engines.
I distinctly remember someone suggesting it to me at our CS lab in college. A few of us had never heard of it, and we all started to do some searches to try it out. There was silence for about 5 minutes, and then someone said "this is really good."
Say what you want about what Google has turned into, but it was an incredibly important tool that came around at the right time.
Also, it makes me really happy that they've kept the "I'm Feeling Lucky" button around.
The keyword(no pun intended) being was --- I remember going through many pages of relevant results and finding the "needle in a haystack" many times with Google, but now it seems like anything even vaguely obscure gets absolute rubbish results (and it claims there are no more results if you try to go deeper, despite the fact that I know the pages are out there.) I'd say it started to decline roughly 10 years ago.
Google has definitely just plain delisted a lot of older sites, even if I do exact quote matches I can’t get results for sites I know exist and used to be indexed and often turn up on other search engines.
On top of that, as you say, anything that isn’t from a huge web property or corporate site is so heavily penalized you’re lucky to ever see it even if it is new.
The utility of google as a web search engine is definitely declining over time.
This is possibly a self-destructive pattern, too. Once nobody can find anything outside of Quora/Wikipedia/Pinterest/Stackoverflow etc. why would anyone bother creating small sites/blogs (this has been happening already for a decade) — But then, once the entire accessible web is just these handful of big sites, what is Google’s utility? They all have internal search and are increasingly accessed primarily through apps
Like grocery stores or many types of restaurants, for example. They usually can't compete against the Walmarts and Paneras, which suck up consumer mindshare. Similar with soft drinks. Coke and Pepsi build as much mindshare as possible. You may invent a great product, but it likely will never get noticed among corporate noise.
Google, Wikipedia, Facebook are becoming the default go-to spots for particular activities, similar to what McDonald's did to fast dining in the 60s and 70s. I guess, in general, markets saturate, oligopolize, and the small spots fade away unless they're niche or very high quality.
It's even delisting a lot of content from existing sites. For example, I've got previous HN comments that I used to reference using `akiselev site:ycombinator.com [some key words]` that are nowhere to be found now. I started noticing this a few years ago.
It was actually how I discovered Google. I was a grad student at the time (English literature) and dreamed of accessing all the primary and secondary sources on a particular topic from the convenience of my own keyboard (still a pipe dream today). I cycled through every search engine I could find looking for one that would actually return results relevant to my search terms. I remember AltaVista being touted as one of the best but still failing to satisfy.
As soon as I read this New Yorker article, I checked out google.com and had that same eureka experience: finally, this is the one.
In grad school, in the early days of the dotcom boom, some of us CS-ish people curious about startups would talk about them. One day, one of the PhD students asked everyone which dotcom's stock would they most like to have. I said Google. (It might not have been a company yet, but we kinda assumed.)
Google worked surprisingly well, obviously they were going to be hugely important (unless another player appeared and did a big next leap), and obviously they were smart. (On the side, I also got warm-fuzzies that the search hits I got seemed to have a strong Linux bias.)
The grad student who asked, coincidentally, ended up deciding not to finish his PhD, and went to Google. :)
I remember being blown away by how good google was around those times.
But I Waldo remember just how blown away I was when I first saw altavista do it’s thing. Altavista was as equally amazing in 1995 as google was in 2000. Innovation goes in waves.
I donno, for me early google was horrible. I always went back to AV because it gave me the results I was looking for. But over time AV just got slower and bloated and Google just got better.
Glad to see someone else had the same experience. When someone recommended Google to me, I tried it and immediately went back to AltaVista. The only distinctive characteristic I remember noticing was the immense whitespace on the Google home page. That sort of simplicity in page design was different from other search and portal websites at the time.
Using AltaVista for me meant digging through page after page of results. Comprehensiveness was the main concern. It was up to the user to evaluate the relevance of the results.
Early Google made similar claims about comprehensively searching millions of pages, but as we know today, they are intent on inferring meaning and purpose. They actively discourage and prevent users from combing through page after page of results. User evaluation (i.e., intelligence) is not expected. Google attempts to evaluate results for the user based on popularity, originally estimated primarily by counting backlinks. Popularity as a filter is useful sometimes but deeply flawed at others. It's arguable Google has dulled, atrophied or stunted development of web users' analytical skills. When it first appeared on the web, Google had no paid placements and no advertising. What was not to like? They later abandoned their original mission to avoid the influence of paid placement. They became beholden to advertising.
It was not difficult to see when and where the influence of advertising came into AltaVista. However when this started to happen at Google, Google tried to hide the ads by making them text-only. As if the influence was not there.
We need another AltaVista, where user evaluation of results is allowed and encouraged, with a mission statement like the
original Page and Brin paper announcing Google: no influence by advertising. Ultimately PageRank was dependent on human discretion: the decision whether or not to link to another page. We soon learned that this discretion, this choice to link or not to link, easily becomes driven by money when people know it effects PageRank. Google quickly got gamed and it has been trying to pretend it can manage this ever since.
I also had this experience. However, I now feel all search engines are returning worse results. Very rarely can I find anything meaningful outside of major content providers. Everywhere my results are filtered through a local government lens. In many ways it feels like search on Facebook, YouTube, Twitter, Instagram, LinkedIn, WeChat, closed appstores, etc. have become the new defacto content aggregation mechanisms, and those are tending toward feeds and curated content rather than requiring active search. The best and most meaningful results on most actual search engine results pages are Wikipedia excepts, or trivial widgets (currency or unit conversion results, weather forecasts, etc.). I don't have any objective measures and am unlikely to be a representative sample but I feel I am searching less, my browser has become a URL-bar based memory system, the traditional 'home page' notion has been replaced by a Firefox auto-curated most-frequent-sites list, and I use traditional search engines more like a command line than a library index. OTOH we now have Library Genesis which is brilliant.
Irrespective of its legal status, LG is proof one does not need 75,000 employees, nor any perceived "brilliance", to provide a world-class, non-commercial information retrieval service. It can hold its own next to many university libraries' remote access facilities and it puts Google Scholar to shame. It is command-line friendly and offers bulk data. Unlike Google, LG does not try to guess what the user is searching for and answer questions. One encourages traditional learning, the other is incessantly trying surveil its users in the name of selling online ad services to third parties.
IIRC, the draw was "you don't have to use plus signs" and the other options, and you didn't...until Google started trying to read between the lines of our searches and we started wishing Google had plus signs.
Google's aversion to zero-result queries is a problem, and possibly the problem, with their "accuracy," so to speak.
I remember the first time I had heard of Google -- it was ALREADY used as a verb!
I was interning at Xerox PARC (adjacent to Stanford campus) in summer 2000. As far as I remember, I used Alta Vista at the time.
Somebody asked a question, and one of the researchers/intern mentors said to "Google it"! I think there were some puzzled looks, but we tried it, and I started using Google and never went back.
When I returned to college in the fall, I remember my former housemate saying how good Google was too. She had started using it too. Once people started using it, they never stopped!
Agreed. I was a heavy user of Info Seek at the time (I especially liked how they highlighted search terms in the results) but it was hard to deny how much better Google results were. For a long time I just used Info Seek first and would try Google second, but after a while I realized I was just using Google. The results were impressive.
I used to go to library to find CS books samples and read them for about 1-2 weeks before I actually get to code. After a classmate share Google with us it was just unbelievable the quality of results and how easy was for us to start finding sample code and literature. Yes, the best tool probably in modern human history
The main thing I remember about Google was that it was fast, really fast compared to AltaVista, which was becoming so slow as to be nearly unusable (sometimes you’d wait 30 seconds for a page load).
Weirdly similar experience. I too discovered Google in my college CS lab, when I noticed the (then) weird looking homepage. I had been swearing by AltaVista at the time. But it was with Google that I was able to, for the first time, get what I wanted without having to go through multiple pages of search results. A search engine giving you what you want within the first few results of your very first search attempt was just not a thing. Crafting search queries was an art, and so was wading through results.
On the other hand, search engines of those times gave you what you asked for, not what it decided was best for you. That option is gone now, and searching for many niche topics is no longer possible. If that's what you're looking for, it might as well be 1990.
I wonder if the "I'm feeling lucky" has stayed around because it's a good training heuristic for Google Home (and maybe other services?) even if very few people actually use it. I'm genuinely curious why they've kept it around.
It's also pretty ideal feature to keep around even if it is mostly unused.
1) It isn't very complicated -- it doesn't require a lot of code that is going to rot.
2) When it is used, it's cheaper rather than more expensive.
3) It only "clutters" a UI that isn't actually seen that often nowadays -- you have to actually go to google.com, rather than just typing in the browser bar -- and isn't particularly cluttered.
However, if it makes the experience worse then it’s not worth keeping. I’m sure they did some study where they found that most users weren’t happy and had to go back.
What drew me to AltaVista was that it gave me results that didn't exist elsewhere. It took some boolean gymnastics to get precise results, or going a few pages deep, but stuff, if it existed, was findable.
I almost wish there were less precise and more obscure options out there today. A search engine that purposefully didn't index mainstream news, social media, nor shopping/product sites.
I wish there were still a search engine that followed my query strings and operators to the letter instead of attempting to be extra clever. Even Bing seems to be doing that no longer.
This would be my main complaint about DDG: when there are no results for quoted search terms (but not when it's a single quoted term) it will just ignore the quotes and give results which most of the time are completely not what I need without even saying it is doing that, yet often it takes me a while to realize I made a typo and it's now giving me things I cannot use. IIRC it was not always like that though.
... until you want to search for something verbatim with quotes in it. Google doesn't seem to respect escaped quotes like "hello \"this\" is a test" which is quite surprising to me considering how they invent programming languages and all.
Is there an whole-Web search engine that isn’t a fulltext search engine, but rather works more like grepping a virtual text file that contains the entire Internet, such that one could match on punctuation and so forth?
You think it's practical to run the query on the index, materialize the full result, re-index it, run the full query string, and rank? This suggests you do not understand the scale of the problem.
You don't need to re-match or re-index on the full set of results to display just the first page.
I fully understand the scale of the problem. I also think this is a very approachable problem for a thousand Google engineers who have done a lot harder things.
So much this...
I commented earlier how I miss altvavista search language. Stuff like the 'near' operator was awesome. Want all john doe's papers ? "john" near "doe" got you "John Doe", "Doe, John", "John A. Doe", etc
oh god yes, that would be nice. I know what i typed stupid search engine, if i put it in quotes...i only want that exact phrase. Don't just cast that aside and search for the words individually.
Yeah, I wish that there was an option in Google (or anything else) to rank my results based on a criteria other than "many people near you have searched for the same thing". Sure, this is helpful for current events but for other things I've been finding that my results are garbage blogs written by an AI and I'm needing to append "reddit" to the end of my query to get a real answer
Hmm, read this after I posted my other comment :) So, is that a setting? Does a search for "nonexistin gcrap" return no results for you then? It does for me..
I played with it a bit; it may be limited to one word. So "nonexstin" does come back with sites with that literal word in it. But with 2 words, it does the "what I think you mean" thing. TBH, I have only tried it with one word at a time for my use of "the site must have at least this exact word" type use cases.
I think this can be generalised as "more power to users." You might want to filter Pinterest. OP wants to avoid mainstream news, social media, nor shopping/product sites. This requires users to be involved.
Filters, refinement, user preferences. etc etc. Search is a powerful tool, and the power has been growing exponentially since 1995. But, almost all of that power is channeled into instant results, intuitiveness, "sane defaults" and such. All the capabilities are in the background and the assumed user is lowest denominator on all fronts. The low effort, low technical capabilities, low understanding, etc. This is most people (me included) most of the time, but not all of the time. Maybe I want to filter seo spam (like pinterest) more aggressively. Etc.
This isn't a slam on google really. One thing can't do everything and the thing google search does is what most people need most of the time. It's definitely the thing that makes the most money. But... none of a search engines power has been channeled into making search a power tool. It's a tool that everyone uses many times per day, but there is no learning curve. No getting better. You can't really invest effort and get rewarded for that effort.
Pity really, all this power is there. It's right under the surface. Let us be more than the lazy, dumb user sometimes.
Google have reversed the User Interface paradigm. Instead of users learning the software and telling it what to do, the software learns the user. I don't necessarily mean NNs or personalisation. I mean that the paraign is software-centric. "If the user does X, how should the software respond?" instead of "If the user user wants Y, how does she make the software do it?" That's great for intuitiveness, but it also creates a frame where the software gets better over time but the user never does.
Yeah, I also want filters, refinement, user preferences. etc.
And I want it to be social - I want to search using the same filters that Tim Bray uses.
Or I want to use the filters that the people use who in aggregate tend to click through to the same results pages that I visit.
So much amazing stuff could be done, but here we are controlled by algorithms that treat us all as brain dead sheep who only want to know about what the Kardashians are up to.
I suspect that Google knows about these desires, but would prefer the world where everyone is a sheep.
> think this can be generalised as "more power to users."
I agree, we need more power to control. There should be more than one recipe for ranking and filtering, and we should have more to say about what information is being hidden from us. Society is diverse, a single formula automatically discriminates some groups. Let people make the choice.
> Google have reversed the User Interface paradigm. Instead of users learning the software and telling it what to do, the software learns the user. I don't necessarily mean NNs or personalisation. I mean that the paraign is software-centric. "If the user does X, how should the software respond?" instead of "If the user user wants Y, how does she make the software do it?" That's great for intuitiveness, but it also creates a frame where the software gets better over time but the user never does.
May I quote you on this?
It is a great description of the 3rd premise in a talk proposal I submitted to LibrePlanet 2021, which goes something like:
- Software freedom is predicated on the notion that users ought to be in control of what runs on their computers.
- I claim that tech literacy is necessary to exercise this control.
- "Google Design" presumes that "nontechnical users" are not and will never be capable of tech literacy.
If all of these are true, then "Google Design" is incompatible with the idea that Software Freedom is for everyone. And indeed, we see this borne out in many people's disdain for copyleft licenses, which impose restrictions on developers in order to guarantee freedoms for end users. That's a logical viewpoint, if you believe that end users wouldn't benefit from software freedom, anyway.
"Google Design" does make software more accessible to end users… at the cost of turning your computer into an appliance. Which makes it a self-fulfilling prophecy. People don't magically become tech literate; they intentionally learn the skills so they can achieve something. Remove the power from the tools, and you take away their motivation to become tech literate. I have a great story about my grandma to illustrate this point.
If our goal is to make computing (and, by extension, software freedom) more accessible to everyone, then we need to change our approach. We need to prioritize education and teaching tech literacy. And most of all, we need to write software that is powerful enough that users will see the benefit in learning to use it.
That's they key points from the first half of the talk. The second half is about how we can create software that is powerful without sacrificing usability. It basically boils down to "make simple things easy and hard things possible" — creating software that Just Works™ without configuration, but allows you to customize its behavior if/when you want to. In particular, I'm a fan of embedded scripting interfaces (e.g. spreadsheets), which allow a vast amount of customization without creating "settings overload" that is typical of many FLO[1] programs.
I'm with you on this. I would pay far less, $1/month would be the point where I would begin considering it. There are just too many "only $5" services and they would add up quickly if I were to allow myself that amount of money.
The problem with this model, though, is support. A service with 100 million untrained end users each paying a dollar will need to provide quality customer service to 100 million untrained end users. It is much easier to sell ads to 10,000 "affiliates" who feel that they have some special arrangement and thus accept some responsibility of their own, and make $10,000 off each of those.
I agree there are too many $5 services, but a search engine is something that I use every waking hour of my day. I get so much use from it, that $5 per month is a bargin.
I would definitely pay Google just to be able to remove some sites permanently from my results. They're just spam to me, even if Google thinks they are legitimate.
Yes, exactly right. If people are paying any amount at all they expect it to work and work perfectly. I'm not saying that's unreasonable, but it does quickly become unfeasible at scale with users who have no training at all.
Then on top of that the skimming the payment processors do makes a large percentage of that $1/month go to transaction fees. Charging yearly helps with that but many people won't want to do that.
Additionally, for those, who likes to add reddit in the search query, i'd also like to recommend " inurl:forum OR inurl:board " (note also that forum and forums gives different results for some reason).
Personally, I like to do broad searches on google to discover new stuff, but for the past 7 or 8 years, google often returns ~100 result instead of the usual 1000000+. And, contrary to the logic, adding various inurl: commands actualy gives you more results instead of limiting them and thus significantly improves searching capabilities.
google with filters for a dollar a month would be awesome, like a function to "block this domain forever"... it would clean my search results from the annoying offenders. and i would gather user inputted blocked domains data to produce better search results for non subscribing users...
The really annoying part about pinterest, is I never seem to be able to get back to the original directions/plans/whatever. Its like "Hey, look at this cool stuff I won't tell you how to make, even though I swiped it from a HOWTO site"
This is really the precedent that mattered, I think. Remember one line of a poem? You can probably find the rest. Recall a Douglas Adams piece about rice farming in Bali or Java or something... you can find it. As you say, it took boolean gymnastics. But, that really just means modifying your search terms and scrolling through some results.
Being less powerful than 2020 google kind of put more power/responsibility in users' hands. A user needed to use a search engine like a tool.
I want what you want too... Everything google (mail/search/youtube, etc)... but designed for more user effort. Finding the best-for-most result instantly is great, but sometimes you want the tool to help you find a result on page 112 within a few minutes.
Bring back a little bit of a directory feel even. Let me narrow down, refine and shape results gradually. Assume that I will rummage through a bunch of crap to find what I want.
Same, at one point the size of AltaVista's index was vastly above all the competition and was the best place to go for finding obscure stuff. With the boolean filters it was possible to exhaustively search the web.
Unfortunately that didn't scale with the size of the web; at one point the amount of pages got large enough that doing a smart ranking like Google did became more effective than the "look at all pages of results" usage required for AltaVista. It felt like a monumental transition, like when a program gets too big for main memory and causes the computer to start swapping to disk.
Yes... Enjoyed AltaVista and FAST (AllTheWeb)[1] for precision searching.
That reminded me of Architext, and EWS (Excite for Web Servers)[2], but a good bit of Googling (ahem) later, seems difficult to find much about most of the mid to late 90s standalone local search engines any more, except perhaps Inktomi. Most are not mentioned in Wikipedia search engine timeline[3].
Excite for Web Servers makes it easy for you to add searching -- Excite, Inc.'s advanced concept-based searching -- to your Web site.
Excite for Web Servers provides a simple Web-browser interface for doing all the things necessary to enable concept-based searching of collections of documents -- administering, indexing, and searching over the collections.
In particular, one can:
- define a document collection -- that is, specify a set of documents to be considered a single collection over which one can search,
- design customized pages for displaying to users who wish to search over that collection,
- index that collection, monitoring the progress, and
search the collection.
With Excite for Web Servers, it's easy to set up concept-based-searchable Web sites in minutes.
The head librarian at my high school was an older lady in her early 60s, and she suggested that I use Google because it was the best search engine. Back then I used Webcrawler, but there wasn't too much distance between the competition.
I thought it was weird that "googol" was spelled incorrectly and that Google's logo was ugly even by Paint Shop Pro 4 standards. It looked like search for kids. I assumed the librarian didn't know anything about computers and dismissed her advice. Within a few months everyone was using Google.
Pro tip: Most librarians have master's degrees and their field is all about information storage/organization/retrieval. It takes very little time to teach a kid how to put books on a shelf. The reason librarians are at libraries is to manage the collection of knowledge and to help people find information. So I'd give a lot of weight to a librarian's opinion on a search engine. The hardest part is finding out whether or not someone working at a library is a "real" librarian without being insulting.
Probably different in different places, but in many libraries I've been to, you often interface with student part time workers or the stereotypical ancient person who still lives in a previous age and is bitter about people being impolite.
But I'm always reading positive stuff about American public libraries that they are not really just about borrowing books, but free internet, photocopying, showers, some kind of free social program to help poor people with any information related stuff, like job search or government forms.
I really was boring when the floor, the desk, the cubicle, the walls, and the computer were all the same shade of aging beige. The sterile asylum days of technology industrial design.
OT, but... That article has the most misleading graph I have ever seen. It plots two items with the same units, but different origins and scale. It looks like we had a huge drop in active leisure time and it was replaced hour-for-hour by screen time (and that screen time was near zero in 2000). Instead a modest decline in the former and a slightly larger increase in the latter
When will somebody replace Google (and Bing) with one that actually works again? Is the cost of entry that high? Is there no business model to permit one to operate at some level of profit?
I'm sure most here have been frustrated by the difficulty of getting "good" results on searches, even with modifiers. But what most troubles me is Google's memory/history has grown smaller and smaller, as if it has Alzheimers - searches that used to return results now bring back none.
The web corpus is huge, which leads to follow on problems: it's expensive to fetch it, and to proccess it, and to host the resulting indexes. Fetching is tricky also because you're likely to get blocked from sites it you're too aggressive.
To justify all that expense, you need a lot of users, but it will be hard to get those users because there is 20 years of 'search = google' to compete with. Yahoo search user testing from 10+ years ago found that users would prefer search results displayed with Google branding over search results with Yahoo branding, regardless of the search results. Maybe it wouldn't be so bad if it's Google vs a new hip name, but you have to somehow cultivate that hipness. Bing doesn't have it, Amazon tried doing websearch and quit pretty quick (but maybe it's used for Alexa?).
You'd realistically need to build out an advertising platform too. Using Google's ad platform while trying to compete with their core market seems like a bad idea. Using Microsoft's ad platform is probably not going to a good experience, but maybe you can start with it.
Which would be easy enough to store in 2020, but then you need to preprocess it in a way that is amenable to both search and result ranking. But lets say your indexing is super good and matches the compressed version: 80TiB. Throw that in EBS, and you're paying $6k/month just to store it. You also need CPU, and memory to actually compute from that data though! If we instead use i3 metal instances, you're looking at about $2700/month each, and you'll need 15 of them for 3x replication. $40k per month isn't bad, if you're a startup with VC funding. But... we also need network egress... All this just to be literally the common denominator search engine with zero users.
So, how do your users get to you? In 2020, you have three major sources: browser searches, phone searches and direct traffic. If you want to be in the browsers, you're going to have to pay, and if you want to be default you must pay more than the incumbents who have their business model figured out. And bid at scales of roughly your Series B and C combined. Phone OSes, same deal: you need to be prepared to bid high, and in volume. Direct traffic is basically word of mouth / marketing driven, and for our common crawl search, we can assume is relatively nil. So even search traffic has an acquisition cost, and almost all of the sources run their own search engine that you would need to bid against.
So this point you need to start thinking about revenue, because every query you get literally costs you money. We know that search engine ads work okay, since the user is clearly expressing intent. But those different users have very different values -- someone searching from an iPhone or MacBook Pro is likely more valuable to advertisers than a 10 year old Linux laptop running Firefox with Adblock and a Pi-Hole DNS server. And without traffic nobody's going to bother running campaigns on your platform.
Alternative revenue strategies seem unlikely to work -- Google is free and Bing literally pays users, so subscription seems unworkable. You could try to find a niche, the way DDG has, and perhaps chisel away at market share slowly, but you'd need some content indexed that is unavailable to competitors, and that will come at a price.
I only browsed the article but had fond memories of how revolutionary AltaVista was when it launched. Suddenly you could browse the web without following links. This started a trend where you provided lesser and lesser value by stuffing links on your homepages (yes this was a thing) as people didn't need the links anymore.
Some years later I remember AltaVista suddenly became full of paid links and ads, to the point of unusability. This is when Google came in, with no ads, no paid links, and actual good search results.
The irony.. now Google fills at least half the first search page with paid for links and unusable results.
Unfortunately, nobody (in the long run) gives away something completely for free. I would pay $1-2/month for a search portal without paid links and no sell-off of my private info.
The estimated average Google search ad cost-per-click (CPC) is $2.69 [0].
Working on the conservative assumptions that an average person will run 10 searches a day and click on 1 search ad a day, Google will make $83.39 a month from the average user. It is likely much more though.
Their global ad revenue is a public figure right? It should be easy to estimate an average. If they have a billion users, do they actually have $100B in monthly revenue?
I highly suspect the distribution is not flat in any way though (some users are much much more valuable than others and it's why advertisers pay FB and Google to target the ads for them).
Bear in mind that there is a vast amount of clicking on "wasted" adverts, i.e. paid adverts which would have been the top result of an organic search anyway, e.g. eBay spending $20M a year on ads targeting the keyword "eBay", which they thought was great because it appeared to give a $245.6m return on investment, until they switched it off for a bit and realised they got pretty much the same traffic and conversion rate without the massive advertising spend[0].
I never click on their ad links so it would be a pure bonus for them if I pay :) But yeah I don't know... sometimes the supply/demand curves never meet.
Still think the business opportunity for a really good paid-for search engine is there. Of course it's not trivial to make a search engine but as I feel Google's usability is in a falling trend, the bar is getting lower..
I remember having AltaVista as my primary bookmark, until they accelerated the ads and monetization, and render it almost completely useless. It was so obvious they wanted to make quick, obscene money from of it. They cared very little about the negative impact that had on user experience. That's why AltaVista went like the dodo. Period.
According to the article, altavista switched owners 3 times during the last 3 years of its life. I assume one of these had the bright idea to fill the results with crappy paid-for links. Google has these geniuses to thank for priming the playfield for them by destroying themselves.
I recall there was a low-bandwidth, "text only" version you could use for some years afterwards which was just the search box. Probably suitable for Lynx users. Then they added one banner.
I don't think the UI switchover was uniquely AltaVista-- remember this was the time when everyone wanted to be a portal and had to have a section with sports scores, repackaged news, and stock tracking.
Yahoo was the last man standing on that path, but I seem to recall a lot of hay being made about the Excite/@Home stuff where the ISPs were supposed to push their portal on unsuspecting customers.
I very distinctly remember when AltaVista removed support for "quoted words and phrases" in queries. I was studying CS in college at the time. I think that's when we switched en masse to Google.
I still miss AltaVista's query language. So much better at narrowing stuff down to _exactly_ want you wanted then google is, even today. Between googles "let me guess what you want and ignore search terms" and their paid placements, page 1 of the SERP is useless for technical work. Doubly so if your researching something obscure...
I still remember when I was first introduced to google in the 5th grade. Our computer class taught us about search engines and how to find information online. They showed us lycos, altavista, dogpile, ask jeeves, etc. and everyone in the class had their favorite site they would use when working on projects.
Within a day of showing us google, every kid in the class used google exclusively. They were so much better than their competition at the time.
I loved Altavista. The thinking had to be done by the user, but uf you understood the process, it was a precise and accurate tool. Before it got sold around, of course.
I'd construct searches along the lines of:
(Word OR Word) AND (Word NEAR Word)
And get great results. Of course, the Web is way to big and Javascript-y for that now.
Being a predecessor is not a failure. Au contraire.
One of Dawkins' memorable lines is "Descendents are common. Ancestors are exceptionally rare"
You could say crocodilians succeeded and dinosaurs failed. A croc is still a croc, but the dinosaurs are hummingbirds and seagulls. If you think about it though, both are ancestors... an exceptional success.
> A croc is still a croc, but the dinosaurs are hummingbirds and seagulls.
Slightly off-topic, but I went to Wikipedia to remind myself of the specific category of dinosaur that birds are descended from, and the 'Today's featured article' was about Achelousaurus, a ceratopsid dinosaur! I think this is the first time I've had such a close match to the thing I was interested in. From there, it was just three clicks to the article I needed [0], which, incidentally, stated that "The present scientific consensus is that birds are a group of maniraptoran theropod dinosaurs that originated during the Mesozoic Era".
I was a loyal mamma.com user until Google came along. I always liked the quality of results and the UI. They were swallowed up by Copernic who appear to still exist in the desktop search space.
I owe most of my career to AltaVista after comp.lang., comp.unix., and comp.databases.* -- even more so because AltaVista had indexed Internet newsgroups.
Am I the only one who thinks this article is very poorly written? It doesn't really tell all the strategic reasons why Altavista fell into oblivion and its structure is all over the place.
I also love Paul Graham's framework for imagining the future and working backwards. If we think like that, Google is nowhere near the form of a final solution to information retrieval. An ideal state would be to retrieve information correct the first time with everything you need bundled into the page. If that problem is solved, then you have to tackle the question of why the user was asking the query in the first place, and how your product can help people have a solution to their answer so that the query is never repeated!
I ran a build cluster in the server room in the basement where Altavista used to be located. The server room was actually pretty small - just a few rows of racks. We still had a sign in our office that said "Altavista Operations". It's pretty mindblowing just thinking how small internet-scale things were back then compared to now.
I think the first search engine I used was Magellan. Until AltaVista happened. Loved AltaVista, until Hotbot/Inktomi, which was the only one I had trouble letting go of, I think only because of their really clean and minimal UI. Even when Google came along I was one of the few that had a finite number of web sites I used to use that made Google virtually useless to me. The only search engine I use now is DDG.
It was as insane marginalising Google back then as so many other tech fads since then, including IOT, Bitcoin, XP/agile, Netbooks, 3D TVs (remember those?), and so on.
There is an upside to not having used anything Google - to this day I have zero reliance on any single product of theirs.
1) Google was the only search engine at that time that understood the power of search. Yahoo pushed its search box below the fold. Lycos, AV tried to bring more "content" to the search page.
2) It was blazingly fast.
3) It allowed you to test your current search with the competition search engines. After a few times you did that, you realized they were by far the best, and did not try it anymore.
In the time of dial-up, having a fast-loading search page probably accelerated the transition. Users factor in speed of load into their preference, even though they are not aware of it. Old school search engine home pages were criminally crowded. Google provided frictionless experience.
After the dot com crash? many of the spiders just stopped indexing new content. The altavista index became way out of date. If you weren't already in a yahoo category you weren't getting in. Dmoz didn't even have the editors to appoint any new editors... It was less a failure than a mass giving up, as if everyone had gone but forgotten to turn the servers off when they left.
I miss the NEAR keyword. I gather that Google has AROUND(n), or did at some point (one never knows with them), but damn, NEAR was pretty helpful when you had a nicely constructed Boolean that wasn't quite getting you what you needed.
Ahh, that day when I was introduced with the search engine & internet, they took us some kids to library PCs. The librarian instructed us to open altavista and said:
If you want to find Mr. Bean and will search for "Bean", you will find... beans. Type "Mr. Bean".
> As of 1998, it used 20 multi-processor machines using DEC's 64-bit Alpha processor. Together, the back-end machines had 130 GB of RAM and 500 GB of hard disk drive space,
I'm typing this on a machine with 20 threads, 64GB of RAM, and a hair over 12TB of disk.
I remember AltaVista even had a free dialup service. IIRC, it was like netzero, in that it displayed ads in a window that stayed on top. Of course, I would just find the window handle and set it to invisible
Results were crap because pages started doing SEO by stuffing white keywords on white background. If it were relaunched today, it may be efficient. But the whole game is to stay on top of this.
DDG provides almost the Google results, yet it feels much less efficient. I think, for the single-box-with-single-keyword-search market, Google is the best we can do, but there might still be room for other search engines.
Also, if you need historical, political or medical information, that is 3 domains where Google is already out of the game.
There is a lot of room today for a search engine which would only returns technical results and not politically or racially motivated results like Google does (Google had project to promote races other than Whites, and thus, started not returning some results depending on the race of the scientist).
Is there a search engine which displays 4-6 thumbnails of top sites? I feel google results are just giant walls of text that you randomly pick one. If you could actually see the relevant text on the website, you'd not even have to click?
Interesting, didn't know this was just a tech showcase. And it's always puzzling how management fails to monetize something new and successful. It seems most are only able to copy competitors.
Unrelated, but does anybody remember a site called astalavista.com? It had a lot of script kiddie tools, among them sub7, msn messenger flooders etc. I miss the early internet.
I remember paying for a membership for their "Premium Security Portal" back in 1999/2000. Got in a lot of trouble at school because of the things I learnt on that site. Good times :)
Google Search today is repeating the same mistakes I remember from the late 90s.
For the first decade of the web, there were a handful of search engines competing, rising and falling in popularity. The best were altavista, and fast.
One thing that was noticeable back then was that bad search engines (and search engines that 'jumped the shark' and became bad) generally did so in similar ways:
a) they included paid results, or devoted too much real-estate to advertising
b) when they failed to find results, they tried to trick the user by showing related results (eg: omitting or substituting terms)
c) they avoided 'logical and' for search terms, in favor of 'logical or', making it difficult for users to search with precision.
The people at Google surely believe their recent changes have nothing to do with all that. Far as I'm concerned, aside from the extra millions of dollars they've spent on AI research, it's the same old story. Nobody needs a somewhat smarter version of AskJeeves.
Google is a victim of its own success and of the increasing global accessibility of networked communications. If 90% of the American search market were split between 5 companies of roughly equal popularity, the ROI of gaming ranking to trick any one of those implementations would be much lower. There would be fewer people who could make a living by faking relevance/quality signals for junk.
Right now the best paying job many people with unspecialized skills can get is "tricking people into clicking things they shouldn't." Google is sorely taxed trying to keep up with the antics of a million people whose career is trying to game Google. Early Google was better because the people really desperate for money couldn't even afford to get online. That was a glaring inequity that doubled as a crude spam filter [1]. I think about this every time a real live person telephones me on behalf of "Windows Support."
[1] This is a large part of what I miss when I'm pining for the early Web. Practically everyone publishing online then had to be either more affluent than average or cleverer than average to get into Club Web. People contributing on the early Web were almost all financially situated independent of what they were contributing, so Web participation was almost all done out of passion rather than financial desperation. Authors didn't worry about how to get paid for what they wrote online and readers didn't worry about how to support their favorite sites either. People in the club were understood to have other means of sustenance. If you didn't, you wouldn't be in the club in the first place!
First: the problem isn't websites made by poor people. My biggest waste of time is ordinary quality but utterly irrelevant results.
Second: very often the problem isn't like before where websites included valuable search terms in white text on white background but rather that Google includes results that never included the search terms at all.
These are entirely Googles fault I think for over-optimizing for quantity instead of quality, not the fault of black hat SEO.
Yeah, being successful has its problems, as everyone wants to game your search engine, but what has that to do with the changes they made to make their search almost impossible to refine.
Operators are basically useless, they may or may not be respected, I can't select results from discussions like there was before, etc.
So basically google decides what to do with my imput, I have no saying on what I really want, so I have to deal with lots of crap.
I'm really one of those lazy users, and I find myself using other search engines more and more, just because this reasons, and despite their "normal" results being worse than google.
This is specially bad in Spanish. There aren't many alternatives in Spanish and sometimes drives me crazy. To the point that I looked into crawling myself some sites and put some OS search solution on top (I'm poor and It's a lot of work, so I didn't finally).
> Nobody needs a somewhat smarter version of AskJeeves.
That's exactly what I need. It's not right for everything, but asking questions is the natural form of information seeking for a human. Being able to do that well is a huge value add.
Yeah the current state of the tech is (in my opinion) just a voice command line. You have to know the right keywords to say, and in the right order, to get it to do what you want. It's largely an exercise in guess, test, and revise.
Interestingly, it didn’t just boil down to a Quora/StackOverflow model; it wasn’t a “wisdom of crowds” thing. Instead, your question really was used as a search query — but instead of searching a pool of documents, it would search a pool of experts, matching you with an expert who knows about similar things†, then facilitating contact with them (and forwarding them your initial query/question to start off the conversation, like a Helpdesk system.)
† Not sure how they did this part — for academic experts, they could “just” fulltext-index their corpus of published journal papers, to build up a “knowledge fingerprint” of the expert. Not sure what they would do for people in industry without a stream of publications, though.
Sadly, Google bought them, shut down the Aardvark product, and probably just put the engineers on regular SRE code-slinging tasks. It almost seems like Google felt threatened. And — hint hint — nothing’s stopping anyone from building something like this again :)
it blew everything else away pretty much immediately.
Right, I forgot people actually believe that.
Um, they were a little better, maybe noticeably so by one out of a thousand people. But wow, that's not why people switch search engines in droves.
The real reason, aside from their gift for self-promotion (I first heard about them in a science glossy, which was rare for a web 'company'), is that they had a cute, zany name, and didn't do the three things I mentioned.
Google was far more popular because of its spartan design, than its quality, regardless of how people mythologize the company now.
Notably, at a the time, a lot of those of us who were "power users" tended to stick with Altavista longer in many cases because Google was at first not so much better that they could compete with highly precise use of Altavista's search operators.
Google was much more noticeably better for non-technical who didn't know how to improve their results and didn't care to learn, but at the time that didn't translate to a massive numerical advantage in terms of users.
Yep, I also stuck with AltaVista for a couple years (or maybe Fast? Long time ago) because the I didn't find Google's results significantly better. I don't think it was the operators. In the late 00's Google's operators were the +traditional +plus +to +and +terms. They didn't introduce them immediately? Boy I was pissed when Google extracted that into 'verbatim mode'
Google's results were significantly better than other search engines, but only for a while. Soon SEO caught up with it using link farms and other forms of link spam. Then it was basically the same as old engines which were already spammed to death using other methods.
That period of better result quality, combined with other factors you mention, was long enough that most of Google's competition went away. So when the quality went down there was no-one to switch back to.
do you have evidence to support this position? they certainly weren't the only search engine with a cute, zany name. trying to recall exactly why i switched. i don't entirely recall, but i remember that at the time i used a rotation of multiple engines, usually starting with metacrawler and then branching to other engines that didn't index. after switching to google, that rotation immediately ended. my recollection is that results really were that much better.
i certainly appreciated google's design minimalism, but it certainly wasn't make-or-break, seeing as i was willing to go through multiple sites on every search.
i wouldn't have cared about what you termed 'tricking the user', which feels like a skewed characterization. i just interpreted such things as 'no results', the same way i do now.
that said, i can't really support my position either. i'm curious if anyone has tried to measure the quality of search over time.. you always see a lot of opinions about it here and elsewhere but it seems very much non-obvious to me, and hugely multivariate.
Unfortunately for my argument, my claim isn't founded on much beyond my gut feelings at the time. It would be easier to find articles that invalidate it. Certainly the ancient magazine article I mentioned (was it Discover? SciAm?) was a breathless story about boy genius graduate students and their breath-through page rank algorithm etc.
The people around me (maybe bloggers, too?) seemed as, or more, interested in Google's simple 'noncommercial' design and use of the word 'googol', than in the accuracy of Google's results.
On the plus side for my take, there's likely evidence that search users didn't consciously value search algorithms like they do now. In the 90s, people were mostly after 'total number of pages indexed'
Agreed. The first search I made on Google (~1999?), probably for something tech-related, turned up several porn results on the first page. Back to AltaVista, DMOZ, and webrings for at least another year…
No, it was better because it was much faster and results were much better. I remember those days, we (me, friends and family) were all using Altavista or lycos before, do you think we all switched to Google in a blink just because of it's spartan design?
I'm perfectly willing to check Bing or DDG or anything else I can find when google insists on misunderstanding my query. Google had exact word match, symbol for symbol, it had reliable logical operators, it had reliable string query. None of that is nearly as reliable as it was 10-12 years ago, and searching for some mildly controversial topics will get google practically yelling its opinion at you, rather than just searching for text matches.
> Far as I'm concerned, aside from the extra millions of dollars they've spent on AI research, it's the same old story.
It's interesting because it seems like the UX equivalent of "burning down furniture to heat the house" -- how does this kind of thing become so institutionalized at companies?
Is this merely the natural end season of the corporate life cycle where after innovation and growth the now engorged and dying corpse must be parted off and sold by the pound? There's something so uncomfortably Darwinian to me about that. But I suppose that's also why it's common -- it works.
> It's interesting because it seems like the UX equivalent of "burning down furniture to heat the house" -- how does this kind of thing become so institutionalized at companies?
You're in charge of revenue for a division. You give an estimate of $X for the current quarter and $Y for the next quarter; your boss changes your estimate to $1.5X and pushes it up the chain. Now there's two weeks left in the quarter and projections are that you'll only reach $1.1X, so your boss pushes you to stick more ads and make them bigger 'just for two weeks', but also reminds you that your revenue target for next quarter is $1.5Y, so maybe you should keep the big ads.
Yes, it would be. But if they charge for it people would not want to be served ads, and they would need to charge a lot of money for that to make sense for them.
It was my first real job at a large company and taught me a lot about working in corporate America.
I saw so many mistakes made within the year I worked there that were obvious even at the time for a lot of us that worked there, but at the same time there are many similarities to what happens with other very well funded projects trying to make sense of a new technology and way of doing businesses within a large very important company with a very different business model.
I have seen virtually the exact same playbook happen in the enterprise blockchain space in multiple occasions over the last 5 years.
It is sad in many ways to see what happened to DEC (probably more so than AltaVista). It was such an innovative company back in the 60s and 70s, but unlike IBM weren't able to reinvent themselves in first the new 80s world of PCs and then later internet. Classic case of innovators dilemma.
AltaVista itself largely died, because in a misguided attempt to manage the innovators dilemma they just tried to rebrand everything network oriented they had as AltaVista.
People only remember the search engine now and for good reason. But we had AltaVista firewalls, gigabit routers, network cards, mail server (both SMTP and X400 (!!!) and a bunch of other junk without a coherent strategy. Everything that had anything to do with networking got the AltaVista logo on it.
The focus became on selling their existing junk using the now hip AltaVista brand, but the AltaVista search itself was not given priority.
I learnt a lot from my experience there, grew to be extremely skeptical, learnt to love Dilbert and also learnt how cool the DEC Hardware and Digital Unix was compared to the Sun Sparc and Solaris stuff I had to work on afterwards.