> ...zero-click results have cost Wikipedia’s English language subdomain tens of millions of organic visits.
> ...Google was able to steal over 550 million clicks from Wikipedia in six months...
"Cost"? "Steal"?!
This would make sense if Wikipedia were ad-supported. But Google saves Wikipedia money by requiring less servers to support traffic. And Wikipedia is open content, you literally can't steal from it -- being open content was part of its original mission statement!
I personally love it when my search results just give me the answer I'm looking for, so I don't have to click through to Wikipedia (or any site) and wade through a page to try to find it, and maybe it's there or maybe it's not.
The idea that Wikipedia's success ought to be measured in pageviews is deeply misguided. The more its content spreads and is reused across the world, online and offline, the better it is for humanity.
And to be clear, this certainly isn't any kind of "embrace, extend, extinguish" strategy on Google's part. Wikipedia isn't declining or going away. Every time you need to read an actual whole article, you still go there. This is solely about convenience in getting quick facts.
As the article points out, the problem isn't specifically that this harms Wikipedia, it's that this practice harms pretty much every for profit business that Google does this to.
Pretty much every company in an established space that Google has entered, is losing traffic and thus revenue due to this practice: Yelp, any flight/hotel/tourism based company, video hosting sites, weather data sites, the list goes on and on.
Google is staking it's claim to content and data. It's legal, and provides short term benefits to users. But in the long term decreases competition and entrenches Google further, which is bad for users.
Google has also been pushing sites that do on-page SEO right heavily in the SERPs. Which basically means that SERPs are now filled with copycat content that reads like crap but has all the right headings and keyword densities.
The search experience is genuinely bad now. Nothing organic shows up anymore. It's either heavily optimized cookie cutter content, or some big brand name cutting through the clutter on the strength of its domain authority alone.
I've taken to appending "reddit" to my queries just to know what actual people think about an issue
> Google has also been pushing sites that do on-page SEO right heavily in the SERPs.
I can't echo this loud enough - I made great profit by out-ranking heavily established retail outlets for years with super spammy and terrible sites purely with correct on-page SEO and garbage 'keyword optimized' content.
I, as well, now have to append things to searches in order to get real answers. (Or start with a twitter search, depending on the topic)
I often have to specifically search for a Wikipedia article in Google to get the result for it. For example, if I search for a pharmaceutical molecule, I will get pages of "health line" type pages listing side effects and uses.
Five to ten years ago, the Wikipedia article for the molecule was always in the top three.
If it's tech-related or even music, history, or philosopy-related, I'll append site:ycombinator.com to my DDG search queries. Unfortunately, it's not just Google search that's full of SEO-optimised crap.
DDG's actually worse in many ways. I've just switched back from it to Google because most times I was prepending my searches with !g or rerunning them due to frustration at the crappy results.
Example: in the midst of my house renovation my Hisense TV remote has gone walkabout. After several months of having no idea where it is (I figured it would just turn up - I mean, it has to be in here somewhere, right?), I decided to order a replacement. Amazon is full of knock-offs, but turns out Hisense sell remotes directly.
So I type "Hisense" into DDG and it's the top result, but it's the Chinese site, and I can't figure out how to get to the UK site. The UK site isn't even on the first page of results either. I rerun the search with Google and the Hisense UK site is the top result, and I'm able to quickly find replacement remotes.
The point is not that in this case with DDG I needed to execute a more specific search (or, by extension, execute two searches to get what I need). The point is that in almost every case where I start with DDG I have to execute more than one search to find what I want. That second search is often with !g.
That's a shame because, as bad as Google has become, it's still better than other options. As an example of how bad Google really is, last night I tried searching for info on how to mount twin slot shelving uprights on an uneven wall - specifically a wall where the otherwise smooth plaster is less than perfectly vertical (there's an undulation reflecting imperfections in the underlying brickwork). The result? Just pages and pages of SEO spam/"content marketing" on the very basics of fitting twin slot shelving, all of which assumes that your walls are perfectly vertical across their entire surface, and none of which has any kind of troubleshooting hints and tips. I probably tried 10 different search query variants before giving up in disgust.
Google is outright terrible, and its much touted AI is laughably poor[1]. But DuckDuckGo is worse. I'm simply choosing the least bad option, which unfortunately isn't saying much.
[1] Or enragingly poor depending on your mood and perspective.
I had an idea last night for a browser extension that hides SEO chaff from search engine results. I don't have much programming experience so who knows if I'll work on this or not though.
I also append forum (or site:forumname.com) if I'm looking for more in-depth review than reddit gives.
Yep. Whenever google autocompletes to "... reddit" I end up choosing that option, because it foretells that the results are going to be crap if I don't.
At the moment, it's good for users, as with most increases in competition. Google's hotel search and flight search user interfaces are superior to those provided by many of the task-specific businesses.
There is an alternative and IMO more sustainable model for a search engine, specifically for non-organic results (i.e. instant type answers). It is to create connections with each underlying data provider, and based on the search query provide results from their sites in accordance with their display/usage terms.
It also provides the advantage of being able to curate the sources to provide higher quality for niche search topics. [0]
I have been working on a search engine that works this exact way. The challenge is identifying and integrating the data sources for all the potential search intents.
This is a step in a good direction but is hard to scale.
A better approach would be to still extract data from websites like Google does (this works), but automatically attribute share (50%?) of search engine's profits proportional to the website's share of appearances in these results. This money would await the webmaster when they eventually verify domain ownership. This is fair for everyone and creates positive feedback loops.
Lots of companies reached monopoly status because they were simply better than the competition.
The point of breaking up monopolies is not to make a moral statement about the company, but because we've found that monopolies are generally bad for society.
Is it really though? Bing has been consistently worse than Google over the years. Maybe our definition of vendor lock-in needs to change.
Honestly at this point it doesn't even matter how good Bing is - we've been unconsciously trained to work with Google's algorithm in particular and they just have a de facto monopoly on the mental process a person goes through to formulate a search. Everyone's workflow everywhere will be worse and take more time if they voluntarily stop using Google, that's not what I consider a fair competitive landscape.
I don't think Google claims any copyright on the Wikipedia snippets. You can share them just like you can share the original. They also give a link back to the source and clearly mark them as Wikipedia content. What more do you think that they should do to satisfy the license terms?
Arguably the entire generated page is a derived work, in which case the whole page should be CC-BY-SA not merely the snippet.
On the other hand, arguably incorporation of the snippet (as the incorporation of context for other pages) is permitted in some way other than licensing (fair use?), in which case the terms of the license wouldn't apply.
I would have to think harder to have a well-formed personal opinion.
Maybe you should familiarize yourself with the creative commons licenses. They allow all kinds of use, as long as (in the case of BY-SA) you mention the original author and share derivative works under the same license.
Fair Use is not a concept in most jurisdictions btw.
Exactly, take "Fastsecret" for example, if you go to the web site and look up calories for something you often get the result in a one box from this site. However if you go to the site you see that they are really trying to get app conversions for their app. As conversions are typically a percentage of page visits, all of those page visits are now "toast" so to speak.
If the problem is not that Google is (not) stealing Wikipedia’s clicks but instead is stealing the clicks —- and, more accurately, content — of other, for profit, sites then why go with that headline? Because it’s good clickbait.
I’m pretty sympathetic to the plights of Weather.com, Genius.com, Zagats (oops, Google bought them and then resold them), Yelp, and whomever else Google is appropriating non-public domain and non-Creative Commons content from. But that’s a story that’s already been told, so the site decided to find a heart-string tugging yet intellectually-dishonest angle.
In long term, there won't be free content for Google to scrape and display in search results knowledge graphs as ad supported / free content businesses shut down. Of course, google can create its own content and does so to feed the knowledge graphs, but it's way more expensive to do so. And the math might not be worth it for google manufacture from scratch all this content.
So then folks won't use google, but will they go direct to Yelp 2030? Will there be Yelp 2030?
Google's ad network provides a lot of support to these "free" content options. Google is not bad at creating own content, the amount of data for example they can slurp up from phones running google maps is pretty darn high.
Our local govt has traffic sensors for the highways - huge bucks to install / maintain etc. I can get more granular traffic that is more current from google for free. If you are going to compete with google in a data management / collection business, you are going to be playing with firm with pretty deep ability to ingest and process data.
How about if you want the information in a search engine, but you would prefer that the "search engine" didn't just nick the contents and display it on their site, rather than sending searchers to yours?
I was specifically asking because I’d never heard of a robots.txt solution for blocking snippets. None of the responses have included that information.
Why don't you try being a little kinder, especially when you don't have all the facts?
It's not as though web design like knowledge of meta tags and robots.txt is the sum total of everything to know about computers. Maybe they've built their own 8-bitter, or are a motherboard PCB designer? I'd argue that'd put them firmly in the 'computer guru' class. But we don't know, so it's probably better to say nothing at all.
Hey, thanks. I've done all those things but tbh the name predates all that. I was an op in a tech support irc channel in the 90s/00s and that was the name I'd picked at the time (I was not comfortable with tying my ID to myself IRL but perhaps even moreso, I was also very sensitive about being pretty much the only minority on all the user groups and mailing lists I frequented. The "guru" half of the moniker was somewhat an homage to being a minority, although I'm not Indian).
I messaged the HN mods many years ago (either pg or dang, can't remember) when things had changed and I had decided to just use my real name online but was told that it would not be possible to change it for posterity purposes.
In a sense, yes, the engine is too good. If you continue this trend, eventually nobody will have a reason to leave Google because anytime you search for anything Google will have the answer to everything.
Sure, it's convenient but it's also a bit frightening for one company to have that much power and I think that's what people are a little hesitant about.
Google should help the users choose whether to visit the site in question or not, but not flat-out remove the need as everything is directly presented in Google. Often the sentences Google shows in the card excerpts are lacking in context, but you wouldn't know without visiting the site.
This would be comparable to write a scientific paper and only reading the headlines of each source you're using
I’m not sure if you’re being dense on purpose. They are not just excerpt. They are taking full on paragraph of content. We aren’t talking of the meta description here. It’s taking so much content that you don’t need to click anymore. Google is now the encyclopedia, with content they did not create.
Then allow it only to search engines where you have an agreement, or at least they have provided a public commitment, as to how they will present information that aligns with your preferences. You don't have to use “User-agent: *” for your rules.
Reducing traffic to Wikipedia reduces donations, contributions, brand awareness.
Grabbing Wikipedia's added value is like stealing candies from a child.
Wikipedia does not need any donations. If they Wikimedia Foundation puts its cash reserves into safe low yield investments, it’ll be able to run Wikipedia for decades. Of course, it won’t pay for salaries of the current WMF employees, but Wikipedia does only need a few of them.
I have to object to this. Wikimedia as of their audited 2019 financial statements (https://upload.wikimedia.org/wikipedia/foundation/3/31/Wikim...) already has ~$53m in short term investments, leaving ~$101m in "cash and cash equivalents", which notably often includes Treasury bonds, CDs, etc, anyways. So they are likely already investing all of their cash into safe, low-yield investments already.
Even if they weren't, the ~$154m they have free for investments, would only gross ~$1.2m in 10 year Treasuries, or at most $5-6m in relatively higher-yield investment grade corporate bonds.
Looking at their expenses in 2019, even if you cut donation processing expenses to $0, cut awards and grants to $0, cut travel and conferences to $0, strip out depreciation and amortization, and cut their $46m payroll in half to $23m (which at a fully loaded cost of $200k/employee, is only 115 employees for one of the most widely used websites in the world), you would still be looking at annual operating expenses of ~$45m.
There is absolutely no way that Wikipedia/the Wikimedia Foundation could survive without any donations - they have $154m between cash and investments, and would need to support $45m in annual operating expenses, even with these fantastically high budget cuts. That means they'd need to net, after taxes, almost 30% return on their assets every year, just to tread water, and only after cutting their budget essentially in half.
Wikipedia has 51.1 million articles in 291 languages, is something like the 5th most visited website in the US, 10th in the world, and they manage to do this without running any ads. Can't we be thankful for the incredible human library of knowledge they've built, and chip in a few dollars if we are able, instead of complaining and telling them they should not accept donations?
On the flip side, the website hosting only costs $2.3 million annually [0] and the content creators work for free. Even if the Wikimedia Foundation could not survive like that, a lean Wikipedia operation clearly could.
I disagree. There is no such thing as a "lean" site that is as popular as Wikipedia (top 5-10 US depending on the day). I believe they are likely the leanest website in the Alexa top 30. The only site that I see that is leaner in the top rankings is Craigslist, currently at #38 in America, and Craigslist is a famously, fantastically, uniquely lean site (in some people's opinions, even to a fault).
Hosting costs for Wikipedia are "only" $2.3m, but imagine the legal expense they have, fighting likely millions of takedown requests, malicious lawsuits, complying with local laws and regulations in China, Russia, etc, paying a team of top software engineers to handle running a top-visited site (that by the way, has almost no downtime), a security threat model that includes nation-state actors, and the responsibility that if they fail (by being hacked, or sued, or DDOSd, or whatever), they will have let humanity's library be harmed.
I just can't understand how any casual bystander can complain about Wikipedia or their funding/organizational model. If you don't like it, just don't donate money.
> but imagine the legal expense they have, fighting likely millions of takedown requests, malicious lawsuits, complying with local laws and regulations in China, Russia
Wikimedia Foundation is not fighting any takedown requests. Wikimedia Foundation had exactly ten DMCA requests last year[1]. That’s a bit short of “millions” you believe it does. It does not comply with any local regulations in China or Russia, or elsewhere, because it does not operate in China or Russia (or really anywhere outside US, except caching servers in NL and Singapore). Wikipedia is actually completely blocked in China, and the fact that you did not know that signals your utter unfamiliarity with the realities of Wikipedia and Wikimedia Foundation.
> paying a team of top software engineers to handle running a top-visited site (that by the way, has almost no downtime),
It was already a top visited site when its development team consisted of a single person, Brion Vibber. There have been very little significant development since then, and if there had been zero development since then, you’d probably not even notice.
Now, to be sure, at this scale, it needs some full time round the clock reliability engineers, but you can easily see that their headcount keeps growing, but their site and infrastructure is mostly unchanged.
> It was already a top visited site when its development team consisted of a single person, Brion Vibber. There have been very little significant development since then […].
The second member of the development team was hired in 2006 [0].
In these ~14 years, quite a few things happened.
Three projects joined the Wikimedia galaxy: Wikiversity in 2006, Wikivoyage adopted in 2012, and frickin’ Wikidata [1] in 2012 − which has deeply reshaped many aspects of the other projects, particularly Wikipedias and Commons.
On the multimedia side of things, we got InstantCommons in 2008 [3], thumbnailing infrastructure changes in 2013, various file format support (TIFF in 2010, FLAC and WAV in 2013, WebM, 3D formats in 2018 [4]) new upload wizard in 2011 [5]. The Graph extension [6] and Wikimedia Maps [7] in 2015. Structured Data on Commons in 2019 [8]. New default skin (Vector) in 2010 [9]. Unified login in 2008 [10]. 2013 brought OAuth [11], Echo notifications [12], Lua scripting [13], VisualEditor [14]. iOS and Android apps [15]. The Wikimedia Cloud Services starting 2012 [2].
(And in terms of size: article count went from ~5M to ~50M [16] ; Commons went from 1M files to 50M files [17])
And that’s just what I’m putting together in a few minutes (Besides my own memory, I’m indebted to [18], a curated timeline up until 2013).
Of course, these may or may not justify the staff size in your book ; but I’d say discounting all of that (and the rest) as “very little significant development” is a bit pushing it. :-)
(And fairly sure that “you’d probably notice” if Wikipedia was still using good’old Monobook skin ;-þ).
>Wikipedia has 51.1 million articles in 291 languages, is something like the 5th most visited website in the US, 10th in the world, and they manage to do this without running any ads. Can't we be thankful for the incredible human library of knowledge they've built, and chip in a few dollars if we are able, instead of complaining and telling them they should not accept donations?
But the donations don't go to the people who wrote those articles!
I'm personally very thankful for the library of knowledge that is Wikipedia. For a lot of searches on technical matters I usually bypass Google and go directly to Wikipedia. I find that you can often use the references and links to bypass the Ad infested web to some extent.
The parent poster is correct. WMF spends about US$2m on hosting annually (out of $100m+ in annual donations), and burns almost $90m per year, as if it were some kind of VC-funded startup.[1][2] Spending $46m on salaries is ridiculous when almost all of the work is done by unpaid volunteers.[2]
"Volunteers also contribute in several ways to the Foundation’s wiki software: volunteer software developers add new functionality to the code base, and volunteer language specialists add to the code base by translating the wiki interface into different languages. During the year ended June30, 2019, there were 48,361 commits merged, through the efforts of approximately 447 authors/contributors, of which 9,158 commits were through the efforts of approximately 258 volunteers."
Only if somehow Google disproportionally captured would-be donors. Otherwise, assuming linear scaling, this changes nothing - and since things on the web tend to grow costs sublinearly, it technically makes Wikipedia ahead in donations/costs.
I personally don't like the short Wikipedia excerpts.
I almost always want more information, and Google tries to make the Wikipedia link as unintuitive as possible in order to keep the user on Google property.
What's more, showing the info box means that the real article link is removed from the results list.
Edit: to clarify, I do find the blue Wikipedia link decidedly small and I often have to actively look for it instead of it being an intuitive click and clear on first sight, especially on mobile.
WRT to the results, they do indeed seem to appear now, but often not in the first or second position. (I do remember that not being the case previously, but I admittedly might be mistaken)
> Google tries to make the Wikipedia link as undiscoverable as possible
Huh?
If I search for "who founded the new york times", then immediately below the three-line snippet answer, is a big (not small) blue link to "The New York Times - Wikipedia". In fact, it's literally the easiest, most prominent thing on the page for me to click on. Google is helping to guide me on to visit Wikipedia for more info.
I think you're confusing the short excerpts (with have a clear, obvious Wikipedia link) with Google's Knowledge Graph results, which is a different thing and is based on many different sources.
I think they're talking about the Knowledgegraph, assuming Knowledgegraph is the content in the right column. If so, they're right. The discoverability of wikipedia in that thing is poor and it's aggravating when they bury the one link I want inside text content so it's a moving target on the page in a small font.
Fascinating, quoting the query changes results. Pipes represent the search field:
__________________________
|who founded the new york times|
__________________________
> Search Results
> The New York Times/Founder
> Henry Jarvis Raymond [link to google query |Henry Jarvis Raymond|]
> The New York Times was founded as the New-York Daily Times on September 18, 1851. Founded by journalist and politician Henry Jarvis Raymond and former banker George Jones, the Times was initially published by Raymond, Jones & Company.
> I almost always want more information, and Google tries to make the Wikipedia link as unintuitive as possible in order to keep the user on Google property.
I have exactly the same problem, so I hate those Wikipedia boxes. Half of the time I go to a Wiki page for some term then switch to another language (either because I searched an unknown word or because I’m searching for a translation) so the box don’t solve my need. The other half I want more info than what is displayed anyway. And I also find super confusing the position of the link, I have to search for it each and every time.
That’s just false, showing the info box does not remove the link from itemized search results. If a Wikipedia link is nowhere to be found, that’s indication that the info box isn’t info from Wikipedia in the first place (e.g. movies, certain persons). Same goes for answer snippets.
I wonder if there are multiple UIs in play. I just went and searched for some examples and the box I'm seeing today is fine, but I swear I've had a box in the past where the Wikipedia link was small, buried, and hard to find. When I say hard, I mean it takes a few seconds, but the idea is I have to go hunting for it while Google tries to entice me to click something else which isn't what I want.
> I personally love it when my search results just give me the answer I'm looking for
This is one of the things I love about google as a user. Googling "what is my ip" or "how many inches in a meter" used to require you to visit some random website filled to the brim with ads.
This was the social contract of the web. You give google free content , they give you back money through adsense. Google broke it, and went overly greedy.
If enough websites band together (incl. wikipedia), they could boycott google and force it to pay them per search click or sth. It's only fair . Google is exploiting a system in their own advantage and to the detriment of others. This is no longer win-win.
Awesome. Now try it at a client's off-site location with hardware and terminal locked down by someone else's IT department while you're trying to troubleshoot their connection.
Sure, it's a good command to know, but completely apropos of nothing. A command line alias is not at all an alternative to typing "what's my IP" into google, something that even my mom could do.
You don't have to see ads to not want to visit places that inundate others with them. Also, having an ad blocker should not give you the false impression that you aren't seeing ads on the web.
Yes, I do because I prefer compensating content creators, because otherwise we will not have many of those. If some website has shitty ads (blinking banners, auto-playing videos, malware etc), I just avoid those sites altogether.
Since you seem to be someone who doesn't see ads on the web, I am curious to know how you pay for the content you consume? I have come across very few non-ads options which work well. For me, subscription fatigue sets in across various video-on-demand sites and various high-quality newspapers. But for a lot of my family / friends across the globe, subscription charges are high enough that they would rather see ads.
I try my best to contribute for things I care about. I'll send the occaisionally donation into the guardian, ETF, Patron for a youtuber, wikipedia, or whatever. I do have a couple subscriptions like for Spotify or the economist, but otherwise I try to give to stick to donations whenever I feel like I have money to burn.
That being said, I live with the fact that certain content creators get nothing from me, I don't feel great about it but I would never consume their content if I had to go through ads, either.
Its funny to note that many people here complain about Google preventing page views, but have no restraint in doing the same by blocking ads. Pot meet kettle...
Google is a search monopoly. Republishing search results to keep users on Google (rather than the publisher) is literally stealing content and the pageviews the content generated.
As a user, blocking programmatic ads (such as those served by ad marketplaces) is the only way to browse the web safely.
Websites are free to use affiliate links, subscriptions, donations, non-marketplace ads, or any other safe monetization strategy. I will not apologize for blocking the unsafe monetization strategies.
At some point in an encyclopedia's lifetime, it's "complete enough" that you'll see an excess of authors. "Compiling the world's information" and, past that, "compiling current events as they happen" (with a side of "occasionally improving/updating old articles") need very different numbers of authors.
What I'm trying to communicate is that Wikimedia has challenges, possibly exacerbated by Google's behaviour, but the challenges are not money. Wikimedia is not a for-profit corporation, it's a non-profit with a mission to collect and distribute knowledge. Google distributes pieces of that knowledge, but does not contribute to the collection of new knowledge.¹ Wikimedia has all the money it needs, and is getting more money from more people all the time. It does not have all the contributions it needs in the form of new knowledge, and by many measures the new knowledge it gets is decreasing.
tl;dr
The monetary cost does not really matter
1. It does contribute money, which Wikimedia does not need.
the decreasing number of edits likely has more to do with the obnoxious edit policies and power users who will stamp down on any new users trying to edit, rather than a few less pageviews directed from google search results.
I'm pretty confident only a small fraction of people edit anyway and those who would only need a few encounters to start becoming editors.
This is based off the fact that I frequently see:
- "why does X not have a Wikipedia article?"
- "this Wikipedia article is so poorly written"
And very rarely actually see someone edit it. On occasion thousands of people have agreed and hundreds of people have written comments agreeing and only two people edited.
So maybe it does some damage at the margin but overall probably doesn't hurt that much.
The Wikimedia Foundation has a mission of, in part, disseminating information. It's mission is NOT "get as much donation money as possible." If the loss in donations allows for their mission to be done more efficiently then that's a net gain.
This is just a smokescreen. If we assume, very conservatively, that Wikipedia content outright satisfies 1% of Google searches (through info boxes, Wikipedia widget etc), thats would be worth 1% of $100bn+ of revenue Google generates from search = $1bn. Even if a fair thing to do would be to split that revenue, Wikipedia should still get $500mn in that arrangement. So a $2mn donation is just a smokescreen. Google is extracting extraordinary value from Wikipedia, and just because it is 'free' doesn't mean it shouldn't fairly give back.
Up until now, an edit was one click away. Now, it's not only one more click away but wikipedia has no way to communicate that editing the result is possible at all.
Do you refer to the markup jungle, or the VisualEditor? Because from the outside, the new editor looks fantastic, and honestly makes me want to self-host MediaWiki again.
Not to justify wasting money, but the questionable UX features are a relatively small part of the massive cash fire that is the WMF.
Also, while there have been some large and costly failures (Visual Editor), UX features are one of the few areas where I think that the WMF could realistically advance their purpose: increasing engagement with the encyclopedia. Features like Page Previews can be very helpful. There are a number of possible UX tweaks to improve reading and editing the encyclopedia, and these could all be helpful to the long term health of the encyclopedia (assuming they don't cut into the budget for servers).
I'm honestly glad to hear that. The main point of the Visual Editor was to make editing Wikipedia easier and more inclusive. The research suggests that it did not really succeed in achieving these goals. This result is a little surprising to me, but there are probably bigger barriers to editing wikipedia than a markup language.
For some people the visual editor is great, and I think that as a long term investment it was a good idea (the benefit could have been large) even though it did not pan out. The problem is that the cost and time spent creating it was very high, and the WMF foundation has a lousy track record executing these projects effectively.
VisualEditor is not worth it. It's being touted as a massive improvement, but which editors is it really catering too? Most valuable content in Wikipedia is written by a minority! By spending too much money on VisualEditor, you're optimizing for people who are barely adding value anyway! The money could have been spent on an interactive markup tutorial instead. That would be way cheaper, because it has a linear flow. It would also have been easier because it isn't as performance critical, as it doesn't need to load millions of time per day, only as part of the onboarding process.
If you compare Wikimedias spending 5 years ago to what it is now, it has ballooned in such an excessive way if you put it in context. Wikipedia isn't providing double the value of what it was 5 years ago.
Surely it's "worth it" if you want the site to be usable by novice users. This is about optimizing for reach and intellectual diversity, not just "people who are currently adding value".
> Wikipedia isn't providing double the value of what it was 5 years ago.
Wikimedia supports other projects besides Wikipedia itself, and the value it provides has not just doubled but plausibly grown by an order of magnitude compared to its early days. Wikimedia Commons and Wikidata are hugely beneficial to the Internet community, and Wikivoyage is not far behind.
> Most valuable content in Wikipedia is written by a minority! By spending too much money on VisualEditor, you're optimizing for people who are barely adding value anyway!
Maybe this wouldn't be the case if pages were less complicated to edit?
I agree its not stealing (the mission is to collect and disseminate knowledgge, not to drive page hits), but the main workflow to get new users is somewhat disrupted by more queries being answered by google.
> I personally love it when my search results just give me the answer I'm looking for, so I don't have to click through to Wikipedia (or any site)
Except for the very frequent times when Google’s summarization engine conveys the opposite point than the text was making.
I’ve had lots of cases where google reassures me that something is possible (like an iOS feature) because it takes a Yes from paragraph 2 and then procedure from paragraph 10. Click the text and it starts with “You can’t do this but there is this other thing you can do”
Both google’s fault and the seospam answer stuffing fault where they put multiple answers on 1 page.
I have one from yesterday where I tried to figure out if I could connect Files on my iOS to a webdav server.
According to Google that seemed possible.
While I don't use Google much more (except the last few days to figure out if the improved quality I've seen and heard about is true) I have enough experience from the last decade to not immediately believe them.
In reality, Roku uses a similar and half-compatible implementation of the Chromecast standard, but it isn't supported in all the same places you can Chromecast, and you'll frequently experience hiccups like the Roku only streaming in SD or 720p instead of 1080p or 4k.
This is partially the fault of the site sharing this information, as they're not presenting this in the most clear way either. However, when Google chooses to try to automatically answer my question, they become responsible for that answer, and when that answer is misleading, it annoys me.
This line of thinking only works in the case of a non profit like Wikipedia. What about the countless other for profit sites that are taking a hit because of AMP?
This is really quite short-sighted. Sure, customers like this in the short term. But this hurts the website owner.
e.g. 1: You are a restaurant or retailer and your potential customers search Google for your hours of operation or phone number. Google shows them the answer so they never come to your site. So they don't see that you have a free delivery offer during the Covid crisis. And you can't market to them via any campaign that retargets visitors to your site.
e.g. 2: You are a publisher that makes money from ads, like say CelebrityNetWorth [1]. Google steals your content and you never get the traffic you'd like to monetize. This is HN so there will be scorn for the ad-fueled business model. But they serve a need. It's one thing for the market to punish ad-monetized sites, it's quite another for Google to steal from them.
e.g. 3: You are Wikipedia. While your content is free, you rely on community contributions to grow. If someone never visits your site, they don't learn about your mission, don't learn that they can contribute. Your corpus stagnates.
The only reason Google gets away with this is because they are stealing pennies from millions of people rather than millions from a single entity. Indie content creators do not have the resources to fight Google on this, because G offers them an all-or-nothing option[2]: you can either be in Google search results or not. And no one can afford not to be.
Google has been so emboldened that they now change the content and user experience _on the website_.[3] Consumers click through to your site and Google will scroll them to a specific section, and highlight that in yellow. Not only does Google control who gets to your site via the search monopoly, but they steal your content, and control how people experience _your_ site.
Do you think that Wikipedia doesn't want traffic? When people land on Wikipedia it creates brand recognition and the opportunity for them to ask for support from those visitors, which is what keeps Wikipedia going. Some of those users will edit pages while they are there.
What Google is doing is terrible, especially for smaller sites. Those sites depend on traffic to survive.
AMP makes things even worse, because now the visitors never actually go to the website's own independent servers, even if "the content" loads, and Google dictates how the sites have to be built. Web publishers are in the process of losing control of their websites and independence.
wikipedia is probably not the best site to use for that argument for the reasons you listed. But the article also mentions weather.com, which is ad-supported, and zero-click answers can come from other ad-supported sites as well.
> But Google saves Wikipedia money by requiring less servers to support traffic.
it would be nice if this was a two-party agreement. Google is forcible changing or dictating any potential business model changes Wikipedia might want to make.
wikipedia is also losing editors and fact-checkers. It also degrades its image -- why would i contribute my time so that google exploits it to become richer?
do you remember when amazon had it's own wikipedia redirector? It was something like www.amazon.com/wikipedia/<rest-of-url> and any book that would appear as an ISBN would be clickable to the book page on amazon for purchase.
Google is selling ads against free content created in Wikipedia, and Wikipedia sees none of that. That isn't fair to Wikipedia and it would be catastrophic for many for-profit websites.
I've definitely noticed that I"ve had to add "wiki" to my search results to see Wikpedia articles to certain subjects I'm searching for whereas in previous years, Wikipedia was almost always the number one result.
I feel like I'm back in the 90s when Yahoo went from a pretty good search engine to mediocre with ads and stuff, and I marveled at the clean simplicity of Google. Now, I'm finding Google shows a bunch of articles from dubious sources and whereas DDG will pull Wikipedia articles closer to the top.
I've been noticing it for a while, the search quality gets worse and worse every year. For me, the red line was the Panda update in 2011. That update completely destroyed the search quality by bringing dubious quality news articles and ecommerce websites on the top of search, it never really recovered since then.
That's exactly what I've been doing almost unconsciously, I'm adding "wiki" or "reddit" at the end of every query except for programming ones.
The only time I'm using the search as intended is for programming error queries where it's way too niche for Google to append low quality news farms / ecommerce websites. That's the only type of query I still get good results.
Even programming questions. If it's something considered popular, you get awful results. Take javascript, I always append "mdn" when I'm trying to look up a language/api detail. Otherwise I'd be sifting through the top ten garbage Q&A or tutorial sites. In the glory days, Page Rank would return reference material first since it was so heavily referenced. But clearly that no longer works in the real world of SEO optimization.
Finding good Angular content is damn near impossible on Google, with the immense amount of regurgitated blog spam trying to sell templates via blog posts disguised as guides/tutorials.
I loved how Google used to find high quality discussions on all sorts of niche topics (tech, gardening, science, diy...) on sites like MetaFilter and Reddit.
Now those queries seem to lead to mostly videos, low quality sites, and shopping sites.
I love how people totally ignore the million of daily users based on their personal experience. If it was essentialy unusable, there wouldn’t be millions of people on it.
True. But you can still use old.reddit.com. The content is still there, it's just not accessible. So my process has been to search for 'keyword + reddit', click the best search result, then manually type in 'old' in place of 'www' in the url.
That sounds tedious. You should at least be able to skip that last step with one of the various browser extensions to redirect to old.reddit.com automatically, e.g. this thing for Chrome:
It is kind of tedious. Thanks for the link to the extension. I have used that extension at times, but in my case there are gaps where I'm either at work, on a different browser or device, where this has sometimes come up. But I agree, an extension is best.
Pinterest is a horror, there are Google searches where half the results visible at first are Pinterest links with various TLDs (pinterest.com, .fr, .de etc.)
If you have Firefox I would suggest that you go on Wikipedia.org, right-click the search box, click "Add keyword to this search", and pick a keyword. This makes it faster to find information that you know is on Wikipedia.
Chrome has had custom search engine keywords since day one (yes, I actually went to locate a day-one article on this feature[1]). Not sure why every time this topic comes up someone has to mention it as if it’s some sort of Firefox secret sauce, complete with weird replies like if you are “stuck on Chromium” you can emulate this with a pointless round trip to DuckDuckGo.
> Not sure why every time this topic comes up someone has to mention it as if it’s some sort of Firefox secret sauce
If you're not sure I can clarify that for you. I was responding to someone talknig about ddg, and I was showing a better way of doing it.
I didn't know that it's possible with Chrome because I don't use it.
In this case I can totally believe it’s innocuous. But as I said, it happens a lot. It happens on Firefox evangelism threads, Chrome bashing threads, DDG evangelism threads, Google bashing threads, etc. where a direct comparison is made. Imagine reading “hey C++ programmers, check out Rust, we have this amazing feature called zero cost abstractions!” over and over. Just really tiring.
Yeah, thank god some people still learn their tools. When I was younger I thought one of the defining characteristics of "techies" is that we open settings/preferences (followed by the advanced version) as soon as we get our hands on a piece of software. Unfortunately it seems the overall techie population is just getting lazier and lazier (not lazy in a good way), and now the norm is, or is sliding towards open the box -> whine about things not working out-of-box when the feature is right there -> find suboptimal workarounds on StackOverflow or random blogs posted by lazy people.
The fact that I like configuration is the reason that I ditched Chrome at home and moved from it to Firefox to Waterfox to Palemoon, where I can use Pentadactyl to customize far beyond the limits of any browser I'd used before.
At work I have to use Chrome (and worse, Chrome OS), and I used the settings to disable syncing between the devices I switch between (because I don't want Google to have any extra excuses to handle my data). Some of them I only use for an hour, so I optimize my customizations to be made quickly. I switch the default search engine to get thousands of keywords in seconds, then load AdNauseum for ads, then depending on how long and what I'll use the browser for, I'll get Surfingkeys or a similar extension for general vim-bindings (I've switched between more of these than I have fingers on one hand because they're all inferior to Pentadactyl in somewhat different ways) and wasavi for more extensive vi-like bindings in text fields.
> weird replies like if you are “stuck on Chromium”
My bad. I saw the parent and forgot I'd seen keywords on Chromium. To be fair, Chrome buries this deeper than Firefox, whose payment for pushing Google is presumably smaller (and IIRC has a few built in). It's enough of a hassle to set up that I personally used bangs (there are thousands of them already set up for you) when I was stuck on Chrome on a variety of unsynced devices daily at work before the nCov.
This, but also, when you're stuck on Chromium, change the default search to DuckDuckGo or Searx and use !w or !wp to search Wikipedia (and !s !sp if you want Google results for some reason, and !g or !go if you want Google results with a filter bubble).
yah, keywords are such a great but probably underutilized feature of firefox. i set up 'wiki' and a bunch of others when i started using firefox over a decade ago. i still use most today, although i've pointedly replaced google with duckduckgo ('g') long ago as well.
The default is the whole domain (and it also works with other sites, like "amazon.com query"), but you can also change the keyword for it. For several years I've had "wp" as the keyword for wikipedia searches, and it works in both Firefox and Chrome. I just type "wp whatever my query is", no tab needed, and it will take me directly to the result.
> Now, I'm finding Google shows a bunch of articles from dubious sources and whereas DDG will pull Wikipedia articles closer to the top.
In my (limited) experience, I dislike using DDG because it gets confused by less important words in the search query. For example, for "who coined the term faux pas" Google simply gives a bunch of links to webpages that define and elucidate the term "faux pas". DDG, however, gives a wide variety of results, many being totally irrelevant. The first article is on parapraxis, the second is the wiki article for microaggression (?).
http://archive.is/jv0qd (notice how DDG bolds the phrase "coined the term" in the first link, thinking that this is the relevant part of the query).
It's stuff like this that will prevent DDG from catching on with the general population who have been spoiled by Google.
Have any HNers used google without ad blocking lately? It’s sort of insane.
Even with it, I have just a huge variety of workaround I use to find anything remotely valuable. Usually adding reddit, wiki, HN, SO, examine, and all sort of other specificity filters.
If you’re shopping, looking at health issues, comparing things, it’s worthless.
If you’re looking for anything scientific it’s worse than worthless, it often links to a full page of pop sci articles that are just... wrong. Google scholar of course works well.
If you’re searching for news it’s basically entirely mainstream, entirely based on the last news cycle, and entirely homogenous.
And of course the Wikipedia links have gotten harder to click. Keyboard nav still purposely is weird. AMP pages break UX.
It’s funny because if I didn’t know so many tips and tricks I’d basically not “know” anything. I’d buy poor products at high prices, I’d believe the latest pop science, I’d only know one or maybe two mainstream opinions on news, etc.
That the worlds number one information finding service seems to have rolled over to a variety of bad incentives is a bit horrifying.
Yep, google, is a dumpster fire, DDG is my main search engine on mobile and I have not missed it. However, it's still far from optimal. We need a user-driven search engine! No more of this bullshit centralized searcher and censure, we need to flip the script on who gets to decide what kind of filters we want to see. Fuck Google.
Absolutely; native filtering support to exclude the domains of my choosing would be the killer feature for me that would make me switch to basically any search engine. I'll even sign up for an account there, and all the possible privacy/query logging that entails, to maintain the blacklist state. Browser userscript addons are just too clunky, screw up the number of results per page and don't work on image searches.
Let me block tabloids, hate sites and click farms natively and I'm sold.
> Searx is a free internet metasearch engine which aggregates results from more than 70 search services. Users are neither tracked nor profiled.
> Additionally, searx can be used over Tor for online anonymity.
> Get started with searx by using one of the Searx-instances. If you don’t trust anyone, you can set up your own, see Installation.
Google News is also pretty much totally broken for a while. You have to look very hard for a way to sort by date - by default it somehow thinks articles from 2018/2019 are what I'm after and it's often not returning articles that exist, are from normal newspapers and can be found by the same query in Google - there are also no useful facettes left and sometimes there is some sort of amnesia for everything older than 2017. Not sure why they are doing it (maybe it's only broken in Germany due to even more braindead "Leistungsschutzrecht" law but something really feels off.
Shouldn't Wikimedia Foundation be grateful for that? Their goal is met — people learn stuff even faster, and also they incur less server costs, because Google eats them.
For a site without ad revenue, it looks like a total win-win for them!
People perceive that they can get the information for free from Google, so they're a lot less likely to donate to Wikipedia even though that's where Google gets the information from.
Imagine a future where Wikipedia finally runs out of money after trying bigger and bigger donation banners and has to stop providing the service. As users expect to see the sidebar in Google's search results pages Google would hoover up all the data and bring control of it in-house, and we would only see whichever facts Google chooses for us. I'm not sure that would be a good thing.
Google, as well as many other companies, has long relied on Wikipedia for its content. Now, Google and Google.org are giving back.
Google.org President Jacquelline Fuller today announced a $2 million contribution to the Wikimedia Endowment. An additional $1.1 million donation went to the Wikimedia Foundation, courtesy of a campaign where Google employees decided where to direct Google’s donation dollars.
Wikipedia spends more than that only for "Donation processing expenses". Of course it is a nice contribution but, in my opinion, very small if compared to the huge value that Google is able to get for free.
https://wikimediafoundation.org/about/2018-annual-report/fin...
I was going to be even more cynical and say that 3 million sounds absolutely pathetic relative to the scale of both entities, and given how important it is to the quality of Google’s search results.
To add to my own comment, this [1] was the submission that I originally had in mind. It was in the 2007-2008 operating year that Wikipedia's expenses crossed the $3 million threshold. In 2015-2016 the expenses were >$65 million. In 2019, they were $91 million. Remember that the content creators work for free.
The reports are available here [2]. Looking at the most recent report here for FY18-19, the amount spent on hosting is $2.3 million. That's less than half of the "donation processing expenses"!
I remember when Wikipedia used to warn against donations from Google and the like and running ads because even if it didn't undermine their independence it would at the very least undermine the perception of their independence.
Looks like they're ok with it now though.
I suspect it won't be too long before they get used to this largesse and won't want to do anything that might jeapoardize it.
The difference is that google's donation is a no-strings-attached donation, whereas if they were running ads, Google could decide to hold their payout or serve hostile and invasive ads. Running ads on that scale equals leverage.
It isn't "largesse" because Google's donation is a small fraction of what they are collecting. It would be a concern if a large fraction of their income came from Google.
AIUI, Google does give proper credit to Wikipedia whenever it's reasonable to do so. Technically they don't even need to do this for a lot of basic machine-readable information that Wikidata is making available via CC0.
I imagine a future where a project like wikipedia is replicated endlessly by willing users on an IPFS-like network, where people can donate CPU, storage and bandwidth instead of money to pay for a centralized server, simply by running something like ipfs pin wikipedia (or a subset)
Sounds kind of like a newsgroup, with every client also being a server. I think torrents can be dynamically updated now.[1][2] Maybe we already have the tools we need, we just haven’t put all the pieces together yet.
That's right. We need to start thinking decentralized. Instead of using DNS to find a server and HTTP to get the content from it, we should switch to a decentralized lookup of the data itself, sourced from wherever the network can serve it. I don't mind getting one wiki page from Peter and another from Paul.
I think IPFS and Filecoin could solve these problems and keep attribution and data integrity intact. Would be nice to have the privacy of monero or another more private coin to facilitate subscription functionality with built in micropayments.
nice. pointing 2 torrents at the same folder alteady works. If they share files at worse it will be overwritten with the same content. Even if there is a lack of clients supporting it you can still force a re-check and avoid downloading everything again.
Actually, the organization behind Wikipedia has way more than "a few" staff members. There's been a perceived problem with excessive spending growth for many years now although to be fair, the increased spending has also allowed for compelling new features, projects, and avenues of editor support. (Stuff like the Visual editor, newer successful projects like Wikivoyage and Wikidata, initiatives to support article-editing in educational and scholarly settings, large-scale contests to support the editing community, etc. etc. just wouln't exist absent that revenue growth.)
That literally makes no sense. You'd rather a rectangle telling you to consume sugar than a rectangle asking if you can support an organisation that provides a service you use?
The point is, of course, that you don’t really want to pay for it but feel sort of guilty if reminded that keeping it running costs money. Textbook example of cognitive dissonance.
I agree with the faster part but these snippets Google shows often times lacks context and other miscellaneous information along with deep links to many other great articles.
I see a lot of misinformation in the comments here across multiple threads. Here are a couple sourced rebuttals.
> Featured snippets means no one clicks through to the source and thus underlying sites lose money.
Fact: Features snippets are optional for site creators and can lead to dramatically increased engagement in terms of sessions and CTR. [1][2]
> Weather.com in particular is hurt because it is ad supported no one leaves the Google page for weather.
Fact: The Weather Company happily partners with Google for this functionality. “The Weather Company, alongside governments, partner with Google to provide the world’s best weather solutions. We are happy to see Google continue to join with us and others in helping citizens stay informed.”[3]
It's a shame that a lot of the antitrust criticisms of google don't focus on Youtube. The fact that youtube is one of the main rivals to wikipedia in results is highly suspicious to me, and should be to antitrust regulators.
Of course at this point its a bit of a self reinforcing cycle, because youtube ranks, it gets more and more content, becomes more popular, and so google might be ranking it more and more legitimately.
But I find it impossible to believe that youtube would have done as well and would be doing as well in SERPS if it wasn't a google property. They've clearly built another site and brand with their own monopoly, similar to internet explorer by microsoft.
The main rivals to YT are sites like Vimeo and Twitch, not Wikipedia. And I don't think either of those is anywhere near YT-scale. It's really hard to run that kind of service in a profitable way, and even harder to let creators monetize their content directly the way YT does.
Yes, but in SERPs I think the main rivals to YT are independent blogs and content sites. Search for "how-to" type queries on Google, and it will often return videos from its own property (YouTube) ranked above text-based content. Google would prefer you to stay within its ecosystem. This is anticompetitive.
How is YouTube not a competitor to Wikipedia? Both are community powered content. They would be directly competing on many major search terms. The fact that the editorial standards and media formats are different is beside the point.
People saying that this is a good thing because they save on server cost. You are right about that part. But the problem is that the search snippet with wiki powered data is a Google Product.
When users consume this data, they become Google customers, not Wikipedia users. Even my little website has seen a 30% drop in traffic, but I appear in much more snippets. Those users get their information and never visit my blog at all. This creates loyal google users [1], not loyal < insert blog/business name here> followers
"Out of nearly 890,000 monthly searches worldwide, only 30,000 actually become search visits to a website"
Wow, I never thought about it, but I be my searches vs. clicks ratio is about the same for many searches. Google must being doing this on purpose, which must be hurting many sites. I'm sure I've read about this before, but I'm not sure I've seen those numbers before.
I skimmed the article. People search google; it gives a paragraph from Wikipedia. That answers their question, and they go on their way; this has reduced the number of people that click on the link to see Wikipedia.
I fail to see the problem. Wikipedia doesn't show ads, and isn't run for profit. If people get the tidbit of info they needed, they've been served. If they didn't click on Wikipedia to get it, that means Wikipedia saves money. This seems like a good thing to me.
And when fewer and fewer people are exposed to the donation banners/buttons on Wikipedia because they stop at the Google results page, how will Wikipedia keep running?
It might be a non-profit, but donation banners _are_ ads, and denying Wikipedia its page views denies them donations in the same way it would deny other sites ad revenue.
>The mission of the Wikimedia Foundation is to empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally.
The mission is NOT "get as much donation money as possible" and donations should exist to support the mission not vice-versa. Google seems to be helping in their mission of disseminating information.
> The mission is NOT "get as much donation money as possible"
No, it's not, but it should go without saying that accomplishing the mission includes keeping Wikipedia alive and functioning. It would hardly be accomplished if Wikemedia goes under, leaving Google, Bing, DDG, and other search engines to either use out-of-date information, or worse: update that information through questionable practices.
I understand that Wikipedia is a non-profit organization. But that doesn't mean that they don't have costs that need to be paid, nor does it mean that they will celebrate shutting down and handing responsibility for their mission over to a for-profit company that has no concern for that mission.
Wikipedia has costs and needs to raise money to cover those costs. Caching results on Google search pages may reduce some hosting expenses, but that's only a gain so long as the money saved in server costs is more than the money lost from disappearing donations.
And an organization without consistent revenue (such as from selling a product or service) needs much more runway than an organization that can depend on sales to regularly replenish the bank account. Because when the Google and other search engines eliminate the last of Wikipedia's donations, the only factor in Wikipedia's lifespan is how much cash they have in their coffers. And the more donations they collect now, the longer the runway they will have.
Wikimedia raises an order of magnitude more funding than it needs to cover basic hosting and operating costs. Losing some of those funds to further it's mission seems very much a net gain for it's mission.
>Wikipedia has costs and needs to raise money to cover those costs.
Hosting cost Wikimedia $2 million last year. It raised $120 million. It spent more on fundraising than it did on hosting.
>Because when the Google and other search engines eliminate the last of Wikipedia's donations
People, amazingly, go to Wikipedia independent of search engines and only a fraction of Wikimedia donations go to hosting costs. Wikipedia can likely survive indefinitely on organic views and their donations.
I feel like you might have missed the point of the article. They go into the example with weather.com, a for-profit website, that is also losing out on clicks. The general idea seems to be that google goes to the site on behalf of the user then doesn't serve the user ads (or in the case of wikipedia donation a donation banner).
Google for sure is just using the weather.com api like everyone else is doing. They license it for android so it would seem insane they wouldn’t for web.
Also every other search engine shows some form of zero click weather. Really bad example.
I do the same but with "wi" because I find it's nicer to type, but yeah this sort of thing lets me skip search pages regularly. I also have a shortcut for word definitions in English and Dutch, skipping the YouTube search results page and go straight to the top hit, Dutch and German Wikipedia... I may like shortcuts a little too much.
Alternate headline:
"Google Saves Millions of Hours of Time by Providing Answers Directly"
Is this a valid analogy?:
Imagine you had a brilliant friend who read all the books in a library and answered any question you asked her. Would you say this friend is stealing profit from book publishers?
If my friend was a billion-dollar company that sells ads, then yes I would. Google is taking the intellectual property of other content producers and monetizing it directly.
This impoverishes the rest of the internet and redistributes money from a diverse field of competitors to a single quasi-monopolistic company. Which is bad for the internet as an ecosystem long term.
Countries like France and the European Union with the copyright directive last year luckily strengthened the property rights of news organisations and Google had to strip snippets out of their results or strike a revenue-sharing agreement with the companies in question.
It is utterly absurd to me that a search engine is supposed to be able to capitalise on the original content of others for free.In fair use doctrine, it is generally considered that a service crosses the line between fair use and piracy if it functions as a substitute. This is exactly what Google is doing.
"In 2015, Google was able to steal over 550 million clicks from Wikipedia in six months"
As someone who has contributed content to Wikipedia and makes a (small) monthly donation this seems like a good thing. I support Wikipedia so that knowledge can be more easily and freely distributed and Wikipedia content is generally licensed under CC-BY-SA. Google following the license to make information sharing from Wikipedia more seamless (while reducing the load on Wikipedia servers) seems like a win for everyone.
Putting aside the criticisms of Google for its near monopoly on search, I don't think it's necessarily fair to say that Wikipedia "loses" a visit just because Google happens to be able to deliver the desired information on the search results page itself.
In many cases, visitors aren't using Google search because they want a webpage. They're using Google search because they want an answer to a question. Google is answering that question without them having to click through to another site, and visitors are fine with that.
Likewise, I wouldn't say that all of the online calculator websites are "losing" visits just because I plug 3^3 into my search bar and get 27.
Note that in some cases, Google has license agreements with the websites from which they gather that information, so while the visitor may never land on the source website, that source website still gets remuneration.
Besides, if you think it's bad that Google is providing content from other sites so that visitors never have to land on the source site, just wait until you hear about AMP ...
I think this is extremely shortsighted. If this were the whole story, then why would Google be going through the effort of integrating Wikipedia data into zero-click results? Out of charity?
No. Because big picture, those people who go to Google and leave without clicking anything had reinforced that Google finds everything they need easily. It makes them more likely to come back.
So Wikipedia does lose clicks to Google in the sense that it loses to Google an opportunity to impress its brand.
Side note. If you have Firefox I would suggest that you go on Wikipedia.org, right-click the search box, click "Add keyword to this search", and pick a keyword. This makes it faster to find information that you know is on Wikipedia.
If you're using DuckDuckGo, we can use the !w bang to search directly on Wikipedia. Bangs are really handy, when you know where you'll find the answers, and can fallback to !s (startpage) if you feel google will have better answers for current query. This feature has been very useful for me, especially !mdn, !cpp etc. when coding.
I think the correct place to have this is on the browser, not on the search engine. I also have a cpp keyword on Firefox :-) They're getting the same result, except mine is faster and more private.
I dont believe it the case that Google is returning the correct info though, and I believe attributing this primarily to zeroclick is wrong.
As the latter part of the article says, for a while now, google has filled the top of the page with tangently related videos, places to buy music, and other media, when all I wanted was the wikipedia article. They seem to have demoted wikipedia off the first page of many search phrases. I have to search wiki directly or add wikipedia as a search term to bring it back.
Google is not saving clicks, they are making me search twice.
Right. This is not "Google stealing clicks", it's users saving clicks because they were only interested in looking up that kind of basic information. It optimizes time and effort.
This is the "blogs kill newspapers" argument again. You're not wrong - and arguably they _did_ - but this is a perennial problem for web content. Aggregators are more profitable than content creators, but they kill their suppliers.
It seems that it disappears (rolls off the top of the screen) when you scroll down, and reappears when you scroll up.
There's little or no "inertia" in that behavior though, so depending on your mousewheel/track-pad/touch scroll behavior that can seem kinda flaky. A tiny scroll-up action brings the sticky header back. I have noticed other sites that do that can be really annoying on mobile, but on a MacOS laptop at least this site seems to do it pretty well (IMO).
It might benefit from not re-displaying the header until you've scrolled up a little more than it does now (again, IMO). I.e,. maybe have a threshold of a certain number of pixels (>1) or a certain velocity of scrolling-up before the header re-appears.
UPDATE: Playing with it a little more I do think it is a little flakier than that. There's something weird about the marquee image that makes the header re-appear for reasons I don't understand. If you use the arrow keys for scrolling (so there's no chance of an accidental "bounce" when you stop moving your fingers) the behavior is a little more obvious and consistent, but you can see that when scrolling from the top of the page the header disappears then comes back briefly for reasons that aren't clear to me.
Clever or subtle UI/UX behaviors on the web are hard.
I visit Wikipedia dozens of times per day. My most-frequent Google search is "wiki foo". The top link is almost always Wikipedia, and it's almost always exactly what I want. I appreciate the quick facts box at the top, but I still end up on Wikipedia. I also have a Chrome search shortcut so that I can type e.g. "w the west wing" to go directly to the Wikipedia article, but to use that, I need to know the exact spelling and wording of the article title, because Wikipedia's own search engine is horrible. I would venture that Google is the single largest driver of traffic to Wikipedia itself.
As a user, this is mostly great; I like getting information faster. (Although Google has an alarming habit of showing information that's wrong or outdated.)
As a fan of Wikipedia and other web sites, this is disastrous. Their content is being used by Google and they get little to no benefit for it. It's a similar situation to what the European news agencies were saying about Google News a few years ago. I wasn't so sympathetic to them, but I am more sympathetic to Wikipedia.
It's not a huge problem for Wikipedia since their pages are not ad-supported. But this kind of siphoning of user views is devastating for commercial sites.
Conscious of this issue I always take time to search (it's not always straightforward) the wikipedia link and click to go to the wikipedia page to somehow not endorse Google's behavior scraping data they do not own and exposing everything in their cards (obviously what I do is certainly useless, I still use Google Search in the first place, anyway, it just be a kind of placebo effect for my brain to feel good).
I hate the fact that Google tries to change itself every couple of years just to make more money. The initial company proposition of giving me search results quickly and getting out of the way was the best ever. Over time, they instead want to keep people for as long as possible so they can present more advertisement links.
I think the bigger issue is just that Google becomes even closer to a monopoly on information and the information being presented here is by design surface level. Maybe Wikipedia shouldn't care about surface level answers. They are designed for more detailed information.
On the other hand, wikipedia's use of nofollow makes it pretty clear they don't want a level playing field. What makes wikipedia great is all the references it builds on, yet those same references never get any "link juice".
Wikipedia uses nofollow to discourage spam edits. Google and other search engines are free to ignore nofollow and try to assess link quality in other ways.
At the current level of spam link posting, sure. But if getting spam links into Wikipedia were more valuable -- say, if they stopped using rel=nofollow -- I wonder how much the amount of incoming spam would increase.
I would be happy to know they tried, or at least considered it. Maybe they have and I'm just not aware of it. It's just something that always bugged me a little.
In the case of Wikipedia: If a single search result answers someone's question, it probably wasn't all that profound.
30-volume encyclopedias are great for learning in depth, but sometimes you just need to know something fairly trivial... more digits in pi, say.
"71% of [Freddy Mercury] searches end there, without a click to a specific site."
I'd like to know about a lot of things in more depth, but there's only so much time. If I'm focused on reading something that just mentions a name (it assumes I know) I might just search it to complete that omission.
The article fails to differentiate the two search-types, and so leaves that important question hanging.
Rings a bell from when the EU decided to slap a fine on GOOG for swaying away consumer traffic in favor of its own shopping platform.[1] But that was for-profit claims and authorities were swift in action whereas this case would be extremely interesting to follow from an online free speech advocacy POV.
Banning advertising is the answer. Stew on that for a minute before you downvote.
Monetary flow is the basis for our financial system and making the user not the customer undermines the basis of why anyone does anything in a more fundamental way then the movement from the barter system to paper money did.
Wikipedia was a text gold mine waiting to be mined. They even publish data dumps themselves that anyone can download. I doubt they had any problem with people "stealing" their data.
Just recently I made a website that extracts information from Wikipedia and presents it in a different way [0].
Just thinking beyond One Click Traffic, I remember there was a time when searching for a <term> would automatically have wikipedia as the top search result, but now often, it is not even on the first page. And modifying my search as <term wiki> would still get me wikipedia as the second result. It may not seem like a problem, but it is.
Has Wikipedia itself as a foundation ever commented on what it thinks of this? From an outsider's perspective, it seems like they're largely benefiting from this. Nobody owns the data that is encapsulated within Wikipedia, so there shouldn't be any issue of infringement here either.
Although it is handy from a user's perspective to quickly get the answer they want, scraping the information off Wikipedia and packaging it slightly different has always felt like cheating to me. Seems like this behavior will be part of the inevitable anti-trust case against Google.
if you want to specifically have google find you a wiki article why not just go to wiki directly, or bookmark it on your phone and create an app, or use any of the many apps that go to wiki directly.
i don't like google but this practice seems 100% legit, someone looking for a broad answer finds an answer immediately outside of wiki and they're happy with it.
i personally don't want -- ever -- wiki to be #1 on my search results, if i want a wiki answer i'll go there directly, i use search engines to find variety of answers.
> ...Google was able to steal over 550 million clicks from Wikipedia in six months...
"Cost"? "Steal"?!
This would make sense if Wikipedia were ad-supported. But Google saves Wikipedia money by requiring less servers to support traffic. And Wikipedia is open content, you literally can't steal from it -- being open content was part of its original mission statement!
I personally love it when my search results just give me the answer I'm looking for, so I don't have to click through to Wikipedia (or any site) and wade through a page to try to find it, and maybe it's there or maybe it's not.
The idea that Wikipedia's success ought to be measured in pageviews is deeply misguided. The more its content spreads and is reused across the world, online and offline, the better it is for humanity.
And to be clear, this certainly isn't any kind of "embrace, extend, extinguish" strategy on Google's part. Wikipedia isn't declining or going away. Every time you need to read an actual whole article, you still go there. This is solely about convenience in getting quick facts.
This is good -- not bad, folks.