It's clear to me Google wouldn't intentionally do something identical to what Bing's done.
But given what Googlers can't say, and don't even know about what other groups within Google are doing, it's not 'very clear' to me that you aren't already doing very very analogous things, with regard to every other site on the internet.
You've got the data; you're allowed to use it by your privacy policy; you've got the rationalizations handy. ("Sites didn't block us; fully-informed users opted-in; this is a crucial way to fight the manipulators; it's only helping us weight things we already found by other means; etc.")
Amit's not made anything clear to me, with his finessed "put any results" wording. Danny Sullivan picked this up too, as he remarks in the headlined article:
Google’s initial denial that it has never used toolbar data “to put any results on Google’s results pages” immediately took a blow given that site speed measurements done by the toolbar DO play a role in this. So what else might the toolbar do?
There's wiggle room in the definitions of 'copy' and 'competitor' in your 'never' promise, too. Is it OK if Google Toolbar data hoovers up implied editorial-quality signals from user navigation on every site that isn't a 'competitor'? (And given Google's size, what site isn't a competitor in some respects for audience share?) Is it 'copying' if your use of clicktrails makes a preexisting result move from #11 to #9 after you observe it satisfying people in other browsing sessions? Move from #99 to #2?
(Has the effect of any of Google's competitive analysis ever resulted in a single result moving closer to the position, higher or lower, that it had at a studied competitor? Some people could call that 'copying'.)
Maybe none of the clickstream sources Google uses stick out as a dominating factor because Google has so effectively "commoditized its complements" – and no one other entity (except maybe Facebook) has access to as much clickstream data as Google does, simply from its own sites.
Given that, it seems a little convenient that Google's standard is "every aggressively creative use of behavioral trails that led up to our 70%-90% share dominance was OK, but from now on let's be really rigorous about letting others observe our info-effluents."
For more official policy you'd have to ask either Amit or Matt. I can't speak for the company here.
Speaking only for myself and only on the ethics, I generally feel that any site that allows itself to be indexed is pretty happy with Google (or bing) doing whatever they can to rank it better. Even with the link data that sites provide, you can add rel=nofollow to the links if you don't want search engines to use them but still want your pages indexed (yelp has done this for instance.)
For me that's the ethical boundary. Sites have various ways of indicating their wishes, and that ought to be respected in spirit beyond the technical details.
Legally, the technologies that make the internet work all rely on the idea of fair use, so it is very important whether something is "fair."
I've seen no statement that Google throws out Toolbar (and other) clickstream data for sites/pages that Googlebot can't visit (which includes not just robots-precluded but also login-required pages). Not that I think you should throw such data out; that's not what robots.txt was meant for, and the user arguably has more claim to that interaction trail than the site. But that seems the standard you're suggesting.
If Google doesn't want IE features or the Bing Toolbar observing its site interactions, it can disallow such visitors. A steep price to pay, at too coarse a level of control? Yes, just like a site deciding to bar Googlebot.
I would agree that a 'fair use'-like analysis makes sense.
I would further agree that any site solely, or predominantly, powered by indirect observations of Google users would be an unfair taking. You'd crush such a site in court.
Meanwhile, a site that tallies Google referrer inclicks for itself, or for a network of participating sites (as with analytics inserts), even republishing summaries of Google source URLs and search terms as public data, is almost certainly fair use. It's taking data you're dropping freely onto third-party site logs, and making a transformative report of it.
What Bing is doing seems to me somewhere in-between. The mechanism avoids literal copying of specific artifacts but the net effect in some cases approaches the same result. As with other 'fair use' analysis, it's rarely black-and-white. The magnitude of the information used, its effects on the market, and the value-added transformation afterward are all important. I don't know how a court would rule in such a suit but the discovery process would surely be fun for spectators like myself!
But given what Googlers can't say, and don't even know about what other groups within Google are doing, it's not 'very clear' to me that you aren't already doing very very analogous things, with regard to every other site on the internet.
You've got the data; you're allowed to use it by your privacy policy; you've got the rationalizations handy. ("Sites didn't block us; fully-informed users opted-in; this is a crucial way to fight the manipulators; it's only helping us weight things we already found by other means; etc.")
Amit's not made anything clear to me, with his finessed "put any results" wording. Danny Sullivan picked this up too, as he remarks in the headlined article:
Google’s initial denial that it has never used toolbar data “to put any results on Google’s results pages” immediately took a blow given that site speed measurements done by the toolbar DO play a role in this. So what else might the toolbar do?
There's wiggle room in the definitions of 'copy' and 'competitor' in your 'never' promise, too. Is it OK if Google Toolbar data hoovers up implied editorial-quality signals from user navigation on every site that isn't a 'competitor'? (And given Google's size, what site isn't a competitor in some respects for audience share?) Is it 'copying' if your use of clicktrails makes a preexisting result move from #11 to #9 after you observe it satisfying people in other browsing sessions? Move from #99 to #2?
(Has the effect of any of Google's competitive analysis ever resulted in a single result moving closer to the position, higher or lower, that it had at a studied competitor? Some people could call that 'copying'.)
Maybe none of the clickstream sources Google uses stick out as a dominating factor because Google has so effectively "commoditized its complements" – and no one other entity (except maybe Facebook) has access to as much clickstream data as Google does, simply from its own sites.
Given that, it seems a little convenient that Google's standard is "every aggressively creative use of behavioral trails that led up to our 70%-90% share dominance was OK, but from now on let's be really rigorous about letting others observe our info-effluents."