hi OP here, i did not consider this to go front-page, just thought it was a funny meta bug.
and no, it's not clickbait and i'm not affiliated with hostgator or any of that other crap.
a few strange points i would like to point out:
the indexed result pages are http:// not https:// - but to my knowledge google forces https:// everywhere.
the double slash issue is probably the reason why googlebot does indeed index this. robots.txt is a shitty protocol, i once tried to understand it in detail and coded https://www.npmjs.org/package/robotstxt and yes, there are a shitload of cases you just can't cover with a sane robots.txt file.
as there are no https://www.google.com/search (with "s" like secure) URLs indexed google(bot) probably has some failsafes to not index itself, but the old http:// URLs somehow slipped through.
but now lets go meta: consider the implications! the day google indexes itself is the day google becomes self aware. google is a big machine trying to understand the internet. now it's indexing itself, trying to understand itself - and it will succeed.the "build more data center algorithms" will kick in as google - which basically indexed the whole internet - is now indexing itself recursively! the "hire more engineers to figure how to deal with all this data" algorithm will kick in (yeah, recursively every developer will become a google dev, free vegan food!), too.
i think it's awesome.
by the way, a few years ago somebody wrote a similar story http://www.wattpad.com/3697657-google-ai-what-if-google-beca... fun enough the date for self awareness is "December 7, 2014, at 05:47 a.m" [update: ups, sorry seems to be the wrong story, but i'm sure the "google indexes itself becomes self aware" short story is out there, but i just can't find it right now ... strange coincident?]
> the indexed result pages are http:// not https:// - but to my knowledge google forces https:// everywhere.
Google only forces HTTPS for certain User-Agent strings. I just tried fetching http://www.google.com with the Googlebot User-Agent string and Google did not redirect to HTTPS.
and no, it's not clickbait and i'm not affiliated with hostgator or any of that other crap.
a few strange points i would like to point out:
the indexed result pages are http:// not https:// - but to my knowledge google forces https:// everywhere.
the double slash issue is probably the reason why googlebot does indeed index this. robots.txt is a shitty protocol, i once tried to understand it in detail and coded https://www.npmjs.org/package/robotstxt and yes, there are a shitload of cases you just can't cover with a sane robots.txt file.
as there are no https://www.google.com/search (with "s" like secure) URLs indexed google(bot) probably has some failsafes to not index itself, but the old http:// URLs somehow slipped through.
but now lets go meta: consider the implications! the day google indexes itself is the day google becomes self aware. google is a big machine trying to understand the internet. now it's indexing itself, trying to understand itself - and it will succeed.the "build more data center algorithms" will kick in as google - which basically indexed the whole internet - is now indexing itself recursively! the "hire more engineers to figure how to deal with all this data" algorithm will kick in (yeah, recursively every developer will become a google dev, free vegan food!), too.
i think it's awesome.
by the way, a few years ago somebody wrote a similar story http://www.wattpad.com/3697657-google-ai-what-if-google-beca... fun enough the date for self awareness is "December 7, 2014, at 05:47 a.m" [update: ups, sorry seems to be the wrong story, but i'm sure the "google indexes itself becomes self aware" short story is out there, but i just can't find it right now ... strange coincident?]