CPP: A Standardized Alternative to AMP

xd1936 · on Oct 25, 2016

This was first published last February, and the top comment was from a Googler who's involved with the AMP project:

> Love this. Not sure by coincidence, but the AMP team has been playing around with the same thing under literally the same name. We should meet up some time and discuss details. Not sure it would be an alternative, but rather a complementary thing.

I wonder what ever came of that.

jefftk · on Oct 25, 2016

The spec has been moving along: http://wicg.github.io/ContentPerformancePolicy/

yoavweiss_ · on Oct 25, 2016

The spec has been split into smaller pieces. The first one that made the most progress is https://wicg.github.io/feature-policy/

Ph0X · on Oct 25, 2016

Correct me if I'm wrong, but doesn't the content maker get to decide if they in fact do want all the AMP benefits or not? If I remember correctly you can choose to disable the whole AMP cache google link, and only use the AMP optimizations.

Secondly, He goes on for a few paragraph at the start about how anyone can do this, but that's not so true is it? The whole point of the AMP redirect is that it's on Google cache servers, and unless you have a lot of money, that ain't gonna be cheap.

So at the end of the day, the last part is optional, and it's basically you paying for the cache with allowing them to use their domain (will all ads and traffic stats still sent to you).

notatoad · on Oct 25, 2016

omitting the amp cache google link basically makes it not AMP though - you can still follow the rules but it's just called "writing clean and minimal html" at that point.

fefe23 · on Oct 25, 2016

I can't take AMP seriously as long as it forces web sites to include resources from Google.

That basically turns all AMP web pages into something that Google can track. Google can track enough of the web already, thankyouverymuch.

If your "standard" proposal starts with telling me I have to include content from a specific URL, you already lost me as a potential proponent.

I also happen to disagree with "all CSS needs to be inline".

Unless I missed something, CPP appears to not require me to include some 3rd party content, so I'm on board already :-)

vmarsy · on Oct 25, 2016

I don't think it forces you to include resources from Google. I think the 2 primary reason Google asks you to include a specific javascript script [1] are :

1 - Lazy & Prioritized asset loading logic

2 - Loading assets from cache

Regarding the lazy & prioritized asset loading, the script [2] shouldn't be relying on some Google specific stuff. Since the project is open source, anyone can take a look and get an answer.

Regarding the loading from Google's cache: I haven't dug that much into it, but it's supposedly possible to write your own cache service (instead of relying on the currently free google CDN). Google provided the API and the URL format to so that it should be doable [3]. My assumption is that you can then create your own script, with the cache server URL changed to your own instead of Google's in a simple config file[4].

The question is then: by doing so (and effectively cutting any ties with Google's servers) will your page still be recognized as AMP by Google search engine? It will still reap the actual perf benefits (assuming your CDN does not suck), but if Google's crawler really wants that [1] to be present (and it should not), you wouldn't get the SEO benefits.

[1]

    <script async src="https://cdn.ampproject.org/v0.js"></script>

[2] https://github.com/ampproject/amphtml

[3] https://developers.google.com/amp/cache/

[4] https://github.com/ampproject/amphtml/blob/c44a48fbb1dbd0de0...

acqq · on Oct 25, 2016

Aren't the AMP pages anyway delivered from the Google servers to the users? Then Google can track them anyway?

I'm just a casual observer and that's just how I understood the idea.

mcbits · on Oct 26, 2016

AMP pages are served from the publisher's server as usual, but Google can easily direct their users to their cache instead of the official site. Every AMP page also includes JavaScript hosted by Google, so Google gets pinged every time someone visits an AMP page, even if it wasn't through their cache.

acqq · on Oct 26, 2016

https://developers.google.com/amp/cache/

"Google products, including Google Search, serve valid AMP documents and their resources from the cache to provide a fast user experience across the mobile web."

So it's Google who does deliver from the cache. Moreover, they motivate others to do the same on the same page.

mcbits · on Oct 26, 2016

Yes, Google can direct its own users to the cache, and others can also link to the cache. But if you or someone else link directly to the page, it is not served from Google's cache. Google would not be able to track those visits except for the fact that the page also loads content (at the very least, the AMP JavaScript) from Google's servers. AFAIK, it is not considered "valid" if you serve the JavaScript yourself.

rhizome · on Oct 25, 2016

That basically turns all AMP web pages into something that Google can track. Google can track enough of the web already, thankyouverymuch.

The point is to be able to track requests with javascript (or just Google Analytics, really) blocked.

Longhanks · on Oct 25, 2016

Isn't there a better alternative to an abbreviation that's already widely used in the software engineering context?

sfrailsdev · on Oct 25, 2016

PPC is pay per click, which is more likely to create confusion in a web dev context. PCP is less widely used in a software engineering and web developer context (one hopes anyway), but probably not something you want to be searching for from a work computer.

If we could get away from policies though, Performant Content Contracts doesn't overlap with much that is CS related.

ajdecon · on Oct 25, 2016

PCP is also Performance Co-Pilot, a relatively widely-used systems performance tool. :)

bluesmoon · on Oct 25, 2016

The CPP abbreviation is used to be consistent with CSP.

TickleSteve · on Oct 25, 2016

What has Communicating Sequential Processes got do do with it??

WickyNilliams · on Oct 25, 2016

Content Security Policy in this instance

techolic · on Oct 26, 2016

Just correct him/her if you don't think it's a joke, why the down vote?

c0g · on Oct 25, 2016

Best part is AMP is a C++ library.

deno · on Oct 25, 2016

Search engines already deal just fine with context. Just type in “cpp language” or “cpp html,” respectively.

I think everyone keeps making name collisions to be a much bigger deal than they are in reality.

maccard · on Oct 25, 2016

If it was a minor collision, sure. But CPP is one of the most used programming languages on the planet. I suppose you think we should create standards for C++ and call them JS or RB or py, for short?

deno · on Oct 25, 2016

If you can always find what you’re looking for for any non-trivial (2 keywords or more) search terms, how is it not a minor collision? “CPP” is not even the name of the language—it’s the file extension.

EDIT: Mostly I’m just annoyed that those useless off-topics about naming keep crowding out actual discussion.

jstanley · on Oct 25, 2016

But CPP is the name of the language!

https://en.wikipedia.org/wiki/C_preprocessor

deno · on Oct 25, 2016

You are contradicted by literally the first sentence in that article.

mcbits · on Oct 26, 2016

cpp html: http://imgur.com/gV3dMI9

It's going to take significant momentum to displace "CPP" in the appropriate computing contexts.

deno · on Oct 26, 2016

Yeah but if you search for “cpp amp” it’s already mid-page, despite there being a C++ project with the same name. The page has hardly any backlinks yet.

Anyway if discoverability is an issue they can always give it a catchy marketing name (Project Swift Gazelle) later. CPP is the appropriate name for the proposal, given the relevant context.

chubot · on Oct 25, 2016

Alternative name: CWP for Content Weight Policy :)

throwaway2016a · on Oct 25, 2016

This is very interesting and I don't want to detract from that .

However, pointing out that for a (large?) portion of us, CPP means "C Plus Plus" so I was confused for a few seconds.

happyslobro · on Oct 25, 2016

Welcome to the world of "$lang-lang" queries. Go, Rust, please make room.

aviraldg · on Oct 25, 2016

You would then need to disambiguate between "clang" the compiler and "c-lang" the language.

octo_t · on Oct 25, 2016

Or C Pre-Processor.

the_mitsuhiko · on Oct 25, 2016

Or Socialist Republic of Romania :P

happyslobro · on Oct 25, 2016

Or Canadian Pension Plan

ohyes · on Oct 25, 2016

AMP is "asynchronous message passing" to me.

petecooper · on Oct 25, 2016

…and Apache + MySQL + PHP for me.

digi_owl · on Oct 25, 2016

From back when LAMP (Linux being the L) was the buzz.

djsumdog · on Oct 25, 2016

oh LAMP .. and if you were enterprise, it was SOA (Service Orientated Architecture) and today it's Microservices.

So buzzy.

icebraining · on Oct 25, 2016

LAMP and SOA don't really have much to do with each other.

digi_owl · on Oct 25, 2016

We really are damn good at reinventing the wheel over and over and over, no?

deno · on Oct 25, 2016

The name makes perfect sense in context:

> CPP could borrow from the concept and approach of the already existing Content Security Policies (CSP). This means that there would likely be a reporting-only mode that would allow sites to see the impact the policy would have on their pages before applying it live.

pcwalton · on Oct 25, 2016

I'm much happier to see this. My concern with AMP has always been that it removes incentives for browsers to get faster, because if most of the content is using AMP there's no point in optimizing important things. Having a multi-vendor solution gives us non-Chrome browser vendors a voice at the table.

(To give a concrete example, why bother with optimizing layout-affecting animations to run off the main thread if AMP just forbids them? Such animations are useful, precisely because they affect layout; we aren't doing Web authors any favors by forbidding them instead of just making them fast.)

hilop · on Oct 25, 2016

AMP is for news pages, stuff is one page away from Search. Browsers still work to optimize the overall experience for web apps and full sites.

pcwalton · on Oct 25, 2016

1. Animations have their place on news sites.

2. I'm not just talking about animations: things like parallel layout are also useful for static sites.

dorianm · on Oct 25, 2016

The proposal: http://yoavweiss.github.io/ContentPerformancePolicy/

pjmlp · on Oct 25, 2016

Just use pure HTML/CSS and the web sites will fly by comparison.

onion2k · on Oct 25, 2016

For a single page load, sure. For subsequent page loads you're loading a lot more than necessary (a JS app can fetch just the content that's changed, and without a blank page in between), so a pure HTML and CSS solution is a great deal slower.

Plus, if you users have unreliable internet connections then a JS app can use a service worker to cache the entire app to work offline, and only load in new content when possible. An HTML page doesn't work at all in those circumstances.

Sometimes JS does actually make a site better. It's not always unnecessary bloat.

sanbor · on Oct 25, 2016

There is a thing in HTML5 called Subresource Integrity (https://developer.mozilla.org/en-US/docs/Web/Security/Subres...).

It looks like this:

  <script src="https://example.com/example-framework.js"
  integrity="sha384-oqVuAfXRKap...."
  crossorigin="anonymous"></script>

I wonder if browsers could keep a cache with those hashes as keys and whenever the integrity hash has a match, then it can take the JS from the cache. That would save huge amounts of bandwidth and pages would be so much faster to load.

Probably right now we're fetching the same version of jquery hundred of times from 20 different domains a day.

woogley · on Oct 25, 2016

Currently, SRI is not enough for browsers to implement content-addressable storage as you describe here, because it is subject to cache poisoning. See https://news.ycombinator.com/item?id=10311020 - basically, the browser can't know if a script can actually be loaded from the claimed domain without requesting it. This can be used to violate CSP.

Though it would be nice for the browser to cache it for domains that have delivered the script previously. It wouldn't be that different from a normal cache except the timestamp doesn't matter.

sanbor · on Oct 25, 2016

Thank you for the info! I imagined that there must be some technical issue as having hashes for content would make caching so easy. Anyway, at least using that hash for the same domain would save some requests, as browsers do requests to the server to see if the file hash matches in order to prevent sending the whole thing again.

r1ch · on Oct 25, 2016

It's not really bandwidth that causes the issue. Javascript is just really slow, both in parsing and execution. According to Chrome dev tools, parsing jquery takes 20ms on my 4.4 GHz desktop CPU. Now imagine how long that takes on a mid-range smartphone. Then add in a dozen other javascript libraries and shims and polyfills and the site is barely usable.

codedokode · on Oct 25, 2016

jQuery is a large library. If it were modular and developers used only the code they need it would be faster to parse.

But I doubt the bottleneck is JS code. The problem is that web sites are not optimized (some frontend developers think that writing a CSS stylesheet for narrow screen is enough) and they include a lot of resources (including trackers, advertisement, spying social network buttons I never click). Some of the widgets create an iframe (which is like a separate tab in your browser) and load a separate copy of jQuery there, make AJAX requests etc. And even worse, some advertisement networks can create nested iframes 2 or 3 levels deep (for example when a network doesn't had own ads, they can put Google Adwords block). So when you load a page with 10 iframes it loads the CPU as 10 separate tabs.

Decoding images is not free too, especially if it is thousand pixel wide heavily compressed JPEG or PNG image.

The real optimization would be cutting away (or making lazily loadable by user request) everything except content. As website developers are not going to do it, it is better to do the optimization on client side. I wish standard mobile browser allowed disabling JS, web fonts (which are just a waste of bandwidth) and loading images on request. Mobile networks usually have high latency so reducing the number of requests needed to display a page could help a lot.

hilop · on Oct 25, 2016

You can download a custom bundle of jquery from the source. But then it won't be shared with other websites. The solution is for browsers to cache common libraries and not try to load these resources per page.

sangnoir · on Oct 25, 2016

> There is a thing in HTML5 called Subresource Integrity

It piqued my interest, but I was disappointed to discover that it's only supported by Gecko & Blink[1] - not supported by Safari or IE/Edge. Javascript is currently unavoidable for offline apps.

1. http://caniuse.com/#search=integrity

robin_reala · on Oct 26, 2016

It’s a progressive enhancement. Browsers that don’t understand the integrity attribute will just load the JS regardless, but at least Firefox and Chrome will get a safer experience.

digi_owl · on Oct 25, 2016

JS + offline apps? thanks for making me feel old.

codedokode · on Oct 25, 2016

There are ways to cache responses defined in HTTP standard since first version. No fancy HTML5 features is required for that.

(And if you meant using hashes to use cache for resources from different domains - there probably will be many misses because every website can use different library versions, they can compress or bundle libraries etc).

nateberkopec · on Oct 25, 2016

At least on HTTP/1 sites, most people are bundling their libraries together, so subresource integrity can't save you there.

I want to say there's a security concern with re-using these libraries, but I guess the possibility of a hash collision would be extremely small.

cramforce · on Oct 25, 2016

They won't unfortunately implement that caching scheme, because that leaks the sites you have visited to attackers.

codedokode · on Oct 25, 2016

> if you users have unreliable internet connections then a JS app can use a service worker to cache the entire app to work offline,

It looks like an over engineered system, and you have to preload the content while on WiFi. And by the way do you know a reliable way to detect whether device is really online or there is a link but no packets are going through?

And every website is supposed to write its own code for service worker.

I think it would be easier to implement a feature in a browser where user can explicitly save some pages for offline reading. Or allow user to view pages from cache.

> Sometimes JS does actually make a site better.

For most sites it just adds unnesessary widgets (like spying share buttons) and advertisements. Especially on newspapers' sites - most of them work better without JS.

honkhonkpants · on Oct 25, 2016

All that stuff sounds really great if you had a lot of engineering resources and you are writing Gmail, but for 99% of the sites out there the JS hacks that load just the deltas and whatnot just get confused by packet loss and I end up having to reload the entire page including the gigantic JS hairball, or even worse the thing is so confused that I have to clear my cache and cookies to make it ever work again.

Simplicity has so many things in its favor.

rakoo · on Oct 25, 2016

Actually for the first part, there is this protocol called SDCH (https://en.wikipedia.org/wiki/SDCH) that allows a site owner to define a site-global compression dictionary, and each resource then becomes a compressed resource with the dictionary being the former one. It's hard to deploy, but it works: LinkedIn saw an average of 24% additional compression.

For the second part I wonder if browsers could display stale data with some warning saying so; that would solve many problems that happen all the time (refreshing a page after the website came down, ...)

pjmlp · on Oct 25, 2016

Given my web development scars, I have become an advocate that for anything other than dynamic documents, the way to go is native.

NoGravitas · on Oct 25, 2016

pjax (whether using the old familiar jquery-pjax or some more up-to-date implementation) is great for decorating simple HTML pages with, to replace full page loads with just a main-content load. And since the fall-back is just a full page load, it degrades really gracefully.

splatcollision · on Oct 25, 2016

You're lucky if you've got a good enough site so that you can get people to stick around for the second page load! Subsequent CSS calls will be cached for the majority.

colanderman · on Oct 25, 2016

Client-side XSLT and HTTP caching address both those issues. Yes, JS is another way to solve those issues, but not the only one.

hyperpape · on Oct 25, 2016

HTML/CSS are slow. The true way is ".txt"

Edit: I mean this as a sort of lazy way of making a reductio. I don't think .txt is better than HTML/CSS for pages (and I hope that's obvious). I also don't think having no JS is a good idea.

I believe in progressive enhancement, and to a first approximation, I think that all websites have at least one feature that they could implement in JS that would be "a good thing".

mxuribe · on Oct 25, 2016

I agree 90% with this; .txt files get most of the job done! ;-) The other 10% where i don't fully agree comprises of hyperlinks; i need me some clickable hyperlinks. :-)

walterbell · on Oct 25, 2016

Browser option to make hyperlinks clickable when rendering text files? Or client-side browser rendering of text-based markdown.

happyslobro · on Oct 25, 2016

Having browsers intelligently render `text/markdown` sounds like a great idea. And while we're waiting on the browser implementation, maybe we can find some sort of temporary workaround to send a markdown parser to the client?

Pete_D · on Oct 25, 2016

Maybe we could just send the raw markdown anyway - it might not be pretty on all clients, but it should be _legible_ on all clients.

Or maybe we could send markdown if the user agent included text/markdown in the request's Accept header, and pipe it through a markdown->HTML filter otherwise.

I would love to see some kind of native markdown support on the web.

walterbell · on Oct 25, 2016

A web search turned up this Firefox plugin, https://addons.mozilla.org/en-US/firefox/addon/markdown-view..., are there other good ones?

pfooti · on Oct 25, 2016

As long as I can make text blink, I'm happy.

krisdol · on Oct 25, 2016

No, hyperlinks failed us. It's all about single-page apps. Put all of your content and all of the content you would have linked to in your .txt. We need someone to build Reactxt.

arkitaip · on Oct 25, 2016

Whoa whoa whoa, what's some fat cat solution right there! Just use a http header for your content and you're set.

imtringued · on Oct 25, 2016

The url should contain all the information so we don't have to make a HTTP request.

Tiksi · on Oct 25, 2016

I think most news sites implemented this years ago.

djsumdog · on Oct 25, 2016

Someone posted this in a reply the other day and I feel it should go here as well:

https://vimeo.com/147806338

It's a pretty interesting talk. Just thinking about this kind of stuff can make way skinnier web pages than AMP. I mean really, if we designed pages for 56k modems, the web would be much much fast on mobile.

honkhonkpants · on Oct 25, 2016

I would just about give my left arm for a consistent 56k mobile connection.

msorvig · on Oct 25, 2016

Mildly interesting: CPP AMP is already a thing: https://msdn.microsoft.com/en-us/library/hh265137.aspx

jasonhansel · on Oct 25, 2016

I think the real concern about AMP is not that it's nonstandard, but that Google's caching mechanism reduces publishers' control over how their content is presented and keeps users on Google's domain. This is a problem for publishers, but I assume it increases performance (perhaps because Google prefetches the page while the user is looking at search results?).

mxuribe · on Oct 25, 2016

I agree with the premise that the amp framework (or any such framework) should be separate from being forced to work with only a specific set of tools, etc. However, it seems to me that what the author is proposing is really just more/better adherence to html/web specs...no? Or perhaps, a few performance-related tweaks to existing html/web specs...I mean, if we all (that is web producers, website managers, content authors, etc.) simply produce sites that adhere more to already established web standards (a la html5, xhtml, etc.), AND have browser makers be more strict in their interpretations of the established html/web specs, then we'd be almost all the way there...no? I'm by no means stating that this is easy, just stating that the author might be re-inventing a wheel that simply could use some optimization.

jefftk · on Oct 25, 2016

Right now there are a lot of things that specs let you do that will make your page slow, like running a lot of js in the scroll event handler, or just including too much js overall. If you read the current proposal [1] the idea is that the site could make promises not to do various kinds of slow things, and the browser could enforce that.

[1] http://wicg.github.io/ContentPerformancePolicy/

noir_lord · on Oct 25, 2016

> However, it seems to me that what the author is proposing is really just more/better adherence to html/web specs...no?

Pretty much but buzzwords/PR work, "CPP Compliant" vs "We build stuff properly (for a given value of properly)"

deno · on Oct 25, 2016

No, there are terrible antipatterns that browsers enable to not break compatibility. A strict performance sensitive mode for web platform (HTML/CSS/JS) is long overdue. AMP is a pretty great solution, and the only issue with it has been the proprietary feeling. The browser support is the cherry on top.

The argument that people will just use best practices out of sheer goodwill or even just basic competence has been thoroughly debunked. No one optimizes for performance if they are not penalized.

bryant · on Oct 25, 2016

COP? Content Optimization Policy?

That's the quickest alternative I can think of to CPP, though I'd be fine with CPP in any case.

krakmh · on Nov 2, 2016

No alternative can take part of AMP, now it become standard and there are hundred of scripts and plugin available for it. I am using WordPress plugins https://talktopoint.com/wordpress-amp-and-instant-articles/ and they are just working fines. What is need of this new alternative and why should I use it?

pmlnr · on Oct 25, 2016

We already have a standardized alternative to AMP; it's called HTML.

Just cut the crap.

irq-1 · on Oct 25, 2016

Shouldn't this be a white-list? If we restrict the features allowed we won't run into all the same security and performance issues we have with the web.

(also, we could call it HTML Light, or htmll)

theandrewbailey · on Oct 25, 2016

> No synchronous external scripts nor blocking external stylesheets

I know we have:

but for stylesheets, why can't we have something similar? Instead of a JS workaround, can't we have:

I hate hate HATE the idea of having css dependent on some JS (even if enabled) that might or might not run depending on what feels like working today.

wanda · on Oct 25, 2016

In the future, (i.e. only Chrome support this) you will be able to do this:

    <link rel="preload" href="/assets/stylesheet.css" as="style" onload="this.rel='stylesheet';">

which will more or less be async CSS. You can see that it will download in the background and morph into a stylesheet when it's ready, while the document continues to be parsed below it.

Then it will just be a matter of including this for people with JS switched off:

    <noscript>
      <link rel="stylesheet" href="/assets/stylesheet.css">    
    </noscript>

And then for browsers which have JS enabled but don't support the resource hint 'preload', you could do something like this as a fallback:

    window.addEventListener(  'load', function sweepUnloadedPreloads() {
      
      window.removeEventListener( 'load', sweepUnloadedPreloads, false );

      [].slice.call( document.querySelectorAll( '[rel=preload]' ) )
        .forEach( function( item ) {
          
          // simply doing this might work:
          item.rel='stylesheet';
          
          /** OR, if that doesn't work (I haven't tested it)**/
          var new_link = document.createElement( 'link' );
          new_link.rel = 'stylesheet';
          new_link.href = item.href;
          document.head.appendChild( new_link );
        });

    }, false );

The sketchy hypothetical fallback technique above, or any JavaScript CSS loader, could be augmented by using prefetch to attempt to get the tyres warm and start a low-priority download of the stylesheets in question.

    <link rel="prefetch" href="/assets/stylesheet.css">

And obviously there's Service Worker, which is also slim on support, but promises to turn your website into a near-native experience by providing the mother of all caches for resources and offline pages/resources.

Preload spec: https://www.w3.org/TR/preload/

Preload support: http://caniuse.com/#feat=link-rel-preload

You could also just put the link element(s) specifying your stylesheet(s) in the body to 'async' it — Stripe does this on stripe.com. It's not valid HTML but very few browsers seem to give a damn.

nailer · on Oct 25, 2016

> It’s also the only JavaScript allowed: author-written scripts, as well as third party scripts, are not valid.

Minor nitpick: they're allowable in iframes. I.e. sandboxed.

ephimetheus · on Oct 26, 2016

http://xkcd.com/927 comes to mind