> instant.page’s script is hosted serverless with Cloudflare Workers so that there’s no server to hack into.
There are a lot of reasons that this software is "safe" but this absolutely isn't one of them and highlights to me as a potential user that you haven't thought through your thread model. The subresource integrity line should absolutely be the headline here.
Haha this is really good :) I was never fond of this "serverless" naming, precisely because it could lead people to believe that there where actually no servers, but I had never seen such a good example of it :)
Oh and guess what, not only there are servers, but other people can run code on the same server as your code is running. Doesn't feel as safe suddenly ;)
Alternative phrasing: "This is hosted on Cloudflare Workers, and Cloudflare engineers are probably doing a better job running my code securely than I would"
It’s useful for the ~.4% of browsers that will execute a module script without checking its integrity. But I do indeed intend to put more emphasis on SRI in the future.
I encourage you to think through some ways that an attacker would compromise my service and how your service is a gateway to that compromise. Notably, "serverless" doesn't mean anything to the consumer of your service--it has always just meant "somebody else's server". So, why don't you talk on this page about how your Cloudflare account has a very strong password and 2factor enabled? It doesn't make sense because it doesn't cover the threat model for me, which is that somebody--you, Cloudflare, Saudi hackers--compromises the server.
One thing you may have meant by "serverless" is that you don't use a database / don't collect data on users. This is another angle you might want to use to highlight the safety of using your service.
Of course. The topic at hand, though, is browsers executing untrusted scripts. A wise attacker will leave the rest of the site intact to make the deface less noticeable.
Defacing capability can be used to change the integrity hash—how it appears to the users (other developers) in snippets—to match maliciously modified script, making it trusted.
End developers using this library would thus, at their own accord, include URL to altered script and its hash in their pages by copy-pasting a snippet from the defaced landing page, satisfying browser integrity checks.
A subtle change like that is unlikely to be noticed by end developers, who would have to count on library site maintainers to have mechanisms in place to notice such an attack promptly and (perhaps more importantly) to not suppress the news about the incident.
I think this library has gotten steadily worse over time. After 1.2.2, each new release just introduces more configurability that >99.99% of sites won’t want, more complications, more bugs and more misfeatures. Most notably version 5 introduced triggering load on mouse down, which I call catastrophically wrong, 100% dealbreaker with prejudice. 5.1.0 made it opt-in, but it’s still functionality that just shouldn’t exist.
I golfed 1.2.2 down a long way, removing other configurability and functionality that I didn’t want; see https://news.ycombinator.com/item?id=23204741, part of the discussion a few months ago at https://news.ycombinator.com/item?id=23203658 (where there was much wringing of hands about the whole load-on-mousedown mess). 5.1.0 is 2,842 bytes, 1,169 gzipped; my version got to 981 bytes, 532 gzipped. At that point it’s almost certainly cheaper to inline it than to have it as an external script (though if your CSP has script-src excluding unsafe-inline, you’ll need to give it a nonce or sha hash if you want to do it inline).
No. I golfed it because I enjoy doing that, and thought I might want to use it at some point; I now doubt I ever will, which is also a part of the reason it’s entirely untested. I consider it a complete work of art needing no maintenance, although it is possible that some years down the track some form of alteration may be reasonable. If you want to use it, copy and paste it! :)
I think rather than waste bandwidth and fight for those last 80ms, we should fix sites that take multiple seconds to load. Especially after the first load — subsequent pages should load instantly.
In general, it's 2020, and if every click on your site takes 5 seconds, something is very wrong.
It makes a big difference and doesn't only help websites that take seconds to load.
Just look at its hover->click speed demo on the landing page. I'm getting 500ms+ if I click it casually like I do most links. All of that time could be spent loading the next page.
https://dev.to/ (blog platform) does prefetching to great effect.
The onus is on you to decide if this would create a bunch of false positive prefetches for your desktop users, like if your website was a bunch of densely packed links. You could also scope this to prefetch only a subset of links.
But your users deserve more deliberation over their UX than a kneejerk yay/nay.
Every page should load fast, obviously, but instantly means the data has to be prefetched and that's potentially a lot of data to download that the user might never actually need. That's a waste of everyone's bandwidth, especially on mobile. Pages that a user will view (eg first page behind a login), or that they're very likely to view (eg top result in a search), should be prefetched/prerendered but whether you should prefetch more than that requires thought and measurement.
A great stimulus for "thought and measurement" would be user feedback. If users could see who wastes their data plan as easily as they can see who eats their battery, this would change the picture.
Browser vendors should implement per-site traffic breakdown.
As far as I know this only preloads HTML. For the majority of sites, that's not going to cause a huge increase in data use. Most page's html is only a couple of kb.
From my understanding that's not how this tool works (admittedly it's a while since I examined it deeply). It waits for a mouse hover over a link and then preloads that link only. Unless it's been terribly abused, there's no way it should preload all the links on a page.
Open the developer tools and prepare to be surprised how long the average site takes to get to the "everything loaded" state.
P.S. I'm only slightly exaggerating. There's an awful lot of bullshit being loaded behind the scenes on the average site. Good thing I'm not using a weak Internet connection.
Hmmm... I actually have stopped worrying about bandwidth completely in 2020. My home and mobile connections have virtually unlimited data. I cannot finish them if I tried. It's amazing to see tech like this that leverages the excess data I have and converts that into speed.
Lucky for you, but that's not really the case for everybody. For instance, I have 35GB mobile cap and 100GB home cap. While I'd love to see a faster experience while clicking links, it would be considerably wasting bandwidth if all sites implemented that. And this is just the user side: I'm not even getting into the excess load on the server side.
It's already a static Jekyll-compiled site served from a CDN, but combined with other techniques, instant.page makes the site quite snappy. (We had the EasyList blocking problem described below, so we started bundling it into our own post-load JS bundle.)
Honestly, unrelated, but the biggest performance improvement for us has been to do server-side rendering of the LaTeX equations, rather than using MathJax client-side.
2 decades ago there was a product whose name escapes me. It was a tool to speed up web browsing, but instead of preloading which link it thought you were about to click next, it simply started preloading all the hyperlinks on the webpage.
And didn’t that get people into trouble in corporate networks because some of the links it followed were links the user would never have clicked on. I vaguely recall that problem arising in the late 90s / early 00s.
It was a recurring problem in the Rails world because the reaction from DHH was to tell people the preloaders were "evil" instead of telling people to follow the HTTP specification.
It was a problem everywhere and in no way specific to RoR. If anything, RoR were one of the first more or less mainstream framewors to solve a lot of common problems then (CSRF, HTML spec only having "GET" and "POST" for form method attribute, etc.)
It wasn’t a problem everywhere. It was only a problem where people used GET in an unsafe way, which was a minority of sites.
What makes Rails prominent in this was the creator of the framework blamed Google for their own bugs and told people to try to detect GWA to hide the links instead of telling people to follow the HTTP specification. People who followed his advice suffered the bug a second time, and people who ignored him and followed the HTTP specification avoided the problem.
It’s not just a problem with unsafe GET requests, it’s a problem for anyone who’s unfortunate enough to work in a company who logs web traffic (which was common practice back in the 90s / early 00s) and who happened to stumble on a site that might have hyperlinks to warez or porn (which, basically, could be any site with user generated content).
I remember hearing several stories about employees getting reprimanded for accessing sites they never intended to visit.
I also know of a similar tail when a national football team had their DNS hijacked and visitors were served porn instead of sports news (also happened around 2000 sort of time)
> It wasn’t a problem everywhere. It was only a problem
> where people used GET in an unsafe way,
this is technically correct.
> which was a minority of sites.
I don't think those were minority. Minority of sited got hit by GWA, but before thad not many cared about what method was used either. PHP rulled the web back then, and I doubt security practices were much better. Heck, even Google's own Blogger was hit by it, iirc.
That makes Rails prominent was that 37Signals were very prominent at the time, so everything Jason F. or DHH said was heard wide. And while you are right about following the spec being the proper way, implementing a quick fix to block GWA may be quickier to deploy than rewriting the app to use proper methods.
These admin interfaces should have been under HTTPS anyway, and GWA did not follow HTTPS links. All those mistakes no web apps side do not mean that GWA was a good idea anyway, because it caused other problems which had nothing or little to do with idempotency: infalted traffic stats, messed up caches, etc.
There were quite a few of these in the late 90s, but I believe the most recent attempt at making a product out of this was the Google Web Accelerator about 15 years ago.
Back then one of the common mistakes used to be taking a huge image and "resize" it with "width" and "height" attributes on the img tag. It still is, but it used to too.
I vaguely remember an add-on (For Mozilla browser??) that you could configure to preload links, that I used to preload pages of webcomics on my subpar childhood dial-up. I would open up the archive page, or first page, go do something else for a while, then start reading, and I would never catch up to the cache.
RES for Reddit also has an option that preloads every post that shows on the page for instant loading at the cost of bandwidth (I had an increase of ~100gb for one month)
>> Before a user clicks on a link, they hover their mouse over that link. When a user has hovered for 65 ms there is one chance out of two that they will click on that link, so instant.page starts preloading at this moment, leaving on average over 300 ms for the page to preload.
This sounds great! I immediately wonder how many additional HTTPS requests will hit your web server fleet. Has anyone using this calculated a percentage of additional requests vs additional clicks from their http logs?
A neat hack would be to have it prefetch only pages already cached on the server (though those are the least in need of speeding up). Or maybe the prefetch requests go into a separate queue at a lower priority.
So my browser starts loading a page from a link that I haven't clicked. Does anyone else see the potential security nightmare on this scenario?
Drive-by downloads, malware, etc. I totally get the benefits of this mechanism until it starts being abused for tracking and delivery malware, and then add-ons will (hopefully) appear to block this.
What, how is this a security nightmare? What's your threat model? You think I, the website operator, am the enemy? Then I can just window.location you. You think the website operator is hosting compromising links because they've been suckered? But then once you click them you're going to be compromised.
I swear to God, so many HN comments just say "security nightmare" for everything.
The preloaded page won't get rendered until you click on it. Assuming that malware activates only by rendering, not simply by downloading, this seems safe.
Except it’s not bullshit. EasyList does exactly what they promise: privacy.
Prefetching is always privacy issue because you don’t know what’s running on the server side. They will log what I’m looking at and which sites I’m probable to view next. That’s almost Amazon level on page tracking.
I think I could probably build an invisible grid and use that script to track users more precisely, but I haven’t looked into it.
Prefetching can be misused and therefore should not be allowed. And yes, it’s theoretical. So are XSS, CSRF, and others.
Then why doesn't easylist add google.com and facebook.com to their list? I'm sure G/FB track every single click you do on their pages. Huge privacy violation. Not even theoretical. Block youtube also, they are tracking your viewing habits, how much you watched, where you skipped, etc. Just block all this.
I wish there was a way to easily override individual entries in easylist/ubo.
Because those are services that people use and consent to (implicitly at least) violating their privacy. There's a huge difference between YouTube tracking my habits to improve my YouTube experience (as well as for marketing reasons) and a third party tracking them solely for marketing reasons.
I'm sorry, but I'll gladly take 100ms of wait over a script fetching every link I hover over. There's a reason I disable stuff like DNS preloading in my browser and this is no different.
I can see how some website benefit from preloading everything, but if they should self-host the script (which won't get it blocked by EasyList) and leverage HTTP/2 Push to load the script in, instead of relying on an external domain that has full access to my IP and the URL I was visiting (through the Referer header).
To be clear, I have nothing against your project; it's just the centralized hosting that I don't trust. Unfortunately, you hosted the JS on the main page so the blocklist now also blocks your homepage; had you hosted the domain on a separate domain (cdn.instant.page) you wouldn't have had to deal with this. Because of the way links are generated, I don't disagree with the decision by EasyList though. Even if you did it by accident, you've created a tracker and that means you end up on tracker blocking lists.
If you self-host the JS, the script won't get blocked by EasyList. You can still use the project, you just need to disable your ad blocker to download the script from the main website.
The mobile site features a similar button, which measures the time between screentouch and release - tapping it casually like I would any other link netted ~120 ms, fastest I could physically tap was 20ms!
That demonstration alone was enough to pique my interest and give this a try.
Interesting theory. To throw out some 'anecdata' (maybe we can turn this into a sort of study?) - my three-device test (iPhone 6S and Pixel 2 XL at 60Hz, 12.9" iPad Pro at 120Hz) would seem to indicate that the refresh speed does not affect the touch refresh rate, however, this could be the result of the large performance differences between devices used, as the average and maximum times trend down with the newer/faster devices in testing.
I'm somewhat concerned about this kind of preloading. Given that network traffic is a huge, if not the major part of the total energy consumption of IT (I've seen estimates of about 40% of the total), this is adding a significant extra (according to the webpage only 50% of the triggered preloads are actually followed by a page view, which equals to an increase of 100% in initial traffic) for a gain of a few milliseconds.
Mind that reducing (initial) page loads by a fraction will do the same for you and even more.
Even if this is a bandaid solution attacking the wrong problem, it's great to see people caring for speed for a change.
But isn't it dangerous? How does it know you re not preloading the "please delete everything" link?
It would be nice to have a standard html tag preloading="yes" so that one day browsers can have it as standard feature. I'd much rather have this than AMP
> But isn't it dangerous? How does it know you re not preloading the "please delete everything" link?
In a perfect world no-one would be using links for actions which modify state - they should always be a POST behind a button, not a GET behind a link.
But in the real world this is probably a legitimate concern. They do have methods to mark links as non-preloadable (https://instant.page/blacklist), but anyone using it should definitely do a thorough test, not just copy and paste and hope for the best
> But isn't it dangerous? How does it know you re not preloading the "please delete everything" link?
As a sibling comment points out, if it's possible for someone to delete everything via a link, your users have already lost. Someone could, e.g., trick your users into loading an IMG tag or an IFRAME pointing to that page, or give them a shortened URL that redirects to your "delete everything" page.
Hosting the script yourself is also possible. Download the latest version at https://instant.page/5.1.0 then add a module script tag just before </body>:
A little bit off topic : the axios website on mobile 4G loads quite instantly on my phone. Even better than most FANG websites. I'm not sure it's only the CDN doing this. Any clue?
For as-fast-as-possible first time load, use TLS 1.3 + inline CSS + server close to user (with a CDN for global audiences) + not much JS (absolutely no 30kB framework).
For as-fast-as-possible subsequent page loads, use preloading (à la instant.page) + light SPA. My previous library, InstantClick, did just that but it’s alas mostly just a proof of concept (lacks good docs), I intend to “reboot“ it this year and announce it on HN.
FANG websites aren’t a very good site speed standard. :)
It rings in at 414ms, 1.9mb uncompressed (~1mb compressed), with a rather obnoxious 90 requests.
They're loading 961kb of script and 197kb of font content. A whole 41kb of actual HTML content in that obese vat of bytes.
On a small Quora page with no major images, they come in at 2mb of junk, 1500ms to load, with 79 requests.
A typical small Wikipedia page with one image will come in at 400kb-500kb and load in 400-500ms, with 26 requests.
GTMetrix lists the average load page size for their performance tests, compressed (!), at 3mb (with 89 requests). Framework bloat is like living on a sugar diet.
They mention mobile in the page, they use the onmousedown to preload, which (they claim) gives in average a 90ms improvement, instead of waiting for the click evento.
Would it save time by loading pages without making a normal new page request? I mean using AJAX/jQuery/… and updating URL with window.history.pushState.
I use instant.page and it doesn't appear to work in Safari (at least in desktop version 13.1.2): I can't see any transfers in the Network tab on hover.
It’s an old idea that keeps resurfacing and frankly I think it misses the real problem of modern websites: they’re too bloated with code that the user is expected to run and yet isn’t there to serve the user.
I’d love to see a world where cloud computing / time sharing was tipped on its head and users and businesses could charge back their compute time spent running JavaScript trackers, analytics, etc from the sites their users visit. Of course this fantasy would never be technically possible but one can dream.
> It’s an old idea that keeps resurfacing and frankly I think it misses the real problem of modern websites: they’re too bloated with code that the user is expected to run and yet isn’t there to serve the user.
I agree that the latter is a problem with many modern websites. That, however, does not mean that this approach does not have a place; many people (especially people among the HN and HN-adjacent crowd, who tend towards simple layouts and static site generators) optimise their sites as much as is reasonably possible, and specifically, don't pessimise their sites with megabytes of tracker & ad scripts. For those people, instant.page might well be a worthwhile optimisation.
>It’s an old idea that keeps resurfacing and frankly I think it misses the real problem of modern websites: they’re too bloated with code that the user is expected to run and yet isn’t there to serve the user.
I can't find it now, but I thought there was an article recently that looked at page rendering times and how they have changed over the years (as bandwidth has also increased). In general, page rendering time hasn't decreased as much over the last few decades as you would expect, and a part of that is the amount of js (often for tracking - not to serve the user) that is pervasive nowadays.
Ok, I'll just rewrite my clients' entire website using a trendy, brand new JS framework that is built on top of an ever-changing trendy JS library, requiring the entire site to be compiled from scratch when my client wants to update a title.
Different tools for different projects. Gatsby and Gridsome, or any other static site generator is not comparable to a simple script like this.
Throwing a random script onto a website I was responsible for, which claims to magically preload things and make things faster is not my idea of simple. Easy, perhaps. Risky, yes. Simple, no.
There are a lot of reasons that this software is "safe" but this absolutely isn't one of them and highlights to me as a potential user that you haven't thought through your thread model. The subresource integrity line should absolutely be the headline here.