Hacker News new | past | comments | ask | show | jobs | submit login
Getting Started with Headless Chrome (developers.google.com)
657 points by uptown on May 1, 2017 | hide | past | favorite | 121 comments



I wrote a little ipython notebook with a stupid simple example of getting headless chrome up and talking to it via a websocket: https://gist.github.com/llimllib/7f6143a1a6955d243161b2fec23...

Then I started on a python library that would handle communication with chrome in a real way using asyncio: https://twitter.com/llimllib/status/855433309375082496

If that's a thing that you're interested in, let me know. https://github.com/llimllib/chrome-control/tree/wstest


FWIW, I wrote a wrapper that exposes the remote debugger interface in python functions here: https://github.com/fake-name/ChromeController

It works by reading the interface descripton JSON files, and dynamically generating the API.

It doesn't handle the higher-level stuff (async, etc...), but for synchronous things, it's quite nice.

--------

As an aside, if you're interested in this sort of thing, the chrome devtools protocol has a issue tracker here: https://github.com/ChromeDevTools/devtools-protocol/issues

Yes, it's on github, no idea why, when they already have their monorail issue tracker.


FWIW, we created the Github issue tracker b/c it's a bit more approachable for everyone. Github is also a much better place for discussions.


Ah, that makes sense.

Monowall is kind of poor. The inability to edit comments is particularly annoying.


I saw that! And it was helpful to me getting started, thanks.

I wanted to generate types from the protocol file rather than generate them at run-time, so that was the first thing I did. I think it helps debuggability/autocompletability/readability to have the types reified; check it out if you're interested, all the capital-letter-named files are generated.


It's a few-line change to output the AST to code via astor (it's actually part of how I got everything working in the first place). You get full executable code with docstrings an everything, though no support for generating inline comments (I think that changed in 3.6, IIRC).

Would you be interested in a patch that does something like that? It should be a heck of a lot more robust then what you're currently doing (which is manual code generation?).

------

Also, if you're interested in doing things with the debug protocol, https://github.com/ChromeDevTools/devtools-protocol/issues/6 might be relevant.

It's an issue about how there's no good way to determine the remote debug interface version, since the version information that's present never changes, and they keep moving functions.


Definitely!

And yeah the ad hoc protocol thing is... suboptimal


I put together a caching system so the generated wrapper is made available for overview/manual-patching, etc.... It currently validates the generated file against a generated-from-the-json file at import, and falls back to blindly importing the json file if the tooling to generate the class definition is not available.

Ideally, any patching for the class file should really be done as a subclass, as well as any more complex logic. That way, if the protocol json file is changed, it can be just updated without breaking anything.

Anyways, generated class is here: https://github.com/fake-name/ChromeController/blob/master/Ch...

Oh, as an aside, you're actually missing a significant chunk of the protocol json file, mainly because there are actually two of them. There's a `browser_protocol.json` AND a `browser_protocol.json` file. They're both produced as build artifacts, so I can't point to them in the source, but the total json size should be > 800KB (mirror in chromecontroller: https://github.com/fake-name/ChromeController/tree/master/Ch... ).

As it is, this should kind of just splat into your source tree, but I don't want to move everything around without at least asking you how you want things structured.


ATM I haven't dealt with structure yet, just dumped it into a dir. I'll get a chance to take a look this weekend, many thanks!


I recently wrote a guide to using headless Chrome with Selenium and Chrome WebDriver [1]. I thought that some people in this thread might find it useful considering that the submitted article doesn't mention Selenium at all.

[1] - https://intoli.com/blog/running-selenium-with-headless-chrom...


Your guide mentions setting window size and taking screenshots, which I thought were not currently working via chromedriver. Do you know if that was fixed or is there something else going on?


Setting window size via the ChromeDriver API didn't work for me but I was able to set it using a ChromeOptions command-line argument as I do in the guide. Screenshots seem to work fine on Linux but I haven't tried on other platforms.


Thanks for sharing the link.

Just thought you'd like to know that there's a small typo at the title: "Runnng" instead of "Running".


Off-topic, but I'm just going to start bringing this up in every thread like this ...

We need a recipe / toolchain for running a browser instance in a chroot jail. With a head. GUI. On the desktop.

I want to fire up a "banking browser" (or "danger browser" or "social media browser") that runs on its own IP, in its own chroot, and I can revert to virgin snapshot state anytime I want ... and I don't want to fire up a full-blown VM for eac of these ...

What is standing in the way of making this happen ?

Who is working on things related to this that I can donate funds to or establish a bounty ?


Sounds like Qubes OS https://www.qubes-os.org/


The idea here is that I set up a chroot jail for firefox or chrome and configure it with things like local filesystem for cookies and cache and certs, etc.

It would also get its own unique IP, this jail.

Then I fire up firefox inside that chroot jail and use it to visit some websites ... and then I can wipe the whole thing out and redeploy again later, starting from scratch.

I don't need to trust incognito mode, I don't need to trust wiping cache or tabs talking to each other (or not) and I can worry a lot less about browser level exploits.

I can even put in firewall rules so that my "banking" instance can only talk to boa.com and scotttrade.com (or whatever).

It's totally workable (and I have done it) with vmware. Make a "bank browsing" VM and revert to pristine snapshot every day. The problem is that this is terribly heavyweight and overkill when I don't need a full blown VM.

It's not even really a browser issue - the real issue is, how do you jail a GUI application in X such that that window is in a chroot jail, distinct from the rest of your desktop?


>> this is terribly heavyweight and overkill when I don't need a full blown VM

The entire concept you're aiming to set up is terribly heavyweight and overkill. If you're knowledgeable enough to be discussing VMs and chroots, you must realize that what you are proposing is being careful to the point of paranoia à la tinfoil hat. Those of us who know how to stay as safe as possible via "basic" methods of security should be sleeping soundly knowing we're already in the top 5-10% of consumers. Install OS security updates, use a virus scanner and firewall, don't install pirated software (more likely to contain malware), and you're better off than most people by a significant margin.

You're talking about barely making a dent in the chances of your credentials or sessions being compromised. Private browsing, a separate browser instance, a VM, or chroot makes no difference if you have malware with a keylogger on the host system. Give yourself a break, realize that there is no such thing as "perfect security", and stop worrying so much. The amount of energy you're pouring into "banking safely" is not a sane endeavor. It serves no useful purpose. You could be investing this time and energy into something far more likely to improve your quality of life (eg: family, friends, health, etc.).


I've seen so many people stress about getting their credit card stolen or bank accounts hacked. It's rather ridiculous considering you don't bear the liability of a hack. If you didn't access or approve a usage of your accounts, the banks just give it back. I have had money stolen more than once from skimmers and I have never had any trouble getting it all back.


tl;dr relax, go for a walk


Did this a few months ago, somewhat straightforward. A "generic" recipe for any UNIX-based OS would be:

1 - Create a container (Docker, Jails, maybe even chroot)

2 - Assign an ip to the container, NAT to it

3 - Install firefox on that container

4 - Run a SSHD server, enable X11 forwarding

5 - Mount the relevant container's folders to your root fs (eg: map $CONTAINER/.mozzila and $CONTAINER/Downloads to $HOME/jailed-browser/)

6 - Add an ssh config for quick alias

7 - Run `ssh container firefox` and profit

Here's a nice example using FreeBSD jails (I remember following this tutorial, everything worked out fine): https://forums.freebsd.org/threads/53362/

My experience with it, though, wasn't great. X11 forwarding through SSH was quite laggy (even after performing some optimizations on the connection). Good luck if you want to set-up audio/mic support. It's a nice solution for a one-time banking login, not for day-to-day use.


"Did this a few months ago, somewhat straightforward. A "generic" recipe for any UNIX-based OS would be:"

Thank you very much! I will give this a try immediately.


Or you can have a banking computer, danger computer and social media computer with everything truly separate. These days, it's quite cheap, as you can get an ARM board with HDMI output for $18 incl shipping and no management engine blobs.

You can create arbitrary local networks with it, and isolate concerns in separate hardware.

If the "social desktop" get's compromised or even rooted, the attacker would still need to find a way through the physical router/firewall, etc. It would not be just a question of finding/using VM/container/chroot escape vulnerabilities.

You can also air gap certain endpoints. Physical security? Use FDE and pop the sd card out of the SBC and take it with you when you're out of home.

Security against rubber hose cryptoanalysis or 'contempt of the court' cryptoanalysis? Devise some way to quickly destroy the sd card when necessary. Then there's nothing to rubber hose you for.

/s? Maybe. But it's all possible today, rather easy to do, and cheap. :)


I don't know much about this but I swore I read the chrome uses chroot for its sandboxing. Would that not already accomplish what you wanted?

https://chromium.googlesource.com/chromium/src/+/master/docs...


Paging HN user christux of http://tux.io

Ask HN | https://news.ycombinator.com/item?id=14245428

> public access to non-root desktops in LXC containers

Show HN | https://news.ycombinator.com/item?id=14245447

> a Linux desktop in your browser


I always thought http://www.zerovm.org/ had a lot of potential. I recall an HN story where someone used this to spin up a ZeroVM instance for each request that came in over an API and tear it down once the request was finished. That sort of speed and isolation would work really well for this use case.


Maybe you already saw this:

https://blog.jessfraz.com/post/docker-containers-on-the-desk...

It's pretty dated, but the state of containers has only improved since 2015, and this was usable for Google Hangouts (video) and YouTube all the way back then.

https://hub.docker.com/r/jess/chrome/

^ This got a push less than 24 hours ago

https://github.com/jessfraz/dockerfiles/blob/master/chrome/s...

^ This is the Dockerfile, in a repo full of Dockerfiles for other things you might have had trouble putting into a container.

I tried this myself and had problems because (I think) I was on a very old LTS/stable distro with a necessity to use pretty old Docker release. This person is a maintainer on the Docker core team, so is most definitely tracking newest releases.

I use Chrome in a headless setup with Kubernetes and Docker (Jenkins with kubernetes-plugin) but it's not Headless Chrome, it's Xvfb that makes my setup headless. Chrome itself works great in a container. It's one excellent way to keep it from eating up gobs of RAM like it does: just give it less RAM (cgroups).

If you said "chroot jail" on purpose because you don't want Docker, I don't know why exactly, maybe because you don't think you can run graphical apps in a Docker container, but it's not true. You can!

You could also cobble something together and do it without Docker, but I'm not recommending that, I just saw this and thought it would be exactly what you're looking for. Hope this helps!


It might sound unpopular but, I've always thought these things are a bit of shenanigans, honestly. It sounds like it increases security a lot but it seems to just be a little extra paper on top. It's good for resource control (stop Chrome from eating all RAM) or installing a browser quickly in a container/docker/jail/whatever, but security wise I think it's not the right solution.

The thing is, a chroot jail doesn't really protect my browser in the way I want to (if I'm speaking personally, I guess). It's not the right level of granularity.

If an exploit compromises my browser, it would, essentially, have vast amounts of access to my personal information already, simply due to the nature of a browser being stateful and trusted by the user. Getting root or whatever beyond that is nice I guess, but that's game over for most people. This is true for the vast majority of most computer users. I don't clear my browser history literally every day and 'wipe the slate clean', I like having search history and trained autocomplete, and it's kind of weird to expect people suddenly not to. It seems like it's a move laterally, in a direction that only really satiates some very technically aware users. Even then, I'd say this approach is still fundamentally dangerous for competent users -- a simple mistake or a flaw in the application you can't anticipate, or your own mistake, could expose you easily.

A more instructive example is an email client. If I use thunderbird and put it in a chroot jail, sync part of my mail spool, and then someone sends me an HTML email that exploits a Gecko rendering flaw and owns me 'in the jail' -- then they now have access to my spool! And my credentials. They can just disguise access to the container and do a password reset, for example, and I'm screwed. Depending on the interaction method of the application, things like Stagefright were auto-triggered, for example, just solely by sending SMS. It's a very dangerous game to play at that point, when applications are trying to be so in-our-face today (still waiting for a browser exploit that can be triggered even out-of-focus, through desktop notifications...)

The attack surface for a browser, and most stateful, trusted apps -- basically starts and ends at there, really. For all intents and purposes, an individual's browser or email client is just as relatively valuable as any company's SQL database. Think: if you put a PostgreSQL instance inside a jail, and someone exploits your SQL database... is your data safe? Or do they just exfiltrate your DB dump and walk away? Does a company wipe their database every week to keep hackers from taking it?

Meaningful mitigation has to come, I think, in the way Chrome does it: by doing extensive application level sandboxing. Making it part of the design, in a real, coherent way. That requires a lot of work. It's why Firefox has taken years to do it -- and is pissing so many people off to get there by breaking the extension model, so they can meaningfully sandbox.

Aside from just attack surface concerns though, jails and things like containers still have some real technical limitations that stand in the way of users. Even things like drag-and-drop from desktop into container is a bit convoluted (maybe Wayland makes this easier? I don't know how Qubes does it), and I use 1Password, so the kind of interaction between my key database means we're back at square 1: where browser compromise 'in the sandbox' still means you get owned in all the ways that matter.

Other meaningful mitigations exist beyond 'total redesign' but they're more technical in nature... Things like more robust anti-exploit mechanisms, for example, in our toolchains and processes. That's also very hard work, but I think it's also a lot more fruitful than "containerize/jail it all, and hope for the best".


I have a feeling you misunderstood the parent's idea. The jail there is not to prevent someone from breaking out from the browser into the system. It's to contain simple attacks on your data, exactly because the browser is a stateful system with lots of stored secrets.

If you have a full sandbox breakout exploit, both cases are broken. But if you have just a stupid JS issue that breaks same-origin, or causes a trivial arbitrary file read, jails will put you from them just fine. It's pretty much to stop a post you open from Facebook from being able to get your PayPal session cookie. Not many exploits in the wild are advanced.


Couldn't this be achieved in Chrome by creating different user profiles and switching between profiles depending on the site being visited?

I already break my social media away from shopping from banking using different Chrome user profiles.


> If you have a full sandbox breakout exploit, both cases are broken. But if you have just a stupid JS issue that breaks same-origin, or causes a trivial arbitrary file read, jails will put you from them just fine.

If you can read an arbitrary file, what is stopping you from reading the browser's e.g. password database files, inside the container, or any of the potentially sensitive cached files, for example? Those files are there -- the browser writes them, whether or not it is in a sandboxed directory or not.

Or do you assume that there is no password database that the user stores in any 'sandboxed' browser instance, ever, and they copy/paste or retype passwords every time or something? This is basically treating every single domain and browser instance as stateless. This is what I mean -- users are never going to behave this way, only people on places like Hacker News will. They aren't going to use 14 different instances of a browser, each one perfectly isolated without shared search, or having to re-log-into each browser instance to have consistent search results or and autocomplete. It's just an awful UX experience.

Of course, maybe you don't map files in there, inside the container. That's too dangerous, because if any part of the browser can just read a file, it's game over. Perhaps you could have multiple processes communicate over RPC, each one in its own container, with crafted policies that would e.g. only allow processes for certain SOP domains to request certain passwords or sensitive information from a process that manages the database. Essentially, you add policy and authorization. There is only one process that can read exactly one file, the database file. The process for rendering and dealing with the logic of a particular domain does not even have filesystem access, ever, to any on disk file, it is forbidden. It must instead ask the broker process for access to the sensitive information for a particular domain. You could even do this so that each tab is transparently its own process, as well as enforcing process-level SOP separation...

The thing is... That's basically exactly what Chrome does, by design. As of recent Chrome can actually separate and sandbox processes based on SOP. But it can only do that through its design. It cannot be tacked on.

Think about it. Firefox does not have true sandboxing or process isolation. Simply wrapping it in a container is not sufficient, and simply having 40,000 separate Firefox containers, each with its own little "island" of state, each for a single domain, is simply unusable from a user POV for any average human being. It is also incredibly dangerous (oops, I accidentally opened my bank website inside my gmail container, now they're contaminated. Now if my bank website serves me bad JS, it can possibly get some content related to my gmail, if it can bypass browser policies. In Chrome's new architecture, this can't happen, from what I understand, even if you don't run two separate, isolated instances of Chrome. SOP is now process level, and it is truly baked into the design.)

How do you make this not garbage from a user POV? By rearchitecting Firefox around multiple processes, where each domain is properly sandboxed and requires specific access and authorization to request certain data from another process. And where processes that need access are literally denied filesystem access. That requires taking control of the containers itself, the same way Chrome does. Chrome goes to extreme lengths for this.

The only way to truly enforce these things is at the application level. Just taking Firefox, slapping it inside Docker or a jail, and doing that 40,000 times for each domain isn't even close to the same thing, if that's what you're suggesting.


You're right about a lot of things, but there are still missing pieces. Whatever the sandboxing idea is used in Chrome (and you're right, Chrome is the gold standard now), a simple issue can still bring it all down. The are RCEs in Chrome published almost every month. Some will be limited by sandbox and that's great. But I disagree with:

> It cannot be tacked on.

Security as in prevention of the exploit cannot be tracked on. But separation of data can be. And there's a whole big scale of how it works, starting from another profile, to containers and data brokers, to VMs like qubes, to separate physical machines.

Chrome still uses a single file for cookies of different domains. And because you may have elements of different domains rendered at the same time, it needs that access. But that's exactly where either profiles or a stronger separation like containers can enforce more separation.

Yes, it does involve some interaction from the user, but it's not that bad. The UI can help as well. "This looks like a bank website. Did you mean to open it in a Private profile?", "You're trying to access Facebook, would you like to use your Social profile instead?" Realistically, people only need 3-4 of them (social, shopping, secure/banking, work)

We practically solved spam clarification already and that's in a hostile environment. Detecting social sites should be simple in comparison.



Windows: https://www.sandboxie.com/ $35 home | $50/yr commercial

Find Linux options here: http://alternativeto.net/software/sandboxie/?platform=linux - maybe someone can pick up https://github.com/tsgates/mbox

At least 1 VM is strongly recommended but you could containerize within that.

Edit: toolchain for running a browser instance in a chroot jail. With a head. GUI. On the desktop - it's here today (with commercial support), just run Windows+Sandboxie in a VM. Yowch!


Clarity just moved to headless Chrome from PhantomJS today: https://github.com/vmware/clarity/pull/803


> Clarity Design System

> UX guidelines, HTML/CSS framework, and Angular components working together to craft exceptional experiences

Crashes on Firefox, causes the "Script not responding" dialogue to appear. What a truly exceptional experience.


We do have a bug on the website, it doesn't really have to do with the design system itself but with the way the website itself is built. Should be fixed soon. Thanks for the feedback :)


In case anyway checks this out, an update to the Clarity site has already went out that improves the performance of the website drastically. Let me know if you're still seeing any issues and thanks again for the feedback :)


Clarity moved to chrome webdriver, but they are not using the headless features. It's standard selenium webdriver, and xvfb

https://github.com/vmware/clarity/blob/master/.travis.yml#L4...


hi @arwineap, you are right in saying that Clarity is using chrome webdriver + xvfb. That's for the css regression testing, which we haven't looked into moving to use headless Chrome. We've switched our unit tests to use the headless Chrome: https://github.com/vmware/clarity/blob/master/build/karma.co...


Hrm, but those also run in travis right? I noticed at the bottom of the karma.conf it checks to see if it's in travis.

The travis image is installing chrome stable, which is not 59; AFAIK the headless feature is only available in beta right now which versions out to 59.

I admit I'm not sure if your unit tests are running in travis, but if they are, I'd wonder if the --headless flag is just getting dropped as an unknown flag.

Curious either way actually so let me know; I'm looking at starting the testing moving from phantomjs to headless which is why I was sleuthing out Clarity's move


> AFAIK the headless feature is only available in beta right now which versions out to 59.

Headless is pretty stable on Linux in 58. 59 is really when headless became usable on Mac. Still in beta on Windows.


I contributed a small patch to get this working on Windows. We were a PhantomJS shop, but it was just so unstable, thought we'd give this a shot. Have been running on it for over 2 months now and it's dropped test failures due to intermittent issues to near 0.


Capybara with headless Chrome..

http://blog.faraday.io/headless-chromium/


Related recent news: "PhantomJS: Stepping down as maintainer"

https://news.ycombinator.com/item?id=14105489

Also relevant: "Headless Chrome is Coming Soon" from PhantomJS dev mailing list

https://groups.google.com/forum/#!msg/phantomjs-dev/S-mEBwuS...


No mention of WebDriver, only Chrome's devtools protocol :( There's probably a proxy or something, but would be nice to see completely integrated native support.


That proxy already exists. It is called ChromeDriver[1][2] and was developed in collaboration between the Chromium and Selenium teams. It is the same way that Selenium/Webdriver controls regular Chrome now.

It would have been nice if they at least mentioned it in the article since Selenium is such a popular browser automation tool. They do in fact mention it on the README page for headless chromium.

[1] https://github.com/SeleniumHQ/selenium/wiki/ChromeDriver

[2] https://sites.google.com/a/chromium.org/chromedriver/home

[3] https://chromium.googlesource.com/chromium/src/+/lkgr/headle...


The problem with ChromeDriver is that it implements Selenium's idea of what the remote control interface should look like, and frankly speaking, Selenium's idea has some enormous and idiotic oversights[1].

[1] http://stackoverflow.com/questions/3492541/how-do-i-get-the-...


ChromeDriver also has a particularly annoying race condition when triggering clicks https://bugs.chromium.org/p/chromedriver/issues/detail?id=28 which often only appears on slow machines such as multitenant CI services like CircleCI (https://circleci.com/docs/1.0/chromedriver-moving-elements/)


correct me if I'm wrong, but doesn't webdriver communicate with Chrome via the devtools protocol?


WebDriver is a protocol: https://www.w3.org/TR/webdriver/


yeah yeah yeah I'm sorry I mean Chrome webdriver, I think that was clear from context?

As in selenium -> webdriver protocol -> chrome webdriver -> devtools protocol -> chrome instance


it's called ChromeDriver and I kinda mentioned it originally — "There's probably a proxy or something" — and said that I want native fully integrated support :)


ah now I follow, my bad :)


What are the pros and cons of running headless Chrome compared to running Chrome in a virtual display driver like Xvfb?


I haven't looked at Headless closely enough yet, but the biggest pros I can see are:

- Potentially less overhead (system resources) - Much simpler setup (compared to something like Xvfb) - Better support for actual automation tasks, e.g. screenshots, separate sessions, etc.

The last point is especially relevant if you run a tool that is visiting many sites in parallel. If you run multiple tabs per process to keep memory usage and Xvfb instances limited then you won't be able to have separate browsing sessions, e.g. two concurrent navigations to the same origin could interfere with each other (cookies, local storage, etc). Another obstacle I have discovered is that you can only take screenshots for the tab that is currently active. For my site (https://urlscan.io) I work around that by manually activating the tab when the load event fires to take the screenshot. Works reasonably well, but can sometimes fail under load.


>If you run multiple tabs per process to keep memory usage and Xvfb instances limited then you won't be able to have separate browsing sessions, e.g. two concurrent navigations to the same origin could interfere with each other (cookies, local storage, etc). Another obstacle I have discovered is that you can only take screenshots for the tab that is currently active.

This is a huge flaw in NightmareJS, which is disappointing because of how beautifully simple its API is. A Nightmare fork rebuilt over headless Chrome would be the best of all worlds for browser automation.


> If you run multiple tabs per process to keep memory usage and Xvfb instances limited then you won't be able to have separate browsing sessions, e.g. two concurrent navigations to the same origin could interfere with each other (cookies, local storage, etc).

Assuming you're running Selenium, this is handled. If you need, you can call each session with its own profile.


Xvfb is way harder to setup. Also, I assume it uses less ressources so it scales better.


I use xvfb-run on Debian and it's a breeze:

    xvfb-run -s '-screen 0 1280x1024x8' ./cucumber
Or are you using more advanced Xvfb features, or perhaps running inside containers?



Yes, I assume that the drawing process may be different. However, it's probably still possible to take a screenshot.

I would like to know what sort of performance differential I can expect.


Screenshot functionality is, in fact, advertised in the linked article.


xvfb can be a bit tough to setup, but Selenium's Docker images make this a breeze, and they even have a debug mode with automatic VNC.


It would be difficult to run this set up on Windows for example.


Will the "--disable-gpu" flag no longer be needed in the future because headless mode will automatically disable the GPU, or because GPU support will be added to headless mode? I really hope it's the latter.



Huh. So, if you can express your PNG / PDF rendering in terms of HTML, SVG, Canvas, and WebGL... How much easier, faster, more reliable would it be to use Headless Chrome and --screenshot, rather than other means?


HTML & CSS are not really suitable for "pixel perfect" text. Even text in SVG works just like text in an HTML document in the sense there is not a way to control precisely where text will wrap. You'd have to disable the browser's word wrapping & come up with your own logic for where to insert <br /> tags, even then you have issues like browsers ignoring CSS rules because of 3rd party browser extensions, missing fonts & such.

Look at this jQuery plugin for evidence - http://simplefocus.com/flowtype/

As you resize your browser, you see the browser re-layout the page & the text jumps around. The text wraps at different positions as you resize your browser around. You'd have to essentially render text to an image server-side, then scale the resulting image in the browser. And if you already have an image, there's no need for tools like wkhtmltoX or phantom for creating an image (you already have an image)... So headless browsers are not suitable for rasterizing documents that contain text [in my experience].

Converting HTML to PNG/PDF is like comparing apples & oranges. The conversion will be imperfect.


> HTML & CSS are not really suitable for "pixel perfect" text.

...and a lot of people don't care.

> You'd have to disable the browser's word wrapping & come up with your own logic for where to insert <br /> tags

...unless you don't care.

> even then you have issues like browsers

"BrowserS"? Plural? No. This is one implementation - headless Chrome. Which you could use to do your own rendering. There's no plural. You would use one browser to do your headless rendering.

> You'd have to essentially render text to an image server-side

That is. What this is.

> then scale the resulting image in the browser

What? It depends entirely on what your use case is. If you want to email a PDF, then you could use headless Chrome to turn HTML into a PDF, and then email it.

> So headless browsers are not suitable for rasterizing documents that contain text

I feel like you're off in this weird space, very different from what I've experienced. And I've experienced it again, and again, at different companies.

> Converting HTML to PNG/PDF is like comparing apples & oranges. The conversion will be imperfect.

...but potentially faster, lower effort, and "good enough" for many use cases.


That's a small trade-off I'm willing to take. It beats having to have a completely separate engine for the PDFs (compared to just using what you're already displaying on the web) and it's way more complete than using something like DOMPDF which doesn't even have support for some of the most basic CSS features.


Tell that to PrinceXML:

https://www.princexml.com/samples/

Print CSS is pretty complete if only the major browsers would support it.


Did someone try to render an SVG using it? I'm using QtWebKit for now, but it's just a huge pain.


Yup. Renders SVGs just as Chrome does. Really straightforward.


Can you provide a minimal example?


If you plan to use it for automated tests, note that headless Chrome is only available on Mac & Linux (in Chrome 59), not on Windows yet.


Windows support is coming. You can follow along at https://bugs.chromium.org/p/chromium/issues/detail?id=686608


I've been using headless chrome over the last couple of months to implement a P2P network between browsers. Worked great!


That sounds interesting. Do you have any more details or a blog post you can share?


https://bitbucket.org/lindenlab/dullahan/src/default/README....

I imagine Headless Chrome will make this obsolete shortly but just in case anyone wants to play with it, here is Dullahan - A headless browser SDK that uses the Chromium Embedded Framework (CEF). It is designed to make it easier to write applications that render modern web content directly to a memory buffer, inject synthesized mouse and keyboard events as well as interact with features like JavaScript or cookies.


Has anyone tried to use headless chrome to detect insecure content on what you hope is a secure page? I've only looked into it a bit, and so far failed. I'd like to know if a visitor to https://example.com/foo.html in chrome will get a warning about insecure content. I'm looking for a browser because in some cases the insecure content is loaded by javascript from a 3rd party. I don't know a way to do that besides fire it up in a browser and let it go.

Is there a way to get the secure/insecure status from headless chrome?


The remote debugging feature can get to that.

See: https://chromedevtools.github.io/debugger-protocol-viewer/to...

Scroll down to InsecureContentStatus


Was on the frontpage yesterday, with 605 points and 117 comments: https://news.ycombinator.com/item?id=14239194


Do any of the major CI providers make this trivial to use yet? Travis CI, CircleCI, Buildkite, etc.?

Anybody have any positive or negative experiences to report with actually using headless Chrome?


I lead the DevRel team for Web and Chrome so please take anything I say with a hint of bias....

I've found it pretty easy to use so far - https://paul.kinlan.me/chrome-on-home/ and got it deployed in a number of environments including AppEngine. We are planning on getting this deployed in our [WebFundamentals](https://github.com/google/WebFundamentals/blob/master/.travi...) Travis script to run automated testing on our dev docs site.

Some things that I found hard: no documentation on how to get it running in a CI yet - some devs (Justin Ribeiro - https://hub.docker.com/r/justinribeiro/chrome-headless/) got a docker file all set up which helped a lot, the devtools protocol whilst it has docs wasn't quite as simple as I had hoped and I had to guess of how to call the node library.


If you're using Dotnetcore I've created a protocol generator here

https://github.com/baristalabs/chrome-dev-tools

With a sample on use here

https://github.com/baristalabs/chrome-dev-tools-sample


The upcoming NW.js v0.23.0 beta version will support this usage: https://dl.nwjs.io/live-build/04-29-2017/nw23-3bd4af6-4d7f95...


So much for that:

    [0501/120318.494691:ERROR:gl_implementation.cc(245)] Failed to load /opt/google/chrome/libosmesa.so: /opt/google/chrome/libosmesa.so: cannot open shared object file: No such file or directory


As noted, you'll need the --disable-gpu flag for now. Hopefully that will go away around ~Chrome 60.

See

- https://bugs.chromium.org/p/chromium/issues/detail?id=546953...

- https://bugs.chromium.org/p/chromium/issues/detail?id=695212


Be convenient if all the examples had it for those of us idly pasting stuff in while code compiles :-)


How's that different from running Selenium with Chrome?


Selenium opens a real chrome window. So it's automated, but not headless. You will also be able to use this headless support in the new chrome to run it without the window with selenium.


Could this be used to create a screenshot service that'll automatically take screenshots of sites? Or is that not permissible under Chrome's license?


The Chromium license shouldn't place any restrictions on using the software like that. Go wild.


This is great progress.

I use electron in node.js (via nightmare) because it can go from headless to visible on the fly

I wonder if the Chrome team is ever going to support something similar


Does anybody know how screenshot is triggered? Is it on DOMContentLoaded? or some other event? Can we trigger this somehow post page load?


I needed a delay before creating a pdf maybe this is helpful ? https://github.com/tim-field/urlToPdf/blob/master/url-to-pdf... ( using via node )


Several years ago I tested the remote debug protocol and was insanely slow. Is it now faster, like evaluating scripts at the console?


I don't know we've heard any performance/latency issues before that concern issuing script evaluation over the protocol. Were you sending hundreds of scripts to evaluate? Or payloads far over 1MB?

Regardless, if you're having problems.. definitely file a bug and get our attention. new.crbug.com or https://github.com/ChromeDevTools/devtools-protocol will do the trick.

[I'm on the DevTools team]


Does it work without X-server?


It's headless so most likely yes.


Who would ever think Google will release a headless browser. Stunning.


why is Google doing this? They're helping everyone else build a spider that could easily serve as the front end to a new search engine. Is this their way of fighting back against the internet closing off? Give everyone a spider so perfect that walling in your content becomes impractical? I like the idea of Google lighting the match that burns down the walled gardens, but if not this than why?

It could be anything, but I feel like there's definitely another motive to Google unleashing something that could bite their core business.


You're looking for strategy everywhere. People at big companies do a lot of things and not every feature is high-level strategic.

I don't know their motivations, but the first use that comes to mind is to make automatically testing your own web app using Chrome easier. Most teams working on web apps could use better testing, whether they work for Google or not.


Spidering is a tiny part of building a search engine, and this piece is a tiny piece of a spider.

Headless Chrome helps someone compete with Google as much as Basecamp's work on Rails helps someone compete with Basecamp.


lol no. Nobody building a search engine would use this, and very similar alternatives (e.g. PhantomJS) have been available for years.

This is just a useful tool. They probably use it internally for automated testing and research.


If I was trying to control scraping and get a view into it, offering the best scraping tool might be good start.

I'm pretty sure headless mode exposes something you could fingerprint it with, server side.


fantastic news, is that in any way useful for server-side rendering? i can imagine well to pre render here or there things though


why does it need a whole launcher app instead of taking a script to run on the webpage with a control api baffles me


only if there was an option to render pdf and save it to disk in headless mode.

Still great step.


scroll down a bit and look for `--print-to-pdf`.


Does not work for me.

$ google-chrome --disable-gpu --headless --print-to-pdf https://www.chromestatus.com/ && ls -la .pdf ls: cannot access '.pdf': No such file or directory


Make sure you're using Chrome 59. If you don't have that yet, try Chrome Canary.


The Google Chrome website installs v58 when I press download, and when I try to install Canary:

"Chrome Canary is currently not available on the Linux platform."


If your system is already configured to use Google Chrome's apt repo (e.g. after installing a stable Chrome), you can do:

    apt-get install google-chrome-unstable


Get a build of Chromium that is more recent than this feature


Any ideas on how to print landscape with no margin ?


Thanks, i was mostly thinking in the interactive mode (read: selenium). Currently this seems to be available in devtools protocol.


Finally. PhantomJS has a competitor now.


[flagged]


I hate to be that person, but why is it so exciting to exchange our privacy and freedom for a shiny piece of software? What's so wrong with Firefox + PhantomJS?


I agree with the sentiment. However an issue I had using PhantomJS is that it doesn't support modern JS rendered sites ( doesn't support much es5 at all it seems) I needed to create a pdf from such a site and couldn't get PhantomJS to do it. Headless chrome saved the day.


Now, this is handy!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: