Hacker News new | past | comments | ask | show | jobs | submit login

I've used Selenium a bunch the last couple months to automate daily/hourly jobs to pull data from 3rd party UIs that don't offer an API. I couldn't imagine not having a tool like Selenium at my disposal!



Pro tip: create a chrome extension with permissions on all http:// and https:// sites, or run it through Node-Webkit/nw.io, then you can use the generic DOMParser and querySelectorAll with a pretty fluent interface on any site you can imagine.

example here: https://github.com/SchizoDuckie/DuckieTV/blob/angular/js/uti...


> or run it through Node-Webkit/nw.io

That's quite interesting! I thought node-webkit isn't suitable (yet) for such purpose. Could you go into more detail on how to do parsing/automation external sites with it?


It's very suitable! (I'm using it in DuckieTV in production, works like a charm!)

Basically, you can use xmlhttp to fetch any webpage becaused of relaxed restrictions. then use DOMParser (a built-in browser component, that you can even shim) to create a virtual DOM of that xmlhttp result, and execute regular querySelector and querySelectorAlll queries on that :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: