Hacker News new | past | comments | ask | show | jobs | submit login

Wow, this made my day. I've been using FireQuark (a custom firebug that provides CSS selectors - http://www.quarkruby.com/2007/9/5/firequark-quick-html-scree...) and while it was very helpful, SelectorGadget will remove sooo much more of the pain of scraping. Major kudos.

As an aside, I'll mention the nokogiri gem (almost identical to hpricot, but under much more active development), that's what I've been using and it's excellent. http://github.com/tenderlove/nokogiri/tree/master




Just so you know, there was, and may still be, an off-by-one error in Hpricot's implementation of nth-child that may confuse you when using the selectors generated by SelectorGadget.



Great point. I haven't looked, but I think nokogiri fixes that bug. I think that was one tenderlove's motivations for creating nokogiri. He talks about it more on his blog:

http://tenderlovemaking.com/2008/10/30/nokogiri-is-released/

"I just want to tell you that you shouldn't worry about that old legacy code that uses Hpricot. Nokogiri can be used as a drop in replacement! Really! Nokogiri doesn't reproduce the bugs that are in Hpricot, but should work in most cases. Just use "Nokogir::Hpricot()" to parse your HTML. Of course, I've tried to keep the syntax of Hpricot that I like."




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: