Hacker News new | past | comments | ask | show | jobs | submit login

If anyone has any ideas as to how best to handle cookie caching, I'm all ears.



You might want to take a look into using something like twill (instead of requests) and BeautifulSoup instead of pyquery -- twill in particular will allow more control over cookies, etc.,


Thought about that, but using Requests + PyQuery was what made this project a joy to work with in the first place :)

I'm planning on just writing cookie data to a file and using that if available.


I prefer lxml to BeautifulSoup.


I can’t find the link now, but I remember something about BeautifulSoup being deprecated—it didn’t support HTML5 last time I checked. LXML is great. http://code.google.com/p/html5lib/ is also a nice parser for HTML5 documents.

Edit: pyquery wraps the aforementioned LXML. Seems like a good fit for jquery style selection…


I tried PyQuery ~1 year ago, and immediately found issues with it (IIRC it was having trouble selecting an element that had two classes, when the selector was only specifying one of those classes). I may have to revisit that if people are recommending it with positive reviews.


Give it a shot again. I've used it with several projects over the last 6 months and have had no issues. One of the projects also involved some fairly heinous malformed HTML, and PyQuery performed well.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: