Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
mercurial
on Jan 20, 2015
|
parent
|
context
|
favorite
| on:
Command-line tools can be faster than your Hadoop ...
You could just use Scrapy [1]. Easy to setup, and plenty of options you can activate if needed. Likely more robust than shell scripts as well. No Hadoop involved.
1:
http://doc.scrapy.org/en/0.24/intro/tutorial.html
Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
1: http://doc.scrapy.org/en/0.24/intro/tutorial.html