Looking at the code for the visualization, the author did an independently similar approach (with the same tools), and one that turned out slightly different, which is what makes things interesting.
It's worth nothing that back in August, only the 2014 and 2015 datasets were released by the NYC TLC. I'm not entirely sure why they decided to release 2009-2012 now.
If you're looking to just playing with the data, I recommend using the BigQuery approach as noted in my article, since downloading and processing ~300GB might take awhile. However, the shape file approach used in the original article the next logical step after that, and one that is put to very good use in the article.
Looking at the code for the visualization, the author did an independently similar approach (with the same tools), and one that turned out slightly different, which is what makes things interesting.
It's worth nothing that back in August, only the 2014 and 2015 datasets were released by the NYC TLC. I'm not entirely sure why they decided to release 2009-2012 now.
If you're looking to just playing with the data, I recommend using the BigQuery approach as noted in my article, since downloading and processing ~300GB might take awhile. However, the shape file approach used in the original article the next logical step after that, and one that is put to very good use in the article.