Hacker News new | past | comments | ask | show | jobs | submit login

At least in my case, although I do like some of the tidyverse (ggplot is tidyverse after all, and probably my favorite plotting system in any language), I find the "tibble" data structure and its disdain for row names really annoying. Many, many, packages which I use expect a data frame with row names and likewise output such. So I am constantly converting back and forth between tibbles and data frames.



The claim that ggplot2 is tidyverse can be disputed. It existed before and works fine outside of it.


You are of course free to dispute what is in the tidyverse, but I pretty strongly believe that it is part of the tidyverse


Seeing as the tidyverse is pretty much your invention, its well within your purvue to define what is in and what is out. I do think, respectfully, that both the op and the general sentiment article have the right of it.

Data.table and data.table-esque notation represent such an improvement of tibble/dplyr, that within my company, we're making a concerted effort to purge all tidyverse packages from general use (less ggplot). When new developers come on, if they are coming from tidyverse, their first task will be something involving pipes and data.table. Tidyverse was fine in school. It doesn't pass muster in production, at least not in our work.

Data.table syntax is simpler, easier to read, easier to teach, and orders of magnitude faster. It plays nicer with other packages than the tidyverse (if it fits into a DF, it almost always fits into a DT, and i've never met a tibble that I didn't wish was a data.table), and since almost all of our datasets are 10's to 1000's of millions of lines long, the decision was really made for us.


"Tidyverse was fine in school. It doesn't pass muster in production...."

This is a bit disrespectful.

"Data.table syntax is simpler, easier to read, easier to teach..."

This is rather arbitrary, and I don't think it's the majority view of the community, whatever the advantages of data.table.

"...orders of magnitude faster"

This is an exaggeration in most real cases, even according to the benchmarks pointed to by data.table.[1]

[1] https://h2oai.github.io/db-benchmark/


If you find data.table more useful, you should by all means use it.

My greatest regret about coining the word tidyverse is that for some reason people seem to think it’s a monolith. It’s not; you’re totally free to pick and choose whatever parts of it you find useful.

It doesn’t hurt my feelings if packages that I have help write aren’t the perfect fit for your problems. Use whatever makes you happy :)


Sure, with your amount of data you need data.table. But that is your specific use case. That has nothing to do with dplyr not being production ready, just that it is not the right tool for you. Separate things but somehow programmers love to consider them the same.


Thank you for your reply and all your work.

I probably should have phrased my comment better. My point was that it’s perfectly possible to use ggplot without explicit knowledge of tidy principles or tibbles. This is great! And for example, I’m reading through your ggplot book and it doesn’t make reference to it.[1] In my work, we use data.tables (we believe we need the raw performance) with ggplot for visualization and are not using the rest of tidyverse and it works fine.

[1]: In recent versions this seems to be slowly changing.


Pretty much all of the individual components of the tidyverse were created before the tidyverse since it’s only 3 years old.

But there’s no reason to use only the tidyverse. That’s not something I’ve ever recommended and it would be extremely hard. I just object to people claiming that some of the most important parts of the tidyverse aren’t actually parts of it.


ggplot2 is under the tidyverse umbrella, and will be installed/loaded with library(tidyverse).

The original article is more about dplyr though.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: