Hacker News new | past | comments | ask | show | jobs | submit login

> R is a highly optimized, well-oiled machine if you're using it for its highly-optimized, well-oiled purposes.

This hits home for me. We are just starting to use R for risk modeling where I work. R, more than any language I've ever used, makes me appreciate "worse is better". From a theoretical "aesthetic" perspective R is a mess. Yet for data processing all those theoretical concerns don't matter. It just works.

It's honestly kind of humbling that something so theoretically messy can be so practically coherent. It makes me question my assumptions about simplicity.




R "just works" now because a huge amount of effort has gone into improving the language over the last 10 or so years, in part spurred by the tidyverse movement, although not restricted in scope to tidyverse. When I was starting grad school around 2010, if someone sent you some R code, the chances that you would be able to "just run" it were basically zero: there would be weird version mismatches in how functions worked, file paths would be specified in inconsistent ways in different parts of the script, all kinds of crazy impenetrable errors were the norm. Now there are several R code snippets posted in these HN comments that will run without trouble. If I could have gone back in time and told myself that this is how R would develop, I would have been shocked (and happy).


Part of it has to do with strict testing in CRAN as well. Packages have to pass tests and confirmed to compile. This adds reliability to package management across platforms.

That said I still run into trouble with package deprecations. I was trying to install the optmatch package (deprecated but still used by causal inference packages) and had a really tough time getting it to compile on macOS.


I dunno man, python has always seemed a little bit worse on this stuff to me. At least with R if you had a consistent version, everything off CRAN worked together.

I think R 3.0 introduced namespaces which fixed a lot of the really crazy stuff.

Also, I was writing Sweave in 2010 for my thesis, and I definitely wasn't alone.


Namespaces showed up around 2004, so somewhere around 2.0.0. I don't think they were mandatory until much later.


I suspect that has as much to do with the maturation of the data science community as it does the language environment. There have been pockets of R users who put much effort into reproducibility before the era you cite, such as Bioconductor.

When I think back to the era you're describing what I recall was people winging around hacky scripts being the norm regardless of their environment. While still not something I'd think of as software engineering best practices, what I see now is less Wild West.


I hadn't thought about R as a "worse is better" language, but that's a good way to think about it. Makes sense, too, since it came from the place that inspired worse is better.


R comes from New Zealand, no?


R is an implementation of S. John Chambers worked on it at Bell Labs starting in 1975.

https://en.wikipedia.org/wiki/S_%28programming_language%29


I’m trying to figure out if you were actually asking a question or if it was rhetorical and you were calling shots.


Is there any reason you chose R over Python? Is it just because that’s the go to language?


We asked the people are going to be using it what they'd prefer. Many of them are recent graduates, and they told us they mostly used R during their university courses. It's just a pure familiarity play. The alternative was building a huge system on a mainframe (we're a legacy bank).

Really there wasn't a lot of thought put into the language. We figure that if it ends up being a total failure, we can just pivot.


Bravo! This is exactly right.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: