Rsuite – R development and data science platform

amirmasoudabdol · on Dec 14, 2019

I’m using R for a while in my current position, alongside some other programming languages, Python and C++. R is bar far the hardest to predict and read. Rstudio is terrible. It’s a wrapper around a “web app” and that simply doesn’t work well for something as complicated as IDE. To give an example, Rstudio does only one thing at the time, you are running a code, you cannot open a data frame even to look at it. Rstudio doesn’t at all behave likes any other IDE that you’ve seen either. Try to increase the font size and the whole idea scales up!

R by itself is a mess, and I don’t think I have to say much about that. R community is big and that’s good and bad. It’s good because amazing people are developing amazing packages for it. It’s bad because there is a lot of bad packages. It’s a lot like JavaScript community. I have a feeling the community has started to reward “having a package”, and everyone has a package.

Besides the quality of R packages and R being a strange programming language, R gets the job done. However, if your job is anything beyond some statistics and data processing, then good luck. I’m not saying that you cannot achieve what you want to achieve using R, however, good luck reading R code. I found it extremely hard to read R codes and so far 90% of codes that I’ve encountered have little to no comments.

epistasis · on Dec 15, 2019

> R being a strange programming language

I'm probably an outlier, but I have to say that the language itself is one of my favorite things about R.

Vector based, super powerful indexing of vectors, functional programming basics, lazy parameter evaluation, super convenient parameter matching and defaults, all these things make it super productive for me and let me deal with data far better than other languages. Matlab is similar in its ability to deal with data, but that's a language that feels far clunkier to me. Python has caught up with some of its packages, but it definitely feels bolted on instead of native to the language.

treypitt · on Dec 15, 2019

I actually love RStudio for what it does. In general I'm a huge CL advocate; RStudio is basically the only time I'd rather use an IDE than the terminal. Looking at data, code, and plots simultaneously is easy - I haven't found anything as elegant for Python, including pycharm.

Yes the language itself has problems. It's 1-indexed for God's sake! But if you stick to what it's good at (dplyr, ggplot2), you can get a lot of mileage. What Linus Torvalds said about C++ programmers probably applies doubly to R programmers - so yea comments are going to be sparse. And if you venture far from its core competence of data, you're gonna have a bad time. But overall Hadly Wickham and the tidyverse are driving the ecosystem forward, and R has found a great niche between python scripting and Excel/Matlab

amirmasoudabdol · on Dec 15, 2019

Now that you mentioned tidyverse, let me discuss the ugly side of R. You are right, R community and ecosystem is being pushed forward by Rstudio, the company and people behind it. Tidyverse is truly great and R without it would have not been where it is now. However, Rstudio is a company and they want to earn money, sure, they are contributing heavily to the open-source R but that might change, or they use their influence to steer the development. I’m not saying that they would but they could. I see it more and more that R codes relies on APIs that Rstudio expose. What if Rstudio decides to keep more of its product locked. You can see a part of this already, Rstudio team (company) sells essential security as a feature. What if this extends to R and Tidyverse. Sure, there is license and what not to protect it but that could change. Looking back at the history, this is how the whole Matlab has started!

Again, I’m not saying that it will happen, but it could. Dependent of R on third party packages and a company to push it forward it’s not necessarily a good thing.

P.S. Python IDEs are crappy too!

p10_user · on Dec 16, 2019

Tidyverse is currently GPL3 licensed, so that code will remain open even if they were to change the license for future releases. If they made it so onerous and restrictive then presumably less people would use it, to the detriment of the community. People may just move over to another language like Python.

iamcreasy · on Dec 15, 2019

R language has many quirks. Here was an effort to list them: https://www.burns-stat.com/pages/Tutor/R_inferno.pdf

Also, the author of R language said the performance of R is sub-optimal. In his own words: https://www.stat.auckland.ac.nz/~ihaka/downloads/JSM-2010.pd...

Currently I am using PyCharm with R plugin that JetBrains released very recently. R Studio is very slow and buggy.

epistasis · on Dec 15, 2019

That’s a very colorful document, reminiscent of the excellent Unix Haters Handbook, but the first chapter is standard floating point stuff, common to every language. It doesn’t inspire confidence that the writer bothered to learn anything about the domain before deciding to write and complain.

Chapters two and three are the lesson to not use procedural language fundamentals if you want performance, and to instead use functional equivalents. Not exactly a language quirk.

If somebody has only ever used 2000s-era Java and C# and those types of languages, functional style programming will be a strange beast. But python has enough functional style things such as lost comprehensions, and I hear Java and C++ have gained functional style programming too, so in this day and age I’m not sure that functional style programming should be considered quirky.

The runtime performance is all due to the current implementation, not the language itself. I think JavaScript has far more quirks, it then it was also invented in an insanely short amount of time so that’s not too surprising.

Cosi1125 · on Dec 15, 2019

> Also, the author of R language said the performance of R is sub-optimal.

That's from 2010, even before JIT compilation in R became a thing. And since then, much has been done in terms of the computational performance of R.

listenallyall · on Dec 15, 2019

Do any people still use Tinn-R? CTRL-F found not a single mention. That's what I learned on and still go to on the occasions I need to script some R.

xvilka · on Dec 15, 2019

> Python has caught up with some of its packages, but it definitely feels bolted on instead of native to the language.

This is what Julia aims to solve.

psv1 · on Dec 14, 2019

> To give an example, Rstudio does only one thing at the time, you are running a code, you cannot open a data frame even to look at it.

This isn't specific to R or RStudio. Start running a slow process in your Python IDE of choice, and while it's running try to execute df.head() to view some data frame - you won't be able to see it regardless of the language or IDE (and for a good reason).

amirmasoudabdol · on Dec 14, 2019

I understand that good reason, it’s because scripting languages run on sessions. So, Rstudio couldn’t execute any new command while doing something else. That’s fine. What’s annoying and not ok is the fact that sometimes the entire interface freezes. UI has to be separated from the session and logic of the program. Rstudio doesn’t do this well.

Gatsky · on Dec 15, 2019

Background jobs in Rstudio are coming:

https://blog.rstudio.com/2019/03/14/rstudio-1-2-jobs/

Also not sure you can criticize a whole language because some random code you read doesn’t have comments.

thereby__ · on Dec 16, 2019

There actually here. They're really great for developing shiny apps. Set the shiny.autoreload option to TRUE. Run the app in the background and point your viewer to the URL et voila

CreRecombinase · on Dec 14, 2019

Even if your job is beyond statistics and data processing, you're probably using R because the core of what your team is doing is statistics and or data processing. If that's the case, and if you're the computationally sophisticated member of the team, then shouldn't the onus be on you to understand/adapt to what your less "sophisticated" peers are using?

amirmasoudabdol · on Dec 14, 2019

I’m not sure why you brought this up. I didn’t talk about who should do what and whether people are doing things wrong. I said R is strange and hard to read and Rstudio is bad and should be more capable.

Now, if I want to address your comment. I don’t think “this was the way things have been done here, let’s do it like that from now on” is a good approach. If there is a better way, even if it’s more complicated, at least it has to be tried and tested. If a more “sophisticated” tool proves to be useful, let it be. Not everyone in a team has to use the same set of tools, and if they see the benefit in something more “sophisticated”, they might want to try it, even if it was not explored before. People can learn, there is always a better way/tool and one tool cannot do it all.

ttz · on Dec 15, 2019

You're entitled to your opinion. But perhaps it is more constructive to email R dev mailing list and suggest your ideas for improvement.

RA_Fisher · on Dec 14, 2019

There are a lot of really fantastic packages, too. Many are the only implementation of a certain stats tool in the world.

ekianjo · on Dec 15, 2019

> It’s a lot like JavaScript community.

Erm, not at all. You don't need a hundred packages to run a trivial application. Even R-base is reasonably powerful, and it goes to a complete different level once you use tidyverse as a layer to code everything for R.

sxv · on Dec 15, 2019

Felt the same way for years coming from the python world. The R for Data Science book[0] was a game changer in making R enjoyable for me.

[0] https://r4ds.had.co.nz/

mslip · on Dec 14, 2019

If you want to look at data frames as you run sections of your code you will have to use r markdown chunks.

laichzeit0 · on Dec 15, 2019

How do you guys get R predictive models into production? Last I used Plumber to put a REST API in front of it then discovered R is a single threaded runtime so effectively you can only go 1 request at a time. I guess the only option is to containerize and run many instances with a load balancer in front? I develop on a Mac so I can’t go the Microsoft R server route and I don’t want to embed myself into some commercial solution, e.g. Rsuite. You can trivially do this with the Python ecosystem.

My feeling is that R is great for anything that doesn’t need to be operationalized into production (monitoring, security, logging, scaling, performance, etc). There are so many good ML/stats libraries in R and most books seem to use R (when written by academics) but it feels like these people have never had to put anything into production.

CapmCrackaWaka · on Dec 15, 2019

It depends on what you mean by 'production'. I've had great success setting up my data collection, engineering and predictions in batch processes. I agree though, I would never try to use R with a REST API, but I don't think it was ever designed for that.

As a general rule of thumb, if something needs real time predictions or I need deep learning libraries, I use Python. R is for anything else.

wjak · on Dec 15, 2019

Exactly, production and deployment process are very different. In enterprise it is very rigid with production that has no internet connection and the best if you do not install pkgs there (supported by rsuite). But I had a customer who treated dev as prod. :)

wjak · on Dec 15, 2019

We use R for rest using plumber. It is very similar to flask. What you need is to add load balancer.

meztez · on Dec 16, 2019

R is like any other languages, we have a few rest API in production for live prediction. We use rocker docker image with xgboost and plumber, data.table to do pre prediction data wrangling. Hosted on GCP kubernetes, using 0.25 cpu and 250 mem, API is able to do around 40 requests per second per pod. Multi models, both have more than a 1000 trees.

demirev · on Dec 16, 2019

I can highly recommend RestRserve [0] for bringing R models into production (it forks every request so scaling up is easier than with Plumber). I use it regularly for various projects and I have had minimal issues with it.

[0] https://restrserve.org/

wjak · on Dec 15, 2019

Check this example. It is quite both complex and simplified. Real implementation is more automated. https://github.com/WLOGSolutions/RSuite-examples/tree/master...

laichzeit0 · on Dec 15, 2019

Maybe I’m missing it, but does this example work for online predictions? My use case is I have a trained model, and I want to put a REST API in front of it that clients call call.

wjak · on Dec 17, 2019

Check this example - https://github.com/WLOGSolutions/RSuite-examples/tree/master...

wjak · on Dec 15, 2019

No it is not example for rest API. Sorry I misunderstood you. I will add example for plumber with rsuite. Nevertheless the example presents workflow where only scoring should be changed to online from batch.

wjak · on Dec 15, 2019

R is single threaded. The same is with python. We use kubernetes for scaling. But it is not for all applications of course. R can be put into production. Rsuite is one of the solutions that helps with that.

proverbialbunny · on Dec 15, 2019

ymmv, but many of the libraries R uses run on multiple languages, so you can take the models built in R and run them in another language (usually Java).

Python is single threaded as well. Like Python, R can be made multi threaded, and like Python, R can be productionized without having to convert it into another language.

One possible implementation is a pool of R workers. Each request calls an R worker. So if your pool is 100 and you get 20 requests from 20 different users at once, all 20 will be ran simultaneously. Likewise, many tasks can and should be cached. Consider MemcacheD or similar.

kusmi · on Dec 15, 2019

I always used NiFi.

glofish · on Dec 14, 2019

R, unfortunately, is also one of the most ill-designed yet popular programming languages in existence. I would strongly recommend people to steer away from it. If you cherish your sanity stay away from using R!

Moreover after seeing what my colleagues publish as scientific R programs, I came to believe that science itself is bottlenecked by the large scale adoption by R and the sloppy, inconsitent and bug-infested programming practice that it encourages.

R does a few things well - cross-platform, plotting works on all platforms, packaging works well. But for actually programming it is atrocious.

CreRecombinase · on Dec 14, 2019

What is "actually programming"? Is fighting with your package manager (I'm looking at you python) "actually programming"? Is re-implementing functionality that exists elsewhere in the hipster language du jour (e.g rust, julia) "actually programming"? I totally concede that R itself is a fairly unremarkable lispy mostly functional programming language. What makes it stand out is it's emphasis on immutable, in-memory, array-based data structures. This means that 1) it's very straightforward to wrap highly performant C/C++/Fortran libraries 2) despite being a dynamically typed language, it's usually quite straightforward to reason about the type/shape of the inputs and outputs of a function 3) individual functions from one package can often be easily combined with functions from another package. I totally get it if any of this isn't your thing, but to write off a whole ecosystem as "ill-designed" (without literally any argument besides "my co-workers, and scientists in general, are stupid"), is pretty lazy.

29athrowaway · on Dec 14, 2019

Rust was developed out of necessity by C++ users, it is not a hipster language.

I prefer using Rust rather than paying hundreds of thousands of dollars on a static analysis tool for C++.

curiousgal · on Dec 14, 2019

What are you on about? R has a specific use, which is statistics and data science. For those purposes it reins supreme. Even for developing dashboards, R-Shiny is a breeze compared to Pythons's Dash. R is awesome.

Also, to quote a comment[0] of yours from 2 days ago:

What you are saying is that since you prefer something everyone should do the same.

And what you prefer is the correct choice for everyone ...

0.https://news.ycombinator.com/item?id=21774917

psv1 · on Dec 14, 2019

There are many great things about R that you can point out - Shiny is not one of them.

Fomite · on Dec 14, 2019

In terms of integration of a first-rate statistics language with anything resembling a web interface, not only is Shiny great, but it pretty much reigns supreme.

proverbialbunny · on Dec 15, 2019

What's wrong with Shiny? It's reactive. It's efficient. It's straight forward. It works well. It works with tons of great plotting libraries.

It's not perfect, but I'm struggling at finding bad things to say about it.

glofish · on Dec 15, 2019

You quote a reply to a post that advocated for all education to be segregated. That means making a choice for others.

In this post, I make a recommendation of staying away from R. It is not even a remotely similar context.

Mikeb85 · on Dec 14, 2019

R is a scripting language, most of the underlying infrastructure is written in Fortran, C and C++. R is also designed for stats, not writing software. Of course you're going to have a hard time if you treat it like a real programming language. That's why R provides easy interop with other languages.

But R also makes a lot of the tasks you do in data science far easier than it would be in a 'real language'.

glofish · on Dec 14, 2019

that "easy" has a huge price - the language is choke-full of unexpected behaviors, inconsistencies and gotchas - all in the name of making it "easy"

ekianjo · on Dec 15, 2019

Care to share any language that does things better without a steep learning curve? R is popular for very, very good reasons. You can pick it up and be productive with it in no time, even with the "gotchas"...

proverbialbunny · on Dec 15, 2019

Raku has less of a learning curve than R, does things better, and is quicker and easier to pick up and be productive in, with barely any gotchas if any at all.

Though, to be fair, I like R. It has good plotting libraries, and as much as it gets bashed I like RStudio too. In comparison, Raku's ecosystem is brand new.

epistasis · on Dec 15, 2019

R is lisp variant with a ton of syntactic sugar and vectorization of the basic data types. Calling it a scripting language would indicate that you haven't liked at it much as a language.

Sure, most of the matrix operations and tight loops are implemented in FORTRAN or C++, but that's for performance reasons. The same would happen with Python, and I don't think it would be fair to call Python a scripting language either; it also has a ton of Lisp like qualities.

sargram01 · on Dec 14, 2019

I find I spend most of my time debugging R, which is astoundingly difficult since it doesn’t report line numbers on errors. Most of my code is in C++, so that helps, albeit it’s still overly complicated to start up R in gdb. Amazingly Julia isn’t much better when it comes to error reporting either.

kgwgk · on Dec 14, 2019

  # cat /tmp/test.R
  x <- 1:10
  y <- 1:20
  plot(x, y)

  # R --quiet
  > source("/tmp/test.R")
  Error in xy.coords(x, y, xlabel, ylabel, log) :
    'x' and 'y' lengths differ
  > traceback()
  8: stop("'x' and 'y' lengths differ")
  7: xy.coords(x, y, xlabel, ylabel, log)
  6: plot.default(x, y)
  5: plot(x, y) at test.R#3
  4: eval(ei, envir)
  3: eval(ei, envir)
  2: withVisible(eval(ei, envir))
  1: source("/tmp/test.R")

If you mean the lines in the packages you used, I think you'll see the lines if you build them with keep.source=TRUE

CreRecombinase · on Dec 14, 2019

It's so crazy easy to start R in gdb:

  R -d gdb

that's it!

ksevastyanenko · on Dec 14, 2019

You might find this package useful https://github.com/robertzk/bettertrace

Zelazny7 · on Dec 14, 2019

  options(error=recover)

LegitShady · on Dec 15, 2019

All the design brief said was get started anything beyond that is a change request

Cosi1125 · on Dec 15, 2019

What are you talking about? Of course it was written in lower-level, compiled languages. Is it any different than Python? Javascript? Perl?

As for the "easy iterop with other languages": [1]

So, how is R different from Python?

Also, it's not true that R wasn't designed for writing software. Even a critical "pamphlet" by Pat Burns [2] states otherwise. For writing statistical software, yes.

--

[1] https://wiki.python.org/moin/IntegratingPythonWithOtherLangu... [2] https://www.burns-stat.com/pages/Present/infernoishR_annotat...

whyhow · on Dec 14, 2019

This is silly. I bet your colleagues are making poor R programs because they are not well versed in programming, not because R is inheritanly worse than anything else.

My experience is that people who make bad R programs also make bad python programs. I don't think you should blame a tool for issues caused by the programmer.

glofish · on Dec 15, 2019

try loading up any scientific package and you'll see how, in turn loads up other and other packages, in bioinformatics you can easily add up to dozens if not a hundred of dependencies, each written in R by people with questionable skills.

there is no escape.

When I said "colleagues" I really meant the entire scientific field runs on untold lines of buggy R code, so obtuse, so cryptic, that the task of debugging or even tracing what is going on is practically impossible. And you can't debug it because it is this awful R code everywhere! And when the code breaks it does not break like normal programming language do, with an error or exception or even a stack dump. No! Most of the time your R code will just start silently doing the wrong thing.

proverbialbunny · on Dec 15, 2019

glofish, do you like functional programming paradigm languages like Lisp and Haskell? It's entirely possible that R is weird and unconventional giving a negative impression, due to it being FPP.

>try loading up any scientific package and you'll see how, in turn loads up other and other packages, in bioinformatics you can easily add up to dozens if not a hundred of dependencies

That's not necessarily a bad thing unless you're trying to run R on embedded or some other constrained environment.

>When I said "colleagues" I really meant the entire scientific field runs on untold lines of buggy R code, so obtuse, so cryptic

I can't recall the last time I've bumped into a bug in an R library. I'm sure they exist but thankfully the ecosystem is quite stable.

>that the task of debugging or even tracing what is going on is practically impossible.

Debugging in R is easier than most languages. I'm unsure where you're getting your facts from.

>And when the code breaks it does not break like normal programming language do, with an error or exception or even a stack dump. No! Most of the time your R code will just start silently doing the wrong thing.

It's no worse than Python in this regard. R isn't particularly bad in this area, but it's certainly no C++.

I'm going back to guessing it's because R is FPP. That's R's dirtiest and most offensive part to the uninitiated.

wodenokoto · on Dec 14, 2019

R is a great language with a powerful, but very dated standard library.

But don’t worry about R bringing science down - the scientific community can also write terrible python code.

xvilka · on Dec 15, 2019

This is why static analyzers, various linters, etc should become accessible and enforceable. Python or R, Julia or Fortran. It is time to force scientific developers to live in the same world the rest of development lives.

closed · on Dec 14, 2019

I strongly disagree. And in general, it seems like poor quality code in science is often not because of the language, but because scientists rarely lose their jobs when code breaks.

There are many books that cover how to develop in R in detail, and they are no less thorough than treatments of the subject in other languages (e.g. Hadley's books are as good as any I've read for python).

Many issues around inconsistency, etc, in language design (mostly how base functions / data types behave) have very clean, consistent implementations in libraries like rlang.

The main differences I see when comparing R vs python package code, that affect style are...

1. Most R operations are immutable.

2. R often uses single dispatch, rather than putting methods on a class object.

3. In R, vectorised behavior is often the norm.

4. R functions can choose to use lazy evaluation (it usually very clear when this happens in e.g. tidyverse packages).

These issues are covered in detail in books like Hadley's Advanced R.

glofish · on Dec 15, 2019

Hadly Wickham is a hero of the language, single-handedly tries to wrestle the slippery monster into sanity, I think long term is a losing battle because in the end the language is still borked.

My hat off to him though!

As for the language ... consider this: lapply(), sapply(), tapply(), vapply() each does something different. The language allows two kinds of assignment operators even: a=1 or a <- 1 that are "almost" identical ... good luck, here is a language there are two ways to even assign a value to a name.

proverbialbunny · on Dec 15, 2019

>The language allows two kinds of assignment operators even: a=1 or a <- 1 that are "almost" identical

The <- assignment is normal for functional programming languages. F#, OCaml, S, and more use this operator. This is because the arrow key used to be a physical key on keyboards back in the 70s when FPP was popular and brand new.

The = sign (function assignment operator) is function level scope and <- (assignment operator) is top level scope.

eg:

    median(x = 1:10)
    x  ## Error object 'x' not found

    median(y <- 1:10)
    y  ## [1] 5.5

So therefor,

    x <- 1:10
    median(x)

is equivalent to

    median(x <- 1:10)

It's a convenient feature the language supports. The alternative is how Python does while loops. If anything, R comes out above in this regard.

edit: Python has the := operator which functions the same way <- does in R. I guess Python is catching up on this one.

eg (Python):

    env_base = os.environ.get("PYTHONUSERBASE", None)
    if env_base:
        return env_base

vs

    if env_base := os.environ.get("PYTHONUSERBASE", None):
        return env_base

Cosi1125 · on Dec 15, 2019

Note that `median((x = 10)); x` works fine :-)

ekianjo · on Dec 15, 2019

> here is a language there are two ways to even assign a value to a name.

three ways if you count a <<- 1 for writing in global variables from inside functions (of course, not a recommended practice...)

I don't see what you point is though. So there are several ways to do the same thing in a language, so it's bad? Bad in what way?

As for the lapply, sapply, tapply, mapply, it's very well documented as to when and where you should use them. Sapply applies only on a single vector, and for generalization on larger data structures you use the other "applys". Nothing very hard to comprehend, and this is well explained in the official docs.

closed · on Dec 16, 2019

> single-handedly

There are many people in the R community working on this together (e.g. Jenny Bryan, Charlotte, etc).

> lapply(), sapply(), tapply(), vapply() each does something different.

The apply situation has been standardized through the purrr lib and dplyr for a long time. They are base library functions that aren't mandatory.

> two kinds of assignment operators even

Consider the custom of using <-. It reduces the kinds of assignment operators to 1. Similar to avoiding from lib import * in python. You can do it, but there are community standards against it.

ineedasername · on Dec 14, 2019

This comes across as just another comment on why one language is "bad" when, as always, it all comes down to trade offs & preference when choosing a tool for a job.

The thing is, it's easy to write bug-ridden sloppy code in any language. Bemoaning R as a language because of these flaws, occurring due to rapid adoption, ignores the reasons why R has seen wide-scale adoption.

R has had an extreme democratizing effect on access to tools that facilitate data science. Previously, tools for data science were either massively expensive or had a prohibitively high price tag attached.

This means that many non-programmers are coming to R, and I maintain that the problems the parent post sees with R stem from that fact. As a result, any language that achieved that sort of layman (to programming) appeal would have the exact same bug or sloppy code fallout. That you cannot separate the momentum that led to such an accessible tool without have the same consequences. Rather than demonize the tool for this, we should recognize the positive dynamic at play and simply help guide users to better practices or improvements that would fix the issues.

clircle · on Dec 14, 2019

I'm on the opposite site of the fence, but I'd love to hear some specifics on how R is ill-designed and encourages buggy programs.

rcthompson · on Dec 14, 2019

I'm a heavy user of R, and I like using it a lot. But the language has lots of traps for beginners: code idioms that look correct but are subtly wrong. For example, if you need to iterate over the indices of a vector X, the obvious thing to do is 1:length(X), looks fine and works fine until you happen to pass a 0-length vector, and then it explodes. Similarly, the obvious way to select a subset of rows i and a subset of columns j from a matrix is X[i,j]. But that's wrong too, because if either i or j has length 1, you get a vector instead of a matrix. And I don't even remember off the top of my head what happens if either or both of i and j has length 0. The R Inferno[1] is essentially a big collection of cases like this.

None of this makes R a bad language, in my opinion. R is far from the only language with surprising edge cases like this. People say that R is designed for statistical analysis more than general programming, but I don't think that's exactly true either. Certainly it excels in writing code for statistical analysis, but I've used R a lot more than that, and I plan to continue. It's a perfectly fine general-use scripting language.

I think the real reason R gets such a bad reputation is that a lot of people writing and publishing R code aren't programmers by trade. And you know what? That's fine. Because I'd much rather work in a community that values and celebrates the publishing of code than one that shames people for releasing their code because it's "not good enough".

[1]: https://www.burns-stat.com/pages/Tutor/R_inferno.pdf

lbeltrame · on Dec 15, 2019

IMO the worst is accessing non-existent items in lists or when using the $ or [[ notation in data.frames: the fact that you get back NULL instead of an error breaks code in unexpected ways, and given that R's debug facilities are basically useless, makes it hard to debug complex code.

Zelazny7 · on Dec 14, 2019

when indexing you can always pass `drop=FALSE` to prevent returning a vector. It will always return a matrix or data.frame.

zmmmmm · on Dec 15, 2019

Still - that's an excellent example of something that's broken by default and outright dangerous for production use and at the same time very convenient when using interactively. There are probably half a dozen similar other features.

The vast majority of packages are written by somebody taking their interactive session and tidying it up with some functions and tests and then publishing it. But going through and weeding out all these "broken by default for edge case" aspects is a nightmare.

glofish · on Dec 15, 2019

Here are some fun links:

https://www.talyarkoni.org/blog/2012/06/08/r-the-master-trol...

https://www.burns-stat.com/pages/Tutor/R_inferno.pdf

and many others.

I have come to believe that the only people that think R is ok are those that are either:

- beginners that just passed the newbie state, learned a few tricks and feel empowered

- actual experts - that fully understand the minute details of the implementation and data models

I have been using R on and off for a decade, as soon as I stop using for a few months getting back is like a tar pit where I am continuously caught off guard by the myriad of ridiculous problems. Paradoxically as you get better with R your errors start becoming more dangerous, your code starts silently doing the wrong things.

R is unlike any other programming language that I have used before (also on an off) from Perl, C to Python and Java. None of these programming languages have such in incredibly obtuse and illogical and trippy design.

ekianjo · on Dec 15, 2019

> I have come to believe that the only people that think R is ok are those that are either:

You can virtually say the same thing for every programming language that is made to be easy to learn by hiding complexity, like Python or Ruby.

kgwgk · on Dec 14, 2019

> one of the most ill-designed yet popular programming languages in existence

What would the others be? Python is one, I guess.

FridgeSeal · on Dec 15, 2019

Baseline JavaScript probably; that would certainly be my vote.

Also potentially PHP?

kgwgk · on Dec 15, 2019

I had forgot about PHP!

I have no opinion on JavaScript.

I guess some people may include Perl. Others won't, either because they don't think it's ill-designed or they don't think it's popular :-)

naringas · on Dec 15, 2019

python is not ill-designed desgined at all

minimaxir · on Dec 14, 2019

Base R is bad; R augumented by other packages (e.g. tidyverse and data.table) is just as performant/easy-to-use, if not more, than other data science tools.

rcafdm · on Dec 15, 2019

I'm not a language zealot (most languages/frameworks come with pros and cons) and I use R and the tidyverse quite regularly, but "performant" is not a word I'd associate with the tidyverse. I'd be surprised if it wasn't slower than most alternatives (can't claim to have systematically benchmarked it, still....), even if it's often easier to use and usually "good enough."

Fomite · on Dec 14, 2019

Ironically, I find students who rely on the tidyverse to be the most vulnerable to "This isn't working and I'll never be able to figure out why."

Zelazny7 · on Dec 14, 2019

Base R is fine. I would much rather not include the tidyverse and all of its transitive dependencies.

CapmCrackaWaka · on Dec 15, 2019

> But for actually programming it is atrocious.

I really hope you are not using R for anything outside data science, physics, or other analysis. It was developed to do these things, not 'actual programming', which I imagine you define as creating some framework or application.

Most of the people that don't like R seem to want to use it outside of its use cases, and get frustrated when they fail.

ekianjo · on Dec 15, 2019

> R does a few things well - cross-platform, plotting works on all platforms, packaging works well. But for actually programming it is atrocious.

Yet R is a lot more expressive than Python + Pandas for data related applications. It was never made as a universal language to develop any kind of applications, but it's pretty good at what it does with data manipulation.

stewbrew · on Dec 14, 2019

It depends. R excels at backward compatibility and at interactive data analysis, which is what it's made for. But you're right in so far that you probably shouldn't use (much) R code in production.

glofish · on Dec 14, 2019

I agree! That is what R was designed for. Puttering around in the R shell, slicing, dicing data live, doing some interactive plot this, plot that - alas that is not how R is used anymore

LeftHandPath · on Dec 14, 2019

... that's exactly what I use R for. At work, I use it to filter data from the FAA database of registered aircraft. Or to poke around whatever CSV data I need some specific details from that day.

I thought that was what everyone was using it for. What are people using it for?

glofish · on Dec 15, 2019

heh, try installing a single advanced package, you'll see immediately how hundreds of libraries interdependent libraries are also loaded and compiled, each full bugs and problems

stewbrew · on Dec 15, 2019

"Hundreds" is as an exageration. You make it sound as if libraries in other languages were bug free. In my experience, most libraries work well and most authors respond rather quickly to requests.

Anyway, I think it's better to make use of a small, commonly used, and well tested library instead of reinventing the wheel again and again. Libraries are not one of my concerns I have with R.

glofish · on Dec 15, 2019

load up a bioinformatics R library in bioconductor, see what happens, a few dozen would be the low estimate. And make no mistake each does fairly complex tasks.

now what if I told you that the majority (perhaps all) of these libraries you loaded and are needed to run the complex analyses in life sciences were all developed by people who are oblivious to proper software engineering. These were never meant to be used the way they are used - expose myriad of global variable names, methods etc.

You say you are using a small well-tested library with R, sure - but that is not what happens in science and for those that see what is going on, we know we're completely FKDd

The tragedy is that we cannot cure cancer as long we try to do it with R - and R is not going anywhere ...

stewbrew · on Dec 15, 2019

I get your point but there are few alternatives. Python and Julia aren't there yet. (And if they were, they would end up in the same place.) Anything else lacks the variety of packages and isn't really useful for interactive use. You must not forget, R is a tool for statisticians (and one of their presumably modern incarnations) written mostly by statisticians. Useless, error-prone packages will be weeded out sooner or later. The useful ones will improve as time goes by.

I don't know a single package that exposes global variables. I only know packages that expose certain functions and provide some configuration via options. In R, there are no really private variables though. If you want, you can access anything you want. Since everthing is packaged in its own environment, I don't see that as a problem though since the global namespace doesn't get polluted. It won't make people with a background in one of the more modern OOP languages happy but that's probably not that important.

syrahshiraz · on Dec 14, 2019

Disclosure: I work at RStudio

Took a quick look at the docs. If you're looking for dependency management there's renv[1] and you can (obviously) use git for source control. If you actually have enterprise use cases for library curation or air-gapped deployments, you can check out RStudio Package Manager[2]. Among other things, it provides precompiled binaries for packages, which Rsuite doesn't improve on, per docs[3]:

> Now you are ready to install dependencies. Beware that it will take a lot of time because of compilation. You install dependencies with the following command:

[1]: https://github.com/rstudio/renv

[2]: https://rstudio.com/products/package-manager/

[3]: https://rsuite.io/RSuite_Tutorial.php?article=rsuite_binary_...

wjak · on Dec 15, 2019

Rsuite has supported binary pkgs about a year before rstudio. You have not read docs to the end. Rsuite has been used for enterprise. It works great. And it is open-source. Moreover it brings proper definition of R project which rstudio still is missing.

psv1 · on Dec 14, 2019

After a couple of minutes on their website I still can't figure out what advantage this offers over using RStudio as an IDE and/or running scripts with the default CRAN R installation.

wjak · on Dec 14, 2019

Hi, I one of the creators. From GitHub page: R Suite an R package which together with R Suite CLI tool enables you to design deployment workflow that fits you and makes R your primary data science platform. It has beed developed by WLOG Solutions company to make their development and deployment data science process robust.

R Suite gives answers to the following challenges for any R based software and data science solution:

- Isolated and reproducible projects with controlled dependencies and configuration.

- Separation of business, infrastructural and domain logic.

- Package based solution development.

- Management of custom CRAN-alike repositories.

- Automation of deployment package preparation.

- Flawless integration with Docker.

- Development process integrated with version control system (currently git and svn).

- Working in internetless environments.

psv1 · on Dec 14, 2019

My job is pretty much only writing R code and managing R models running in production. I still don't understand what your product offers that I don't already have, or how it achieves what it claims to achieve. Copying and pasting the sales pitch from your website didn't clear things up for me.

ekianjo · on Dec 15, 2019

A more helpful pitch would indeed to compare RSuite vs RStudio in terms of features and what it does best vs the alternatives. I agree that the explanation about the project is pretty bad and it does make the "here are the kind of problems you have and how we solve it" very tangible.

wjak · on Dec 14, 2019

To make this discussion better you should tell more about your development and deployment process. This includes definition of the project you use.

scottlocklin · on Dec 14, 2019

If you really solved:

>isolated and reproducible projects with controlled dependencies and configuration.

... that would be huge. Sticking it in docker containers is also a decent idea.

Thanks for writing it, and pay no mind to ding dongs on here who can't be bothered to learn the language and its tooling, but sure do have an opinion on the topic.

wjak · on Dec 14, 2019

Check for yourself if our solution works for you. We use it on a daily basis. But reproducibility is not the only thing. The most important was to have a project for R.

wodenokoto · on Dec 14, 2019

Only read the headlines. My understanding is that they have version control of packages (something to the effect of a virtual environment, but maybe with a completely different approach)

wjak · on Dec 14, 2019

It goes deeper. We were lacking definition of R project. Pkgs management is part of project management.

ksevastyanenko · on Dec 14, 2019

You might find this package useful for pkg management problem https://github.com/robertzk/lockbox

williamstein · on Dec 14, 2019

After a few minutes I can't even tell what problems this is supposed to solve or if it is even related to solving an IDE problem... the actual site starts with bullet points that describe the product, but not what problems it solves:

" - Open source with Enteprise [sic] support.

- Designed to separate..."

The very first bullet point has a typo so maybe this isn't very mature yet?

wjak · on Dec 14, 2019

Thanks for finding typo. It is mature. We have used it to deploy R to big players. And use it for our consulting services everyday.

Check docs (https://rsuite.io/RSuite_Tutorial.php) and examples (e.g.https://github.com/WLOGSolutions/RSuite-examples )

ngcc_hk · on Dec 14, 2019

Not yet tried but open source ... free ...

arminiusreturns · on Dec 15, 2019

I see a lot of people hating on R or on R-studio. For those people, I'm curious what you would posit as an alternative?

I have liked R because I use it simply, inside emacs org-mode code source blocks which use either R to generate plots or gnuplot. Based on comments, now I am afraid I will reach some ceiling in R. What else is there? Octave? Sage? Julia?

malshe · on Dec 15, 2019

If you read the comments, it’s just one guy giving his opinion on every comment without any supporting evidence.

xvilka · on Dec 15, 2019

Not hating the R, but from what you listed Julia comes the closest and the best designed of all. Octave is bound to be MATLAB compatible, this prevents language innovation. Sage is bound by being "a middle ground" for all third-party languages and frameworks it incorporates. Julia language is cleaner.

anthony_doan · on Dec 15, 2019

I'm going to buck the trend and state that I love using R for modeling and statistic.

The R packages for these domains are one of the best I've seen.

As for R in production, I would wrap it using https://www.rplumber.io/.

vhhn · on Dec 14, 2019

Hi Wit, you guys do a great job to make R ready for deployment in production.

What do you think of the new renv package?

wjak · on Dec 14, 2019

It's goal is different.

We started with reproducible project definition. Then we implemented rsuite to help manage the project. It includes dependency management which is what renv solve. What is the biggest difference is that our project consists of possibly many pkgs that are local to it. This allows you to create complex solutions. Moreover deployment PKG is zip file and to use it you only need r. No PKG installation on prod.

xvilka · on Dec 15, 2019

There was a request to add R in GitHub Semantic library and tool, but prerequisite of that work is creating [1] a tree-sitter [2] parser. So if anyone is willing to help - welcome.

[1] https://github.com/github/semantic/issues/382#issuecomment-5...

[2] http://tree-sitter.github.io/tree-sitter/

ngcc_hk · on Dec 15, 2019

I heard of R when I am fond of XLispStat. The older language is good but it is lisp. Hence, people move on to the mess. I just use R on a pragmatic manner. It is very hard if you take the language too serious. Just use it. And if you can compare a bit your result with other like old SPSS you are familiar with as the result is quite programmer dependent.