The R standard library is kind of weird and hard to use for "general purpose" programming. It's very clearly a domain specific language.
Also, it's unbelievably slow for basic operations like looping, function calls, and variable assignment. It's literally orders of magnitude slower than Python (I've tested it). Unlike its fellow C-flavored-Lisp Javascript, R retains an extreme level of homoiconicity, Which apparently makes the language very difficult to compile or optimize. I don't know how e.g. SBCL does it, but over the lifetime of the R language no one has managed to implement a non-trivial subset of R that performs better than GNU R (fastR was never finished to my knowledge). So you are basically relegated to writing code in C, Fortran, or C++ if you need even decent performance. Otherwise you are stuck using "vectorized" operations like lapply(), which are fine but make for a jarring experience if you're coming from other languages.
So of course it's a "real" language in the sense that Bash is a real language. But it's ultimately and fundamentally a domain-specific scripting language and I don't know if there is a way around that.
I can give you some sense of why one can compile SBCL and not R. SBCL and other Lisps these days maintain some kind of conceptual barrier between macroexpansion time and run time, so that you can _stage_ your activity appropriately. Read the code, gather the macro definitions, expand the code (repeat as necessary) until you have some base language without metaprogramming in it.
Then you can optimize and compile. This is particularly true of Scheme which foregoes much of the dynamic quality of Common Lisp in favor of a much more static view of the world. But its still basically true in CL.
R doesn't have macros of that kind. It has a sort of weird laziness based on arguments remaining unevaluated until needed. Metaprogramming in R is typically done by intercepting those "quoted" arguments and then evaluating them in modified contexts (this is exposed to the user more or less by letting them insert scopes into the "stack").
Thus, there is no distinction between macroexpansion and execution time. Hence, its tough to write a compiler, which is basically just a program which identifies invariants ahead of time and pre-computes them (eg, lexical scope is so good for compilers because you can pre-compute variable access). Because all sorts of shenanigans can be got up to by walking quoted expressions and evaluating them in modified contexts, R is hard to compile.
This is, by the way, why the `with` keyword was removed from Javascript. It provided exactly the ability to insert a scope into the stack used to look up variable bindings.
I didn't think the with statement was removed, as that would break backwards compatibility on the web with any older code that used it. Instead, it's just not allowed under strict mode, but it is still part of the ECMAScript specification. I just tried it in the Chrome console, and it still works.
> This is particularly true of Scheme which foregoes much of the dynamic quality of Common Lisp
The idea of even the standard Common Lisp is that both is possible: a static Common Lisp and a dynamic Common Lisp, even within the same application in different sections of the program. Common Lisp allows hints to the compiler to remove various features (like fully generic code being reduced to type specific code), it allows a compiler to do various optimizations (inlining, getting rid of late-binding, ...) - while at the same time other parts of the code might be interpreted on the s-expression level.
Its definitely the idea. I know it works well enough for a lot of people, but for me, I'm happy to just give up on the dynamic behavior in favor of a simpler universe.
The last thing I want to be thinking about is whether my compiler needs a "hint" that something can be stack allocated, for instance. That is their business, not mine.
Common Lisp as a language standard makes no commitment to the smartness of a compiler. If a certain compiler can figure out things, great, but some compilers might be dumb by design (for example to be fast for interactive use). That's left to implementations.
it's ultimately and fundamentally a domain-specific scripting language
A domain-specific rather than general-purpose programming language? Most definitely. A scripting language? I would dispute that. Scripting languages are for automating the execution of sequences of tasks that you'd otherwise run independently.
Well, it may not be keeping up with the new versions but at least it does cover a non-trivial subset of the standard implementation. :-)
I have not tried it in many years and I don't know how much faster or compatible it is. Nevertheless, I think it's a good thing that an "outsider" is looking at performance issues.
As far as I know it's essentially a one-man's effort. One could imagine that RStudio or Microsoft would be able to improve R performance substantially if they wanted to.
I’d say one of the major problems I have with R is when people try to use it as a general purpose language. Just because you can doesn’t mean you should. It just wasn’t designed that way. I’ve seen a lot of R code that would have been trivial to write and execute significantly faster in another language.
I don't think there's any shame in R effectively being a good, productive DSL for a lot of stats and viz work, built around fast stuff written in C++. I also think this is largely true of a lot of Python as well, tbh. In the future my hope is that projects like Arrow end up with even more of the workload being taken off R's shoulders.
I'm a data scientist and, frankly, I really strongly prefer R over Python. Its mostly the whitespace in Python, but the language is also slow and crufty as heck. R, if you use it responsibly, feels a lot more like your standard, lexically scoped, dynamically typed language.
Absolutely. I usually respond to the "R-is-not-a-real-language" line with something like "R is syntactic sugar on a Lisp where the atoms are APL arrays". It's plenty interesting from a computer science perspective, if you bother to look.
For instance, why does R use <- instead of = for assignment? Because initial versions (of S) predate C -- developed down the hall -- the language that first introduced = for assignment.
The "not a real language" critique often comes from those without any knowledge of Scheme; "real" languages are like C# or Java, and maybe Python, even though the lisp-like properties of Python are not so much appreciated. As C# and Java have gotten more functional aspects over the years, I've seen less and less of the "not a real language" criticism.
There's really not much Lisp/Scheme left in R these days.
Function arguments/environments are still pairlists: a type of cons cell.
You can manipulate the parameters to a function as symbols before you evaluate them. This is useful — many very good R libraries all make heavy use of it — and certainly very interesting. Lispers probably know it as quote/antiquote.
I suppose we could also point that, much like most Lisps, R gives you plenty of object systems to choose from...
And that's about it for R-as-a-Lisp.
It's also a really bad language for working with tree-structured data, something Lisps normally excel at.
I agree; which is why I generally get confused when I see that criticism. This is only an assumption though; maybe its just a vocal minority that don't like R or I only notice the negative comments. Who knows
For what it’s worth, I am a JavaScript programmer and I feel like HN shits on JS All. The. Time. Some people just like to shit on languages.
I think it’s just insecurity. They worry a lot about whether they’re a good enough developer, so they try hard to be on the “most advanced” platform they can figure out, and then need to talk down to the other platforms.
Secure, talented, older developers, in my experience, tend to be open to whatever technology is most appropriate. They have confidence in their ability to learn new things. And see the good traits of any tech, as well as the bad one—a necessary skill if you’re going to be the kind of developer who can make traction in any situation.
The platform fetishists can only really work from inside their special tower.
The last time I tried putting a REST API in front of my predictive model, I used plumbr. I also then learned that the R runtime was single threaded, so that only 1 request could be processed at a time. I don’t know how you can overcome this, and it makes things significantly difficult to put into a real production environment.
I’m sure someone will chime in that there are different runtimes, like Microsoft’s (does this run on MacOS? If not, how can I use it to develop and test on my laptop) or some expensive RStudio solution.
The fact is you have none of those problems in e.g., Python. Although I would prefer to use Caret for many predictive modeling tasks, it is non-trivial to take the R runtime to production. That was what I walked away with.
- is based on scheme, a HN favourite;
- integrates very well with C++ through RCpp allowing you to do whatever you want. 9/10 times someone already went through the trouble for you.