How do you guys get R predictive models into production? Last I used Plumber to ...

CapmCrackaWaka · on Dec 15, 2019

It depends on what you mean by 'production'. I've had great success setting up my data collection, engineering and predictions in batch processes. I agree though, I would never try to use R with a REST API, but I don't think it was ever designed for that.

As a general rule of thumb, if something needs real time predictions or I need deep learning libraries, I use Python. R is for anything else.

wjak · on Dec 15, 2019

Exactly, production and deployment process are very different. In enterprise it is very rigid with production that has no internet connection and the best if you do not install pkgs there (supported by rsuite). But I had a customer who treated dev as prod. :)

wjak · on Dec 15, 2019

We use R for rest using plumber. It is very similar to flask. What you need is to add load balancer.

meztez · on Dec 16, 2019

R is like any other languages, we have a few rest API in production for live prediction. We use rocker docker image with xgboost and plumber, data.table to do pre prediction data wrangling. Hosted on GCP kubernetes, using 0.25 cpu and 250 mem, API is able to do around 40 requests per second per pod. Multi models, both have more than a 1000 trees.

demirev · on Dec 16, 2019

I can highly recommend RestRserve [0] for bringing R models into production (it forks every request so scaling up is easier than with Plumber). I use it regularly for various projects and I have had minimal issues with it.

[0] https://restrserve.org/

wjak · on Dec 15, 2019

Check this example. It is quite both complex and simplified. Real implementation is more automated. https://github.com/WLOGSolutions/RSuite-examples/tree/master...

laichzeit0 · on Dec 15, 2019

Maybe I’m missing it, but does this example work for online predictions? My use case is I have a trained model, and I want to put a REST API in front of it that clients call call.

wjak · on Dec 17, 2019

Check this example - https://github.com/WLOGSolutions/RSuite-examples/tree/master...

wjak · on Dec 15, 2019

No it is not example for rest API. Sorry I misunderstood you. I will add example for plumber with rsuite. Nevertheless the example presents workflow where only scoring should be changed to online from batch.

wjak · on Dec 15, 2019

R is single threaded. The same is with python. We use kubernetes for scaling. But it is not for all applications of course. R can be put into production. Rsuite is one of the solutions that helps with that.

proverbialbunny · on Dec 15, 2019

ymmv, but many of the libraries R uses run on multiple languages, so you can take the models built in R and run them in another language (usually Java).

Python is single threaded as well. Like Python, R can be made multi threaded, and like Python, R can be productionized without having to convert it into another language.

One possible implementation is a pool of R workers. Each request calls an R worker. So if your pool is 100 and you get 20 requests from 20 different users at once, all 20 will be ran simultaneously. Likewise, many tasks can and should be cached. Consider MemcacheD or similar.

kusmi · on Dec 15, 2019

I always used NiFi.