Hacker News new | past | comments | ask | show | jobs | submit login
New Machine Learning Gems for Ruby (ankane.org)
176 points by thunderbong on June 19, 2021 | hide | past | favorite | 51 comments



One of my greatest disappointments in life is that programmers don't give a shit about the actual experience of using a particular language and it's ecosystem, and instead go full sheep mode and follow the crowd for whatever reasons they think are valid.

I don't hate python but it's existence is the main brake slowing ruby down.

I enjoy programming in Ruby. As the years roll by and I get more grey hairs the enjoyment I get from other languages descends to just toiling away and wasting one's life.

Thank fuck for Ruby


I enjoy several languages but Ruby definitely ruined python for me. I used to think python seemed elegant, but everytime I've had to work with it lately I just wish it was Ruby.


A lot of people care and Rails isnt going anywhere. But we need to be realistic about winning the data science battle...that ship has sailed.


> I don't hate python but it's existence is the main brake slowing ruby down.

Flipside, I love Python and hate Ruby. It's easy to use Python idiomatically, and this design deters bad behavior from "clever geniuses". Ruby code I've had to maintain is full of mixins, dynamic dispatch, and magical behavior from a distance. This is why I love Rust so much despite the fact that it feels like Ruby at a superficial level - it doesn't want you to be clever.

But I feel the same way you do about PHP.

> "it's existence is the main brake slowing [...] down."

Maybe Ruby could focus on winning over the PHP community instead of trying for the data science community.


I've had exactly the same experience.

I had to do some hacking on discourse (the OSS forum package), and the amount of magic and difficult-to-follow logic is super high.

Basically, ruby is the new perl, as far as I can tell.


>ruby is the new perl

What's new in ruby? it's there since 1995, and the connection to perl was an explicit influence in its design.


... and Clojure. These two stand out as the most elegantly designed creativity-enhancing programming languages. The rest are mainly a set of constraints. Python is the VHS of programming languages with its pathetic lambdas.


This may surprise you, but using lambda is considered a sign of bad programming practice in Python.

In general, this whole thread completely misses the point Python makes. Do not invite “clever” language (ab)use. Do the straightforward thing.


> This may surprise you, but using lambda is considered a sign of bad programming practice in Python

Only by its BDFL. Most languages consider lambdas to be first class so why not let the user decide?


Because the user makes poor choices, that’s the point - don’t do the clever thing, do the straightforward thing. Lambdas are discouraged in Python, and that’s not just something Guido van Rossum thinks (who, by the way, is not BDFL - there is none anymore and there hasn’t been for a long time now.)


If lambdas are clever then so is most Node.js code. Guido may have stepped down but he was responsible for this "Pythonic" attitude to lambdas. It's like the Python community is parodying Matz with "Guido hates FP therefore we hate FP".


Lambda syntax is indeed horrible in Python. I hope some day they do something to fix it… Otherwise I really enjoy the language and the ecosystem.


What do you think about Elixir in this context? Is it not "creativity-enhancing" enough?


Elixir doesn't have an array/vector implementation. Instead it has to borrow Eralng's ugly offering or hack Map with numeric keys as in PHP.


There is some exciting development on that front by building on XLA.

https://github.com/elixir-nx/nx/tree/main/nx


So I can get a cutting-edge PyTorch/TF multi-dimensional array with Nx but Elixir itself still has no basic built-in vector? I don't get it.


People don’t go full sheep mode. They just invest in an ecosystem, and don’t want to waste time looking for a new hammer.

Part of what has also kept Ruby lagging is also cultural. It’s not a language that was ever adopted by engineers but rather artists, and it took a while for these artist types to realize performance actually matters.

Python, on the other hand, always had an avid base of CS oriented folk that helped drive the language forward technically. Ruby instead only had Rails, which was a productivity marvel rather than a technical one.


> Part of what has also kept Ruby lagging is also cultural. It’s not a language that was ever adopted by engineers but rather artists, and it took a while for these artist types to realize performance actually matters.

… are none of the engineers that work at GitHub (Microsoft) CS oriented folks?

> Ruby instead only had Rails, which was a productivity marvel rather than a technical one.

… are productivity marvels and technical marvels mutually exclusive?


Not necessarily, but Rails is much more a productivity marvel than a technical marvel tbh. This is as it should be for a productivity oriented tool, not surprising users with technically advanced stuff is a feature in that case.

And sure, some of the engineers at Github/Shopify/etc are super qualified but from my personal experience in the Ruby ecosystem (5+ years and counting) there are WAY more autodidact freelancers who can throw up new websites in minutes but still struggle with computing the algorithmic complexity of linked lists than there are CS graduates who have a solid grounding in data structures and algorithms. In such an environment, it is not surprising that many of the available libraries are very business-oriented but not very optimized.


The CS/tech industry is full of toxic elitists, who get off at seemingly complex or over architected things.

Ruby and it’s ecosystem make things look easy. So easy that anybody can create software.

The elitists fail to see that creating such an ecosystem is where true skills and thinking required. Funny part is most of them ARE sheep, and make so many basic and architectural mistakes. They simply like to feel superior to others.


But Ruby is just as fast as Python in most of the comparisons I’ve seen.

It is true IME that usability and compatibility, not performance, are Matz’s first concern.


I love Ruby more than Python. But I’ve got to admit it can be much more confusing and the learning curve is steep. I mean, getting to the point where you love and appreciate Ruby is a lot of work.


Really, compared to Python? I'm genuinely curious why you think so, it's the exact opposite for me.

Can't tell you the amount of times I've been able to just put pseudocode down in Ruby and have it work somehow, the language is beautiful


Maybe I’m wrong but things like 3.times { ... } are more prevalent in ruby than in other languages. There’s optional return statement (do I put it or not?), strings vs symbols, hash conventions ({ hello: :world } vs { “hello” => “world” }, people still use both), optional parenthesis in method calls, etc. I think ruby gives more ways to do the same thing, but that’s confusing for a newcomer. That being said, it’s my favorite language.


That's true, but I think for a beginner having more ways to do things is also nice, since you have the freedom of doing any one thing in multiple ways. Once you're more comfortable with the language, you don't even really think about it, you just pick what you prefer and what makes sense in the given context.

I'll address the points you made since I love talking about Ruby lol, hope you don't mind!

I'm not sure what you mean with the 3.times {} point. I personally prefer the 3.times do ... end syntax, because it's basically English, it makes it very easy to comprehend what's happening as a beginner. You're doing a thing 3 times, it might as well be pseudocode. IMO it's much more elegant than the whole for (i = 0; i < 3; i++) syntax that's more common in other languages.

The optional return thing I'd agree with, it can lead to some confusing situations. Even after years of Rubying, I sometimes get caught by an unexpected return, especially if I'm working in other langs for a bit. Though if you just disable Rubocop's warnings, you can put a return in no problems! I even encourage beginners to explicitly put in returns, so that the methods they write do what they intend them to do. Didn't intend to return that specific thing there? Now it's clear!

What specifically about strings and symbols? The only caveat that I can think of is that symbols are immutable whereas strings are not , but I don't know how often beginners come across situations where they're confronted with symbols outside of a Hash anyway.

The hash conventions are simply a matter of legacy code. I forget when the rocket (=>) syntax was 'replaced', for lack of a better word, but that's just because before a certain version of ruby you couldn't do the foo: :bar syntax. For legacy purposes, the rocket one remains, but in my experience most people writing modern ruby stick with the foo: :bar convention.

I'd also agree with the optional parenthesis bit, though again for beginners I just tell them to always put parens since it makes it easier to reason about methods.


There’s a chance component to why Python is so popular in data analysis. But I’ve seen many non-programmers get into computational science. Like biologists doing bioinformatics. And my gut feel is it’be harder to get them into Ruby compared to Python. I have experience explaining tidyverse vs vanilla in R. Tidyverse libraries have a lot of synthetic sugar as well. But it’s challenging to get newcomers to appreciate it, because you have to always go an extra mile to explain what it means. Whereas vanilla(and Python) is closer to what they studied in CS101. So the whole machine learning for ruby is going to be very niche.


"Matz is nice and so we are nice"


Instead of being disappointed you could try and analyze why Ruby is lagging so behind Python. Perhaps it's Ruby itself that is the greatest brake and not Python?

After all, people did flock to Python for integration with scientific, math and machine learning libraries, not to Ruby.

There are some musings, for example, here: https://twitter.com/zverok/status/1365014133180141578?s=20


Mh, it seems that the Twitter thread mostly focuses on the legacy in terms of language design and then blames much of the demise on there never having been a real killer application that showcased it (as the author pointed out and I agree with Rails worked because of Ruby but didn't showcase it well). I didn't see anything about actually faulty language design.


I never said the fault was with the language design


A lot of things in life are coincidental and I think python adoption is one of them. Its not a bad language but neither is Ruby.


Awesome work. I tried something similar a while back, but I gave up once I realized the memory requirements for Ruby numeric primitives make it near impossible to massage large data sets in Ruby.

Pandas and numpy skirt this with custom numeric types. I've seen some efforts to build pandas clones in Ruby, but none have come close to the performance needed to handle a few gigs of data.


I think they were adding something like python’s buffer protocol to ruby3, which should pave the way for something like numpy if there is enough demand. I’m referencing https://bugs.ruby-lang.org/issues/14722


Numo::NArray has proven useful to me for working with audio in Ruby, though I haven't tested it with GBs of data in a single array.


There is also Dask in the Python world for when Pandas can’t handle the data size.


I see the discussion turns out to be general about programming languages. Here is my take:

I am doing Ops (aka DevOps, aka system administration) and I was using bash and Python. I never clicked with Ruby. I was suffering. Constant thought "why is it so shitty?"

Outdated bash doesn't meet any modern expectation from a programming language.

Python, as other general purpose languages, was not created for Ops specifically. Running external process is a pain, data manipulation is with list comprehensions instead of more straightforward map() and filter() (or do non-idiomatic Python), quite a few other features missing which I would expect when writing small scripts.

I think Ops people deserve better. Result - my own programming language - https://github.com/ngs-lang/ngs . Is it "better" language? Probably not. I do think it sucks less than others for Ops though.


It sounds like we do very similar jobs and feel the same way about Python. I'm 97% sure you're going to stick with your own language, but for others seriously give Ruby a look.

I absolutely adore Ruby as a scripting language. It's so easy and elegant to shell out to bash and do something with the output. When brevity or familiarity with bash is desired, variables like $? are there also (although there are generally less esoteric ways to do things). Running a simple shell command in Ruby is as easy as using backticks and having the output from the command returned to you as a String (and I always add a .chomp to strip the trailing \n)[1] Example: datestr = `date '+%Y-%m-%d-%H-%M-%S'`.chomp

Worth noting too is that Ruby also has a fantastic one-liner command line syntax like perl and awk for example. Check out https://robm.me.uk/2013/11/ruby-enp/ and https://benoithamelin.tumblr.com/ruby1line/

Also the community is amazing. There are gems for almost everything, and really neat/fun projects like Ruby Warrior: https://github.com/ryanb/ruby-warrior

I'll stop now, but suffice it to say Ruby is one of the best tool sin my belt, and I haven't even mentioned the number of times I've thrown down a simple single-file HTTP service based on Sinatra in record time ;-)

[1] https://stackoverflow.com/a/2400/2062384


> I'm 97% sure you're going to stick with your own language

Agree with the estimation :)

> but for others seriously give Ruby a look

Of course. Python is not out of the question either. I think it's mostly about alignment of how you think and what is possible to express in the language in a straightforward manner.

> It's so easy and elegant to shell out

That's the area where I think NGS has better facility than pretty much anything else I've seen out there. In NGS, it's the "domain". From the top of my head:

  * Short syntax for run-the-command-and-parse-output (currently auto detects JSON but customizable) - like backticks in bash but double-backtick on each side.
  * backticks syntax like in bash (but don't strip the last newline)
  * Argv facility for constructing command line arguments (if the command is complex)
  * Automatic handling of exit codes. Errors throw exceptions and no, not any non-zero is an error.
  * ok: option to specify non-error exit codes for particular run of a program
  * log: option to log the command being run
More at https://github.com/ngs-lang/ngs/wiki/Use-Cases#use-case-ops-...

> Worth noting too is that Ruby also has a fantastic one-liner command line syntax like perl and awk for example.

NGS is somewhat there and somewhat getting there


> Published January 22, 2020

Has any additional development happened since then? I was interested to see this, but less so if it hasn't got any traction over the past year and a half


Andrew Kane seems to be one of the most productive Ruby developers out there - he has a ton of open-source gems that solve real-world problems: https://github.com/ankane

He's done a lot of the hard-work for creating the tooling in the first place - the community can step in from here.


Indeed he is. Blazer is my favourite reporting/dashboards tool by quite a distance.


Blazer, groupddate, lockbox, the list is endless. Grateful for folks like him in the community.


But will the community step in? I doubt it since Ruby gem output has dropped dramatically in the last couple of years. See modulecounts.com where Python output is about 10 times that of Ruby. That's the difference with the exponential growth of Python where you have the Python Foundation backing everything.


It's definitely a good point, I myself have noticed production software in Ruby losing favour over the past 5 years. I'm not sure for the reasons on it as I no longer build greenfield apps for companies so don't have those direct conversations, but I still use Rails in my own personal projects and for our current SaaS.


After the Ruby devops wave lost its momentum to Python and Go Ruby was left looking like a one-trick pony. The rise of microservices and API gateways also fits the lightweight Node.js async model better than Ruby's memory-heavy monolithic approach.


Update - daily output of modules: Python/190, Ruby/12.


What community? The web development space has moved on to javascript and friends. Ruby was also big in the security community for a while, but almost everything new is in Python. I actually checked the comments on this post to see if there was some new big push for Ruby I was missing.


Hotwire/Stimulus hasn't reversed Ruby's decline, as far as I can tell. Nor have the promised performance improvements in Ruby 3 so Ruby's downward slide looks likely to continue. Revivals are a rare thing in programming language adoption.


Yea. He has been a prolific contributor to the Ruby ecosystem. Inspiring work!


This follow up was published in September, 2020: https://ankane.org/more-ml-gems


With a total amount of 31 gems published within a year and this person having a lot more published libraries outside the ML space, how much time is available per gem for things like continued development, regular updates to new ruby versions and/or bug fixing? I can't help but think that having so many gems becomes unsustainable rather soon.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: