Didn't realize Square was interested in Ruby type checking, just like their competitors over at Stripe. Lots of money riding on Ruby, I guess :)
It does seem useful to have a _standard_ for type definitions - RBS as the equivalent to a .d.ts file - as that allows for different type checking implementations to use the same system under the hood. This was a big problem for Flow, and why it lost the fight as soon as TypeScript's definitely-typed repository started gaining momentum - users wanted to use the type-checker that they knew had definitions for the libraries they used.
On the other hand, RBS as hand-written seems rather dangerous, to me. Nothing wrong with using them to define previously-untyped external code, as long as you know the caveats, but I think you really want to have definitions generated from your code. Sorbet cleverly (and unsurprisingly, given it's Ruby) used a DSL for definitions in code, which had the (excellent) additional boost of runtime checking, so you actually could know whether your types were accurate - by far the biggest pain-point of erased-type systems like TypeScript.
Given that Ruby 3 was supposed to "support type checking," I'm surprised that it does not seem to have syntax for type definitions in code, and instead will focus on external type checking. I might be missing a piece of the full puzzle not covered in the blog post, however.
> I'm surprised that it does not seem to have syntax for type definitions in code
This is a big disappointment to me, one of the main advantages of static typing is that it can make code much easier to understand when types are added to non-obvious method parameters.
It also leaves a big question-mark over how this fits into the REPL, and how we might create type definitions dynamically, since many classes in a running Ruby application are conjured by frameworks.
At the moment, I’m disappointed in Sorbet’s capabilities‡, and it’s definitely not usable to me for the libraries that I maintain. I will (theoretically) be able to use Steep with _zero_ negative impact on deliverability.
> On the other hand, RBS as hand-written seems rather dangerous, to me. Nothing wrong with using them to define previously-untyped external code, as long as you know the caveats, but I think you really want to have definitions generated from your code.
Isn't the point that you run the type checker on your own code and it checks that it implements the signature correctly? Having a mismatch between the code and the signature will give a type error. How is this different from how Sorbet works?
Flow supports importing type definitions for third party untyped libs. And in fact did a better job at being integrated in current projects.
Flow lost because the compiler was in really bad shape, slow and frequently crashing. Also their equivalent repository to DefinitelyTyped would ignore PRs for months and years and afaik still does.
It's like it was somebody's toy project and its author eventually lost interest.
It's a pitty because TypeScript still has unsound generics. But Microsoft know how to make dev tools and maintain them.
> On the other hand, RBS as hand-written seems rather dangerous, to me. Nothing wrong with using them to define previously-untyped external code, as long as you know the caveats, but I think you really want to have definitions generated from your code.
That’s sounds like what type-profiler, mentioned in the article, is for; it's an experimental project which,if successful, seems destined to be part of Ruby’s bundled command line tooling, for generating type signatures from code.
If you mean you want type signatures embedded in code source files rather than in separate files, they seem to be taken a documentation-annotation approach, with YARD documentation format expressly called out as a mechanism to bed typing in source files. That's probably cleaner than further cluttering Ruby’s syntax with annotations.
> Given that Ruby 3 was supposed to "support type checking," I'm surprised that it does not seem to have syntax for type definitions in code
The support seems to be that, at a minimum, that it will have a standard for type definitions and provide them for Core and Stdlib and have command line tooling for working with type definitions. Which is, I would say, significante support.
I also believe that, at some point, Ruby _will_ support RBS/steep format in the source code, but the advantage to something like .rbs files is that projects that need to support 2.x and 3.x don’t have to maintain two different versions of the code.
It must be a very easy next step to allow type declaration inline with the code, for example as comments of special format, or maybe some meta-fields / annotations (I'm not a rubyist so don't know whether the language allows associating custom meta information with program elements).
> It must be a very easy next step to allow type declaration inline with the code
Updating Ruby’s already notoriously complex syntax to support type annotations while keeping existing Ruby code valid with it's existing semantics is...not a very easy step, I suspect.
Annotations in documentation is a more viable way of integrating type definitions into program source files.
This is the first I have ever heard of ruby syntax as notoriously complex. If anything it’s usually the opposite. I would love to read why people say that about ruby.
Ruby syntax is designed to be easy to use and flexible for humans; the decisions taken in pursuit of that makes the actual syntax itself quite complex and difficult to parse and, more to the current point, difficult to modify without breaking things that are currently valid Ruby.
Yeah, there are so many alternate syntaxes, shortcuts, and ambiguous statements in Ruby; just reading through Matz’s reference book on Ruby was a trip for me.
Ruby is human-friendly at the expense of a machine-friendly syntax. Writing a machine parser for Ruby is awful.
I guess a language on the opposite side of the spectrum would be Lisp-likes, which are brain-dead simple to come up with a generative grammar for, but a little hard on the eyes.
That is a fallacy in language design. Humans do not have an algorithmic shortcut for parsing; if it's hard for the machine, it's hard for the human.
For short chunks of program text, we can probably rely on our natural language abilities to some extent. Those capabilities allow us to deal with transformational syntax, and ambiguities. So that is to say, we have a kind of general parsing algorithm that is actually way too powerful for programming language syntax, but which only works over small peepholes. Most speakers will not understand (let alone be able to produce) a correctly formed sentence that is too long or too nested. It's as if the brain has a fixed-size pattern space where a sentence has to fit; and if it fits, then a powerful pattern matching network sorts it out. Whereas a programming language parser is unfazed by a single construct spanning thousands of lines, going into hundreds of levels of nesting; it's just a matter of resources: enough stack depth and so on. As long as the grammar rules are followed, and there are resources, size makes no difference to comprehension.
When reading code, people rely on clues like indentation, and trust in adherence to conventions, particularly for larger structures. Even relatively uncomplicated constructs have to be broken into multiple lines and indented; the level of syntactic complexity that the brain can handle in a single line of code is quite tiny.
We also rely on trust in the code being mostly right: we look toward understanding or intuiting the intent of the code and then trust that it's implementing that intent, or mostly so. If something looks ambiguous, so that it has a correct interpretation matching what we think we understand to be the apparent intent, and also has one or more other interpretations, we tend to brush that aside because, "Surely the code must have been tested to be doing the right thing, right? Furthermore, if that wrong interpretation is not actually right, the program would misbehave in certain ways (I guess), and in my experience with the program, it does no such thing. And anyway, this particularly code isn't even remotely near the problem I'm looking for ..."
The idea that human and machine language parsing have any underlying similarity is amusing but pretty absurd. It depends upon the idea that we're somehow doing the "same essential thing", which we are not. Humans do not translate text to serial machine instructions for a processor. They do many things with text, but that is (very seldom) one of them.
I meant literally that Ruby is easier for a human to read for comprehension than say, x86 assembly, which it is. Ruby requires (however) substantially more complex parsing logic to machine parse (translate to machine instructions), because Ruby syntax tolerates an almost absurd amount of ambiguity. This distinction holds when you compare Ruby to many common programming languages. Lisp is an excellent example of a high-level language that can be parsed with minimal complexity. I can teach an undergraduate to build a Lisp parser in a day, but it would take weeks to get someone up to speed on a Ruby parser.
This was not posited as an essential tradeoff in programming languages (if I came off that way, my apologies). Ease of human readability is probably orthogonal to ease of machine parsing.
If you think that you have an algorithmic shortcut when parsing code, try cramming even a moderate amount of code into a single line with no indentation, and go by the token syntax alone. You will find yourself doing ad hoc parsing: scanning the code for matching tokens to extract what goes with what to reconstruct the tree structure.
Humans don't have a magic algorithmic shortcut. If I give you scrambled word decks of various sizes to sort manually, the best time performance you will be able to show will appear as an N log N curve. Maybe you can instantly sort seven objects just by looking at
them, but not 17.
That would only be parsing Lisp s-expressions, which is a simple data syntax. But it's far from the complete syntax, which btw. is basically not statically parseable, since Lisp syntax can be on the fly reprogrammed by macros.
Back in the 1.8.x era, the Ruby parser was already 6k lines, even using a parser generator.
The grammar is notoriously complex in ways that most users of the language thankfully do not have to worry about. But it does make extending the syntax quite hard.
I mean, the most obvious solution would just be to unify Ruby's basic syntax with the RBS syntax shown in the OP. This format already looks like a Ruby class definition with the method bodies omitted and some simple "-> type" and ": type" syntax added. I think that's why people find the separation confusing.
> This format already looks like a Ruby class definition with the method bodies omitted and some simple "-> type" and ": type" syntax added
The thing is that much of it is perfectly valid Ruby code with wildly different semantics already, so, no, without breaking a lot, you can't unify it with Ruby syntax.
It is, and checkers like Steep and Sorbet can infer these types. We're currently playing with the idea of deriving from documentation like YARDoc as well.
I'm really puzzled by the decision to use a separate file for this. The stated justification ("it doesn't require changing Ruby code") doesn't make sense, and my personal experience with languages with external type specifications is strongly negative. It's an unbelievable pain to keep multiple interface files in sync over time.
`.h` files are not something to emulate! External interfaces should be generated by tools where needed.
FWIW, you can use inline syntax with Sorbet[0], one of the two typecheckers that will work with the RBS format (the other being Steep, which does not have inline syntax).
Here's a full example, complete with a typo, based on the example in the blog post: https://bit.ly/3hMEMSp
Here's a truncated excerpt to get the basic idea across:
# typed: true
class Merchant
extend T::Sig
sig {returns(String)}
attr_reader :name
sig {returns(T::Array[Employee])}
attr_reader :employees
sig {params(token: String, name: String).void}
def initialize(token, name)
@token = token
@name = name
end
end
Disclaimer, I used Sorbet while I was an employee at Stripe. I found it to be a terrific typechecker. It's also just absurdly fast (most of the time).
OK, but if we're going to have .rbs, why not just modify the ruby syntax to allow .rbs-style types inline? Especially becuase .rbs already looks like class and method definitions without the bodies. So... just add the bodies.
class Merchant
attr_reader token: String
attr_reader name: String
attr_reader employees: Array[Employee]
def initialize(token: String, name: String) -> void
# actual method body
end
def each_employee: () { (Employee) -> void } -> void
| () -> Enumerator[Employee, void]
# actual implementation body
end
end
It seems like they are trying to support existing competing work... but i'm not sure any ruby users actually want that. I prefer this .rbs to sorbet all around, and would prefer it inline.
The Ruby syntax is too complicated to allow for changes like this to be backwards-compatible.
For example, `attr_reader token: String` is valid ruby today – that's the same as `attr_reader(:token => String)` which somebody might be doing in the wild, since you can override `def self.attr_reader`.
Similarly, `def initialize(token: String` clashes with the definition of keyword arguments.
I am not able to spin that into "And besides it's better to force it to be in two files anyway!", I don't think it is, but I guess it's not so easy to do different.
If we could write tests in .rbs files it would more naturally fit into existing 2 file workflows.
Mind you, if we could write tests in .rbs then I guess .rbs could form the basis of a new ruby syntax without breaking compatibility with old code in .rb files.
Sorbet was written in C++ and is a great piece of work, Stripe did a great job with it. It does have some issues as soon as someone gets into the magic weeds with metaprogramming like Rails does.
Disclaimer: Working at Square, have friends at Stripe, enjoy both type checkers.
I believe one of their guiding principles was that they wanted all the syntax to be valid Ruby, because they did not want it to become a separate Ruby interpreter. So they were pretty limited in the syntax available to them.
I believe they don't want to just strip out the annotations because Sorbet also does run time type checking. So to get all the features they wanted, they had to either write a new interpreter or use valid Ruby.
One thing I never really figured out with Sorbet is how it would work if I wanted to distribute a gem with type checked code. A typed gem would necessarily have to depend on the sorbet gem. Wouldn't this mean library users have no choice but to opt into type checks always being run in this library? (Is this why sorbet-runtime exists?)
Yeah, the gem would depend on sorbet-runtime, and the library author could configure sorbet to not run any checks in production if desired (or to have any errors log instead of throw).
You can configure things like this globally and/or for each method call.
Eg;
# turn off all runtime checks
T::Configuration.default_checked_level = :never
# turn off runtime checks for one method
sig {returns(String).checked(:never)}
def foo; :wont-raise; end
Personally if I were authoring a gem I'd leave the runtime checks on except in hot paths, so my users get quick feedback when they pass the wrong thing.
In any case, the library author can get the benefits of static and runtime typing, and their users will get nice static typing if they use sorbet. Users also get nice runtime typing for the library if the author chooses to leave it on for them. The overhead is usually small.
You can sort that out easily by doing something like:
module T
module Sig
def sig *args
end
end
# You'd need to stub out a few more things here.
end
begin
require 'sorbet-runtime'
rescue LoadError
end
Basically as far as what I can tell from just having briefly looked at Sorbet, you could quite easily stub out the bare minimum to allow people to choose whether to pull in the full thing or not. It'd be nice if they provided a gem that did that.
Yeah I agree with this. They cite the typescript compiler, which in addition to supporting .d.ts files also supports compiling regular JS in additon to separate TS files in the same project. I think this would have been a better approach for backward compat as well, so that users could upgrade to versions szupporting static typing and incrementally change projects one file at a time (leaving existing code intact).
I'm not sure how tools that use RBS without inline syntax will handle these situations, but to be honest I expect the community to adopt Sorbet in practice anyway. It's very fast and battle-hardened in production at Stripe and several other large companies.
Edit: Like, seriously. Either the local var is populated by something coming in externally (which is then typable) or, unless your code is too complex / large, it should be easy to see everywhere it's used, and then why would you need that additional typing info?
One big use-case of types is the sanity-check that the value is what you think it is.
A classic example of where I might have an inline type annotation in Rust is when I'm doing a non-trivial chain of Future/Result combinators in the middle of a function. It doesn't take much code for your understanding to desync from reality. Annotating "Result<String, IOError>" inline both documents to others what this intermediate value is but also creates better, local errors as the chain is modified.
Complex stuff does generally get factored out into functions, but at the same time, it's nice when you're the one who decides when it makes sense to extract code rather than a limitation of the typing syntax. Those things don't always line up.
Because when you see the benefit of type annotations (I’m not saying that’s objective, just if you do go that route) you want to add type information to as much as possible. Leaving them off because you want to is one thing. Not being able to is an unnecessary limitation.
While that’s true, that’s not what I’m talking about. I’m talking about the communicative benefit of type annotations. If you get the benefit from seeing the types, you don’t want them to be inferred. You use them as a reading tool.
If something is untyped in Sorbet, you can give it a type with `T.let`. So if the return value of function `foo` is untyped, but you have a high degree of confidence that it will return a `String`, you can do `ret = T.let(foo, String)`
Right. TypeScript also doesn't require changing files and everything is opt in but you can add them inline.
If the author thinks that's the biggest benefit, I'm inclined to think the ruby community doesn't seem to have enough eyes these days in the core development.
Yeah, I was wondering why this was being announced on Square's website. Seems it's because Square happens to employ Soutaro Matsumoto, who wrote the post and is also the creator of Steep[0] (an implementation of a typechecker for RBS files).
It's not clear to me whether Soutaro is a member of the Ruby core team, so it feels a bit odd that the post is written like an announcement from the Ruby maintainers.
Soutaro is indeed a code member of the Ruby team, he also happens to work at Square. Soutaro is also one of the main contributors to RBS and helped define that standard.
He was going to keynote on this at RubyKaigi this year until it was cancelled, and had a talk at RubyConf as well on this.
This. RBS is the underlying language for defining type checkers. Sorbet and Steep both utilize it, and this allows future type checkers to evolve from a known-base instead of having to reinvent everything.
Can someone explain why the types cannot live in Ruby code itself (after an appropriate version bump)?
Python 3 incorporated types into the language itself, in a similar way (though non-reified) to PHP. This seems much easier to deal with than requiring two files (.rb and .rbs) to describe a single data structure.
I can well imagine that it might be because ruby's formal syntax is already utterly bonkers, and the thought of adding types to it in any usable fashion gave someone a seizure.
Haven't used ruby in years for the typical reasons people move away from it (performance, strong types, GVL, etc.) but syntax is #1 reason I like programming in Ruby. I did mostly ruby for about 5 years and really grew to love it! It may seem bonkers at first but quite enjoyable once you understand it. Now nearly 4 years later of mostly javascript, golang, python, haskell I still regularly stop and think to my self how much I miss ruby!
Yeah no, you can say this against literally every criticism and be right unless someone comes up with time spans and user experiences of other great things to disprove it. Ruby was great on windows since 1.8, don't know how it was before that. I recently scripted AIX with it, so not really a contender on the same lane.
Then for instance most languages get away with inline optional typing by using “:” , for instance “ping_user(name: String)“. In ruby it’s of course already taken, in no small part because there are 3 or 4 different ways to declare hash parameters.
I’d imagine most decent syntax candidates had similar issues, due to ruby’s syntax versatility.
The worst part of ruby imo is the fact that a hash can have both string and symbol keys. Countless times I have encountered issues where a function takes an options hash and the callers use both string and symbols for the same key depending on which caller it is. I end up calling the function to convert to symbols all the time.
Actually, if my memory serves me, a ruby hash can use any object a key! And considering everything in ruby is an object (even the class `Object`) it’s really quite elegant
I'm trying to think of any typical chars that don't already mean something and I think at best you'd have to use a pair and even then it would potentially break older code. Very badly offhand something like:
`attr_accessor ~:String :name`
and `def sing(~:Song song):` seems pretty ugly but borderline feasible on the premise that while ~ and : have meaning in Ruby, it's not super likely that bitwise inverting symbols is common. (I'm sure there more reasons that wouldn't work or isn't great.)
I don't like the separate file thing, but it does seem more challenging than I'd have thought to avoid.
I guess on a tangent Ruby code historically cares a lot more for duck typing so strong typing will be a headache for a lot of stuff.
I genuinely think that adding `:` as a hash separator was a mistake. Apart from anything else, you get this weird effect where the type of the key in `{"foo": bar}` isn't what you think it is.
I wonder what percentage of TypeScript users write their types inline, and what percentage of users choose to write separate .d.ts files for each of their source files.
My guess is the latter is vanishingly small – that it's pretty much only done for libraries that were written before TS was a thing – so I wonder how things will go in Ruby.
Maybe everybody will just standardize on third-party tools like Sorbet which allow inline typedefs, or use types a lot less, or hook up a "regenerate inferred .rbs on save" workflow in their editor, or just switch between files a lot.
I don't use ruby, I am genuinely interested - why is it great? I'm assuming if it were ever allowed, it would be a use-at-will feature and wouldn't affect anyone who didn't use it. Typescript has probably doubled if not more my speed and accuracy since I've adopted it - yet I still do plenty of things in normal javascript. These days I'm usually unhappy when something does not have typings because it can make it terribly difficult to discover things.
It's great because Ruby is an Object-Oriented Programming language. Just saying that is an understatement; Ruby lives and breathes Object Oriented philosophies. It was made for them.
The conflict here is that object oriented philosophies aren't actually about objects. They're about communication between objects. The messaging between objects. As per Alan Kay himself:
> I'm sorry that I long ago coined the term "objects" for this topic because it gets many people to focus on the lesser idea. The big idea is "messaging".
The goal of object oriented design is to focus on the communication between objects, not the objects themselves. Part of that is that the type of object receiving the message doesn't matter so long as it understands the message and knows how to respond. If the object looks like a duck, swims like a duck, and quacks like a duck, that's good enough--even if the duck turns out to be a chicken with an identity crises. It understood the message and responded, and that's all we want in object oriented programming, objects that can communicate with each other.
Adding type checking flies in the face of this philosophy. Instead of type being irrelevant as long as the receiver of a message can understand that message, suddenly it's front and center. The code will accept or reject objects based on their type even if they're fully capable of upholding their end of the conversation.
Type-less-ness is core to Ruby. But some people may still prefer to include typing. We all want to use the tools and practices that best enable us to deliver, so that's a fair want. But since Ruby as a philosophy doesn't care about type, it's important to maintain type checking as an accessory to the language, not a feature of it. Something that can be layered on top of the Ruby language for those that want it, but that can be ignored by those don't.
Bravo. Let dynamic languages be dynamic. Why does every *damn language have to approximate Java in the long run? PHP is nothing more than pseudo-Java and Javascript is heading in the same direction now classes have become firmly-established. At least there's still Clojure.
The philosophical argument in the Ruby community is basically that Ruby is not a statically typed language, period. And a strong contingent, myself included, do not want a hybrid world where type annotations are optional, spattering redundancies all over our syntax. Mostly because I see that as a step in the direction of some kind of "strict" mode that will ultimately enforce type annotations and type-checking and destroy most of what I love about Ruby.
That's why the approaches being used keep the type annotations out of the source files themselves.
> Typescript has probably doubled if not more my speed and accuracy since I've adopted it
TypeScript hasn't ever done anything for me than give me 3rd party dependency integration headaches. I love strongly typed languages and compile time checking, but TypeScript has never seemed worth the trade off due to its broken interoperability with normal JavaScript and the terrible state of crowd sourced typedefs. I'm either fighting some badly defined third party typedef, spending a lot of time creating typedefs myself or dealing with a version issue because the typedef isn't compatible with the version of the library I'm using.
When I use JavaScript I hardly ever run into issues that static typing would have prevented and I have zero TypeScript issues.
Honestly how has it improved the speed at which you get things done? Were you just constantly running into JavaScript bugs due to the lack of typing?
This was my experience with typescript. Nothing I actually wanted to use had first class support for typescript. Nothing I settled with didn't have endless compiler errors that had more to do with the tsconfig than my actual types.
Then at the end of the day, it was still JavaScript (an interesting word for "not ruby"), but with types slapped on top.
I ended up switching to crystal, which is basically ruby + types (infered when possible, but I actually wanted the types) with the performance of golang.
Most of the improvement is from the typings that other libraries come with, if, like you said, they are complete. Now I can just ctrl-click into an object to view it's methods and from their can view the interfaces the methods accept and the interfaces the interface accepts, and so on and so on.
Honestly, I rarely refer to documentation for these things because every project is a snowflake and the documentation gradient goes from no documentation to perfect documentation. By that, I don't just mean the words, I mean the website or the framework used to document, as well as the style of documentation (more like flavor?) Typescript is the great equalizer that makes a project with no documentation (but decent comments or method/var names) just as documented as one that that does.
I can also ctrl-space easy to get a list of methods in case I forgot which method I needed, or if I want to discover what's available. That's enormous in my style of programming. Sure beats going to someone else's documentation page, trying to read it.
Some of the improvement is not necessarily that I have javascript bugs due to lack of typing but rather that with typescript I don't get those bugs which means I don't have to reason about avoiding those bugs anymore like I did with javascript. Sort of a reduced cognitive load.
Also, I have a few coworkers that are not javascript/typescrpt savvy that I was able to get up to speed with typescript fairly easily due to the easy of using the types. There are, of course, hard things such as partials or understanding tsconfig.json or even generating types that I don't cover with them and just have them come and get me when they're ready.
For most things without types I just do the declare module in a d.ts - however, I will first try to find another package that does the same thing with types. Most popular packages these days do include types, some better than others.
After I re-read above, I realized that a lot of it depends on the IDE. If I were still using vim or kate/gedit, it probably wouldn't be a huge timesaver. Fortunately, I settled on one of the intellij editors.
I'm having a really hard time understanding this "I need types forced down my throat" and "I like typing 3x as much as I would otherwise need to" and "yes, I want half my screen obscured by the types of everything I'm doing, not the actual code" and the "adding types now means bugs are impossible" mass cult hysteria that's running so rampant. Typing very occasionally prevents bugs that are generally easy to catch/fix or show up straight away when running an app. It's mostly a documentation system. And it slows development down.
Especially in Ruby which is such an elegant "programmer's language" I think it would just be silly.
If your type definitions are 3x longer than the functions implementing them, something is wrong. In languages with complete type inference, you actually don't have to write types at all if you don't want to, though in practice you end up doing so to clarify your intentions.
Static types do make certain classes of bugs impossible, like missing method bugs, typos, and the like. You can eliminate a large group of defensive programming techniques and trivial unit tests that you would need in a dynamic language to ensure a similar level of confidence in a program. Obviously they don't make all bugs impossible, there will be bugs as long as there are programs, because we write programs without perfect knowledge of the requirements, and this is an unavoidable pitfall of software.
This can depend really heavily on what you mean by "development." If it's just getting the first version banged out, sure. If it includes coming back to code a couple years later in order to incorporate a new business requirement, having that documentation present can be a really big deal. 2 seconds spent typing out a type hint now might, down the line, save several minutes on average. Even in a recent Python project I did over the course of just a couple weeks, when I got to the "clean the code up and get it ready to put on the shelf for now" phase of the project, I ended up wishing that I had bothered to use type hints just a wee bit more when I was banging it out in the first place. It would have been a net time saver.
I don't like static typing super a lot in all cases because it makes it hard to do data-level programming. Which I find to be the true productivity booster in dynamic languages. But optional typing seems to hit the sweet spot for a great many purposes.
For example, JSON describes a logical structure of nested lists and dictionaries. If you were doing data-level programming, you would just map the JSON into actual nested lists of dictionaries and get on about your business.
The alternative, which is more common in static languages like Java, is to transform it all into some set of domain model objects, and probably validate it up-front, too. Even the bits you don't actually need to look at in order to accomplish the job at hand. IMO, that approach tends to mean creating a lot of unnecessary work for oneself. It also makes it harder to obey Postel's law.
(The corollary to that last bit is that it is also possible for static typing to create bugs.)
I'm skeptical of Postel's law, if you deviate from the spec how can the meaning be clear? It seems to me like you would have to go out of your way to implement a buggy version of the spec?
A personal example of this was Httpd used to accept standard headers with spaces instead of dashes, this leads to strange behavior if you accidentally include both. So they decided to stop doing that in a major version. This major version was opaquely included by ops accidentally into our base images. This lead to a very long day of debugging on our end.
Point is, by being liberal with what you accept you create ambiguity, which you may not totally understand at the time. By putting that out into the wild you basically are forced to keep this ambiguous, undocumented spec alive or you no doubt will end up breaking some client.
That's definitely a concern, but it's also way outside of what I was talking about. I would also expect any JSON parser, even one in a dynamic language, to fail on JSON that is straight-up malformed. And ambiguous formats are always bad news.
I'm talking about situations where the JSON is formatted fine, it's just that some field wasn't specified, so then the entire input gets rejected. Even though there was zero need to read the contents of that field in the first place. It just happened to be included in some domain object that gets re-used everywhere, including some other places where the field's contents do matter.
Keep in mind that, when we're dealing with anything that might be transmitted in JSON, thinking that there might be a published spec, and that it manages to accurately cover all these details, is really optimistic. I've honestly never seen it happen in the wild. Oftentimes, any validation rules you might try to impose are guesswork as much as they are anything else. So complaining that a piece of data didn't conform to the spec might not even be a valid thing to do. All you can say for sure is that the data didn't meet the needs of some piece of business logic.
It's not perfect, but it's life. This tension, for example, is at the heart of why proto2 got replaced with proto3, and why using proto3 is strongly encouraged if you're looking to build a robust infrastructure.
There are huge debates at Google internally over required vs optional in proto2 and proto3.
Beyond that I think you’re operating from a misconception about JSON parsing in static languages. There’s no requirement to convert to domain objects and reject data that doesn’t fit on a triviality, you’re just required to specify explicitly what happens when you encounter unexpected structure or data.
Sorry if I wasn't being clear. I'm not saying that's the only way it can work in static languages. I'm saying that that's the way it tends to work out in practice, because the ergonomics of most popular static languages tend to discourage a less brittle approach.
Whereas the ergonomics of popular dynamic languages tend to favor an approach that I find, for this specific purpose, to be both less verbose and more robust.
> For example, suppose we have JSON that represents a set of metric data (this isn't our real JSON, this is just a thought experiment) that should look like this, with "tags" being optional attribute:
{ "id": "1", "timestamp":"12:30pm", "value":"999", "tags": [ "myapp" ] }
> Suppose a python client sends tags but calls the attribute "tag" rather than "tags" (its missing the "s"). Its an optional attribute, so the server won't consider it an error if the "tags" attribute is missing. But it also won't fail due to this unknown attribute called "tag" - it will just silently ignore it now. The Python developer is wondering why his tags aren't being stored - he is getting no errors but they are just silently being ignored. He would need to figure out he is sending in the wrong attribute name, with no error messages to help him out.
> That's the use-case I'm asking about - the "silent error" that will occur due to malformed JSON messages.
What is the difference in approach between these? I've programmed extensively in dynamic and static languages, and don't understand what you're talking about. Less verbose, I might concede. More robust though, I need some more evidence.
Reminds me of Rich Hickey’s “Maybe Not” speech, which I understand him suggesting that programming with “sets” is better than programming with “records” that may contain optional values.
Yes, I know it and he seems to mostly ignore the fact that you can still fall back to manual typechecking in a statically typed language. That’s the part I don’t get. There’s nothing stopping you from manipulating JSON structurally in a static language.
You can definitely still do this kind of programming in a statically typed language. There are a few ways to go about it.
One way is to treat the JSON as a generic JSON structure, and traverse it manually. Of course, you will have to be explicit about what should happen when children are of different types from what you expect, though this explicitness could just be throwing an exception or ignoring it. Haskell's Aeson and Rust's serde_json both support this, as does .NET's JsonElement type.
Unfortunately, this means you're passing around a lot of objects called something like "JSON" without any information about what they contain at the type level, and as an alternative between that approach and creating domain objects, there are row polymorphic records, which allow you to write functions that accept any record that has certain fields, and also specify that they may also contain other fields which you do not handle. This allows you to program to what you know about the types you're ingesting without having to write a lot of new types.
Clojure is strongly typed. I think you mean statically typed.
They're orthogonal concerns. C is statically and weakly typed. Clojure is dynamically and strongly typed. PHP is dynamically and weakly typed. Haskell is statically and strongly typed. Java, as the most design-by-committe language ever, manages to be a mix of all four.
Weak typing is when types get automatically transformed like 2 + “3” == 5,
“2” + 3 == “23”.
Strong typing doesn’t do these types of automatic conversions and throws exceptions or generates a compiler error.
Static typing — types checked at compile time.
Dynamic typing — types checked at runtime.
"Strong" typing doesn't mean much of anything and I generally try to avoid using it but slipped up here. When I do use it, I use it as a synonym for static languages with expressive type systems. I prefer statically typed languages.
Strong typing generally does not mean much and everyone seems to be using a different definition. Would you consider Javascript weakly typed? What about Python?
I'd consider JavaScript to be more toward the weak typing end of things, because it does lots of automatic conversions with surprising results. (see, for example, Gary Bernhardt's "Wat?" lightning talk.) I don't think I'd consider it as weak as C, which has things like unions and pointers that let you just sort of fall out of the type system entirely.
I'd consider Python to be more strongly typed than JavaScript. It doesn't do quite so many automatic conversions. For example, in Python, `1 + "foo"` is a TypeError. In JavaScript, it's "1foo". Sadly, `1 == True` in Python, so it certainly doesn't get full marks.
{-# LANGUAGE MultiParamTypeClasses, TypeSynonymInstances, FlexibleInstances #-}
import Prelude (String, (++), show, Int, (==))
import qualified Prelude
class Add x y where
(+) :: x -> y -> y
instance Add Int String where
(+) x y = show x ++ y
instance Add Int Int where
(+) x y = x Prelude.+ y
instance Add String String where
(+) x y = x ++ y
a = ((1 :: Int) + (1 :: Int)) == 2
b = ((1 :: Int) + "aa") == "1aa"
c = ("a" + "aa") == "aaa"
Examples like the last one about Python are why I think it’s approximately meaningless as a descriptor. I don’t see why dynamic languages should have any implicit conversions at all.
Where you store the type information and when you do the type check is a separate question from whether you do the type conversions automatically or not.
I think a more interesting question is typecasts, like happens in languages like Java and C#. These languages are nominally statically typed, but they retains some type information at run-time, so that you can perform run-time type conversions, which requires run-time type checking. Which is the defining feature of dynamic typing.
C# is a little bit more straightforward about being a hybrid static/dynamic language, with its reified generics and dynamic references. But teasing out the details of where, how, and the extent to which Java is statically or dynamically typed would make a decent topic for a master's thesis.
It also hints at a deeper thing that one must be mindful of: static/dynamic and strong/weak are not binary categories. They're not even the extremes of two binary scales. They are somewhat vague descriptions that are meant to serve as useful shorthands for certain sets of choices that one must make when designing a language's type discipline.
But the fact that they're not cut-and-dry terms does not mean that they're meaningless. It just means that one must disabuse oneself of the notion that they're cut-and-dry before one can have a conversation about type discipline that goes beyond a certain level of detail.
You’re muddying the waters. Static and dynamic have a much clearer distinction between them than “strong” and “weak” typing do. These things aren’t binary but that doesn’t mean they are equally descriptive terms.
Java is a statically typed language with late binding implemented through subtype polymorphism and its type system has been explored pretty extensively in the literature.
> Typing very occasionally prevents bugs that are generally easy to catch/fix or show up straight away when running an app.
This is not true. You could paint almost every language feature aimed at producing correct software in this way: "writing tests makes me type more, and they catch very few bugs that would have been shown when running my app anyway". (Or, as an ex coworker once told me, "I don't need to write tests because I never have any bugs").
And what are types if not a kind of test/proof that the computer writes for you?
> And it slows development down.
There's a software development adage that goes like this: "I don't like writing tests, because they make me waste time I need to fix bugs on production that weren't caught because I don't write tests."
> It's mostly a documentation system. And it slows development down.
Well, I guess this is also a matter of perspective.
From where I'm standing, I'd rather you slow down and "document" your code. Code written at the speed of thought makes for an awesome MVP and for an awful legacy for your co-workers.
In the course of my job I write Swift for iOS and Ruby for server APIs and our web-based UIs.
Type issues are about 0% of my Ruby bugs, but dealing with all the damn type requirements in Swift regularly takes dozens of minutes to track down when some weird esoteric error message pops up. And God help you if you try to use generics.
If you want strong typing, then good for you. Just pick a language that fits that mold.
So much of what I love about Ruby is what it doesn't make me do.
Type issues are 0% of your Ruby bugs because you're not using a typechecker. I guarantee you have type errors somewhere if your codebase is large enough.
My point is that imposing a big ass type system on developers as a "solution" to a trivial number of actual problems is overkill.
I'm sure there are developer/projects that both enjoy and benefit from static typing and strict type systems of various kinds. I just want Ruby to remain a place for those of us who aren't in those positions.
I'm not sure what a "big ass type system" is, and I disagree that the number of actual problems is trivial. However, I'm in no more position to say what Ruby should be than you are, and I'm sorry you're so opposed to static types that even attempting to support them is a minus in your book.
However, even with TypeScript ascendant, the vast majority of people programming JavaScript write vanilla dynamic JS. I don't think dynamically typed Ruby is ever going to die. Whether large enterprise codebases will standardize on requiring type signatures is a different matter, because the benefits always outweigh what downsides you see in static typing once you surpass a certain scale.
>Swift's type system is what I have in mind: strict, complex, required, and in my experience, often petty.
I do hear a lot of complaints about Swift's type system. I wonder what the specific problems are, because I do not hear similar complaints about Rust. I wonder if it's the combination of subtyping with a lot of type inference and also a full-on trait system with protocols and extensions and such.
My biggest complaints all center around the intersection of custom types with protocols and extensions, especially when trying to get a generic approach to something working.
In my experience at least 70% of bugs are ones that you'd catch by using types - things like x instead of y, possibly-empty list instead of known-nonempty list, user ID instead of group ID. Logic errors that couldn't be caught by typing do exist, but they're very much the minority.
Maybe we just work on different kinds of problems.
70%+ of bugs I deal with are business logic issues that no type system could solve.
Sure, as I code I run into an occasional nil object or NoMethod error, but those last as long in Ruby as they do in Swift (about 2-5 minutes while working on that specific part of the code).
I've worked across a wide range of industries over several years, and it's always been pretty similar. You should be building the business constraints into your types so that errors in the business logic become errors in the types - in my experience if you actually work with the type system then most errors become type errors. If you've got examples of the kind of errors you're talking about then I could try to be more specific.
Not the GP, but here is a scenario that I am interested in understanding from the perspective of types.
A calculation that involves 21 parameters (in a particular insurance industry underwriting) yields a number. A threshold is read from the database. This threshold could change every month.
Suppose that the current value of the threshold is 0.78. The calculation above can yield an `x` with the following cases:
(i) x <= 0.78,
(ii) x > 0.78.
We have hundreds of test cases for the combinations of the 21 parameters, leading to hundreds of values for `x`. It is a bug for `x` to be > 0.78 when it should be the other way.
Is there a way this can be encoded in types? That would be very interesting.
This description doesn't quite make sense. If the threshold is regularly changing, the calculation can output the same result number for the same 21 parameters and have it be a bug or not a bug from month to month, depending on the threshold. How can you write a test for that without locking in the threshold? Indeed, without hard-coding the threshold in the calculation itself?
Sure. Create a type that represents x being <= that threshold, with a private constructor. Only allow constructing it via a factory method that requires it to be an x that should be <= the threshold. Then whenever you have a value of that type, you know that it's legitimately <= the threshold, and the bug becomes impossible.
Don't you see the irony in your own comment? If you never create type related bugs in ruby then you shouldn't encounter them in a typed language either because you are infallible. The truth is probably that you see all the type errors at runtime instead and don't see them as such.
Sure, and I get compile time errors in Swift. Each last about 2-5 minutes.
The actual bugs I have to fix are nearly always business logic issues. Edge cases around 3rd party integrations, incomplete implementations, unintended side effects, etc.
Types are great for tooling, which is a much bigger drive for me to use them than soundness guarantees. I can’t stand opening up API docs in a separate tab (or god-forbid browser window) once I got used to having literally everything I could want to know about how I can use a value available with a simple Cmd+Space.
> I like typing 3x as much as I would otherwise need to
3x? Even on languages that do not support type inference I would say that this is at most 1.1x. Even then, type inference exists.
> adding types now means bugs are impossible
I usually see that as a mis-representation of what type advocates say. Rather, it seems that people just support that types reduce the amount of bugs.
> or show up straight away when running an app
Or that show up after you had said app running for a while, and then you get a run-time type error which appears only after doing certain actions. This is the main reason that I am avoiding languages like lua and python.
(In addition languages with more advanced type-systems allow you to catch bugs such as buffer overflows or division by 0 at compile time)
Based on the relative smoothness of Ruby version transitions versus Python, I trust Matz’s preference on this implicitly. One good thing about it being external is that you can optionally and experimentally annotate existing code without munging up your source files. At least so long as this is a bleeding edge feature, that separation makes a lot of sense to me. It’ll be a while before anyone can be confident in a particular model for how this should work, until it’s been in use for a good long while.
RBS and type files on the side were really hotly debated for a while and the core team settled on this as a way to not break the existing parser among other reasons.
While I don't 100% agree with them I have faith that Matz and the team make the decisions they do based on impact and what they see in the community.
Ruby 1.8 to 1.9 was very painful, I am not sure why it was more succesful than Python 2->3, I'm not sure it "deserved" to be or was any less painful on it's face. It easily could have been just as disastrous. So also informed by that; ruby hasn't done anything nearly as painful since.
But that applies to making it so old code does not work in the new version of the language. Nobody expects all new code to work in the old version of the language. Ruby adds new features including syntax that won't properly parse in old interpreters all the time. It's not clear to me why inline type definitions couldn't be such.
Matz often cited the "carrot" of much better performance on 1.9 as a reason for the successful transition.
Python3 didn't offer much over python2, so people just saw the downsides, while ruby pushed people to upgrade with the promise that their efforts would gain them better performance and/or save money.
I expect most ruby projects only have rails as its core dependency with all other gems being small utility libraries that can be easily updated and replaced. Python gets used for such a wide variety of things.
It was, and I was around during one of his discussions on that at RubyConf last year. It's a very valid concern and Matz is very sensitive to it. There are a lot of things he's joked about removing or changing but won't because of those reasons.
If you take a look at his keynote video he says quite a bit on this too.
> It’ll be a while before anyone can be confident in a particular model for how this should work, until it’s been in use for a good long while.
Check out the OCaml community, interface files have been use there since basically day one, and are generally well-liked for how clean they allow the implementations to be.
I'm not thrilled about the separate files with the type information but I completely understand why they did it, and if it were my choice I might make the same one.
I don't like the comparison with TypeScript `.d.ts` files however, because TS still lets you do types inline in the code. I haven't seen it mentioned anywhere that this won't be supported by Ruby 3.
Does anybody know if Ruby 3 will also support inline type information or will the header RBS files be required?
If you completely understand why they did it, can you explain it to me?
> Does anybody know if Ruby 3 will also support inline type information or will the header RBS files be required?
Wait, what are we talking about? I thought this was the decision you said you completely understood, that the type information is in separate .rbs files. Isn't ruby 3 what we're talking about?
I don't think Ruby 3 itself will provide a typechecker, just a standard for type definition file formats. You have to use a third-party tool, like Steep or Sorbet, to do the type-checking – and Sorbet at least does support inline type information. See more at my comment here: https://news.ycombinator.com/item?id=23991258
You won't need to use the header RBS files at all (types are optional in any case) but you'll likely want to use Sorbet or Steep to generate them if you're sharing your code more widely, since community tooling like YARD will probably use those for code navigation.
The intention right now is for the StdLib to provide known types to build off of written in RBS. There's no requirement to use them necessarily.
Steep and Sorbet are second-level, they build off of RBS. Matz has mentioned offhandedly in conversations I'd had with him in the past that there's a ton more in store with RBS beyond just type checking, so we'll see where they go with it.
As far as YARDoc I've been eyeing that one for a while now since I first heard about Steep at a Braintree Ruby meetup before Soutaro was at Square. We're still talking about what and how as far as that one.
I much prefer separate files for type declarations. Or at least the ability to define them separately. Type annotation takes away from readability. I like keeping the types and code separate.
The upside of external files is pure incremental implementation that touches no other tooling and requires no buy-in.
I don't see how having to switch files to know that `input` is a `User` increases readability, though. It seems like straight-forward impl-simplicity trade-off, not one of user ergonomics.
That can be covered by the editor to give the user some hint by referencing the external file but for the user, having have to keep adding it on a separate file seems pretty annoying as you need to keep declarations synched in 2 files.
Also how do you type something in an inline function?
Separating type definitions from code can be considered as contributing to readability of idiomatic Ruby on one hand, and type definitions on the other, taken separately on their own—by not imposing constraints on either syntax.
IDEs will likely be able to seamlessly peek/go to RBS type definition on any Ruby identifier in any case.
Do you mean for Ruby specifically or in general? I've found that it's much easier to (safely, accurately) read, use, and extend e.g. a TypeScript file than its JavaScript counterpart, even when provided with a .d.ts file.
I don't disagree, but I think it's a very minor issue given that it's trivial to use color to highlight code these days. By comparison having to switch between two files (and keep them in sync!) when making changes is a far bigger usability concern.
Type annotations aren't inline in all languages. If you're writing Haskell or Elm, as a few examples, then you get static types without having to write them out and if you do write them out they sit above the function that uses them.
Better IDE integration: Parsing RBS files gives IDEs better understanding of the Ruby code. Method name completions run faster. On-the-fly error reporting detects more problems. Refactoring can be more reliable!
IDE support (autocomplete, refactoring and quick documentation) is the most important reason to annotate argument and return types.
I've been using typescript for a few years now and to be honest I almost never rely on the compilation errors. I just use the built in Jetbrains IDE completion, autosuggestion, and navigation to make it work.
Yep. A good IDE to a first approximation doesn't allow compilation errors to occur because you are auto-completing everything, including symbol completion based on the type at cursor, etc.
Since the advent of LSP, I think the value proposition of a full featured IDE has been greatly diminished.
For example, I used to use Intellij for Scala but recently switched to Emacs+Metals and haven't really missed anything. In fact, it's probably an even better editing experience.
Intellij still has better refactoring (though I don't use it much), and the integrated debugger and database viewer are really nice. I've found myself using Emacs and only switching to Intellij for the aforementioned specialized tasks.
5 years ago you would have been crazy not using an IDE for JVM work but this is no longer the case. LSP is such a wonderful technology and has empowered the creation of new programming languages like never before. It's truly remarkable.
If I was Intellij, I would be a little worried about my future market share. They simply can't provide the same value as before, and I'm not sure how they intend to change that.
I wish I had had your luck with language servers. They are fantastic when they work but in my experience configuring them is finicky and difficult, particularly with Emacs. I have also run into problems where the server crashes and does not restart itself, so the IDE functionality in my editor will just silently break and I have to go fix it. JetBrains still dominates the market for IDEs that work out of the box, and I still don't know of any LSPs that can even remotely compete with the sophistication of their static analysis tools and such.
However, my experience with rls/rust-analyzer with VSCode hasn't been great either. Red squiggly lines that persist even if the syntax is correct - I have to erase and type the same thing to trigger refreshed checking which is very annoying and counter-productive. Though it is possible that this is more of an integration issue considering I haven't had similar problems with Typescript on VSCode - the checking was flawless (I think typescript checking also uses a language server).
Does Metals have a feature to add parameter names at method callsites? I use that IntelliJ feature of the Scala plugin all the time, it's such a lifesaver.
I haven't used Ruby in ages but this seems like a really odd way to incorporate type hints in the language.
I much prefer the Python 3+ approach of type annotations in source code.
I can't imagine having to look at a separate file just to figure out what the type of something is. You may say "tooling will fix this" but it's just far less overhead for everyone at the end of the day to just make annotations in source.
My more existential question is, is there really an advantage to doing static type checking in Ruby?
When I was doing Ruby, the way you can change objects on the fly, add methods on the fly, the vast amounts of metaprogramming, are types at "compile" (I know, not really) time really the same as types at runtime?
Like, it might be nice to get some autocomplete, but AFAIK tools already do that (RubyMine, others).
> I can't imagine having to look at a separate file just to figure out what the type of something is. You may say "tooling will fix this" but it's just far less overhead for everyone at the end of the day to just make annotations in source.
TypeScript has this functionality (in addition to being able to write actually TypeScript files with inline annotations. The big advantage is being able to provide 3rd party type definitions for libraries that don't provide them and aren't interested in using them. This allowed TypeScript to bootstrap decent library support well before it was popular enough that the mainstream was considering adopting it, and this in turn enabled widespread adoption.
> My more existential question is, is there really an advantage to doing static type checking in Ruby? When I was doing Ruby, the way you can change objects on the fly, add methods on the fly, the vast amounts of metaprogramming, are types at "compile" (I know, not really) time really the same as types at runtime?
Again, I think TypeScript shows that there is. Sure, there are times when you want to do super-dyanamic stuff. And you can opt out of type checking using the "any" type in those cases. But a lot of the time you're not doing anything complicated, and you just want a compile-type check that ensures you're passing the correct type to the function you're calling.
There have been attempts to have types outside the main source or in comments for many dynamically typed languages. They seem to fail due bad programming ergonomics, as maintaining separate "header" files is cumbersome (hello C my old friend).
This is why I do not like mypy or types in Python other than dataclasses. If I'm going to type the damn thing I better be getting performance ala cython. Why on earth use a dynamic language like Ruby or Python and then try to bolt types on top. Ruby would do far better to fix the bloody `and` vs `&&` issue (it should just be `and` and it should work like `&&`) and strings should be immutable by default with a special syntax or method to make them immutable.
But you're absolutely right about the downsides of stuffing types into a different file. I get why Matz did it (he wants to keep Ruby beautiful and types are crufty) but I don't like them in the first place.
> Why on earth use a dynamic language like Ruby or Python and then try to bolt types on top.
To answer this (as someone who basically only ever writes in Python):
There are a few cases where it's really nice to be able to add type annotations to methods or functions. The most obvious example is API calls; it's nice to be able to say "this needs to be a list, give me a list", and not have to do
if not isinstance(var, list):
var = list(var)
or
if not isinstance(var, list):
raise ValueError("I know I didn't tell you I needed specifically a list, but I need specifically a list in this case")
Over and over and over again all over your module. Look, give me a list, I need a list. I need the APIs that list has, I need the interface it uses. I don't want a generator that I'm going to be iterating over forever, I don't want a string that's going to get split into individual characters.
Duck typing is all well and good, but just because strings, lists, sets, and os.walk are iterable doesn't mean I'm able or willing to handle those.
It can also help a lot in IDEs; for example, if I type-annotate a method to accept "name" as a Str, then my editor can assume that "name" is a string, even without any other evidence to that being the case. Likewise for things like warning about return types.
Lastly, it lets you do automated testing as well. Hey, you're passing a FooNode to this function, but that function accepts a list. I know this because NodeCollection.find() returns a FooNode. Makes it easy for the dev to look at the report and think "Oh, I meant to use NodeCollection.findall(), oops!"
I certainly don't want a statically typed language, but there are a lot of cases where my internal logic is fixed and I don't want my method to have to know how to deal with int, str, none, bytes, etc. Type annotations can solve this problem for me and for other people using my code.
I just worry it's going to be abused, though. E.g. I've worked with more than one Ruby code base where someone did a kind_of? check and threw exceptions if it didn't get what it wanted even though the actual type required was anything that implemented a given method in a reasonable way for no good reason.
I hope people keep the type annotations sparse, and allow the tools to infer it unless they're prepared to link long and hard about the minimal restrictions that are reasonable.
I think your own example proves that your concern is moot. If people are going to do stupid stuff, they're going to do it with whatever tools are available to them. Your kind_of misbehavior already happens without type annotations, but now it can be clear to you beforehand what's going to happen.
I hope you're right. I just fear that making it external to the code will make it easier for people to ship overly restrictive type signatures without thinking. Though hopefully the tools like sorbet will make it easy to override, in which case it might well improve things (if I "only" need to override the type signatures instead of having to monkey patch or fork code)
> Why on earth use a dynamic language like Ruby or Python and then try to bolt types on top
Probably using the languages for the ecosystem (e.g. Python for scientific computing or ML and ruby for ruby on rails) but still wanting to benefit from type checking
I think it's moderately clear that they're not intending this be a complete solution to type checking in Ruby, but rather a starting point that the community can build on top of.
For instance, I can imagine adding something like comment blocks to Ruby code that RBS tooling can find and treat like the RBS files.
Treating it as a starting point is, I think, the best justification for putting them in separate files. I don't quite like the idea of having them in separate files, but at least that way replacing it or evolving it without breaking peoples code-bases will be easier.
Adding type checking sort of clips the wings of a codebase, but makes it far less magic, and when you're a company the size of Square or Stripe you want as little magic as possible.
I'm still trying to make sense of this announcement. With a lack of type annotation in the Ruby core, I chose to build off YARD to make gradual type safety work. Now I don't know if there will be a standard that supports type safety or if I should continue down the path I'm already following. Help me, Ruby core developers. You're my only hope.
Personally, I think you should keep using YARD, because people are bound to be using Ruby <3.0 for a while. As an aside, thanks for solargraph! I can't recommend it enough.
I hear you. It occurred to me that Ruby could have chosen to innovate with something like a Semantic TomDoc. To choose a separate file based approach seems like a step backward. At the very least it could have been module based. But Matz is a C coder -- not a Ruby coder. So it doesn't necessarily surprise me.
It's sad though. Since poor design of Refinements, C transpiling for 3x project, and now this, I am less and less inclined to continue using Ruby. I miss some of the dynamics but I find myself using Crystal instead.
(Honestly, if any one figured out a way to supplement Crystal with dynamic behavior for those features that a static language can't offer, Ruby would be done.)
As soon as I feel comfortable maintaining a Crystal server in production I think I'll switch to it. Last I tried it, things broke and shards required some effort to maintain every version update. I'm eagerly looking forward to their 1.0 and hoping they stabilize a lot more.
Please don’t use YARD. As a documentation format, I find it noisy. As a documentation generator, it doesn’t support standard RDoc syntax (_intentionally_ so) that makes it completely useless.
I say this as someone who has written Ruby for almost twenty years. I will _never_ use a tool that depends on YARD document formatting, because I will never use YARD document formatting.
Indeed. To expand on your point: yep, it's incorrect. An untyped language is a language in which there is no concept of type. Assembly languages tend to be untyped. Forth is untyped.
The Ruby and Python languages do have the concept of type, it's just that they're dynamically typed, not statically typed. They check types at runtime.
But when you have things like "duck typing", don't you think "they check types at runtime" becomes less meaningful? The majority of functions written in Python, even the ones that have type annotations, do not effectively have "assert isinstance(...)" in the program text below their signature, which is what I'd expect after reading "check types at runtime".
Also, Python now has (in its stdlib!) things like typing.Protocol, which is almost exclusively checked at type checking time. So if such a thing exists, and you still say "types are checked at runtime", isn't that confusing?
I don't really know what you're trying to say here.
Why would it be less meaningful to say types are checked at runtime with ducktyping? The nature of ducktyping is that the specific class of an object does not matter relative to behaviour, but a class is not entirely equivalent to a type.
If I need an object that implements method `foo`, and don't care about class, then "objects that implements foo" is in itself a type, that can potentially be inferred and checked be it at runtime or before.
> The majority of functions written in Python, even the ones that have type annotations, do not effectively have "assert isinstance(...)" in the program text below their signature, which is what I'd expect after reading "check types at runtime".
You're thinking his "checks types" too narrowly. Every time I try to call a method on an object in a strongly typed language, typing is involved. It doesn't so much "check" it as look up the method to see whether this method applies to this specific objectat this point in time and decide whether or not to throw exceptions.
But the point remains that it is a typed. And strongly so - in both Ruby and Python objects has a type associated with the object itself, unlike e.g. C or C++ which are weakly typed because it is the variables that are typed, not the values.
Quibbles with your final paragraph: that's not what strongly typed means. It refers to when a language has strict rules restricting implicit type conversions. [0] Also, C++ has RTTI.
It's not that simple. There's no uniform definition of strong. You're right strong typing is often used to refer to absence of (or restrictions to) implicit type conversions, but it has also since the beginning been used to reference languages that does not prevent obscuring the identity of the type of an object.
E.g. Liskov and Zilles [1] defined it this way for example:
"whenever an object is passed from a calling function to a called function, its type must be compatible with the type declared in the called function."
Under this definition C and C++ fails hard, since you can statically cast a pointer to an object and pass it to a function expecting a totally incompatible type.
Note that the system described relied at least partly on dynamic/runtime type checks (in case it reads as if the quote above suggests they used "strong" to refer to static typing):
"It is desirable to do compile-time type checking, since type errors are detected as early as possible. Because of the freedom with which types can be used in the language, however, it is not clear how complete the compile time type checking can be. Therefore, the design of the language is based on a runtime type checking mechanism which is augmented by as much compile-time checking as is possible. "
If C++ fails when you do casts, why doesn't Haskell fail since it allows infinite recursion to build terms of whatever type you like? You can make the same argument: "don't do that". C++ compilers can warn when you cast. Haskell compilers can warn on incomplete pattern matches.
Is the definition you use not so broad as to admit all invariants being described as types?
If a method requires a dictionary arguments with a certain key, is that a type to you? If you extend the term "type" to cover all invariants, I don't think that is the way that the term is commonly used, even though many invariants can be proven in e.g. dependently typed languages.
All these terms are so wonky because a C++ type is not equal to a Haskell type. So I feel we can't ever have solid definitions of terms like "strong", since it depends what you compare it to, and it also depends what you compare from. So while X is strong in comparison to Y, that doesn't say much about X's relation to Z.
> The majority of functions written in Python, even the ones that have type annotations, do not effectively have "assert isinstance(...)" in the program text below their signature
It's true that types aren't checked in the act of passing a value as an argument, but at bottom, Python still has a concept of types, and they are still checked at runtime. Try the following and you'll see Python check your types and determine that there's an error:
"Hello" + 42
This never happens in, say, Haskell (statically typed and never performs runtime type checks) or in Forth (untyped, no concept of type at all).
> Python now has (in its stdlib!) things like typing.Protocol, which is almost exclusively checked at type checking time
Yes, Python is now adding optional static typing, and of course, JavaScript has TypeScript. I'm afraid I don't know a lot about these new systems but presumably the end result is that type errors can occur either at compile-time (or static type-checking time, or whatever we call it) or at runtime. This isn't exactly anything new, it's always been possible in Java for instance, which has always permitted downcasts, and has always had covariant array types, checking both at runtime. [0] This is despite being a statically typed language where all variables must have a fixed type, the type they are declared with. (Remember that a variable's type is distinct from the precise class of a pointed-to object.)
We can draw a distinction between statically typed languages like Java where there's still a need for runtime type checks, and statically typed languages like Haskell where there's no need for runtime type checks. Type theorists use the term soundness for this property: in a language with a sound type system, a program that is successfully validated by the static type system can never have a type-related error at runtime. In engineering terms then, a sound type system means you don't need to check types at runtime, as 100% of type errors are caught by the static type system and there's no way for type errors to ever arise at runtime.
I used Haskell as an example, rather than C, because although C doesn't give us runtime type-checks, C programs can still go haywire if you make a mistake in your program (termed undefined behaviour). C has an unsound type system, and it lacks runtime checks. This is one reason C is so famously 'unsafe'.
So anyway, we have the situation where Python code can encounter type errors at compile time or at runtime, and Java can encounter type errors at compile time or at runtime, but we call Python dynamically typed and we call Java statically typed. The difference is that in Java, a variable must be declared with a fixed type, unlike in Python.
Things can get messy in the middle-ground: in C#, the dynamic keyword allows for true dynamic typing, where a variable's type is determined at runtime. [1] So you could write a C# program in traditional Python style, where there's very little compile-time type-checking. And you could write a modern Python3 program using lots of static type assertions, minimising the opportunity for runtime type errors. We'll still call the C# language 'statically typed' as it's typically true of C# code, and we'll probably still call Python 'dynamically typed', as that will probably remain typically true of Python code.
Disclaimer: I'm not a type theorist or a programming language researcher, corrections welcome if I've got anything wrong.
I don't understand the benefits of forcing it to be in a separate file. I'd rather at least optionally you could include the types in the source file where you defining the methods.
Being able to define an interface instead of pure un-specified "duck type" is great.
Separate ruby header files with type information? Seriously? What's the rational behind that? Is it just to make clear that the ruby interpreter doesn't really care about the type information and doesn't use it to improve the code's performance?
With all due respect, but IMHO this is too little and much too late.
Isn't this similar to how TypeScript allows annotations for JavaScript to live in separate files? My (possibly naive) assumption is that the goal is to make it easier for developers to write type annotations for projects they use without necessarily having to convince the maintainers to add them to the project itself. I've heard of people using TypeScript annotations from https://definitelytyped.org/ for dependencies which don't have their own annotations.
.d.ts files are mostly compilation output, only if the original source is JavaScript you write the definitions by hand. TypeScript developers normally work with types inside their ordinary .ts (JS+types) module file, not in a separate source file.
"The benefit of having different files is it doesn't require changing Ruby code to start type checking. You can opt-in type checking safely without changing any part of your workflow."
How does this compare to the sorbet project? Is it two different implementations for the same goal or is it adding support in ruby so sorbet can also benefit?
That's correct. Sorbet currently uses RBI but after meeting with core they standardized on RBS and are migrating to present a more unified front and enable easier development of type checking libraries on top of it.
Very little of this will need to be written by hand. The underlying tech is pretty decent at guessing types, the idea is that if it's not quite specific enough you adjust it, but it should otherwise be transparent.
Agreed. But don't read too much in RBS details yet. RBI is currently very early and will need to change substantially learning from experience of actually typechecking real codebases. Stripe and Shopify are helping with this.
RBS has better syntax, but has features that don't have clear semantics or feasible implementation. And doesn't support inline annotations that are necessary in practice.
RBI is limited by ruby syntax and thus isn't as nice, but has good semantics and support inline annotations. And has been tried on hundreds of real codebases including those with dozens of millions lines of code.
We'll need to gather benefits of both on our way forward.
Agreed. Currently at Square so watching some of this.
The best path forward will be a lot of discussion, especially around learnings with production codebases where possible.
Personally I prefer inline annotations, but want to explore the possibilities of both and see what comes of it. I've written on Sorbet in the past and have used it on a few toy projects.
Also working on some analysis documentation comparing the two if you'd be interested in chatting later. Feel free to DM @keystonelemur on Twitter.
I likely am missing some context, but the comparison to TypeScript's ".d.ts" files seems misplaced. This is a type signature language, but it does not seem to be a type checker.
The comparison to .d.ts files then seems bizarre because that is helpful for language servers¹ to consume types, but there's no proof that say, the implementation matches the type specification.
TypeScript declaration files declare what the types of module exports are. For the most part, a .d.ts file informs the typechecker "the type of module Foo export bar is the interface named Quux". This is not checked, this is simply an assertion. The language server for TypeScript will pick these definitions up, assume they are correct, and provide code completions for those types as if they were correct.
On the other hand, a .ts file, combining types and codes, enables type checking. If the type declarations are incompatible with the code, an error is thrown by TypeScript. While .d.ts files declare types, .ts files verify that the code and the types declare are compatible.
Since .rbs files simply describe the external interface of types and modules, and cannot describe internal variables, I'm not sure how it's doing any type checking.
For example, if I have this code:
module Foo
class Bar
def trivial: () -> String
end
end
What prevents me from writing this:
class Bar
def trivial
return 42
end
end
Or alternatively, this:
class Bar
def trivial
x = some_function_that_might_not_be_declared_in_an_rbs_file()
return x
end
end
Does x have a type? Can I gradually type "class Bar" via type inference, or do I have to declare and keep in sync all my rbs files with my rb files? What happens when the rbs file is out of sync with the rb file?
¹ Language servers are implementations of an IDE protocol for code completion. The trend in programming language tooling is to use Microsoft's Language Server Protocol (https://microsoft.github.io/language-server-protocol/) to provide code completion, semantic navigation, refactoring, etc.
The type checker implementations (eg; Steep, Sorbet) should throw an error at `return 42`. If `some_function_that_might_not_be_declared_in_an_rbs_file` is not known, the behavior may be configurable – it could be assumed to be something like `any` or assumed to be unsafe, and error until you declare that function. I think Sorbet has this configurability today, and I'm not sure about Steep's behavior.
The comparison to ".d.ts" seems apt since both are languages/grammars focused solely on declaring types. How they are interpreted is a different matter of course. TypeScript assumes they are strict declarations of the JavaScript code, while Steep/Sorbet will use the declarations to type check your code.
In your example you would get a type checker error.
It’s a separate RBS file for now, but if/when this gets integrated into Ruby itself how will it interact with the new Ruby 2.7/3.0 keyword argument syntax that also uses the colon? https://bugs.ruby-lang.org/issues/14183
note that syntax isn't actually new in ruby 2.7 or 3.0 at all. It's been around since ruby 2.0 in fact (Feb 2013). (keyword arguments had to be declared with default values until ruby 2.1, Dec 2013).
What ruby 2.7/3.0 do is rationalize some really weird counter-intuitive and ambiguous edge cases related to passing a Hash arg expecting it to be invoked as if it were keyword args. But it's a change in semantics, not syntax. The keyword argument syntax, keyword arguments with colons, in both method definitions and invocations, has been around for years, unchanged.
I'm really loving how this is intended as a starting point, so that the community can continue to build and explore on top of it. Feels very Ruby Is Nice, So We Are Nice.
> How about checking that a given string is non-blank?
Because when you try to do this with some object that doesn't have a "length" or "empty?" method, your application crashes.
irb(main):013:0> a = 1
=> 1
irb(main):014:0> b = "1"
=> "1"
irb(main):015:0> b.length
=> 1
irb(main):016:0> a.length
Traceback (most recent call last):
4: from /usr/bin/irb:23:in `<main>'
3: from /usr/bin/irb:23:in `load'
2: from /Library/Ruby/Gems/2.6.0/gems/irb-1.0.0/exe/irb:11:in `<top (required)>'
1: from (irb):16
NoMethodError (undefined method `length' for 1:Integer)
irb(main):017:0> b.empty?
=> false
irb(main):018:0> a.empty?
Traceback (most recent call last):
4: from /usr/bin/irb:23:in `<main>'
3: from /usr/bin/irb:23:in `load'
2: from /Library/Ruby/Gems/2.6.0/gems/irb-1.0.0/exe/irb:11:in `<top (required)>'
1: from (irb):18
NoMethodError (undefined method `empty?' for 1:Integer)
This is why people want a way to know if something they think is a String is actually a String, without risking data loss and outages at runtime.
I think it's rude to dismiss people asking for this as "fixated," and furthermore it could be no less fairly used against people who show up to these debates beating their own drum against it.
Honestly I highly suspect that many, many "typing" solutions out there are plain ignorant of the whole spectrum of choices one can make, are tend to lean towards Java-like APIs out of that ignorance.
This is not limited to Ruby, I also see it in TypeScript which is very much a contrived system for real-world usages while making little use of JS's dynamism.
I'm glad we agree that adding type annotations to code is a good thing. You aren't arguing against that–you are just arguing implementation details. If I'm designing a 5 9s service tasked with processing gigabytes of data every unit time, I don't want to pay the runtime cost of checking the types for every mundane operation like measuring the length of a string.
It is evidently not a solved problem. Why would large financial services companies be working on these things we're talking about if that were the case?
Do you know what solves the problem of checking every type of every thing you want to use before you use it, where the checks don't incur any performance penalty at runtime? Static type checkers!
Modern compilers usually don't even require any explicit type declarations, as they can infer at assignment. So they provide more safety for less code than your examples.
> Typed languages are suitable for larger projects but are often less flexible. Untyped languages allow for rapid development, but scaling teams and codebases with them can be difficult.
Well some of us disagree with the statement that untyped is not suitable for large teams. And that's why we use Ruby. There is a lot of very good typed languages out there if you do want typed. I feel Square and Stripe are pushing their own codebase issues onto general Ruby - as it's our problem to solve - which is not cool.
Interesting. I’m surprised they didn’t opt to do this inline with the rest of the ruby code, because now they can diverge from each other. It’s a bit like a separate header file in C/C++/Obj-C, except in those cases the compiler will yell at you if the implementation doesn’t match the header. Having it blow up at runtime instead doesn’t feel like such a big change from the way it is now, other than helping out IDEs.
> I’m surprised they didn’t opt to do this inline with the rest of the ruby code,
As they mention in the post, they followed typescript's approach, here. The benefit is it allows you to layer in typing into an existing codebase in a non-disruptive way.
But that's not really TypeScript's main approach, is it? When I think of TypeScript I think of the inline colon annotations, not the external files. Those are just a bridge for legacy code.
But as I mentioned the downside of this is that any mistakes don't become evident until at runtime. While the python way has the same problem (they're not compiled languages after all), by inlining to the existing source there's less changes for divergence to happen.
Remember this is designed by companies that already have existing, large, ruby codebases. For them, it makes a lot of sense to be able to incrementally add typing without having to make changes to the underlying code itself.
You can do both. Python allows for external .pyi files, for situations where you can't modify the underlying library (for example: it's written in C). There are tons of them: https://github.com/python/typeshed, but you can still add types to new code inline.
You can, but clearly the people designing this system have weighed the pros and cons and found that there would be more benefits to them to leave the source code unchanged.
Yeah, there was a ton of discussion on this in the Ruby bugtracker and at core meetings. Matz is very sensitive to breaking the language with Ruby 3 and the core team is doing their best to ensure an easy transition.
That use case is not for most of us yet the way those companies with large code base goes into the core? I guess ruby itself already has a large code base, so it seems they got aligned but they forgot to align with the rest of the world.
> As they mention in the post, they followed typescript's approach, here.
They didn't, though! That's what's confusing me. TypeScript has inline types. .d.ts files are typically for JavaScript files that don't have types embedded.
> They didn't, though! That's what's confusing me.
Sure they did. It's the 5th paragraph in the post:
"We defined a new language called RBS for type signatures for Ruby 3. The signatures are written in .rbs files which is different from Ruby code. You can consider the .rbs files are similar to .d.ts files in TypeScript or .h files in C/C++/ObjC. The benefit of having different files is it doesn't require changing Ruby code to start type checking. You can opt-in type checking safely without changing any part of your workflow."
Well, .rb types won't have the standard type signatures (that could be consumed by various typecheckers) but could still have a particular typechecker's custom annotations…
After reading the article I'm not sure, but.. it seems reasonable to assume that you can do both. Inline and separate files. The same way TypeScript does it.
> Ruby 3 has no plans to change Ruby’s syntax. To have type annotations for methods live in the same place as the method definition, the only option will be to continue using Sorbet’s method signatures.
> Typed versus untyped is a 30-year-old issue for programming languages.
I'm pretty sure the merits of typed vs untyped has been going on since the 1950s at least. 30 years is such a specific period of time that it makes me wonder what happened in the early 90s that the author is referring to.
Crystal is pretty unstable at the moment (in that it changes often). I have a very small project in crystal (discord bot with only a couple of commands that CRUDs a database). Every time I deploy to heroku, if heroku updated crystal, the bot breaks. I have to spend an hour determining what broke. The language changes so often that either I update with heroku or I can't use any newer libraries for development.
And then you get stuck in very opaque error loops with the typing where it's expecting eg. a Number. no, not that number, a different type of number. no, not that type either. no, not that type. Most the code i've written has been typecasting.
Crystal does have some significant benefits over Ruby (which is why i'm using it - better memory use, better for scaling for my purpose) but I just spent time rewriting the non-user facing, non-scaling part of the bot in Ruby so I could actually get stuff done instead of fighting the language.
Of course a lot of this might be my inexperience with Crystal, but as a Ruby dev for 13 years, it's not as easy as just switching over from Ruby to Crystal. I've had this bot running 4 years and it hasn't got any easier for me.
Yeah, the issue is more that I can either keep Crystal updated and use all libraries, or keep it frozen and be restricted in what I use. It's not mature enough yet to be able to lock it down and have more or less Crystal's entire shard collection available to me, like I can do in Ruby. Reminds me more of Ruby a decade ago with 1.8.7 and 1.9.2 and 2.0 (which is fine for it to be at, esp for an evolving language, but it's not "there yet" to be a replacement for Ruby)
One incident that stands out is that certain Postgres support was only available in the latest version of a shard, which required the latest version of Crystal, which wasn't compatible with 3 of the other shards I was using.
> Crystal is pretty unstable at the moment (in that it changes often)
This had never been my experience, I have a server written fully in crystal running in production serving millions upon millions of requests on heroku and crystal doesn't break a sweat.
> it doesn't exist because nobody really wants it.
Clearly not true, as there's two separate fullstack frameworks being actively developed[0][1], not mentioning the ones in the past (sails.js, meteor.js).
I think the parent comment was referring to the overall market position of RoR, and I'm saying that Node isn't likely to achieve an equivalently dominant share going forward, due at least in part to the wide variety of options available.
You will always find someone somewhere picking up a language / framework and thinking it's production ready, from my experience Crystal is not ready and in the example you gave there is absolutly no reason to no use Rust. They needed C binding, and the fact that they started using Crystal years ago when the state was even worse is very worrisom.
Crystal has a nice AST based Macro system. IMHO it makes up for it. The problem with Crystal is no direct support for Windows OS compilation which is a non starter for many developers.
I guess this is just the foundation. A way to check types before runtime. I can see a lot gems making this experience better so I'm not too worried about that. It will become better, and it is optional.
For the time being I think this kind of type checking is only worthwhile in big projects, for smaller projects I have found Sorbet never finds an error, so it's just extra work to generate the files on a big change.
I use unit tests as a design and documentation tool.
Since we already have a specification tool (MiniTest) in the StdLib it would interesting if we could combine the rbs files with spec unit tests.
I already have my matching test file open anyway. Having the typing information in the same place would encourage the use of types, tests and documentation.
Matz had mentioned this as a key reason he was interested in working on this, as well as a "language server" with a lot of really interesting features like what JS/TS has with VS Code.
I had the good fortune to hear him talk about it at length at a conference a while ago and there's all types of fun stuff on the way.
As an avid Rubyist I have no interest in introducing types into a dynamic language. I would just rather use C# or Java. I never understood why people are trying to make a round peg fit in a square hole.
I think of it a little differently. I'm excited. I love these incremental type systems. I don't want to write C# all the time at all, but I do find myself missing it occasionally. These systems are excellent for easing that pain when a language like C# isn't an option.
When a statically typed, compiled language is an option I tend to choose Rust lately just for novelty and curiosity, but it's rare that I have those options. When I don't, I find these tools are a godsend. You don't really lose flexibility at all.
It’s insane to me that people are still arguing against static types. TypeScript has proven that you can add the safety of static types without taking away any of the flexibility of dynamic typing. Every time you dive deep into source code to figure out what a field is called or what a function expects, remember that you could have eliminated that completely with static types.
Perhaps if people have not used JS and TS, they would think that typings are in their brain memory but typings are a lot more than scalar and return value typings like defining a structure of a hash and apply that dynamically depending on parameter value or give string an enum like behavior limiting what could be assigned.
Ruby has types already, everywhere. An optional system that annotates these types and expresses relationships between them—allowing you to specify constraints and detect errors that you would otherwise not notice—is a win-win.
How does this interact with method_missing, the soul of rails? Obviously much of the rails api is defined dynamically, not statically. I could see runtime checks having some value but I’m not sure how an IDE could take advantage of this, period. I’d imagine you’d at least need to generate methods (rather than parse and route messages at runtime) to make this remotely viable.
I’m not super familiar with ruby outside of my work so I’m not sure if this reliance on method_missing is more widespread than rails.
You're correct in your assumption you'd need to generate methods, from my quick investigation into how Sorbet handles it - see https://github.com/chanzuckerberg/sorbet-rails for some details.
Considering how prevalent Rails is for Ruby, we can assume that the majority of Ruby codebases are web apps. There are tons of way to scale Ruby web apps for high traffic, and a tiny minority of companies end up reaching a scale (like Twitter) where Ruby becomes infeasible.
Separate files for types with no inline annotations possible? What an embarrassing compromise. This is all because Matz explicitly won't allow type signatures in .rb files. I wonder how long it'll be until a hostile fork if he doesn't change his mind.
If you can't calmly go "ah, I see why they did X, but I would prefer if they did Y," I think the problem is only on your end. Let's be adults/engineers.
I've found that if all you can say is "I hate this", you usually don't actually understand the trade-offs.
In the meantime, you have Sorbet available. What's the problem?
Adults and engineers both hate things sometimes (tone policing, for example).
Both sorbet and RBS have put huge amounts of effort into bolting type systems onto ruby in ways that don't run afoul of Matz's categorical and in-principle rejection of adding type-annotation syntax to the core language. Both or either of these projects would have much better ergonomics without having to bend to this constraint.
As ruby's user base skews further and further away from the hobbyist market and toward the startups that began in the period when ruby was 'cool' that have now grown up into enterprises, this pressure will continue. If Matz doesn't recant, the two possible futures are:
1. Deep compromise (see Sorbet, RBS) to keep types out of the core language; or
2. A hard fork, or a one-level-up language like typescript.
I wouldn’t call it “embarrassing.” What is the actual benefit of having inline type annotations? What is the actual downside of having them in a separate file?
If we're going to add type signatures to code, having them visible alongside their definition site is a sizeable part of the value; and, littering directories with doubled files feels suboptimal to say the least. The comparison to typescript in the article isn't really fair, since typescript supports both modes of operation - projects can choose which they'd like to use.
How hard is it to imagine that you need to keep 2 files in sync, and that you can't type anything inside a method.
I was even hoping to use ruby as a main language having used it before but I'm about to lose any interest in the language when its reality is a bit decoupled from the rest of the world.
Have you ever worked with a language that has header files (C/C++) or a language that can use them optionally (Ocaml)? In practice, keeping the files in sync isn’t difficult. In fact, it ends up being better (for me) in terms of readability, because I can look up the type definitions in one place, store them as context, and then read code that isn’t littered with type annotations. Type annotations add quite a bit of noise to code. I think that’s what Matz is going for here. You need to be able to keep the readability of Ruby, which he’s dedicated his life to.
Not to anecdote too hard, but the practice of doing type signatures out-of-line that you are describing is my absolute least favorite part of OCaml, which is otherwise a very lovely language. As for C++, it has enough other stuff going on that I probably can't say the header files are my least favorite feature, but they certainly don't make life easy. I think it makes sense to challenge that decision from an ergonomics standpoint, even if it makes sense within the constraints the Ruby core team has decided upon.
I very much like Ocaml signature files. Jane street recommends using them, as an example. There’s no other way to communicate the high level contract of a module. With type signatures that are in line with the code, the high level contract gets lost.
The benefit is being able to see what type a variable is without having to open a separate file. Having an option to also have them in a separate file (ala TypeScript) is perfectly fine
It does seem useful to have a _standard_ for type definitions - RBS as the equivalent to a .d.ts file - as that allows for different type checking implementations to use the same system under the hood. This was a big problem for Flow, and why it lost the fight as soon as TypeScript's definitely-typed repository started gaining momentum - users wanted to use the type-checker that they knew had definitions for the libraries they used.
On the other hand, RBS as hand-written seems rather dangerous, to me. Nothing wrong with using them to define previously-untyped external code, as long as you know the caveats, but I think you really want to have definitions generated from your code. Sorbet cleverly (and unsurprisingly, given it's Ruby) used a DSL for definitions in code, which had the (excellent) additional boost of runtime checking, so you actually could know whether your types were accurate - by far the biggest pain-point of erased-type systems like TypeScript.
Given that Ruby 3 was supposed to "support type checking," I'm surprised that it does not seem to have syntax for type definitions in code, and instead will focus on external type checking. I might be missing a piece of the full puzzle not covered in the blog post, however.