> If you "accidentally" implement Write([]byte) (int, error), odds are it'll still be doing something at least semi-sensible if it is accidentally used as an io.Writer.
Funny, not too long ago I encountered a logging library with a writer that expected each write call to pass in a fully-formed log message, which happened to be in a JSON format. That meant that if you passed the writer to anything that expected a properly byte-oriented stream - like, say, a JSON serialization library - the writing would typically be done in chunks, and each chunk would be sent in isolation to a remote server that was expecting fully-formed JSON, and that server would silently drop the malformed data. It was weeks before I figured out what was going wrong.
This would be a great refutation of the article if it had happened in Go, but no, this was Rust, and the library authors had explicitly marked their type as `impl std::io::Write` without understanding why that wasn't appropriate.
I guess the moral is that semi-sensible isn't good enough: the real danger isn't that you end up shooting a physical gun; it's that you shoot your video game gun in a subtly wrong way that takes ages to track down. Failing to compile is loads better.
The problem is that the logging library isn't typed enough: it should require some JSON document object instead of text. If it requires text, it must actually accept text.
Silently dropping log messages, without even logging that logging is failing, is an even worse defect since it is also terrible without typing errors, e.g. in the case of clients earnestly trying to pass JSON as text but making some mistake with quotation, escaping, commas etc because it is text.
So the logging library is inadequately designed and not "semi-sensible" at all.
While serde is the de facto standard for JSON in Rust, it's not actually in the standard library, so there's no one true "JSON document" type for the logger to accept. Anyway, the point of the JSON serializer taking a writer is so that it can push tokens straight to an IO device (the network, in this case), one by one, without having to hold the entire abstract structure in memory. And even if you had a dedicated "JsonWriter" interface for streaming that validates on the way out, you'd have cases where a string is already known to be valid JSON but to send it you have to pay the cost of an extra validation for no good reason. I love me some strong, static types, but sometimes you gotta let bytes be bytes.
The silently dropping bad input, yeah, that part was just bad developer experience on the vendor's part.
If they want a whole JSON Document here - which apparently they do - but they don't want to just admit that everybody uses serde, they can provide a Trait like OurJSONLogRecord and then provide the implementation for OurJSONLogRecord on the serde type. You can implement it on whatever internal JSON documents are ready to be logged.
This OurJSONLogRecord still surfaces the undocumented assumption, "Oh, we need the entire JSON document, we didn't realise anybody would want to stream data" whereas the Write trait does not.
I find it so interesting that there is such a huge chasm between "what we know can be done" and "what we do in practice in industry".
How can it be that in so many programming languages, adhering to a contract means nothing but "these methods have similar calling signatures" and "I pinky swear I implement the semantics of std::io::Write " ?
Haven't the Ocaml people entered the next metaphysical level of understanding? When they speak of "types" they don't mean "uint32_t" or a class.
They seem to operate at much higher level of abstraction. They mean something more profound when they speak of "types", which I don't understand. Also Ada seems interesting, with more spelled out contracts.
That depends on the types. You can have something as basic as "int", something a bit more complex like arrays, or something even more complex like GADTs https://v2.ocaml.org/manual/gadts.html
It's the SDK for Fastly's edge computing service, so not something you're likely to run into without getting paid for it. And no, still has the brokenness, still has the internal comment warning their own engineers not to do it wrong[1]. I'm doubtful that my support ticket about the issue ever made it to anyone who actually knows Rust.
(edit to add: the feature was in beta at the time and we were early adopters, so Support not knowing how to field issues about it was sorta understandable. Their engineers seemed cool when I had the chance to talk to them)
The logging utilities are just a frontend to a bunch of supported logging/tracing providers, many but not all of which accept plain text. So yes, it was technically a separate system (Honeycomb) doing the rejection, but one that was blessed by the vendor. And the only debugging facility I had available, which was duplicating the payload to stdout and tailing that, masked the issue because stdout is an actual stream.
And to be clear, the system would be broken for the plaintext providers too, just more obviously broken as you'd presumably have a bunch of JSON tokens (or segments of a format string or whatever) show up as individual log lines.
> In my 23 years of programming, I have never experienced this bug. I've also never heard of anyone else experiencing this. I've even said this in fora where I should have had someone pop up to correct me
This is a joke I’m not getting, right?
Somebody is forgetting that comparison and arithmetic operators are an interface on a type, and that almost every popular language (even many that are ostensibly “statically typed”) have ubiquitous nightmare scenarios surrounding implicit casts.
And of course, it’s also common where data comes from an i/o source and then is magically assigned a type by an ORM or deserializer or framework helper.
Or when working with large projects with lots of modularity and abstract interfaces, where the names of parameters, methods, and members become increasingly generic.
People work around these issues and learn how to avoid them with good practice, but they happen every day.
For example, in Scala's standard library there are a ton of implicit conversions that results in some truly horrendous behavior where types get autoconverted hither and thither if they don't match. So instead of a simple type error, you get nonsense when the compiler manages to find a way to cast A to B. And then you might get a crash because you used B wrong.
I have heard that this is one of the things that Scala 3 fixes, but I haven't done any programming in that yet.
> Somebody is forgetting that comparison and arithmetic operators are an interface on a type, and that almost every popular language (even many that are ostensibly “statically typed”) have ubiquitous nightmare scenarios surrounding implicit casts.
For what it's worth (and in the context of the article), Go doesn't do implicit conversions like this: you get a compiler error if you compare an int32 to a uint64, for example, and you always have to convert one to the other type explicitly. That goes for comparisons and arithmetic (neither of which are overloadable in Go).
> and that almost every popular language (even many that are ostensibly “statically typed”) have ubiquitous nightmare scenarios surrounding implicit casts
Well, C/C++ implicit cast are very, very bad, but this doesn't mean that another language can't create sensible 'implicit cast' rules.
No int <-> unsigned implicit cast but allow intY <-- intX implicit cast when Y>=X (same for unsigned).
The major argument is that "if it's wrong, nothing bad will happen, because it will break in runtime". Isn't it better to break before things are in production, like at compile time?
Things "break" all the time in production, regardless of type system. The amount of instrumentation you have to have anyways makes this whole conversation almost moot.
Even {insert your favorite language here} programmers write bugs.
Segfaults can happen regardless of language/typesystem.
I haven't seen a single SRE say, "oh, you use X language, with Y type system? Ok cool, you don't need us then"
The best apples-to-apples experiment for the value of type systems is JS vs TS.
When I write TypeScript, I have so few runtime errors that I'm often in disbelief. Sometimes I have zero runtime errors on my first test of an application.
Contrast this with my own writing of JavaScript (same coder, same runtime, etc.) and it's very clear. JS needs many, many unit tests that TS just doesn't.
Yea, I program in Python with a very strict type checking configuration, and I have never gotten a runtime error in production due to type related issues (like attribute errors or key errors), only weird edge case stuff or algorithmic mistakes.
The goal of a type system is not to prevent all bugs, or to prevent all bad things happening at runtime.
The goal is to catch as many bugs as possible BEFORE it runs in production. And even if you just catch a single bug, you have already won. In particular if the type system is balanced in such a way so there's not much additional effort involved (e.g. by using type inference).
This sounds like the Law of Averages. All things don't break at the same frequency, and noticing that doesn't mean that you're saying that something else is perfect; other things could just break less often, for different reasons.
Of course. But fixing a bug in production is 100x more expensive than fixing the same bug during development. Ideally, we minimize the number of things we rely on SREs to manage.
Nope. Not true. I haven’t had a bug in production the last 5+ years. And I am maintaining large scale C++ software with massive refactorings and features added. The key is having solid tests.
Really? Types languages are way safer and break less, which is what you want and you want breakage to happen as early as possible in the dev cycle for velocity
Yes but things can break less with a robust type system. This is categorically true.
For example ELM is a programming language that cannot crash due to it's type system. I mean it can crash off of exceeding system resources or through the FFI, but it cannot crash any other way.
The argument the OP is making is that the difference achieved by a type system is negligible, but I would argue that in some cases this is true, but ELM is a case where it's not.
If automated tests/linters are a part of your build system, then they basically are breaking at compile time. (Having to write tests to mimic functionality of a compiler seems wasteful to me though)
Type checking is a form of proof. It proves your system is type safe. You would need billions or trillions of tests to achieve type safety equivalent to a type checker.
A successful test only proves something for a single test case. In fact it is impossible to do the equivalent of type checking in the runtime code of your program unless the runtime code has the ability to self "reflect" on it's own source code.
There are also complex type systems that can prove correctness and completely eliminate the need for tests all together but this style of programming is really challenging and time consuming.
As someone who did the first ~2/3 of my career in Java, but have been in Node/Typescript for the past several years, can't say enough how much I think the structural typing of Typescript allows be to have much higher productivity than the nominal typing of Java.
Primarily it has to do with making refactorings much, much easier.
I can't help but wonder how much of your productivity boost was due to leaving Java vs. due to using Typescript. Java is just about the most unjustifiably verbose language I can think of off the top of my head.
It's not just verbose, but has bad choices that impact the language to have to be verbose (to maintain within it's own limitations) that results in inferior implementations.
The "new object" paradigm for instantiation (instead of Class.new, or some other generator), the static keyword, the final keyword, no global scope (eg for singletons), the inability to do non-inheritance mixins without 3rd party libraries, a full pre-runtime to bypass some of these issues in the form of Spring. Many of these work against testability and JUnit just can't keep up. It's horrific and it's not going to get better anytime soon.
Interesting how Java manages to chug along for decades on backends with arguably the biggest code bases out there (running basically most of the services at FAANG as well as outside of it), when it is so “inferior”. Empirically I would say that it does very very well on maintainability’s count.
> the answer one doesn’t want to read is that the “language doesn’t matter”, at least not in any significant way.
When you have an outrageous amount of effort to (what amounts to manual) testing in closed environments before release, you're fighting the language but you just don't care.
> Empirically I would say that it does very very well on maintainability’s count
FAANG companies don't even release data related to how program defects are detected and addressed (publicly), so the assertion is literally baseless. Given enough time and effort, you can overcome the inability to do simple things. That doesn't make the tooling better, just the perception. This is common, historically, in tech (re Mechanical Turk).
They harm testability for me, because I have to use java to test them. There's a larger discussion about "what is testing" and "how to properly test" which I'm not going to get in to here, but would be happy to talk about in another forum.
JUnit (and all the inferior alternatives) can't properly mock them or verify how/when they are called (despite running in a closed test container, which is how JUnit executes them). If you have ever tried to deal with native methods (or FAANG libs that like to use static final methods), there's a practical barrier to how much state you can avoid in testing. That barrier is java's ecosystem itself (JVM + syntax + tooling + ethos).
Take my opinion with a huge pound of salt because I've only ever used Java for smaller projects but...
A large part of Java's verbosity is optional, but it feels to me that people do it because "That's how you write Java!".
But you don't HAVE to write an interface, abstract class, and factory for every class. You can just...write a class. You don't need to write a class that has nothing but an "invoke" function. You can just...write a function. You don't need 10 layers of abstraction for simple cases.
People exacerbate the problem but the language itself is pretty darn verbose too. Just look at how much code you need to instantiate a generic array for instance.
My experience is very much the reverse. Even at present I’m looking at different java to js compilers (teavm, j2cl) because I don’t want to write a frontend apps business logic part in js/ts..
> I’m looking at different java to js compilers (teavm, j2cl) because I don’t want to write a frontend apps business logic part in js/ts
I've seen people do this (rewrite a Java app in JS) and then go completely off the rails, instead of, you know, writing it just like they would have (read: already did the first time) in Java. Bonus points when it's followed by the conclusion that JS sucks. It hints towards irrationality on their part. It's as if the structure and order imposed by Java is the _only_ thing keeping them on the straight and narrow, because they themselves can't be trusted to act responsibly and simply follow the same principles on their own when no one is around forcing them to.
Languages do have their respective idiomatic structure, you write JS differently than Java, and personal preferences may alter one’s decision.
But the reason I plan on going down this road is that I don’t want to program the complex business logic twice, it is quite important that both backend and frontend reliably calculate the same results.
It is much easier to handle it through a compiler that will uphold Java’s semantics, than trying to write the same thing in JS - where you might easily do float arithmetic when you didn’t intend.
> Languages do have their respective idiomatic structure, you write JS differently than Java
I object to this specifically as it relates to JS because the sort of code that mainstream "JS" programmers write nowadays ends up being passed off as idiomatic even though it really shouldn't. If you look at well-written JS in very large JS codebases with high standards (like Firefox, at least as it was circa ten years ago, or Apple's Web Inspector[1]), then you realize idiomatic JS looks a lot more like Java, C#, and Objective-C than what you find getting pushed to GitHub today—and (very soon after) discarded, since the half-life of an NPM-powered codebase is something like 2–3 years.
The problem is that the style of development associated with the NodeJS+NPM crowd has cannibalized the JS ecosystem, so almost anyone gazing in on the situation today is going to have a _very_ skewed view of what "idiomatic" means.
The perverse thing is that you will encounter more of an impedance mismatch with JS-the-language if you actually try to follow the crowd. This is a big driver for toolchain/framework churn—because they're all fighting the language instead of just going with it.
My guess is that a big reason why you do this is because your backend is in Java. And that I can agree with - having a single language across both the front and backends provides huge benefits to team productivity, primarily that frontend and backend teams are much less "siloed" in their respective languages.
But that said, if starting from scratch, I think a Node/TS backend and a TS frontend is a much, much better choice.
> In my 23 years of programming, I have never experienced this bug. I've also never heard of anyone else experiencing this. I've even said this in fora where I should have had someone pop up to correct me; a corollary to Cunningham's Law is that if you say something that should be controversial but nobody pops up to correct you, it must not be controversial after all.
This is such an odd claim. Mixing up a string for a list of strings, which both satisfy the iterable interface, happens all the time in python. Does the author, and his acclaimed fora simply not use python?
The thing the author has never experienced is someone accidentally implementing an existing interface, and then code expecting this interface to call that accidental implementation in a way that creates a bug.
Really? Iterating over a string should give you some kind of {char, code point, grapheme cluster} or whatever; iterating over a list of strings should give you a string. Those are not the same type, and any nominative type system should surely realise this.
I've never intentionally passed a string to a function that expects a list of strings. But I have done it accidentally very often and received an incorrect return value (instead of an exception).
I'm including functions like min() there, technically it can take a string argument, but I've never wanted it to.
> In my 23 years of programming, I have never experienced this bug. I've also never heard of anyone else experiencing this. I've even said this in fora where I should have had someone pop up to correct me
I cannot today find the bug, but as I recall an early version of Golang's stdlib included some HTTP handler that would "interface upgrade" your Reader to a ReadCloser and close it.
This could be bad if you expected functions that accept Reader to not call close!
We experienced this bug at Twitch and found it quite troublesome.
Duck typing and structural are not the same thing (for safety), IMHO.
I think Structural is safer and better overall, and the best exponent is the relational/sql model.
What make it good is that structural types are more about the data itself, and if the code is around that concept, is very productive and safe in practique.
What make Duck typing unsafe is that is easy to "break" the type in runtime (ie: monkey patching, adding or removing things) and this is what I expect structural to retain it safety better...
I get that issue all the time, recursing on list structures in Python. You always have to test for types, because strings, dictionaries and a whole ton of objects are iterable too.
The general case this article describes is a type being confused for another type. This happens to me all the time when programming with Python and Pandas, and somewhat less often in Elixir. The `+` operator in Python will concatenate sequences as well as add numbers together, and I have definitely forgotten to parse strings before adding them together before. This stuff happens all the time, people who prefer dynamic languages just don't see it as a problem.
I'm not gonna claim the conclusion is wrong (or right), but Shoot() is probably not the most compelling example one could try to, well, shoot down.
How about a function like add()? It might mean something like BigInteger.add, or something like List.add, or Set.add, and these could all satisfy the same interface while behaving very differently.
Satisfying the interface was just one of the things the article lists as reasons why breakage is quite implausible. Your example doesn't address the others though.
For example, how did you accidentally get an instance of "Set" in the code path doing arithmetic on "BigInteger" instances, and why didn't that "Set" fail earlier during the other operations that are most likely being performed on the BigInteger (parsing it from a string, doing other arithmetic on it, etc)?
> For example, how did you accidentally get an instance of "Set" in the code path doing arithmetic on "BigInteger" instances, and why didn't that "Set" fail earlier during the other operations that are most likely being performed on the BigInteger (parsing it from a string, doing other arithmetic on it, etc)?
Because I got it as a return value from someone else’s function, and then this was the next thing I did with it.
So your code calls a function that is named in a way that indicates it might return a set or an integer (and its parameters make sense in both situations), then you take the return value and add() to it ambiguously, and then you do nothing else with it? You don't loop over it or query it for set membership or pass it to any functions or multiply it?
This is exactly the point of the article. Extremely extremely unlikely.
But if you have a real world example of this, I'm sure the article's author is interested.
I wasn't claiming it would. I was just saying Shoot() isn't exactly a common function name that would collide with anything to begin with, so it's a bit of a strawman as an example before you even need to list out multiple criteria. I'm pretty sure I've never written a Shoot() function in my life, let alone worried about it colliding with something. Compared with add(), get(), setValue(), read(), etc. which are incredibly common and thus more realistic candidates. That's all.
I'm not a massive fan of duck typing, but it's literally never even occurred to me to worry about the concern addressed in this article (that an object accidentally conforms to an interface and is used as such).
One major thing I like about having to implement a defined interface is that it forces people creating an API to actually define what that interface is. And in typing out that interface there's a fair chance they'll actually write doc comments.
IME a lot of libraries that do duck-typing (for instance Python) make very little effort to explain what methods I'm supposed to implement or what the contract is supposed to be for those methods.
It also forces users of the API to explicitly state in code that they're intending to implement that interface, as opposed to just implementing a handful of seemingly random methods whose purpose will not be immediately clear to readers.
Also, AFAICT duck typing is incompatible with the approach taken in Rust or Haskell where the definition of a type can be decoupled from its implementation of various interfaces (called traits in Rust or type classes in Haskell), and where methods or attributes in different interfaces implemented for the same type can have the same name and not clash, because the object isn't the namespace for its methods. I strongly prefer this approach.
> One major thing I like about having to implement a defined interface is that it forces people creating an API to actually define what that interface is. And in typing out that interface there's a fair chance they'll actually write doc comments.
This has another interesting beneficial side-effect, when embraced: it often leads not just to better API design, but better design in general. Defining an interface necessarily means thinking about it. Which in turn generally leads to thinking about how it’s consumed, at least a fuzzy picture of the code implementing it. All of which can also percolate out into other parts of a system.
With interface-typing, is there a right way to use the type system to declare "this function takes a (statically-certifiable) positive integer", not because it would fail to output a value for other numbers, but because its output would be incorrect or meaningless?
In scientific programming I've often wanted that, to push asserting particular constraints on values to the caller and/or the type system (whether that's compile-time or runtime type checks). But those constrained types would always match the interface of their unconstrained forms.
You would have to write a wrapper class for an integer that can only be ever constructed with a positive integer without it throwing an exception, and then have it implement some PositiveInteger interface. The consumer would then have to trust that the provider has held true to the contract.
Needless to say, this code would be very verbose and ugly - what you would actually want is dependent typing, which solves this problem at the type system level.
Here is the implementation for Nat in Idris, for example:
data Nat =
Z
S Nat
With dependent typing, you can specify constraints on the values of function parameters, and have the compiler prove that only such values can be passed to those functions. For example, the following function only takes non-zero natural numbers, since the S constructor used in the parameter implies that the element can not be Z:
addTwoNonZero : (S n) -> Nat
addTwoNonZero = (+) 2
Contracts are ways to specify invariants for the runtime to verify. For example you could write a function with a contract saying that it takes a string containing all lowercase characters and returns a string containing all upper case characters. As far as the type system is concerned it just takes and returns strings but the runtime won't let a caller give you a value you don't want and won't let you return a value that doesn't match what you promised.
Dependent types essentially let you do the same but at the type level. For example you could have a function whose type says it takes a list of length N and a list of length M and returns a list of length N+M. Or maybe a type that represents a tuple containing a number N and a list of length N. Your code wouldn't type check unless the type system can prove the types. There's a lot of overlap with theorem provers in this space. Agda, Idris, and Coq are some decently popular languages in this category.
I find dependent types to be extremely powerful but sometimes the added overhead of figuring out how to encode an invariant into the type system is overwhelming.
A common approach in TypeScript, which doesn’t require a wrapping object or any runtime overhead, is called “branding” (or a variety of other names for similar concepts). It works by applying additional metadata about the value which makes it nominally incompatible with other values of the same underlying type. You can accomplish this a variety of ways, all of which have some tradeoffs. My (current) preference is to declare a class where the metadata is defined as a generic value assigned to a private property or a unique symbol, e.g.:
declare class Branded<Meta> {
private meta: Meta;
}
type Brand<T, Meta> = T & Branded<Meta>;
type UInt = Brand<number, 'UInt’>;
Yes. It’s called Refinement Types. I wish Rust came with Refinement Types from day one. That would have been a massive step forward instead of the much smaller step forward that Rust is currently taking.
In addition to all the problems others have already noted, duck typing also loses the machine-checked documentation of explicit typing. With types, I don't have to guess what a function accepts or returns, and when I refactor, I can count on the compiler to find many issues for me.
This is not a tradeoff, where I sacrifice development speed for safety. I gain speed, because types help guide me, and catch errors I would otherwise need tests for.
I'm not sure if this counts, but where I run into this sort of problem it's with conventional methods/operators, like `operator+`. I've had lots of bugs over the years from something doing string-concat or list-append instead of integer+, or vice-versas.
I've used JavaScript and TypeScript professionally for more than 7 years now and I'm still discovering new ways structural typing can break systems in novel ways. It's not like you don't benefit from the reduced need to type everything everywhere, but there are still holes that can leave subtle bugs in your system.
Funny, not too long ago I encountered a logging library with a writer that expected each write call to pass in a fully-formed log message, which happened to be in a JSON format. That meant that if you passed the writer to anything that expected a properly byte-oriented stream - like, say, a JSON serialization library - the writing would typically be done in chunks, and each chunk would be sent in isolation to a remote server that was expecting fully-formed JSON, and that server would silently drop the malformed data. It was weeks before I figured out what was going wrong.
This would be a great refutation of the article if it had happened in Go, but no, this was Rust, and the library authors had explicitly marked their type as `impl std::io::Write` without understanding why that wasn't appropriate.
I guess the moral is that semi-sensible isn't good enough: the real danger isn't that you end up shooting a physical gun; it's that you shoot your video game gun in a subtly wrong way that takes ages to track down. Failing to compile is loads better.