Hacker News new | past | comments | ask | show | jobs | submit login

> Tony Hoare apologized for inventing the null reference: I call it my billion-dollar mistake.

Have we learned nothing? Many languages without NIL/Nil/nil/null/NULL existed already when Golang was born.

And this attempt to heal the damage hurts, too:

> “make the zero value useful” philosophy

This is like instead of programming by exception (Java), it's programming by ignorance (of errors).

I like a few things about Golang (handling of numeric literals, for example), but not this thing.




I think the idea of "useful zero values" in Go is a mistake from bias grown out of being at Google. Protocol Buffers implement default zero values (https://developers.google.com/protocol-buffers/docs/proto3#d...) and there's no way of discerning whether something is "false" or "unset", the empty-string or unset, etc. In that context, it makes perfect sense for Go to have similar behaviors for zero values.

In some respects, you can get around this by simply adding another field. You can have `bool over18; bool over18IsSet` and I'm guessing that Google's internal usage of protobufs does this.

In a certain way, even getting rid of null/default values doesn't fix all problems when it comes to things like updating data. Think about updating a record where a field could be set or unset - let's say a person's age could be a number or empty. If I want to send a request updating their name, maybe I send `{"name": "Brian"}` because I don't want to update the other fields. How do I unset Brian's age? `{"age": null}` makes some sense, but a Java (and many other language) deserializer will have null for age with `{"name": "Brian"}` too. I mean, the age field has to be set to something in the Java object. You could manually read the JSON, but that's janky and brittle - and hard in terms of interoperability with libraries and languages.

Maybe Google's protobuf designers would argue, "you really need to have explicitness around your values and forcing defaults means forcing engineers to deal with things explicitly."

I don't think I agree with that. I don't like Go's nulls and default values. I think most languages are moving away from that kind of behavior with other new languages like Kotlin and Rust going against it and older languages like C#, Java, and Dart trying to bolt on some null-safety (Java via the `Optional` object and C# and Dart via the opt-in `TypeName?` similar to Kotlin). It's possible that this is a wrong direction chosen by many languages. We've seen bad programming language fads before. In this case, I think we're on the right track and Go's on the less-great side.

Go has a lot to like, but this is one of those odd decisions. I understand why they did it. Go comes from Google where Protocol Buffers have similar default-value behavior. I think Go would be better if it had made some different decisions in this area.


I don't think it has anything to do with protocol buffers, but the behavior derives from the same intrinsic motivation.

If you don't have a zero value, a programmer has to pick one. What are they going to pick? Probably what the language picks for you, "int n = 0;", 'string foo = "";', etc. For a language, it doesn't really matter which side you pick (force programmers to select a value, or auto-assign one). For network protocols, defining empty is an important optimization -- if the client and the server are guaranteed to agree, you don't have to send the value. This is especially important where the client and the server aren't released at exactly the same time; the server may have a new field in the Request type that the client doesn't fill in. With a predefined zero value, it doesn't matter. (You can always add fields to your message to get the same effect, if you actually care. I've never seen anyone do this in any API, including ones that use serialization that doesn't have the concept of zero values. It's why Javascript has the ?. operator!)

Finally, Go came out in the proto2 era, which did have the concept of set and unset fields (and let the proto file declare arbitrary default values). Honestly, I wrote a ton of Go involving protos at Google, and never saw proto3 until after I left Google.


> If you don't have a zero value, a programmer has to pick one.

Most languages without null have an optional type which is used exactly for that. In these languages, this means the None value exists when you have optional things, but that, then, the compiler forces you to check whether your value is set when you want to use it. Serialization libraries then get the choice to handle these optional values as they wish, which can be not to send a field.

It's one of those things that may be hard to think about from an external pov, but it works just fine.

From users of these languages, proto2 was okayish, and proto3 was a massive regression. Another thing that's missing in protobuf is the ability to define union types. That's one frequently asked feature from typed functional languages for serialization protocols.


For programming languages, I agree with you. Though I don't really see how "int*" is different from "Optional<int>". You can write:

    func foo(maybeInt *int) {
      if maybeInt == nil { panic("not so optional!!!!") }
      ...
    }
Just as easily as:

    func foo(maybeInt Optional<int>) {
       switch maybeInt {
       case None:
          panic("not so optional!!!!!")
       ...
       }
    }
To me, it just isn't a big deal. Your program is going to crash and return unexpected results when it expects something to exist and it doesn't, and the type system won't save you. (Even Haskell crashes at runtime when a pattern match doesn't resolve. Don't see how that's any different than a nil pointer dereference. Your live demo is ruined.)

For protocols,an Optional type just pushes the problem one level down. Is the optional value "None" because the client didn't know the field existed, or because they explicitly set it to "None"? You can't tell.

I think rather than going 3 levels deep, it's easier to just define the default values and not distinguish between these three cases. If you want an Optional value, you can make yours as complex as you wish:

message Optional { int value = 1; bool empty_because_the_user_said_so = 2; bool client_has_version_of_proto_with_this_field = 3; }

Now if you get (0, false, false), you know that's because the client is outdated. If you get (0, false, true), you know that's because the user didn't feel like sending a value. And if you get (0, true, true), you know the user wanted 0. (Of course, there are all the other cases that you have to handle -- what about (1, false, true), or (1, true, false)?)

I think you'll find that nobody but programming language purists want this feature. If your message is:

message FooRequest { int foos_to_return = 1; }

You do the right thing regardless of whether the 0 in foos_to_return is what the user wanted, something the user forgot to set, or the user has an old version of FooRequest ("message FooRequest{}").*


> Though I don't really see how "int * " is different from "Optional<int>"

They wouldn't really be different if a pointer were only used to express optionality.

However, in go, that's not the case. Pointers are used to also influence whether a receiver can mutate itself (pointer receivers for methods), to influence memory allocation, and as an implementation detail for interfaces.

If I have "func makeRequest(client *http.Client)", how can I know if the function will handle a nil client (perhaps using a default one), or if the client expects me to pass in a client, and just uses a pointer because idiomatically '*http.Client' is passed around as a pointer?

The answer is, I can't know. However, if we have what rust has, which is Box and Option as two different things, we can get 'fn makeRequest(client: Option<Client> | Box<Client> | Option<Box<Client>>)'. We've made it so the type system can express whether something is optional, and separately whether something is on the heap.

In go, those two things are conflated.

Similarly, rust has 'mut' as a separate thing from 'Option', which is another case where the type-system can express something the go pointer also is sorta used for.

In practice, I think there's a clear difference. Most of the pointers I see in go (like pointer receivers on methods) are not pointers because they're expressing "this might be nil", they're pointers because the language pushes you to use pointers in several other cases too.*


> If I have "func makeRequest(client http.Client)", how can I know if the function will handle a nil client (perhaps using a default one), or if the client expects me to pass in a client, and just uses a pointer because idiomatically 'http.Client' is passed around as a pointer?

C/C++ implementations generally have a [[nonnull]] attribute to that effect.


> Though I don't really see how "int*" is different from "Optional<int>".

One requires you to handle the possibly-nil case in order to access the pointer, the other expects you to remember to check every time.

The following is completely legal, and results in a runtime error:

    func foo(maybeInt *int) {
      x := *maybeInt // runtime panic
    }
The following is not, and results in a compile-time error:

    func foo(maybeInt Optional<int>) {
       x := *maybeInt // compile error: need to unwrap maybeInt
    }


Annoying that the various Go linters I have installed (`go vet`, `golangci-lint`, `gosec`) don't catch "possible use of pointer before nil-check" - you'd think it would be an obvious case for those to handle.


That would be very annoying to use as many pointers are just assumed to be non-nil and you don't have any other option but to assume they're non-nil.

Also since Go supports nil receivers, it would have to require that every "pointer method" be checked for nil. Which technically they should, but...


I'd imagine it'd be one of the "disabled by default" in `golangci-lint`, for example - helpful for when you're doing an in-depth review of the code but probably not something you want running on every CI invocation.


There are two major differences.

For one, the compiler can force you to check for None. Trying to use an Option<T> as a T is a compile-time error, you have to write the pattern match to use the Some case.

For another, and this is the big one, you can write a function which takes a T, and you can't pass it an Option<T>. The compiler can statically confirm if your variable has already been "nil checked".

I used to write a lot of Java code, and since you can't know for sure that null-checking has been performed (among other reasons, that might change with a new code path) you just kinda sprinkle it everywhere. And still forget and get runtime errors.


    func foo(maybeInt Optional<int>) {
       switch maybeInt {
       case None:
          panic("not so optional!!!!!")
       ...
       }
    }
This function would never take an Optional in real life if it's just going to crash on None so I'm not sure what you're getting at here. The benefit of using Optional/Maybe is that you're encoding at the type level whether it makes sense for a given variable to be able to be nothing, and if it does make sense, the compiler makes sure you check whether it's nothing or not, but this is an example of where that doesn't make sense so the type should just be int instead of Optional<int>.


I feel like this discussion is departing from the simple clarification about what alternatives exist in the design space, and entering a debate about the merits of these alternatives vs. Go's choice, which was not my aim.

While I'd be happy to discuss about my experience using these alternatives and how it's practical beyond being a purist, it doesn't really belong to a discussion about Go.


One major difference is that *int is mutable. Using it as a replacement for "optional" values is potentially dangerous because you can't guarantee that it is never modified.


> For a language, it doesn't really matter which side you pick (force programmers to select a value, or auto-assign one).

Even (modern) C/C++ handles this aspect of memory safety better: There is no default value, and reading from an uninitialized value is a compile time error (usually, because C/C++ have baggage).


> Protocol Buffers implement default zero values (https://developers.google.com/protocol-buffers/docs/proto3#d...) and there's no way of discerning whether something is "false" or "unset", the empty-string or unset, etc.

This was one of the significant changes from proto2 to proto3.

This was also met with much opposition internally, and recently changed as of v3.12 (released last year).

https://github.com/protocolbuffers/protobuf/blob/master/docs...

https://github.com/protocolbuffers/protobuf/releases/tag/v3....


Our project got bitten by this hard. I was under the impression that they did this to enable memcpy/memmove into structs, which didn't feel completely motivated if you ask me.


>unset age

In my horrible opinion, we shouldn’t ditch null. We must introduce null flavors (subclasses) instead and fix our formats to support these. One null for no value, one for not yet initialized, one null for unset, one for delete, one for type’s own empty value, one for non-single aggregation (think of selecting few rows in a table and a detail field that shows either a common value, or a “<multiple values>” stub - this is it), one for SQL NULL, one for a pointer, one for non-applicable, similar to SQL. Oh, and one for not-there-yet, for async-await (a Promise in modern terms). These nulls should be enough for everyone, but we may standardize few more with time. Seriously, we have three code paths: normal, erroneous and asynchronous. Why not have a hierarchy of values for each?

Semantically all nulls must be equal to just “null” but not instanceof null(<other_flavor>).

Edit: thinking some more, I would add null for intentionally unspecified by data holder (like I don’t share my number, period), null for no access rights or more generic null for “will not fetch it in this case”. Like http error codes, but for data fields.


We have that in JavaScript with undefined. It's awful.

Here is different proposal. Let's allow people to define their own types of missing values. We'll call it Nullable<T> or Maybe<U>.


A usual Maybe(Just, Nothing) doesn’t cover these use cases, because Nothing is just a typesafe null as in “unknown unknown”. Case(Data T, Later T, None E, Error E) could do. It is all about return/store values, because you get values from somewhere, and it’s either data of T, promise of T, okayish no value because of E, or error because of E. Where E is a structured way to signal the reason. No other kinds of code paths exist, except exceptions, it seems. (The latter may be automatically thrown on no type match, removing the need for try-catch construct.)


My point is, there is no size fits all. Maybe you only have Some(data)/Nothing. Maybe you have a Some(data)/NoData/MissingData/Error(err)/CthuluLivesHere.

It's better you develop one for you and that suits you, rather than just a set of null-likes that are similar in meaning, but different in semantics.


Indeed: your language needs to support the ad-hoc creation of these primitives in a first-class way. (Which is why I still consider a typed language without union types to be fundamentally crippled.)


undefined is awful because you can use it anywhere. Done properly you would only be able to use it in APIs that specifically need to deal with that form of null.


It’s also awful because it’s unnecessary when we can just define e.g. Option<T> at the library level.


What you want is a 'bottom' class (as opposed to 'top' = Object), not null. Essentially, a class that subclasses everything to indicate some problem. Look at how 'null' works: the class of 'null' (whether it can be expressed in a language or not) is a subclass of anything you define, so you can assign 'null' to any variable of any class you define. This is how 'bottom' works, if you want it as a class. But you already recognise that this is not really what you want: you want specialised sub-classes representing errors of specific classes you defined, which are all superclasses of a global bottom class.

Such a system can be done, but it is probably super ugly and confusing. The usual answer instead is: exceptions, i.e., instead of instanciating an error object, throw an exception (well: you do instanciate an error object here...). That works, but if overdone, you get programming my exception, e.g., when normal error conditions (like 'cannot open file') are mapped to exceptions instead of return values.

The usual answer to that problem then is to use a special generic error class that you specialise for your type, the simplest of which is 'Optional' from which you can derive 'Optional<MyType>'. You can define your own generic type 'Error<MyType>', with info about the error, of course. I think (please correct me if I am wrong), this is currently the state of the art of doing error handling. It's where Rust and Haskell and many other languages are. I've seen nothing more elegant so far -- and it is an ugly problem.


Yeah, my gp[2][0] comment addresses okayish error values with Case(...). It’s interesting what do you think of this type? What would a language look like if that was built-in?


As I said, it will get super-ugly, and it has not been done (in any language with more than 1 user), I think. Why? Because you will want an error class for a whole tree of classes you define, and it is not so trivially clear how that should look like. A simple 'bottom' (i.e., 'null') works. But e.g. you have 'Expr' for your expressions and you want 'ExprError' to be your error class for that that subclasses all 'Expr' and is a superclass of bottom. Now when you define 'ExprPlus' and 'ExprMinus' and 'ExprInt' and so on, all subclasses of 'Expr', you still want 'ExprError' to be a subclass of those to indicate an error. That is the difficult part: how to express exactly what you want? How does the inheritance graph look like? At that point, languages introduced exceptions. And after that: generic error classes: 'Optional<Expr>' and 'Error<Expr>', etc., without a global 'bottom'. This forces you to think about an error case: you cannot just return ExprError from anything that returns Expr, but you need to tell the compiler that you will return 'Optional<Expr>' so the caller is forced to handle this somehow.


Most people start using Result/Either[0] when they need to define a reason for a value being missing. Then you can decide how to handle arbitrarily different cases of failure with pattern matching, or handle them all the same. The error types themselves are not standardized as far as I know, but I'm not sure how useful it is to standardize these differences at the language or standard library level. Is the theory that people don't use the Result type correctly as is?

[0] https://doc.rust-lang.org/std/result/ https://caml.inria.fr/pub/docs/manual-ocaml/libref/Result.ht... https://hackage.haskell.org/package/base-4.15.0.0/docs/Data-...


It's very usual in Haskell to define some error enumeration, and transit your data in `Either ErrorType a`. It's not a bad way to organize your code, but there is no chance at all that you'll get some universal error enumeration that will be useful for everybody.


> In a certain way, even getting rid of null/default values doesn't fix all problems when it comes to things like updating data.

The way this is addressed in Google's public APIs is with the use of a "field mask"[1]. You provide a list of dotted paths to the fields you want to update. I'm not sure if that serves as an indictment of the design decisions made in protobuf, or if it's just one less bad tradeoff among several bad ones.

[1]: https://github.com/protocolbuffers/protobuf/blob/e9360dfa53f...


The only apt comparison IMO is Rust, the other languages with JIT compiling runtimes aren't really useful comparisons.

To me this looks like a discussion about syntax and ergonomics. Go provides the same mechanisms:

* for safe dereferencing, use "val, ok := *valRef"

* for potentially-unsafe dereferencing, use "val := *valRef"

Every language in that list has equivalent mechanisms. In Rust you can use one of these methods: https://doc.rust-lang.org/std/option/enum.Option.html or pattern matching. That's a whole lot to pick from.

But Rust also makes it more complicated to reason about your memory, see for example https://doc.rust-lang.org/std/option/#representation

So, given Go's design tenets, using pointers makes a lot of sense to me. It is easy to reason about them in terms of memory and resource consumption, there are only a few ways they can be used, pass-by-value semantics further reduces the centrality of them, and they don't require a JIT compiler to be efficient.


Why does whether a language is JIT or not make any difference? C# is usually jitted but you can AOT compile it just like Go if you want, and you could JIT go if you want.


The memory layout of a pointer is quite different from a full-blown object. E.g. Optionals are only efficient in Java because the HotSpot compiler optimizes them at runtime. And it further obfuscates the memory layout--is it an object or not? How much memory will it use at runtime? How long does it take to become optimized? Am I paying the cost of a method call or not?


At least in Java it's rather easy with PrintAssembly, if you do care about performance.


An Option<T> is literally a pointer to T with some compile-time semantics. Which part of the memory model is hard to reason about?

Edit: Nevermind, I forgot this is only of T is a reference. Otherwise it's layed out like a normal enum.


Yes this is basically what I was talking about. It becomes pretty tricky to understand memory layout if you try to masquerade as a regular type (hence my apparently controversial reference to JIT compilers). I guess my point is that Go pointers are language primitives for that reason, and they support the fundamental "safe" and "unsafe" access operations that all those other languages have. So I don't think there's anything fundamentally different between the safety of Go pointers and optional types, but they are easier to reason about from a memory model perspective (they are laid out in memory exactly as you would expect).

Relatedly, in practice, a lot of Rust code I've worked with is littered with unwrap() calls.


You only have to deal with the extra complexity if you choose to put non-pointers into an Optional, though. If you use the same capabilities go has, there's no problem.

Not that "maybe it uses an extra byte, maybe it doesn't" is going to matter in 99% of situations.


> the other languages with JIT compiling runtimes aren't really useful comparisons

Interesting side-point re. language comparisons I noticed recently -- Java is often benchmarked together with compiled languages, although I would say it's only half-compiled (to bytecode, not to machine code). That's


Byte code compilation does extremely little about performance, it's all about the JIT. Hotspot was simply the 1st JIT compiler that was that good it was comparable to gcc's o2. Java is very much compiled to machine code/assembly.


But then why it's not compared in the same ballpark as JS, Python? They all have JIT as well. But Java is compared against AOT compiled languages.


I had to do a double-take; Rust, one of the languages leading the mainstream away from nullables, was created by Graydon Hoare (I looked it up and there's no relation to Tony)

But yeah: I once heard Go described as "a C fan-fiction". It was designed by and for people who use and like C, and just wanted some niceties built on top of it for their usecase of writing distributed systems (garbage collection, strings, easy networking and multithreading, easy cross-platform builds, very basic polymorphism). It was not designed by people who were interested in stepping back and re-evaluating the big picture beyond those practical necessities

I see it as C++ (literally, "C improved"), for the modern world (instead of the 80s). For all the good and bad that implies.


> But yeah: I once heard Go described as "a C fan-fiction".

It's more of an alt-java than a C fan-fiction.

Most of the stuff people are interested in in C, Go dropped. However while Go clearly disagreed with Java on the specific means and most important bits the core goals and aesthetics are extremely strongly reminiscent of Java's.


I think nil/null is a symptom of the real issue: the language permits partially constructed data types, so it has to assign something.

That's a consequence of the "always mutable" model whereby the responsibility of initialization can be shared by the object constructor and the caller.

But there are many cases where it's very intuitive for the caller to set up an object, especially in the objects as actors model.

I think to make that work, you'd want to track the evolution of the object's state, especially noting when all fields have defined values.


For what it's worth, it's more ceremony, but you can work fine with partially constructed data types without null, it's just that the nature of their construction either needs to encoded int the types directly, or they need to be initially constructed with defaults and you need to decide if you want them to explode/panic if they're not completed, or just return potentially unexpected defaults.

I've seen all these approaches in Rust.


> That's a consequence of the "always mutable" model whereby the responsibility of initialization can be shared by the object constructor and the caller.

It's not though. An "always mutable" model can require that objects be fully initialised up-front. Rust does exactly that.


I don't think mutability is problem here. Many imperative languages don't have null.


My experience writing Go is that nil pointer expectations are really not that big of a problem in practice. Linters and unit tests are quite good at catching those.


One can say that about literally any language with nil/null, and it's been empirically demonstrated beyond reasonable doubt at this point.

The reality is that you don't know where your nil-dereference bugs exist because your language doesn't detect and force you to eliminate them.


A nil-dereference is very easy to spot in production: you have a panic and a stack trace in your logs.

Which means you can check after a few years how many nil-pointer exceptions you got in your actual system. In my experience that numbers is really low.


> very easy to spot in production: you have a panic and a stack trace in your logs.

Not a great experience for your users though. And definitely not great for developers who get paged to handle these production incidents that could have been caught at compile time.


My experience is that any Golang codebase that makes use of protobuf ends up having DoS vulnerabilities (because they forget to check for nil somewhere).


Yeah it is a strange choice, maybe just reflecting what the creators of Go are used to. They have always had null and haven't had any problems with it, why change?

Null made since in the early days of C, an option type back then would have been prohibitively expensive. Today, not so, any language that can afford garbage collection, the cost of an option type is lost in the noise, and modern compilers can often optimize the cost away entirely. (though that would hurt Go's fast compile times a little bit)


Rust has a zero-runtime-cost Option<T>. Presuming T is a non-nullable pointer it compiles to something that uses the nullpointer to store the None varient. (This isn't a special-case for Option in the compiler, this optimization applies to any two-varient enum where one varient can store a pointer and the other varient stores nothing).


Early C compilers were not optimising compilers. The hardware was very limited.

Although I would agree it was possible to design the language much better to avoid Null, most buffer overflows etc.., C was not designed to be good from PL standpoint. It was designed to write something and then evolved.


Early C compilers were non-optimizing compilers because C is among the hardest languages to optimize, not because the hardware was limited. Frances Allen described C as a "serious regression" in language design which "destroyed our ability to advance the state of the art in automatic optimization" because of how its semantics blocked most known optimizing transformations.


> Null made since in the early days of C, an option type back then would have been prohibitively expensive.

Of course not. Option types are as old as C, and they don't cost anything more at runtime than a regular pointer, or a union tag, costs.


If they required values to always be initialized, that would either require a concept of constructors (the whole OOP thing they tried to avoid), or require all struct fields to always be explicitly initialized at every instantation (which can get awkward), or allow specify default field values (which sounds like implicit magic Go tries to avoid too). In Go, by-value semantics is more common than in Java with its mostly by-ref semantics, so nil is less of an issue. It's also easier to implement (the allocator already memsets to zero for safety) So it's just a combination of factors, they took a bunch of shortcuts. Not defending it, but their choice makes sense, too.


> Have we learned nothing?

I suppose not, if you are considering "we" as "the set of people that designed languages like Go or justify the design decisions made by such group".

If you are considering "we" as humanity, in general, I'd say that no, "we" actually learned a lot and "we" already know better. But you still have some building rockets to try to prove the earth is flat, unfortunately.


Well Go's goal is a lowest common denominator language with the least amount of features possible and the easiest to learn for new graduates. This essentially makes it almost a toy language in fancy clothes. A toy language with the money of google backing it so it does actually feel useable.


> the easiest to learn for new graduates

And I'd say it fails at that goal, judging by the number of gotchas in fundamental concepts of the language like slices or interfaces.

A surprising behavior of slices that had discussions on HN, and something I discovered in my own code as well: https://news.ycombinator.com/item?id=9555472

And of course this gem was on HN just yesterday: https://news.ycombinator.com/item?id=26631116


> easiest to learn for new graduates

This was never a goal of go. Go was conceived out of frustration with c++. They wanted to reduce language complexity and build times (among other things). For me personally it's C without the hassle.


This is false. Here it is straight from the creator of the language: https://www.youtube.com/watch?v=uwajp0g-bY4


"Not capable of understanding a brilliant language." Wow as a googler I find this... Awful.


Man, people are really downvoting accurate information just because the truth hurts.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: