Hacker News new | past | comments | ask | show | jobs | submit login

I think the idea of "useful zero values" in Go is a mistake from bias grown out of being at Google. Protocol Buffers implement default zero values (https://developers.google.com/protocol-buffers/docs/proto3#d...) and there's no way of discerning whether something is "false" or "unset", the empty-string or unset, etc. In that context, it makes perfect sense for Go to have similar behaviors for zero values.

In some respects, you can get around this by simply adding another field. You can have `bool over18; bool over18IsSet` and I'm guessing that Google's internal usage of protobufs does this.

In a certain way, even getting rid of null/default values doesn't fix all problems when it comes to things like updating data. Think about updating a record where a field could be set or unset - let's say a person's age could be a number or empty. If I want to send a request updating their name, maybe I send `{"name": "Brian"}` because I don't want to update the other fields. How do I unset Brian's age? `{"age": null}` makes some sense, but a Java (and many other language) deserializer will have null for age with `{"name": "Brian"}` too. I mean, the age field has to be set to something in the Java object. You could manually read the JSON, but that's janky and brittle - and hard in terms of interoperability with libraries and languages.

Maybe Google's protobuf designers would argue, "you really need to have explicitness around your values and forcing defaults means forcing engineers to deal with things explicitly."

I don't think I agree with that. I don't like Go's nulls and default values. I think most languages are moving away from that kind of behavior with other new languages like Kotlin and Rust going against it and older languages like C#, Java, and Dart trying to bolt on some null-safety (Java via the `Optional` object and C# and Dart via the opt-in `TypeName?` similar to Kotlin). It's possible that this is a wrong direction chosen by many languages. We've seen bad programming language fads before. In this case, I think we're on the right track and Go's on the less-great side.

Go has a lot to like, but this is one of those odd decisions. I understand why they did it. Go comes from Google where Protocol Buffers have similar default-value behavior. I think Go would be better if it had made some different decisions in this area.




I don't think it has anything to do with protocol buffers, but the behavior derives from the same intrinsic motivation.

If you don't have a zero value, a programmer has to pick one. What are they going to pick? Probably what the language picks for you, "int n = 0;", 'string foo = "";', etc. For a language, it doesn't really matter which side you pick (force programmers to select a value, or auto-assign one). For network protocols, defining empty is an important optimization -- if the client and the server are guaranteed to agree, you don't have to send the value. This is especially important where the client and the server aren't released at exactly the same time; the server may have a new field in the Request type that the client doesn't fill in. With a predefined zero value, it doesn't matter. (You can always add fields to your message to get the same effect, if you actually care. I've never seen anyone do this in any API, including ones that use serialization that doesn't have the concept of zero values. It's why Javascript has the ?. operator!)

Finally, Go came out in the proto2 era, which did have the concept of set and unset fields (and let the proto file declare arbitrary default values). Honestly, I wrote a ton of Go involving protos at Google, and never saw proto3 until after I left Google.


> If you don't have a zero value, a programmer has to pick one.

Most languages without null have an optional type which is used exactly for that. In these languages, this means the None value exists when you have optional things, but that, then, the compiler forces you to check whether your value is set when you want to use it. Serialization libraries then get the choice to handle these optional values as they wish, which can be not to send a field.

It's one of those things that may be hard to think about from an external pov, but it works just fine.

From users of these languages, proto2 was okayish, and proto3 was a massive regression. Another thing that's missing in protobuf is the ability to define union types. That's one frequently asked feature from typed functional languages for serialization protocols.


For programming languages, I agree with you. Though I don't really see how "int*" is different from "Optional<int>". You can write:

    func foo(maybeInt *int) {
      if maybeInt == nil { panic("not so optional!!!!") }
      ...
    }
Just as easily as:

    func foo(maybeInt Optional<int>) {
       switch maybeInt {
       case None:
          panic("not so optional!!!!!")
       ...
       }
    }
To me, it just isn't a big deal. Your program is going to crash and return unexpected results when it expects something to exist and it doesn't, and the type system won't save you. (Even Haskell crashes at runtime when a pattern match doesn't resolve. Don't see how that's any different than a nil pointer dereference. Your live demo is ruined.)

For protocols,an Optional type just pushes the problem one level down. Is the optional value "None" because the client didn't know the field existed, or because they explicitly set it to "None"? You can't tell.

I think rather than going 3 levels deep, it's easier to just define the default values and not distinguish between these three cases. If you want an Optional value, you can make yours as complex as you wish:

message Optional { int value = 1; bool empty_because_the_user_said_so = 2; bool client_has_version_of_proto_with_this_field = 3; }

Now if you get (0, false, false), you know that's because the client is outdated. If you get (0, false, true), you know that's because the user didn't feel like sending a value. And if you get (0, true, true), you know the user wanted 0. (Of course, there are all the other cases that you have to handle -- what about (1, false, true), or (1, true, false)?)

I think you'll find that nobody but programming language purists want this feature. If your message is:

message FooRequest { int foos_to_return = 1; }

You do the right thing regardless of whether the 0 in foos_to_return is what the user wanted, something the user forgot to set, or the user has an old version of FooRequest ("message FooRequest{}").*


> Though I don't really see how "int * " is different from "Optional<int>"

They wouldn't really be different if a pointer were only used to express optionality.

However, in go, that's not the case. Pointers are used to also influence whether a receiver can mutate itself (pointer receivers for methods), to influence memory allocation, and as an implementation detail for interfaces.

If I have "func makeRequest(client *http.Client)", how can I know if the function will handle a nil client (perhaps using a default one), or if the client expects me to pass in a client, and just uses a pointer because idiomatically '*http.Client' is passed around as a pointer?

The answer is, I can't know. However, if we have what rust has, which is Box and Option as two different things, we can get 'fn makeRequest(client: Option<Client> | Box<Client> | Option<Box<Client>>)'. We've made it so the type system can express whether something is optional, and separately whether something is on the heap.

In go, those two things are conflated.

Similarly, rust has 'mut' as a separate thing from 'Option', which is another case where the type-system can express something the go pointer also is sorta used for.

In practice, I think there's a clear difference. Most of the pointers I see in go (like pointer receivers on methods) are not pointers because they're expressing "this might be nil", they're pointers because the language pushes you to use pointers in several other cases too.*


> If I have "func makeRequest(client http.Client)", how can I know if the function will handle a nil client (perhaps using a default one), or if the client expects me to pass in a client, and just uses a pointer because idiomatically 'http.Client' is passed around as a pointer?

C/C++ implementations generally have a [[nonnull]] attribute to that effect.


> Though I don't really see how "int*" is different from "Optional<int>".

One requires you to handle the possibly-nil case in order to access the pointer, the other expects you to remember to check every time.

The following is completely legal, and results in a runtime error:

    func foo(maybeInt *int) {
      x := *maybeInt // runtime panic
    }
The following is not, and results in a compile-time error:

    func foo(maybeInt Optional<int>) {
       x := *maybeInt // compile error: need to unwrap maybeInt
    }


Annoying that the various Go linters I have installed (`go vet`, `golangci-lint`, `gosec`) don't catch "possible use of pointer before nil-check" - you'd think it would be an obvious case for those to handle.


That would be very annoying to use as many pointers are just assumed to be non-nil and you don't have any other option but to assume they're non-nil.

Also since Go supports nil receivers, it would have to require that every "pointer method" be checked for nil. Which technically they should, but...


I'd imagine it'd be one of the "disabled by default" in `golangci-lint`, for example - helpful for when you're doing an in-depth review of the code but probably not something you want running on every CI invocation.


There are two major differences.

For one, the compiler can force you to check for None. Trying to use an Option<T> as a T is a compile-time error, you have to write the pattern match to use the Some case.

For another, and this is the big one, you can write a function which takes a T, and you can't pass it an Option<T>. The compiler can statically confirm if your variable has already been "nil checked".

I used to write a lot of Java code, and since you can't know for sure that null-checking has been performed (among other reasons, that might change with a new code path) you just kinda sprinkle it everywhere. And still forget and get runtime errors.


    func foo(maybeInt Optional<int>) {
       switch maybeInt {
       case None:
          panic("not so optional!!!!!")
       ...
       }
    }
This function would never take an Optional in real life if it's just going to crash on None so I'm not sure what you're getting at here. The benefit of using Optional/Maybe is that you're encoding at the type level whether it makes sense for a given variable to be able to be nothing, and if it does make sense, the compiler makes sure you check whether it's nothing or not, but this is an example of where that doesn't make sense so the type should just be int instead of Optional<int>.


I feel like this discussion is departing from the simple clarification about what alternatives exist in the design space, and entering a debate about the merits of these alternatives vs. Go's choice, which was not my aim.

While I'd be happy to discuss about my experience using these alternatives and how it's practical beyond being a purist, it doesn't really belong to a discussion about Go.


One major difference is that *int is mutable. Using it as a replacement for "optional" values is potentially dangerous because you can't guarantee that it is never modified.


> For a language, it doesn't really matter which side you pick (force programmers to select a value, or auto-assign one).

Even (modern) C/C++ handles this aspect of memory safety better: There is no default value, and reading from an uninitialized value is a compile time error (usually, because C/C++ have baggage).


> Protocol Buffers implement default zero values (https://developers.google.com/protocol-buffers/docs/proto3#d...) and there's no way of discerning whether something is "false" or "unset", the empty-string or unset, etc.

This was one of the significant changes from proto2 to proto3.

This was also met with much opposition internally, and recently changed as of v3.12 (released last year).

https://github.com/protocolbuffers/protobuf/blob/master/docs...

https://github.com/protocolbuffers/protobuf/releases/tag/v3....


Our project got bitten by this hard. I was under the impression that they did this to enable memcpy/memmove into structs, which didn't feel completely motivated if you ask me.


>unset age

In my horrible opinion, we shouldn’t ditch null. We must introduce null flavors (subclasses) instead and fix our formats to support these. One null for no value, one for not yet initialized, one null for unset, one for delete, one for type’s own empty value, one for non-single aggregation (think of selecting few rows in a table and a detail field that shows either a common value, or a “<multiple values>” stub - this is it), one for SQL NULL, one for a pointer, one for non-applicable, similar to SQL. Oh, and one for not-there-yet, for async-await (a Promise in modern terms). These nulls should be enough for everyone, but we may standardize few more with time. Seriously, we have three code paths: normal, erroneous and asynchronous. Why not have a hierarchy of values for each?

Semantically all nulls must be equal to just “null” but not instanceof null(<other_flavor>).

Edit: thinking some more, I would add null for intentionally unspecified by data holder (like I don’t share my number, period), null for no access rights or more generic null for “will not fetch it in this case”. Like http error codes, but for data fields.


We have that in JavaScript with undefined. It's awful.

Here is different proposal. Let's allow people to define their own types of missing values. We'll call it Nullable<T> or Maybe<U>.


A usual Maybe(Just, Nothing) doesn’t cover these use cases, because Nothing is just a typesafe null as in “unknown unknown”. Case(Data T, Later T, None E, Error E) could do. It is all about return/store values, because you get values from somewhere, and it’s either data of T, promise of T, okayish no value because of E, or error because of E. Where E is a structured way to signal the reason. No other kinds of code paths exist, except exceptions, it seems. (The latter may be automatically thrown on no type match, removing the need for try-catch construct.)


My point is, there is no size fits all. Maybe you only have Some(data)/Nothing. Maybe you have a Some(data)/NoData/MissingData/Error(err)/CthuluLivesHere.

It's better you develop one for you and that suits you, rather than just a set of null-likes that are similar in meaning, but different in semantics.


Indeed: your language needs to support the ad-hoc creation of these primitives in a first-class way. (Which is why I still consider a typed language without union types to be fundamentally crippled.)


undefined is awful because you can use it anywhere. Done properly you would only be able to use it in APIs that specifically need to deal with that form of null.


It’s also awful because it’s unnecessary when we can just define e.g. Option<T> at the library level.


What you want is a 'bottom' class (as opposed to 'top' = Object), not null. Essentially, a class that subclasses everything to indicate some problem. Look at how 'null' works: the class of 'null' (whether it can be expressed in a language or not) is a subclass of anything you define, so you can assign 'null' to any variable of any class you define. This is how 'bottom' works, if you want it as a class. But you already recognise that this is not really what you want: you want specialised sub-classes representing errors of specific classes you defined, which are all superclasses of a global bottom class.

Such a system can be done, but it is probably super ugly and confusing. The usual answer instead is: exceptions, i.e., instead of instanciating an error object, throw an exception (well: you do instanciate an error object here...). That works, but if overdone, you get programming my exception, e.g., when normal error conditions (like 'cannot open file') are mapped to exceptions instead of return values.

The usual answer to that problem then is to use a special generic error class that you specialise for your type, the simplest of which is 'Optional' from which you can derive 'Optional<MyType>'. You can define your own generic type 'Error<MyType>', with info about the error, of course. I think (please correct me if I am wrong), this is currently the state of the art of doing error handling. It's where Rust and Haskell and many other languages are. I've seen nothing more elegant so far -- and it is an ugly problem.


Yeah, my gp[2][0] comment addresses okayish error values with Case(...). It’s interesting what do you think of this type? What would a language look like if that was built-in?


As I said, it will get super-ugly, and it has not been done (in any language with more than 1 user), I think. Why? Because you will want an error class for a whole tree of classes you define, and it is not so trivially clear how that should look like. A simple 'bottom' (i.e., 'null') works. But e.g. you have 'Expr' for your expressions and you want 'ExprError' to be your error class for that that subclasses all 'Expr' and is a superclass of bottom. Now when you define 'ExprPlus' and 'ExprMinus' and 'ExprInt' and so on, all subclasses of 'Expr', you still want 'ExprError' to be a subclass of those to indicate an error. That is the difficult part: how to express exactly what you want? How does the inheritance graph look like? At that point, languages introduced exceptions. And after that: generic error classes: 'Optional<Expr>' and 'Error<Expr>', etc., without a global 'bottom'. This forces you to think about an error case: you cannot just return ExprError from anything that returns Expr, but you need to tell the compiler that you will return 'Optional<Expr>' so the caller is forced to handle this somehow.


Most people start using Result/Either[0] when they need to define a reason for a value being missing. Then you can decide how to handle arbitrarily different cases of failure with pattern matching, or handle them all the same. The error types themselves are not standardized as far as I know, but I'm not sure how useful it is to standardize these differences at the language or standard library level. Is the theory that people don't use the Result type correctly as is?

[0] https://doc.rust-lang.org/std/result/ https://caml.inria.fr/pub/docs/manual-ocaml/libref/Result.ht... https://hackage.haskell.org/package/base-4.15.0.0/docs/Data-...


It's very usual in Haskell to define some error enumeration, and transit your data in `Either ErrorType a`. It's not a bad way to organize your code, but there is no chance at all that you'll get some universal error enumeration that will be useful for everybody.


> In a certain way, even getting rid of null/default values doesn't fix all problems when it comes to things like updating data.

The way this is addressed in Google's public APIs is with the use of a "field mask"[1]. You provide a list of dotted paths to the fields you want to update. I'm not sure if that serves as an indictment of the design decisions made in protobuf, or if it's just one less bad tradeoff among several bad ones.

[1]: https://github.com/protocolbuffers/protobuf/blob/e9360dfa53f...


The only apt comparison IMO is Rust, the other languages with JIT compiling runtimes aren't really useful comparisons.

To me this looks like a discussion about syntax and ergonomics. Go provides the same mechanisms:

* for safe dereferencing, use "val, ok := *valRef"

* for potentially-unsafe dereferencing, use "val := *valRef"

Every language in that list has equivalent mechanisms. In Rust you can use one of these methods: https://doc.rust-lang.org/std/option/enum.Option.html or pattern matching. That's a whole lot to pick from.

But Rust also makes it more complicated to reason about your memory, see for example https://doc.rust-lang.org/std/option/#representation

So, given Go's design tenets, using pointers makes a lot of sense to me. It is easy to reason about them in terms of memory and resource consumption, there are only a few ways they can be used, pass-by-value semantics further reduces the centrality of them, and they don't require a JIT compiler to be efficient.


Why does whether a language is JIT or not make any difference? C# is usually jitted but you can AOT compile it just like Go if you want, and you could JIT go if you want.


The memory layout of a pointer is quite different from a full-blown object. E.g. Optionals are only efficient in Java because the HotSpot compiler optimizes them at runtime. And it further obfuscates the memory layout--is it an object or not? How much memory will it use at runtime? How long does it take to become optimized? Am I paying the cost of a method call or not?


At least in Java it's rather easy with PrintAssembly, if you do care about performance.


An Option<T> is literally a pointer to T with some compile-time semantics. Which part of the memory model is hard to reason about?

Edit: Nevermind, I forgot this is only of T is a reference. Otherwise it's layed out like a normal enum.


Yes this is basically what I was talking about. It becomes pretty tricky to understand memory layout if you try to masquerade as a regular type (hence my apparently controversial reference to JIT compilers). I guess my point is that Go pointers are language primitives for that reason, and they support the fundamental "safe" and "unsafe" access operations that all those other languages have. So I don't think there's anything fundamentally different between the safety of Go pointers and optional types, but they are easier to reason about from a memory model perspective (they are laid out in memory exactly as you would expect).

Relatedly, in practice, a lot of Rust code I've worked with is littered with unwrap() calls.


You only have to deal with the extra complexity if you choose to put non-pointers into an Optional, though. If you use the same capabilities go has, there's no problem.

Not that "maybe it uses an extra byte, maybe it doesn't" is going to matter in 99% of situations.


> the other languages with JIT compiling runtimes aren't really useful comparisons

Interesting side-point re. language comparisons I noticed recently -- Java is often benchmarked together with compiled languages, although I would say it's only half-compiled (to bytecode, not to machine code). That's


Byte code compilation does extremely little about performance, it's all about the JIT. Hotspot was simply the 1st JIT compiler that was that good it was comparable to gcc's o2. Java is very much compiled to machine code/assembly.


But then why it's not compared in the same ballpark as JS, Python? They all have JIT as well. But Java is compared against AOT compiled languages.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: