Hacker News new | past | comments | ask | show | jobs | submit login
a[low:high:max] in Golang – A Rare Slice Trick (build-your-own.org)
130 points by weird_user on March 18, 2023 | hide | past | favorite | 125 comments



This paragraph is scary. It looks like a good idea for the Golang version of the Underhanded C Contest.

> This trick is useful for returning a slice from an immutable array; if you accidentally append to the supposedly immutable slice, a copy is forced and no data is overwritten because there is no more capacity left.


I don't know much Go but what does it mean to "accidentally" append to an immutable array? Are there just no particular guards for arrays and they are saying something which should be treated as immutable?


Go has no way to ensure "const correctness". If you pass around a slice (which you'll be doing a lot, since it's essentially Go's equivalent of a vector), everyone can modify the slice however they please. It's just a fat pointer.

So yes, something which should be treated as immutable.

The way Go works here easily leads to bugs, especially when concurrency is involved and slices get captured or used by goroutines.

If you send a slice to a function, they essentially receive a pointer pointing at the same block of memory as in your calling function. They can modify it. If they append to the slice, however, and surpass the capacity of the slice they received, then their slice gets reallocated and doesn't point at the same block of memory anymore.

In other words, if you receive a slice, append to it, and then change the first value of the slice, then this might modify a slice (or array) somewhere completely different, depending on whether your append surpasses the allocated capacity of the slice or not.

Welcome to Golang.


But also strings are actually immutable sometimes (at the very least when hardcoded), and they can be converted to a `[]byte` slice that looks mutable, but panics when modified. AFAIK no other slice-like data in Go behaves like this.

Fun!


This is misleading enough it's a lie; to do this you need to use a function named `unsafe.Pointer` with a type named `reflect.StringHeader`, not just `[]byte` and `string`.


It means there is special protection for string-data that is not available to anything else, and that library-authors get panic reports about immutable byte slices. To that degree, it's something that the specialized type exposes you to but the type system does not adequately protect or signal.

But yes, AFAIK this requires use of `unsafe`. Which does change things, and puts the blame for misuse squarely on the immutable-byte-slice creators.


> It means there is special protection for strings that is not available to anything else

Lots of languages have no C/C++-style const objects yet immutable strings. It's weird to pick on Go specifically for this. (Since it's the JVM model, at this point it may even be a majority of mainstream languages.)

If you mean specifically the panic when modifying a static string, is this not just the default `mprotect` on rodata? You can do that to any memory page you want. It's a feature of the kernel's memory management, not the type system or even the language runtime.

> library-authors get panic reports about immutable byte slices.

I'm really skeptical this happens in any meaningful amount. Go offers the feature to transform a byte slice you know you won't want anymore into a string without a memory allocation; or to pass a string to a function which wants a []byte and will not mutate it, usually to wrap an optimized implementation working on both strings and []byte. Modifying a string's contents is always undefined behavior, even if it doesn't panic immediately - the compiler will assume strings are immutable and make "as if" judgements accordingly.


> Since it's the JVM model

Java model. Kotlin collections are immutable by default (not sure about Scala, but I suspect they behave the same).


I'm pretty sure it's the JVM model. Kotlin collections are "immutable" in that they have no mutating accessors, which is not the same thing as a memory region defined to be immutable by the specification which can be relied on to e.g. trivialize certain compiler optimizations. (I plead ignorance about Kotlin Native, maybe it does such things.)

You can also look at what kinds of things are allowed in class constant pools; you will not find any collections.

https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.ht...


If you specifically mean RO memory pages, then yes, thats not the same thing. The language and the runtime will enforce collection immutability though, throwing an exception if you attempt to mutate. Poking at the underlying byte code will defeat this, of course.


It seems pretty absurd to me that you would blame Go’s design for silly things people do with unsafe. Yes, I wish ago had more/better immutability semantics and it’s slightly odd that strings are immutable but other types are not, but griping that something bad happened when you used a package called “unsafe” is pretty silly.


I don't think you can convert a string to []byte without unsafe except by copying?


Agreed. You can copy a string into a byte slice, but I don’t think you can convert one into a byte slice that panics on mutation.


Is it panicking on mutation because Go has some special logic, or is it panicking because the text segment is PROT_READ? If it's the latter, sure you can do that too.


I don’t know what PROT_READ is, but as far as I know, Go doesn’t allow you to convert a string into a byte slice apart from ‘unsafe’ (and presumably you’re free to modify that without panicking, but even if not, you used unsafe so the onus is on you to know what you’re doing). There’s syntax support for creating a byte slice by copying a string’s data, but that won’t panic on mutation.


This is a pretty good vignette of how Go is designed. "Users don't need this feature (immutability, polymorphism, etc.), let's not include it. Ah crap, turns out we actually need it for a core language feature. Is there a lesson we can take away here? Nah, just make an ad-hoc implementation of the functionality in this one place."


"X for me but not for thee" is very much the vibe Go gives me, yeah.

I mean, I have completely replaced my adhoc Python use with it and I'm much happier. But it's terrifying to use to build large systems.


Part of it is that "large systems" are almost all combinations of small systems with proto boundaries, so it's not much of an actual risk unless you're making giant monolith code.


"Proto" meaning protobuffers?


I think it rather "communication via some protocol" in general.

In other words if we are not talking about giant monolith - we have N small systems to begin with.


I meant specifically protobuffers in that case, but sure, that works.


Const is a hard language feature to design well. There are a lot of pitfalls in the way const is used in C++ and TypeScript (I’m including “readonly” in the discussion).

Rust gets it right, but Rust is relatively complicated.

The languages which are most similar to Go are Java and C#, both of which also lack const types, or have a very limited version of constness.


What are some of these pitfalls in C++'s const design? I assume the issue is that it's difficult to guarantee const correctness as long as you have the ability to fiddle with pointers?

Function signatures showing when a parameter gets modified and when not (such as in Rust, or even in properly const-speckled C++) would benefit Go, but I'm not sure how difficult it'd be to implement that. I could see it being difficult considering that one of Go's core features (slices) results in opaque and overlapping memory ownership.


For what it’s worth it sounds no worse than Java: you have immutable strings but you have no way to enforce that you can’t modify something that you have recieved, which is why the underlying char array for a string has to be defensively copied if the user asks for `asArray()` or whatever it is.

Strings being immutable doesn’t seem like a special case, though. The underlying mutable array is just encapsulated, which is something that you can implement yourself. (But reflection… maybe you can break the rules with reflection.)


Java gets a pass because it was early. Go had time to learn from its mistakes and…just refused to do so.


No worse, possibly better, yeah. In a fair number of ways I prefer it over Java, e.g. reflection being so limited makes code dramatically easier to understand with confidence, because a large number of common reflection shenanigans in Java simply can't exist at all in Go. The clean slate and lack of inheritance also means the incredible towers of inheritance insanity simply don't exist - it's wonderful.

But I work on a pretty big system. The near-inability to both efficiently and safely abstract things is a big problem, and it'll be years before mature uses of generics truly start to address that... when it even can. Then I miss Java quite a lot. Or maybe more accurately Kotlin. Or sophisticated code generation and bytecode modification. Or MAT. Or...


I don't think you can reflect onto the standard library anymore, with the recent "strong encapsulation".


I mean, what language got everything exactly right from day 0? Yes, Go started from a minimalist position, but that’s been a wildly successful decision—Go lacks a lot of the cruft of other languages. Go is an easily understandable language, it compiles super quickly, it has top notch tooling (e.g., compiling almost any Go project on any system with ‘go build’ and even cross compile by changing a couple of env vars), it compiles to relatively small[^1] static binaries by default, and it does all of this with pretty good performance.

[^1]: Someone is going to come in with some rant about how big a “hello world” binary is compared to C, as if this is emblematic of some real-world use case.


A cynical view is that 80% of Go's success is a result of the excellent tooling, and the fact that it's associated for and pushed by Google, while the underlying language is "mediocre" at best.

In theory I agree with Go's minimalist perspective, it just also feels inconsistent, and like they made a whole lot of bad decisions along the way. Favoring C-style enums over proper sum-types is one of the biggest one, and ties into Go's error handling, which continues to be a major talking point.


I fully agree with the enums thing—that’s by far my biggest gripe with the language. But even still, I can be a lot more productive in Go than I can in any other mainstream language, including languages that have sum types (Rust, OCaml, etc) which isn’t to say sum types make me less productive but rather that those languages have other productivity issues that outweigh the benefits afforded by sum types. Basically, Go gets a whole lot of little things right that most languages miss, but everyone fixates on “generics” and “sum types” which are, overall, quite small things IMHO.


Is the excellent tooling not also a property of a minimal language? Lisp has the "least" syntax/semantics and the best tooling, because it's so easy to write tools. Go has more complex syntax/semantics, but still much easier to wrangle e.g. control flow out of an AST, compared to other Algol derivatives.


Ironically that is exactly the same approach they had with C, so at least they are consistent.

"Although we entertained occasional thoughts about implementing one of the major languages of the time like Fortran, PL/I, or Algol 68, such a project seemed hopelessly large for our resources: much simpler and smaller tools were called for. All these languages influenced our work, but it was more fun to do things on our own."

-- https://www.bell-labs.com/usr/dmr/www/chist.html


I don’t think you can fairly call go’s generics an “ad-hoc” implementation. They added syntax for it.


I think in this context "ad hoc" refers to the context in which that syntax was added. IIRC the original creators were against generics ever being added to Golang, so they wouldn't have thought about their eventual introduction when choosing Go's initial syntax. The result is that the generics that eventually were added feel awkward and "bolted on" to many people.

(I don't have any strong opinions on it personally, because I'm not invested in that particular ecosystem. I'm merely attempting to distil what I've heard from other people.)


Maybe you shouldn't play game of telephone like that.

Go generics are exceptionally well designed.

The reason it took so long for Go to get generics is because Go designers took their time to arrive at a design that fits with the rest of Go.

It's not rushed, it's not "bolted on".

They did several designs that they rejected before they accepted the design that got implemented.


This is a truly unreal level of blub paradox and/or brown nosing. It's almost a complete inversion of reality.

Go's maintainers had to be beaten into bolting on a poorly-done implementation of polymorphism over like a decade. I can't imagine anyone who's used any language with polymorphism baked in to the design describing Go's implementation as "exceptionally well designed". If this is a Poe's law thing and you're just joking, then you got me.


If you are going to make such a claim then please back it up rather than state opinions as facts.



Yeah. Go generics are quite crippled. I'm still very, very glad that they exist though - even crippled, they're a vast improvement.

I think they took a... rather extremely-conservative step towards what their generics will eventually be, at which point they'll probably be pretty reasonable. As it stands now they're kinda weird and and very incomplete, though thankfully simple (in behavior).

They did at least leave syntactic and semantic room to improve them though, so I think it'll happen eventually. It was cut off at a safe point. They just need to be brow-beaten further, hopefully this small success won't stop the pressure.


I'm a big fan of Go, have been using it since r59 (pre 1.0), professionally working with it the past ~8 years at one of the earliest companies to adopt it.

The fact that you cannot have a generic method at all, and instead have to rewrite methods as functions.. that seems like a pretty glaring flaw.

I'm happy Go got some form of generics, definitely, but they really do feel bolted on to the language.


go's authors were never against generics, they just didn't consider them a table stakes feature

go's generics are pretty good, i've never heard anyone disparage them as "bolted on" or hard to use


I haven’t heard many gripes about the syntax, but I have heard plenty of gripes about how long it took them to add them. Frankly my biggest grievance with Go’s generics is the goofy dictionary implementation that makes performance difficult to reason about. I also think the people who complain the loudest about missing generics were often just unaware that there are other (often simpler) ways to achieve the same thing—it often feels like people were just angry that Go didn’t look exactly like $theirFaveLang. There are definitely some use cases that are improved by generics, but most complaints were about avoiding writing relatively simple loops (iterator chains are more readable in the trivial cases, but quickly become less readable particularly when you need to short circuit on errors and so on—a loop is often cleaner, clearer, and faster to implement).


I had to add a code generator, because of the lack of generics


Yeah, like I said, there are some cases where generics are genuinely helpful. It’s just a small percentage of the code I tend to write (generics have been out for quite a while, and I still avoid them in Go despite my familiarity with languages like Rust, TypeScript, etc that use them extensively).


When you finally cave and jury-rig bolt on language features like a decade after the fact, that's not quite as ad-hoc as what they had for the first decade (generics for builtins only), but it's still pretty ad-hoc in the sense of "not being derived from a coherent theory"


Which modern languages do const properly?


Rust and Haskell come to mind.


Any ML derived language.


"one goroutine has access to the value" :O

If you stick to "values" Go works as originally advertised (see below). With values iirc the consensus became that for high performance the channel overhead was too much of a hit and so back to locks and 'traditional' concurrency.

The concurrency issue in PLTs is not a logical puzzle. It's a performance challenge revolving around copying stuff and hand-offs. With multicores thrown in, it seems to really require addressing the challenges once and for all (as services) at the OS and possibly even hardware layer. IF we have efficiencies at the hw & os around 'message passing', any variation on CSP would address the concurrency issue, "by design", as it says below in Go's "Effective Go" documentation.

https://web.archive.org/web/20091111073232/http://golang.org...

https://web.archive.org/web/20091113154825/http://golang.org...

Share by communicating

Concurrent programming is a large topic and there is space only for some Go-specific highlights here.

Concurrent programming in many environments is made difficult by the subtleties required to implement correct access to shared variables. Go encourages a different approach in which shared values are passed around on channels and, in fact, never actively shared by separate threads of execution. Only one goroutine has access to the value at any given time. Data races cannot occur, by design. To encourage this way of thinking we have reduced it to a slogan:

Do not communicate by sharing memory; instead, share memory by communicating."


On the bright side, if one tests with race detection enabled these problems are usually made apparent.


This issue can trivially occur in sequential code.


Here's a succinct demonstration: https://go.dev/play/p/Li6_Rpe2R5L

Basically: a slice is a triple of {ptr to allocation, number of elements used, size of allocation}

the slices will share the same underlying allocation. By specifying the third parameter in the slice function, you set the capacity equal to the number of elements in your slice. This forces the next append to reallocate and copy the contents into a new region.

Without bounding the capacity when creating a new slice, the append operation could possibly continue using the same allocation, shared with the original slice. It could possibly resize and copy as well. It depends on how full the slice is, and is an implementation detail subject to change between versions.


Here another illustration of how slices sometimes alias each other, sometimes not. https://mobile.twitter.com/erikcorry/status/1561635841495236...



Append mutates or copies-and-appends based on the capacity of the slice it's given. And since slicing an array or a slice gives you just a view, not a copy, it can cause "spooky action at a distance" and mutate things you didn't expect, especially if it was a slice with extra capacity of data used somewhere else. Which is what this article is describing.

Plus the knowledge of the change in length (to see the appended data) is only visible with the returned slice (a new view with the larger length), but the underlying data is changed either way.

It's not often a source of errors, but when it is it can be extremely hard to diagnose.


So in other words, appending to a slice may overwrite data in the middle of a different slice?


Yep. Depending on how the slice was constructed, not how it is used.

Slices (largely) behave like this in most languages, Go's contribution is mostly that slices and append are ubiquitous, so pretty much every Go coder is exposed to it. Prior to generics, literally any alternative was so much more work they essentially haven't been used. That may change in the future now that we do have (very simplistic) generics, but only time will tell.


Slices (largely) behave like this in most languages

Is that really the case? I can’t think of many languages that let you construct a view into the middle of array, pass that view as an argument to a function call, and allow that function to add new values into your array via the view.

In Python or JS, for example, the “slice” would just be a brand new array, right?


If the slice syntax creates a copy, I would argue that it's simply syntactic sugar for array copying, not actually producing a slice (i.e. there is no slice type). But yes, `somefunc(ary[1:])` in Python produces a copy, not a reference to the underlying value. You could build a more "true" slice class, but the builtin stuff doesn't do that. JavaScript is similar.

Java however has `List<>.sublist` which largely behaves like Go: https://docs.oracle.com/javase/8/docs/api/java/util/List.htm... . Sometimes these kinds of things are also referred to as "views". Go calls them slices though, and this is a Go article.

edit: C# apparently has "spans": https://learn.microsoft.com/en-us/archive/msdn-magazine/2018...


It doesn’t look like any of those allow you to arbitrarily add and remove elements from the parent list in the same way Go does (if I understand Go slices correctly).

Java does let you make modifications, but only under very tight restrictions:

The semantics of the list returned by this method become undefined if the backing list (i.e., this list) is structurally modified in any way other than via the returned list. (Structural modifications are those that change the size of this list, or otherwise perturb it in such a fashion that iterations in progress may yield incorrect results.)

I take that to mean the parent list can’t be modified at the same time, nor can multiple views be modified.

So, I think Go’s ability to do those structural modifications on slices is rather unusual (as well as error-prone and not obviously useful).


Go slices cannot change the structure (len or cap) of their "parent" slice. (Unlike Java, the semantics of every operation are even well-defined.) In some sense this is the root of the problem - the "child" slice can start using spare capacity it 'inherited' with no way to tell the parent it has done so, and the parent may inadvertantly pass some of its length as capacity down to its child.


You can't insert or remove elements, but you can overrun the buffer??

I have a hard time thinking of a situation when that would be useful at all, let alone useful enough to be the default and widely-used behaviour! If I want to let the callee add stuff to the end of my list, can't I just pass them the whole list and let them modify that?


You cannot "overrun the buffer" nor can a callee add stuff to the end of your list; you can give them a memory buffer with unused space and they may use it, and then you can also mistakenly use it later.

Honestly, you seem to be too detached to understand this. If you really want to know how it works go through some official Go documentation. It's not fundamentally different than some feature in other languages, it's just that in Go this is the default growable list type and in other languages it's usually a non-default type.

As for whether it's ultimately useful - of course we can debate, but everyone still builds lists one element at a time and allocators still allocate by doubling, so there's at least one immediate and obvious use of capacity beyond the used length.


I coincidentally was just looking at some docs, and I think I have a handle on it (to it?:) now.

The specific strange design decision in Go is that you can create a child slice that has its own starting offset and length, but may or may not inherit its parent’s capacity. Just because it’s well-defined doesn’t mean it isn’t tricky.

Even if you kept the semantics the same but made cap=len for subslices by default, surely that would be an improvement. Rather than a “rare slice trick”, it ought to be normal. Or is there an advantage to the current default that I’m overlooking?


I mean, if you want Java, it's right there.

The advantage is that you get the fast path by default. Since Go programs are generally not awash in buffer reuse bugs despite its semantic trickiness, this seems like something different languages can reasonably prioritize differently.


cap==len is effectively the default - most slicing is done with [start:end], not [start:end:additional-cap]


No it’s not -- I just checked the language specification and it says:

After slicing the array a

a := [5]int{1, 2, 3, 4, 5}

s := a[1:4]

the slice s has type []int, length 3, capacity 4

Many of us here are arguing that the default should be cap==len==3, but it’s not, it’s 4.


Welp. TIL.

Yeah, that's much more error prone I think. It's the sort of thing that only kinda makes sense on zero-valued arrays... and even then it's dubious at best.

Bleh. I'm gonna go review some old code now D:


Another example is typed arrays in JavaScript, which have both .slice() and .subarray(). One creates a copy, and one creates a new view into the same underlying memory.


You can’t change the length of a JS ArrayBuffer, or write past the end of your view.


ArrayBuffers have a byteLength and a maxByteLength and this is similar to Go's len and cap, with the `resize()` being equivalent to a reslice or append on the subview.


The difference is that those operations are on the root ArrayBuffer in JS, not the TypedArray views on that buffer.

In Go, the underlying array is fixed, and each slice independently has resize/reallocate operations.

The JS approach seems better and more comprehensible to me. (I note also that MDN lists “resizable” and “maxByteLength” as new experimental features.)


> Slices (largely) behave like this in most languages

Only on assignment, not in appending.

> Go's contribution is mostly that slices and append are ubiquitous

Go's contribution is the conflation of slices and vectors, which in most languages are separate (or really most languages only have the latter and don't provide access to backing arrays, thus precluding this specific confusion).


In C++, until C++20’s std::span, you would just use std::vector for stuff you can modify and const std::vector for stuff you can’t.

I think the real problem is that Go’s type system doesn’t catch the common error of keeping a reference to something you don’t own, or similar errors. C# catches some of these errors by letting you return a IReadOnlyList<T> or some other restricted type. Golang has ways to narrow certain types, but the slice type is primitive and has no narrowed read-only variation. C# instead forces you to pay a higher runtime cost, because you’re paying a lot more for indirection with IReadOnlyList<T>.

People fret about the difference between T[] and List<T> in C#, and they do come with different performance characteristics… Go’s choice to use slices everywhere does have a certain advantage that you’re getting the fast path everywhere, and you’re spending less time thinking about which one to use.

Not trying to say that Go is “right” here, it’s just my viewpoint that these language design decisions are rational.


> People fret about the difference between T[] and List<T> in C#

That doesn't align with my experience. I've worker for small (4 programmers) and large (100s of programmers) C# shops, and I don't recall people "fretting" about T[] and List<T>. People see T[] as a non-growable, less useful version of List<T>. HashSet<T> vs List<T> seems to cause much more trouble for novice (and sometimes experienced) C# programmers.


I'm thinking more about the places where people care about performance, since accessing a T[] is faster than accessing a List<T> (less indirection).


There are no immutable arrays. If you have a slice of capacity ten with four elements, you can write a fifth element to it. If you reduce the capacity to four, then a new slice must be allocated to store five elements.

The backing store for a slice is just a pointer. The rule is you don't give people pointers that you don't want them to write to.


That's just not true. There are immutable arrays in length. [4]int cannot be appended to. The backing of all slices are array types and array doubling is used for appends that go beyond the capacity.

https://go.dev/ref/spec#Array_types


No array can change its length, the length is part of the type. There really are no "immutable array"s.


Yeah, that was a mistake.


Sounds like they're just badly describing copy-on-write.

The original slice uses no extra memory because as long as you treat it as immutable it won't actually make a copy. As soon as you try to modify it, then the copy is made and the extra memory used.


Yes-ish; because go has no immutable slices, it's just copy-on-realloc. Which is if course obvious if realize that's what it's actually doing…


> Sounds like they're just badly describing copy-on-write.

It’s not quite copy-on-write, and what it’s really describing is a workaround / safeguard.

The problem is that by default Go slices will have as large a capacity as they can based on the parent, this is a problem if you return a slice to a still-in-use slice or array, and the caller decides to use that slice as a vector (either because the contract is not well documented or because they fucked up): appending to the “borrow” will happily go and stomp over the backing array, which may be holding in-use data of an other slice or the original array.

By forcing the slice to have no extra capacity, if a caller tries to append it’ll force a “fork” by realloc-ing and avoid the issue.


I know even less go but it sounds like it prevents accidentally writing past the end of the array by making a copy and appending to that.


why do people write articles about go features? when PHP was in its prime, almost nobody wrote blogs explaining how they found some philosophy in PHP. there's a reason for that.

when you see someone open their article explaining a language feature by talking of the implementation details or specific use cases, that's a language smell (of course all industrial PLs stink).

ironically go is the only post 80s language that uses "memory safe" as a marketing point (even though they all are), yet go has the most memory unsafety of post 80s industrial languages. you can parse something and pass on a slice somewhere. if you mistakenly slice that slice with a bigger size - this incorrect size being the programmer bounds check error - you restore some of the original array that was supposed to be cut off and teh next operation working on that slice will thus modify or leak data:

    package main
    import "fmt"
    func main(){
            a := [3]int{1,2,3}
            b := a[0:2]
            fmt.Println(b[1])
            c := b[0:3]
            fmt.Println(c[2])
    }

    $ go run a.go
    2
    3
the other example of memory unsafety in go being that modifying slices between threads can lead to actual memory corruption, not just simulated memory corruption as above

the point here is that this footgun doesnt even have a real point outside of some insane performance argument. nobody would ever design something like this without massive cognitive dissonance (aside from industrial PLs, which just copy and modify the previous industrial PL, C in this case). all go's primitives are rigged like this with unintuitive behaviors. its amazing how much such a simple language with small scope can get wrong. and i expect nothing less from people who go around saying "zeroeth". DAY OF THE BOBCAT SOON


Yeah, the most surprising thing about Go's slice expressions is that you can reslice a slice beyond its length, as long as it's still within its capacity.

I wonder how many off-by-one bugs have happened undetected because a slice is unintentionally resliced beyond its length. Instead of crashing, so that the issue is known early, the program will still run with inconsistent data.


calling articles about language features and/or their implementations a "smell" is some pretty insane stuff

the slice behavior you demonstrate there is well-defined by the language spec, it's totally memory safe, it doesn't demonstrate memory corruption or anything like that

go is probably the most successful new language since java, if you don't like it that's fine, but it's nonsensical to call its design decisions "wrong"


I’m pretty sure the most successful language since Java is JavaScript.

Of course it has even more bad design decisions than Go, but people don’t leap out to defend them quite so much.


I'm not sure how your snippet above exemplifies memory unsafety.

Concurrent access does let you hit some 'fun' behavior, but you have to be doing pretty dumb things to hit them. And while the implementation may be able to save you from something like that, such things would likely bubble up elsewhere(disk i/o, network i/o, etc) if doing that kind of thing.


I would also consider this to be memory unsafe. If you have an "array", you should not be able to index (or slice) beyond its bounds. If you are allowed to do so, you may have unpredictable junk in your array.


This is not any usual definition of memory-unsafety; the contents may be useless to your task at hand but it's well-defined.


TLDR:

The best part of Go is that there is very little magic in Go. If you understand that slices are just fat pointers implemented as a built-in, there is nothing confusing about them. I can understand every part of a Go program, all the way down to the language syntax that generate assembly. I don't have to be afraid of or be mystified by any language feature, because 1) there are few, 2) they are just programs implementable in Go. This does not happen with many languages.

Longer version:

Go didn't need to add slices as a language feature (it could have been a library function of containers, as fat pointers are not a new thing), but having it in the language makes using them easy. And not having generics at the start sort of forced their hand.

And as slices are just fat pointers to an underlying array, obviously it's not multi-thread safe.

So if you understand that slices are just C-style structs with pointer to data, a length counter and a capacity counter, then nothing in your example code is surprising. There is no hidden memory copy, no hidden synchronization lock to make it thread safe. And Go's a = append(a, item) now makes sense, because if 'a' grew in size, append would have to create a new underlying array, and a new slice struct with a pointer to new data. To me, it's much easier to reason about what the code is doing than other languages with Array types.

> nobody would ever design something like this without massive cognitive dissonance

Somebody did, without any cognitive dissonance. And I like it :)

> just copy and modify the previous industrial PL, C in this case

Go really wanted to be "A Better C". The language is not much larger than C, removed a bunch of C foot-guns, and it's as capable as Java, if not a bit more. I think the compromises Go made were well considered compared to other C family of languages.


> The best part of Go is that there is very little magic in Go. If you understand that slices are just fat pointers implemented as a built-in, there is nothing confusing about them.

It's just a tautology. If you understand something, of course by definition you aren't confuse about them. By using the same logic, all languages have "very little magic."


But the big difference is time to understand it


Offloading that pain of learning as the pain of using.


How can you call slices "little magic" when append may or may not modify the original slice?

So you pass something as a value, and something else as a pointer. But slices are passed as values but sometimes act like pointers.


Why not just use c at that point?


Why not use C instead of Better C? Is that what you're asking?


probably because it's simpler and there are fewer things to worry about. few write their web app in C.


Oh, I rather use this version :

  package main

  func main() {
    a := [2]int{1, 2}
    b := a[0:1:1]
    c := b[0:2]
    println(b[0], c[1])
  }


Tangential, but if OP is the owner of the site, could you talk a little bit about your book writing process?

- How you write

- How you render the PDF

- How you develop your plans

That kind of thing. I find it super interesting to self-publish software books and I've been slowly writing one for about a year now. Really curious about this stuff in general and it looks like you've got a solid process down.


From the author of Crafting Interpreters and Game Programming Patterns, some interesting stuff here about how he went about his two books which are excellent quality http://journal.stuffwithstuff.com/category/book/


My fave golang slice trick is the len of an empty slice is 0, but the slice itself is == to nil, but the len of nil won't compile. Can't understand that one.

https://go.dev/play/p/MslCkBphl7q?v=gotip


Because nil is a special thing in Go.

nil is not a value, it's a predeclared identifier.

it represents zero value for pointers, interfaces, maps, slices, channels and function types, representing an uninitialized value

len(nil) feels like it should work if you think of nil as the same as "value of empty array"

but what should be:

  var p *Struct
  len(p) 
  ???
There's no "length of zero-valued pointer".


Because the bare value `nil` has no type. Typing it, e.g. `len([]int(nil))` works fine.


This is the most satisfying answer, but it doesn't make the implications less complex.


What implications did you have in mind?

In languages with null and type inference, `var x = null` is probably not going to infer the type you want. In languages with function overloading (which is essentially the case for Go `len`), `f(null)` is going to be a compile-time error if multiple overloads are potentially null.


the ability to cast nil as a zero length array means there's no difference between a function that returns a zero length array and a function that returns nil (assuming some casting process takes place). It could be a subtle and annoying bug to track down the difference.


nil isn't cast to an empty slice (Go doesn't have casts, except maybe the new pointer-to-array syntax if you want to count that), nil is the default value of a slice, and that value is also empty. Other empty slices may be non-nil, because they may have capacity, or have been sliced out of another buffer, etc.

Of the various legitimate issues around nil (box vs. unboxed, nil receivers, nilability of all pointers), this is the most not-actually-ever-an-issue.


Oddly the first issue I came across today was folks being confused about this. I haven't gotten to the bottom of where the code is now, but this go library was messing up len == 0 vs nility and someone forked it to fix it: https://github.com/algorithmiaio/mapstructure/pull/1 It may not be a common bug but that doesn't make it less of a pitfall.


No, this is a terrible idea and you're in for a world of pain. If you want to distinguish "not set" from "length zero" use a *[]T or ([]T, bool), not a nil vs. non-nil-empty []T. A nil slice is not any special kind of empty slice, it is just the most efficient representation of an empty slice.


> the len of an empty slice is 0, but the slice itself is == to nil

That’s not true. An empty slice is initialized: []T{} or make(T[]); it’s not equal to nil.[1] The zero value nil slice is technically not an “empty slice”. Colloquially you may call a nil slice an empty slice, but the nil-ness is still an important distinction that manifests in e.g. encoding/json.Marshal; nil marshals to null, whereas an initialized empty slice marshals to [].

If you want to test the emptiness of a slice, test the length, don’t compare it to nil.

[1] https://go.dev/play/p/IP2NIgwvaTR?v=gotip


How does it look with json.Unmarshal? Does a JSON null decode into nil or an empty slice? And a missing JSON field?


Empty slices also serialize to `null` instead of `[]` when using the default json encoder.


Question: what is the reason for the silent copy when append exceeds the original slice cap?

It's a footgun avoided by reading the spec and (maybe) remembering it in practice, but it feels like it would be safer to throw a comp error and force the user to deal with it when a user is trying to exceed the cap of the underlying array?

Alternative is defensively using len() and cap() for slice ops in which case error-ing out feels more ergonomic.


Because you would not have any growable vector/list structure otherwise.

The real problem is that Go merrily lets you have copy-and-append operation on a slice (good), subviews of a slice so that you can share subsets of the data without copying it (good), at the same time (very bad: any operation on either will lead to confusion).

In most languages, subslicing gives you something of another type that can't be modified (or at least not accidentally). But in Go, if I call a function `do_smth_with_slice([]byte xs)`, there is no way for me to know whether this function expects `xs` to be mutable or not.

I'm sure that e.g. a C++ function taking an std::view can do some forbidden magic to still modify the underlying data, but at least the original intent is made clear by the argument type.


> Question: what is the reason for the silent copy when append exceeds the original slice cap?

Because Go slices play double duty as vectors. And that is the usual behaviour of a vector.

And the issue is the opposite situation, when appending does not exceed the original slice cap. The entire point of the slice trick is to force a resize (and thus a copy) on append.

> it feels like it would be safer to throw a comp error and force the user to deal with it when a user is trying to exceed the cap of the underlying array?

It would be safer to have not confused slices and vectors, but half-adding that confusion sounds even worse, your suggestion would only keep the worst parts, and would require hand-rolling the rest every time.


Erroring on appending to a slice would require checking every call to append for an error. I'd find it more surprising for append to error on resize since append implies a growable array.


This to me is a glaring case of a poorly named variable.

a[low:high:capacity] would be much easier to understand at first sight.


As high is not len, max is not capacity, low gets subtracted from both.


0. Only 2 of 3 values are necessary.

1. Can 1 value be unspecified?

2. Can 2 values be unspecified?

3. Can 3 values be unspecified?

4. Are conflicting values interpreted as intersection of ranges rather than union?

Note 1-3. With 2 values, I believe x[:] is how to lift a sized array into a more generic slice type.


Regarding conflicting ranges—

In a[i:j:k], 0 <= i, i <= j, j <= k, and k <= cap(a). If not, then the operation will panic.


Your questions are really unclear.

In the 2-parameters form `a[low:high]`, both values can be left out, defaulting to respectively 0 and len(a). In the 3-parameter form, only the leading value can be left out, defaulting to 0.

I've no idea what (4) is asking about.


So this is the same as

    a[low:min(high, max)]

?


No.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: