What's a reference in Rust?

kibwen · on Nov 29, 2017

> In Rust, a boxed pointer sometimes includes an extra word (a “vtable pointer”) and sometimes don’t. It depends on whether the T in Box<T> is a type or a trait. Don’t ask me more, I do not know more.

For those wanting to know more about this, the idea is that types whose size is unknown at compile-time receive this two-word representation. I tend to refer to these as "fat pointers", which is terminology from Cyclone (though Cyclone's fat pointers serve a different purpose). More documentation on these can be found at https://doc.rust-lang.org/beta/nomicon/exotic-sizes.html#dyn... and in the section in the book on slices (terminology taken from Go, whose slices are similar though with an extra word) https://doc.rust-lang.org/book/second-edition/ch04-03-slices...

saghm · on Nov 29, 2017

> terminology taken from Go, whose slices are similar though with an extra word

Interesting; I've always thought of Rust slices as being rather different from Go slices. A Rust slice is always used through a reference and doesn't own its data, whereas a Go slice is not generally used through a pointer and sometimes points to a heap allocated section of memory, so it's basically the union of Rust's vector and slices.

Tangentially, the inability to tell whether some value is heap allocated or not from the type is one of my main gripes when working with Go as opposed to Rust; in Rust, I can be sure that `Vec`, `String`, `Box`, `Rc`, and `Arc` are all heap allocated and that slices, arrays, `str`, `&T`, and `&mut T` are not. In Go, slices and pointers might be heap allocated--or they might not.

dan00 · on Nov 29, 2017

> in Rust, I can be sure that `Vec`, `String`, `Box`, `Rc`, and `Arc` are all heap allocated and that slices, arrays, `str`, `&T`, and `&mut T` are not.

That's not really true, because you can easily create e.g. a `&str` out of a `String` or a `&[T]` out of a `Vec<T>`.

kaoD · on Nov 29, 2017

> you can easily create e.g. a `&str` out of a `String` or a `&[T]` out of a `Vec<T>`.

I think what op means is both `&str` and `&[T]` live on the stack, even if they reference slices of heap-allocated (or static) data.

dbaupp · on Nov 29, 2017

The String and Vec objects live on the stack in the same way, but they contain pointers to heap data. I think the grandparent was trying to get at the fact that &str etc don't force things onto the heap nor do they keep things on the heap alive.

saghm · on Nov 29, 2017

Yep, from looking at all the comments in response in my original comment, it seems like I didn't do a great job explaining what I meant here; the basic idea is that if I'm trying to optimize my program by minimizing heap allocations, I can safely ignore any instances of &str, &[T], etc. and just focus on String, Vec, etc.

merb · on Nov 29, 2017

is there a way to force a copy? like String copy to &str (on the stack?)

kaoD · on Nov 29, 2017

`String` copies are always `String`s and will live on the heap. AFAIK there are no stack strings (plain `str` which I guess is what you mean?)

IIRC small fixed-size byte arrays (`[N; u8]`) are sometimes allocated on the stack depending on their size, but they are plain bytes and not full-fledged UTF8 strings. You can convert them into `String` but that would heap-allocate them.

To expand on this: in Rust copy strinctly means there is a new owner that is tasked with deallocating that data once it goes out of scope. Move is the same but the ownership is transferred (thus the old owner is no longer responsible for deallocating anything) insted of having a new copy and an additional owner. Copy always results in a new object of the same type.

& types are always borrowing. You can't copy into a reference since references are just borrowing of data, and owners of references won't deallocate anything (since they assume that, as long as they can hold an &, the data they reference is still alive, which is true because owners can't deallocate anything if there is a borrow in place).

EDIT: I was wrong. As sibling comment says, you can convert a stack-allocated fixed-size array slice into `&str` with `str::from_utf8`.

Check the last example here: https://doc.rust-lang.org/std/str/fn.from_utf8.html#examples

oconnor663 · on Nov 29, 2017

> whereas a Go slice is not generally used through a pointer

I think this is referring to a syntactic difference more than an implementation difference. In Rust, a &[u8] (usually called a "slice" but maybe more technically a "slice reference") is a pointer + a length. This is basically the same as Go's []byte, which is a pointer + a length + a capacity.

Rust also sometimes uses the [u8] type (without the &). This is an "exotic" type, in that it has no fixed size. It refers to the bytes inside the slice, but it's not really a pointer to them -- it is the bytes themselves, however many of them that might be. This mostly comes up when you're dealing with generic traits like AsRef or Deref, which will put the & back in all of their method signatures.

saghm · on Nov 29, 2017

I'm not sure that I'd agree that `[]byte` in Go is "basically the same as `&[u8]` in Rust; making a `&[u8]` will never cause a heap allocation, which is what I was trying to get at with my original comment. If I want to track down all of my heap allocations in my program in Rust, I can safely ignore all of my slices, whereas in Go, I have to carefully reason about each usage of one.

oconnor663 · on Nov 29, 2017

Got it. In that sense a Go slice and a Go array pointer (a &[n]byte) could have the same effect, is that right?

What I wanted to emphasize was that whether you're reading bytes through a Go []byte or a Rust &[u8], the same "number of hops" is happening at runtime.

sriram_malhar · on Nov 29, 2017

> the inability to tell whether some value is heap allocated or not from the type is one of my main gripes when working with Go

Why is this important? The whole point of GC is to not spend time debating this point, as long as performance is good enough

saghm · on Nov 29, 2017

Normally, it isn't! Unfortunately, "good enough" is relative, and for some applications this is vitally important.

I'm not trying to knock Go's performance here; from a naive standpoint, GC'ing only some pointers is better than GC'ing all of them like you have in more traditional garbage-collected languages, but it's still easier to know what exactly is being heap allocated and what isn't in a language like Java precisely because you know that all objects are on the heap. From what I've seen of low-level optimizations in Go code, it relies heavily on techniques like generating flame graphs to analyze where allocations are occurring, which IMO isn't a very good workflow, whereas in Rust you could do this much more easily by just looking at the types that are used. I don't think this approach is necessarily incompatible with garbage collection; theoretically a language like Go could have separate vector and slice types like Rust does, and I think that would make these types of optimizations much easier!

(I'm also not sure why you were downvoted for asking this; it's a perfectly reasonable question)

sriram_malhar · on Nov 29, 2017

I wonder if you have seen the OpenHFT project written in pure Java for high frequency trading (https://github.com/OpenHFT). Would 175 _million_ trading transactions per second on modest hardware be considered good enough? Check out the Chronicle log in the same project, that persists tens of millions of records on disk.

All it takes is a basic understanding of cache architecture and of generational GC, and simple data structures.

saghm · on Nov 29, 2017

Sure, for high frequency trading, I think that's good enough! On the other hand, if I'm writing an operating system or a device driver, getting a GC'd language to be "good enough" is a very different type of problem.

As an aside, I don't think Java actually suffers from the specific problem that I was mentioning in my original comment, namely that it's hard to tell what's on the heap or not. I was under the impression that all objects on Java are on the heap, which makes it trivial to determine whether something is heap-allocated or not based on the type like in Rust.

sriram_malhar · on Nov 30, 2017

> On the other hand, if I'm writing an operating system or a device driver, getting a GC'd language to be "good enough" is a very different type of problem

Nicklaus Wirth's Oberon OS (written in Oberon), Microsoft's Singularity OS (written in a variant of C#), the Mirage Unikernel written in OCaml, these are all examples of OSs written in GC'd languages. I am not aware of performance being an issue in any of these cases. Oberon was extensively used at ETH, and the components of Mirage that I am aware of (such as their OpenSSL and DNS) are competitive in performance with their C counterparts.

littlestymaar · on Nov 29, 2017

That's where the generational GC makes all the difference. Go people don't agree[1] with almost all the industry on this topic though.

[1] : https://groups.google.com/forum/#!topic/golang-nuts/KJiyv2mV...

cdoxsey · on Nov 29, 2017

> I don't think this approach is necessarily incompatible with garbage collection; theoretically a language like Go could have separate vector and slice types like Rust does, and I think that would make these types of optimizations much easier!

Absolutely. It's not a GC issue, it's a design issue. Adding this kind of control would make the language harder to use.

Making the hard case easier to handle for experts makes the easy case harder to handle for everyone.

saghm · on Nov 29, 2017

Yep, I definitely agree here! I think it was a reasonable choice for Go to make, I just personally prefer working with things the Rust way.

lmm · on Nov 29, 2017

For the record, in modern JVMs with escape analysis you may find that some objects aren't actually on the heap.

hyperpape · on Nov 29, 2017

Go also uses escape analysis, though the effects could be different because of the compilation model.

jmgao · on Nov 29, 2017

There's problems when you're taking a slice and storing it for a potentially long time. If the slice is actually a subslice of a super large array, the backing slice will be kept around until the subslice goes away (assuming the implementation doesn't try to be clever about things, but it probably doesn't).

zaarn · on Nov 29, 2017

If the Go compiler can't tell anything about the length of storage of a slice, it'll end up in heap. The stack is usually only used when escape analysis determines that the value does not survive the function call, atleast IIRC.

Slices should be capable of being partially deallocate so long as the backing arrays are not referenced anymore.

littlestymaar · on Nov 29, 2017

Go doesn't have a generational & compacting GC (where allocation can be made really efficient: just bumping a pointer), hence heap allocation are expensive in Go (but idk the details, and maybe they're not as expensive as they are in C or Rust).

Then to avoid performance penalties, you need to reduce allocations to the minimum, but since Go use escape analysis to decide whether to allocate on the heap or not, you don't have full control on what is heap-allocated or not, and avoiding allocations can be quite tricky.

kibwen · on Nov 29, 2017

Rather than being a comment on where each was stored, my comment was intended to highlight how both Rust and Go slices are fixed-size pointer+metadata "windows" into some underlying array.

saghm · on Nov 29, 2017

Fair enough! Not having done any PL work in a while, I tend to think of types in terms of the properties of how I use them rather than their implementation, so I was just surprised to see them compared this way.

linkregister · on Nov 29, 2017

Isn't it the case that the compiler allocates objects on the heap if the reference escapes the function, and allocates on the stack if the reference doesn't?

masklinn · on Nov 29, 2017

> if the reference escapes the function

The difficulty is knowing when that happens, and you're probably guessing wrong (the -m gcflag will tell you).

Furthermore fitting Go's theme the escape analysis is pretty simplistic, so there are many cases where it will somewhat unexpectedly assume escape (note: link is from 1.15, some have been fixed since like the …arg one or the slice assignment): https://docs.google.com/document/d/1CxgUBPlx9iJzkz9JWkb6tIpT...

littlestymaar · on Nov 29, 2017

> `Vec`, `String`, `Box`, `Rc`, and `Arc` are all heap allocated and that slices, arrays, `str`, `&T`, and `&mut T` are not.

if T is Box<u32>, &T is heap allocated. Imho, it's more accurate to say that `&` and `&mut` doesn't cause heap allocation. </nitpick>

saghm · on Nov 29, 2017

As I mentioned in a couple sibling comments, it looks I didn't do a great job explaining what I meant here, but my basic point was that if I want to optimize my program by minimizing heap allocations, I can ignore all `&str`, `&[T]`, `&T`, etc. and just focus on `Vec`, `String`, `Box`, etc. In Go, there isn't any clear boundary like this, so I'm forced to reason about every slice and pointer to determine if they're doing heap allocations or not.

steveklabnik · on Nov 29, 2017

With pointers, you have to differentiate where the pointer is from where the pointer is pointing to is.

    let b = Box::new(5);
    let r = &*b;

Here, r is on the stack (or in a register), but is pointing to something on the heap, so "&T is heap allocated" feels wrong.

littlestymaar · on Nov 29, 2017

> Here, r is on the stack (or in a register), but is pointing to something on the heap, so "&T is heap allocated" feels wrong.

What's the difference between `r` and `b` here ? (no pun intended)

Crespyl · on Nov 29, 2017

They are both pointers, but `b` is an owned pointer (it "owns" a heap allocated value) and `r` is a borrowed/reference pointer (a "pure" reference to something (heap or stack) that it doesn't own and isn't responsible for).

`b` cannot be moved, mutated, or deallocated until `r` is gone. When `b` goes out of scope the heap value it points to will be automatically deallocated, unless `r` still exists somewhere (saved off in a struct for example), in which case the program won't compile.

steveklabnik · on Nov 29, 2017

They both point to the same place. However, r doesn't have ownership, so when r goes out of scope, the memory won't be freed. However, b does, and so when b goes out of scope, the memory will be freed.

littlestymaar · on Nov 29, 2017

ok, but how does it link to where the value is allocated ?

My question was, is it wrong to say that `b` isn't heap allocated either since : «Here, b is on the stack (or in a register), but is pointing to something on the heap.»

steveklabnik · on Nov 29, 2017

It's wrong to say that b isn't heap allocated because it's not stored on the heap. &Ts can refer to something anywhere, heap or stack, and can also be anywhere, heap or stack. A Box<&i32> is going to have a &T on the heap.

pornel · on Nov 29, 2017

It's a very important topic to understand to be productive in Rust.

My knowledge of C made learning Rust so much harder for me. It's really hard to stop thinking in pointers. While Rust's references are technically implemented as pointers, for the purpose of "fighting with the borrow checker" it makes more sense to think of them as read/write locks for regions of memory.

bjz_ · on Nov 29, 2017

Yeah, interestingly I think it's hard to understand what is going on with C because `T*` pointers can be used for many things. I found it easier to go to C after doing Rust because I had a deeper understanding of the semantics behind them. Was frustrating though because it never caught my mistakes!

shmerl · on Nov 29, 2017

C++ helps more than C for Rust.

vog · on Nov 29, 2017

This is especially true if you are familiar with the RAII principle in C++.

shmerl · on Nov 29, 2017

Yeah, there is a bigger emphasis on it in C++11 and on.

Paul-ish · on Nov 29, 2017

I spent a whole day struggling with a bug because I equated &[T] with *T and thought you could cast one to the other (for interop with C++ code). It took me too long to figure out that &[T] is two words long, but now its obvious. I'm not sure where I thought the "length" part was being stored.

FreeFull · on Nov 29, 2017

For reference, the proper way to get a *const T from a &[T] is the .as_ptr() method. The way &[T] (and any other type not marked with #[repr(C)]) is laid out is implementation-defined, so accessing its internals isn't recommended.

greenhouse_gas · on Nov 29, 2017

Me too. I found *&var quite confusing.

codefined · on Nov 29, 2017

As someone who has just spent the last 15 minutes escaping to HN from Rust due to reference errors, this was amazingly useful and actually helped me fix the error I was getting.

gamegoblin · on Nov 29, 2017

If you ever find yourself stuck for too long, drop a code snippet into https://play.rust-lang.org/ and share a link to the code in IRC channel at #rust or #rust-beginners (irc.mozilla.org). Very friendly community.

gjtorikian · on Nov 29, 2017

I second this. I asked a bunch of dumb questions on IRC, Reddit, and the forums and every single time the responses were so patient and helpful.

I work at GitHub and I’ve been telling people that for the future of open source we really ought to be looking at the Rust community, both the amount of automation they have and also their general communication style.

CleanCut · on Nov 29, 2017

Yay! I start at Github on Dec 5th, and I'm a huge Rust enthusiast.

gkaemmer · on Nov 29, 2017

Funny, I just had the exact same experience. Had been fixing compiler errors by adding lifetimes to my structs instead of boxing the values.

I need a good reference on the right way to un-learn certain C concepts to make learning Rust concepts easier.

eximius · on Nov 29, 2017

Overall really great.

> These 3 types all have equivalent reference types (again: a reference is a pointer to memory in an unknown place): &[T] for Vec<T>, &str for String, and &T for Box<T>.

This seems to accidentally imply that these reference types are for things on the heap. i.e., that &T is borrowed equivalent to Box<T> which is not true. All three of these reference types can point to memory not on the heap. The former two 'usually' don't, while the latter will vary wildly depending on the application.

gpm · on Nov 29, 2017

&[T] are commonly created from stack allocated arrays, and &str are even more commonly created from read only string literals... so I don't think it's correct so say that those "usually" point to things on the heap. (But of course the definition of "usually" could vary, it wouldn't shock me to find out they did 60% of the time).

Or did you mean &T usually points to things on the heap, in which case I should just say it very very commonly points to stack allocated things as well.

danieldk · on Nov 29, 2017

&[T] are commonly created from stack allocated arrays,

Really? I would say that in my typical Rust code &[T] is created from a heap-allocated array >90% of the time. Most functions that do not require ownership of an argument will use &[T] and not &Vec<T> (or perhaps S: AsRef<[T]>), since &[T] works for stack and heap memory and &Vec<T> is automatically converted to &[T] through Deref coercion.

E.g.:

    fn main() {
        let v = vec![1, 2, 3, 4, 5];
        blah(&v);
    }

    fn blah<T>(s: &[T]) {
        println!("{}", s.len());
    }

(The same is true for &str.)

gpm · on Nov 29, 2017

When you pass a `Vec<T>` directly to a non-mutating function or method not implemented on `Vec<T>` itself you pass it as a `&[T]`. But more often I pass it as part of a struct so it remains as (indirectly) `&Vec<T>`. However pretty much whenever you use a stack allocated array you use it as a &[T], part of a struct or not. I'm sure I use a heap allocated &[T] more often, but I doubt it reaches 90%.

For &str you have to remember that every string literal in your program is one. When you do `some_String.starts_with("/mnt")`, `println!("hi there {}", name)`, etc you are using a new &str. I suspect most programs use more static strings than dynamic Strings (particularly since Rust isn't heavily used in GUIs yet).

crispweed · on Nov 29, 2017

> The most important thing about Rust (and the thing that makes programming in Rust confusing) is that it needs to decide at compile time when all the memory in the program needs to be freed.

> ...

> When the function blah returns, x goes out of scope, and we need to figure out what to do with its my_cool_pointer member. But how can Rust know what kind of reference my_cool_pointer is? Is it on the heap?

> ...

> If we knew that my_cool_pointer was allocated on the heap, then we would know what to do when it goes out of scope: free it!

The way this is written kind of seems to suggest that Rust will sometimes free heap memory when a reference to that memory goes out of scope, which I think is misleading.

As I understand it, this is not the case, and the point is just that Rust needs to be able to prove that nothing else freed the referenced heap memory at any point where the reference may be used.

jmite · on Nov 29, 2017

No, I think the article is right. When a value goes out of scope, its drop method is called, which for Box values deallocates it.

The trick is that if it is (possibly) returned from a function, it is moved instead of dropped.

It's also important to distinguish Box from Rc. Both are heap values but have very different behavior.

crispweed · on Nov 29, 2017

The text I quoted seemed to suggest that the reference going out of scope could trigger deallocation.

Manishearth · on Nov 29, 2017

Which is true.

The word "reference" is overloaded, it can be used to mean "anything pointery that's guaranteed to exist" too. Box<T> in this context is a reference.

The post does kind of dance between definitions of "reference" a bit, but I think that's intentional.

lmm · on Nov 29, 2017

Yeah. Knowing that it's on the heap isn't enough - we also have to know whether we own it or are borrowing it.

burntsushi · on Nov 29, 2017

Great post! I appreciate the socratic style. I agree with other posters that stuff like this is important to be comfortable with when writing Rust, and more material like this blog post is fantastic. I think if I were to write a part 2 of this blog post, it would be about learning how to read Rust code such that you know what is a reference and what isn't, and more pointedly, when something is behind two references. These things are important for effectively using pattern matching among other things.

With that said, I'd like to add some advice by spring-boarding off a part of the post.

> Converting from a Vec<T> to a &[T] is really easy – you just run vec.as_ref(). The reason you can do this conversion is that you’re just “forgetting” that that variable is allocated on the heap and saying “who cares, this is just a reference”. String and Box<T> also has an .as_ref() method that convert to the reference version of those types in the same way.

While on the surface this is absolutely correct, there is a subtle point missing here: as_ref on Vec/String/Box is implemented as part of the AsRef[1] trait, which is _intended_ for use in generic programming. Aside from intent, practically speaking, using as_ref in a non-generic context can often be somewhat unergonomic, since depending on how you use it, it might require a type annotation (because it's generic!).

Where AsRef is useful is in making the types of parameters to functions a bit more liberal. One particularly convenient place where it's used in the standard library is for defining functions that accept file paths. For example, the type signature of the function that opens a file is[2]:

    fn open<P: AsRef<Path>>(path: P) -> Result<File>

Basically, this function says that it accepts a parameter `path` with a type `P` that can be infallibly converted into a `Path`. Why is that convenient? Because lots of useful types implement `AsRef<Path>`. They include OsStr, Cow<'a, OsStr>, OsString, str, String, PathBuf, and of course, Path itself. This is what let's you write `File::open("foo/bar")`. Without the generic `AsRef<Path>` constraint, the signature would look like this:

    fn open(path: &Path) -> Result<File>

Which would mean that you'd need to write something like `File::open(Path::new("foo/bar"))` instead.

So what's the alternative to using `as_ref` if I'm here poo-pooing it? In my experience, the typical thing to do here is to rely on something called deref. That is, if `s` is a `String` then `{STAR}s` is a `str` and `&{STAR}s` is a `&str`. In many cases, the explicit dereference (so that's `&s` instead of `&{STAR}s`) can be elided and the compiler will "auto-deref" for you. For example, given a function like the following

    fn repeat(string: &str, count: u64) -> String

and a string `s` with type `String`, then

    repeat(&s, 5)

will "just work." If you prefer the explicit, then I think the recommendation is to use type specific conversion methods. For `Vec<T>`, `as_slice` will give you a `&[T]`. For `String`, `as_str` will give you a `&str`.

OK, that's enough for now! This rabbit hole goes deeper, but I'll stop here. :)

> One question I have (that I think I will just resolve by getting more Rust experience!) is – when I write a Rust struct, how often will I be using lifetimes vs making the struct own all its own data?

If I were forced to give a pithy answer to this question, then I think I would say (predominantly from the perspective of a library writer): "It's a healthy mix, but if I don't care about performance for $reasons, I can usually ignore lifetimes in the types I define."

[1] - https://doc.rust-lang.org/std/convert/trait.AsRef.html

[2] - https://doc.rust-lang.org/std/fs/struct.File.html#method.ope...

ejanus · on Nov 29, 2017

> open<P: AsRef<Path>>(path: P)

Can't one achieve same with enum??

kelnos · on Nov 29, 2017

Yes, but it's awkward to work with from the caller's side. You'd have to do something like this (hard to pick non-clashy names, too):

    enum Path {
      FromString(String),
      FromStr(&str),
      FromOsStr(OsStr),
      // ...
    }

    fn open(path: Path) ... {

    }

And then as the caller you'd have to do things like:

    let file = open(Path::FromStr("/foo/bar"));

It's not particularly nice to read, and you also have the overhead of creating and throwing away the enum instance.

wtetzner · on Nov 29, 2017

An using AsRef means you can define your own types that can be used as paths, whereas an enum would be fixed once it was defined.

Manishearth · on Nov 29, 2017

You then need to do extra wrapping, and other crates can't add new things that are "as reffable as a path", and there's a runtime penalty

rathm · on Nov 29, 2017

Good stuff. Thank you so much.

TheDong · on Nov 29, 2017

> Every struct (or at least every useful struct!) refers to data

Not true, zero-sized structs are quite useful too. They can be used to fulfill traits, indicate certain errors (often in enums), etc.

A couple quick examples from the stdlib:

https://github.com/rust-lang/rust/blob/1.22.1/src/libstd/syn...

https://github.com/rust-lang/rust/blob/1.22.1/src/libstd/col...

I recommend watching this excellent rustconf 2017 talk for more information; it heavily features information on how zero-sized types can be used: https://www.youtube.com/watch?v=wxPehGkoNOw

sagichmal · on Nov 29, 2017

By opening with "Not true" you're establishing a contrarian position, which puts people -- likely the author, potentially even the reader -- on the defensive, emotionally.

It's sufficient and actually a lot nicer to simply state your point: e.g. "Zero-sized structs are quite useful too."

littlestymaar · on Nov 29, 2017

I agree for the author's point of view, but as a reader I enjoy contradiction and argumentation because that's where I learn most. Then when I see someone starting with `not true` or `I disagree`, I immediately interested in reading more. YMMV though.

shock · on Nov 29, 2017

> I know in Java you have boxed pointer versions of primitive types, like Integer instead of int. And you can’t really have non-boxed pointers in Java, basically every pointer is allocated on the heap.

That's not true. In Java pointers can very well be allocated on the stack, but the objects that they point to will be on the heap

Rusky · on Nov 29, 2017

It looks like the article pretty consistently uses the phrase "the pointer is allocated ..." to mean that that's where it's pointing to.

shock · on Dec 1, 2017

So the article is pretty consistently misleading/incorrect. A pointer is a data structure like any other, in fact Java is pass-by-value, the pointer values are copied when objects are being passed as function arguments.

Rusky · on Dec 2, 2017

To me it seems it's just using different terminology than you expected. I've heard and used the article's version plenty of times and it generally works in context.

tempodox · on Nov 29, 2017

For me, the question in Rust is not, what's a reference. But how do I find all functions applicable to a given type? In C/C++, I can just grep the header files for the type name and voilà. I find header-less languages like Rust or Swift really obscure in that way.

simias · on Nov 29, 2017

Interesting, if there's one thing I really don't miss in Rust it's bloody headers.

Can't you simply use the docs? When I code in Rust I generally have the docs opened: https://doc.rust-lang.org/std/vec/struct.Vec.html (or more like likely the locally installed version).

Most crates have documentation available as well (generally linked directly from their entry on crates.io) and if it's not online for some reason you can just run "cargo doc" to generate it locally. Randomly taking the "image" crate as an example: https://docs.rs/image/0.17.0/image/

Beats grepping header files IMO.

jcelerier · on Nov 29, 2017

> Beats grepping header files IMO.

who greps header files in 2017 (or even 2010) ? just fuzzy search a few characters that more or less looks like what you want in your IDE's search box.

jstimpfle · on Nov 29, 2017

I still grep header, as well as implementation, files a lot.

I miss being able to fuzzy search sometimes, but I keep coming back to vim. IDEs just don't cut it for me. They are too slow (Visual Studio 2017 on my desktop from 2011 is unbearable for even starting a new project). And most things I really need to do - in vim they are a few memorized keypresses or a plain shell command in a Makefile away, while in IDEs I have to dig through wizards which really brings me out of the zone.

Not relying on API search much has the huge advantage of not relying on external APIs, which leads to good modularization. As a general rule, a module shouldn't call into other modules much.

And by the way it's the same for OOP: OOP has the advantage of supporting IDE member/method autocomplete (noun first syntax), but it's just the wrong mindset for me and leads to really broken architectures.

wasted_intel · on Dec 5, 2017

> Not relying on API search much has the huge advantage of not relying on external APIs, which leads to good modularization. As a general rule, a module shouldn't call into other modules much.

When writing Rust, you'll likely use the standard library a lot; this rule might not be as applicable as in other languages/environments.

jstimpfle · on Dec 10, 2017

What kind of use do you mean exactly?

wasted_intel · on Dec 10, 2017

Data structures (vectors, hashmaps, trees), I/O, etc. are all part of the stdlib, and their rich feature sets make an API reference essential. You can certainly write Rust without it, but you’d be missing out on a lot of useful functionality.

kibwen · on Nov 29, 2017

I'm a bit confused, can you be more specific as to what you're asking for? grepping for types works just as well for headerless languages as it does for C++, though "finding all functions applicable to a given type" can't be done via grep for either C++ or Rust given that generics exist.

crispweed · on Nov 29, 2017

I guess tooling can help here?

It seems like this information is all collected together for the docs, for example.

Look at the page for std::vec::Vec, for example (https://doc.rust-lang.org/std/vec/struct.Vec.html).

Here, you have sections for: 'Methods', 'Methods from Deref<Target=[T]>' and 'Trait Implementations', and then it seems that if you look through all these sections, you can see everything that can be called on this type, highlighted in the same light brown colour.

It would be quite nice to get an alphabetically ordered list of just these method names, also..

MaulingMonkey · on Nov 29, 2017

Header diving certainly works well for some C/C++ codebases, but not all I've sadly discovered. The rough analog in Rust might be grepping rustdoc generated documentation, which should at least generally tell you what exists / is publicly exposed. Grepping the full source with extra filters like \bfn\b or \bpub fun\b might be another option.

Like C++, you can also (ab)use intellisense to find a lot of them as well. I should hack more on Visual Rust to improve the situation there...

Manishearth · on Nov 29, 2017

Generally most inherent methods are listed in the same file as the type.

Trait implementations may bring in other methods and may be listed elsewhere, but C++ doesn't help with this either (C++ doesn't have traits but there are common patterns that provide similar functionality)

Most folks use the autogenerated docs (cargo doc), which list all the methods. But also when reading code it's not hard to grep for impls.

pjmlp · on Nov 29, 2017

By making use of an IDE or tools like RLS.

tybit · on Nov 29, 2017

So in comparison to C++ would it be correct to say that Box<T> is like unique_ptr<T>, Vec<T> is Vector<T>, and that references are the same in both languages?

kibwen · on Nov 29, 2017

Rust's Box and Vec are analogous to C++'s unique_ptr and vector, yes. But references in Rust really aren't anything like C++ references, given that Rust references 1) are first-class, 2) come in two varieties (mutable/exclusive and immutable/shared), 3) feature mechanically-checked lifetimes, and 4) will be two words in size (rather than one) if the underlying type is dynamically-sized.

jcelerier · on Nov 29, 2017

> Rust references 1) are first-class

what makes them more first-class than C++ references ? eg in C++ given a type T, you can use `std::add_lvalue_reference<T>`, `std::remove_reference<T>`, overload on references, check if a type is a reference to another...

Sharlin · on Nov 29, 2017

C++ reference types are first-class. But instances of reference types are not first-class values. References are not objects, in standard speak, they do not have a memory location, you can't take their address, you can't pass them to functions (a reference parameter means passing a value by reference, not a reference by value). And so on. Rust references are more like C/C++ pointers or Java references in that they are actual values, and AFAIK Rust functions, like Java and C functions, are strictly pass-by-value.

wyldfire · on Nov 29, 2017

This is a really helpful distinction, thanks for clarifying.

kimundi · on Nov 29, 2017

I think they just meant that Rust references are like normal first-class generic types. Eg, you can nest them to get a &mut &T for example, since they behave more like a pointer in that regard.

C++ references on the other hand are more like modificators of a type, eg you can have a T or a T&, but having a (T&)& does not make sense. (Outside of templates, where it gets folded down to a T&.)

jplatte · on Nov 30, 2017

There is one important difference between Box and unique_ptr though: unique_ptr is nullable while Box isn't.

paulbgd · on Nov 29, 2017

Fantastic explanation, coming from a C background I definitely got confused by some of the lifetime things that Rust does.

b0rsuk · on Nov 29, 2017

> I’ve written a few hundred lines of Rust over the last 4 years, but I’m honestly still pretty bad at Rust and so my goal is to learn enough that I don’t get confused while writing very simple programs.

This makes me feel hopeless, as I'm only about to start using Rust in my hobby projects after reading the essential book chapters. I hope it's just excessive humility on her part ? At the same time, I'm excited because if I commit myself to mastering such a language it can make me stand out. I still have an opportunity to be an early adopter, and have a head start in a promising new language.

kibwen · on Nov 29, 2017

A few hundred lines of the course of four years would imply that the author is idly dabbling with Rust rather than using it in anger. (It also implies that she's been using Rust since before its 1.0 release, which would probably make it harder to get a handle on modern Rust, as it changed significantly back then.) Trust me, it won't take you anywhere close to four years to get proficient in Rust. :)

Manishearth · on Nov 29, 2017

> This makes me feel hopeless, as I'm only about to start using Rust in my hobby projects after reading the essential book chapters.

The operative term there being "a few hundred lines", not "4 years".

A few hundred lines is hardly a couple of hours work.

This means she's written bits of Rust on and off over the course of four years and never sat down with it, basically.

Nothing to worry about.

oconnor663 · on Nov 29, 2017

The IRC channel and the r/rust subreddit are all very helpful for new rustaceans who get stuck, so don't hesitate to reach out.

littlestymaar · on Nov 29, 2017

don't forget stackoverflow ! It doesn't have all the answers easy to google, but /u/shepmaster is doing an amazing job as a curator here. You usually get an answer in less than half an hour (assuming he's awake, but I'm not even sure he sleeps :p)

agmcleod · on Nov 29, 2017

Really just this year is the first time i've been using it for larger side projects. While I still run into some things with the borrow checker, I find I'm much better at predicting them, and figuring out a strategy around it. Really once you get through the book and are comfortable with types, you really just need to start working on something bigger. You will need to change things and refactor as your original ideas don't pan out. But you learn from it.

Good luck!