> std::vector uses constructors and destructors to create and destroy objects wh...

CodeMage · on Aug 20, 2019

> This is precisely what vector::emplace() solves, and std::move should be faster than swap and pop.

The whole swap-and-pop section weirded me out. Maybe I just don't know enough about C++, but saying that assignment (a[i] = a[n-1]) will call the destructor seems false.

As far as I know, the compiler should generate an implicitly defined copy assignment operator for these fixed size PODs and it should be as performant as memcpy.

But again, I don't have years and years of in-depth C++ experience, so I would be grateful if an expert could shed more light on this.

foota · on Aug 20, 2019

Yeah, fairly certain that is wrong.

I think that would just call the copy assignment operator, would it not?

For correctness you would probably then follow up with a pop_back to keep the vector right-sized.

Actually you'd probably want to do:

a[i] = std::move(a[n - 1]);

Then follow up with pop_back.

Best would probably be:

a[i] = std::move(a.erase(n-1));

hermitdev · on Aug 21, 2019

Ideally, the std lib implementation should handle that detail for you...

gpderetta · on Aug 21, 2019

which detail?

foota · on Aug 22, 2019

In theory erase could return a move iterator, meaning that you could omit the call to std::move. This wouldn't be backwards compatible though so not going to happen.

gpderetta · on Aug 22, 2019

wait, how is this supposed to work?

   a[i] = std::move(a.erase(n-1));

There is no erase that takes an index, so I assume that n = a.end(). Also it is missing a dereference:

   a[i] = std::move(*a.erase(a.end()-1));

but erasing the one-before-the-end returns the (new) end iterator, which obviously is not referenceable. In general, after calling erase, it is too late to access the erased element.

You want something like:

  template<class Container, class Iter>
  auto erase_and_return(Container&& c, Iter pos)
  {
     auto x = std::move(*pos);
     c.erase(pos);
     return x;
  }

Also in the general case it doesn't make sense for erase to return a move iterator.

foota · on Aug 23, 2019

Thanks for the corrections. I mis-read the documentation and thought erase returned an iterator to the elements erased.

lenkite · on Aug 21, 2019

You are correct. A trivial copy assignment operator makes a copy of the object representation as if by std::memmove. All data types compatible with the C language (POD types) are trivially copy-assignable.

https://en.cppreference.com/w/cpp/string/byte/memmove

rurban · on Aug 21, 2019

Not memmove. A trivial object assignment can be _memcpy_aligned, which is much faster. And the size is compile-time constant.

lenkite · on Aug 21, 2019

I assume you mean aligned on boundaries ? I picked up that from https://en.cppreference.com/w/cpp/language/copy_assignment and it does also say that memmove has a fallback to std::memcp when there is no overlap between source and destination.

AshamedCaptain · on Aug 20, 2019

The article is just doing a generic cargo cult warning there. Not bad as a general C++ gotcha warning, but definitely incorrect in this specific case.

As per the author's constraints these are "POD types that are trivially memcpy-copyable", so by definition the copy constructors will never do anything. Much less "allocate memory" as the author claims.

stephc_int13 · on Aug 21, 2019

[flagged]

adrianN · on Aug 21, 2019

From the Guidelines:

> Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that."

kcbanner · on Aug 20, 2019

In most C++ game engines the standard library is almost never used, for performance reasons.

See: https://github.com/electronicarts/EASTL

gpderetta · on Aug 21, 2019

My understanding the primary reason for he developers using custom libraries is not so much performance but a) historically console compilers and expecially standard libraries have been extremely buggy and b) is good to have a single implementation across platforms instead of having to deal with quirks and implementation divergence.

logimame · on Aug 21, 2019

From what I've heard, there are two more major reasons to not use STL for gamedev.

- Debug build performance. Release builds of C++ code using STL are generally pretty fast, but Debug builds suffer a lot (especially Visual Studio's std::vector implementation is notoriously horrible for debug builds). Debug executable speeds matter when you are debugging a game; you don't want to test your first-person shooter in 1 FPS!

- Build speed. Because of heavy use of templates and historical cruft, STL slows down your build times a lot. The build-test cycle is very important when designing games; you don't want to wait for a few hours after you've changed a few lines of code to tweak a new feature. Gigantic distributed build servers alleviates this problem a bit, but they are pretty cumbersome to set up nonetheless.

favorited · on Aug 21, 2019

Performance in debug builds is a particular issue, since getting acceptable-for-gamedev machine code from modern C++ often requires optimized builds.

http://aras-p.info/blog/2018/12/28/Modern-C-Lamentations/

hermitdev · on Aug 21, 2019

For MSVC, the debug checks are fairly customizable through judicious use of appropriate debug macros. One can also enabled optimizations with debug symbols, but the debugging experience can be jarring.

I'm not a game developer, but have spent a decade doing C++ on Windows, and at former employer, we had several different debugging profiles depending on the severity/difficulty of reproducing/debugging an issue. Our "normal" debug profile had all of the debug checks in the std lib disabled, and we could only effectively debug our own code. Not sure if games dont do this, or if its still not performing enough.

daemin · on Aug 21, 2019

One problem with using different debug macros in your debug build is that any libraries you link in must also be using the same flags. This is not necessarily possible for binary releases as they will assume certain standard library flags to exist in the debug builds (like iterator checking levels).

At work we don't use a debug build in the traditional sense, it's what you call a no-optimisations build where the code is compiled without most optimisations but otherwise the flags are the same as a release build. Some teams also go a step further and compile most of the code in release but some of their code with optimisations disabled.

hermitdev · on Aug 22, 2019

> One problem with using different debug macros in your debug build is that any libraries you link in must also be using the same flags.

They don't have to be, but it certainly makes this world's easier. If the flags are not the same, for sure you have to be very careful about passing objects between DLL boundaries.

At the companies I've done C++ work at, we've always had the source for all non C libs and compiled any C++ libs our selves (except for Windows libs, bit they also provide checked debug libs), so we could control the flags.

lenkite · on Aug 21, 2019

Are all those "best practices" valid for modern C++ ? I mean one statement says "Pass and return containers by reference instead of value.". This is in contradiction to modern C++ where you return containers by value and rely on copy-elison/RVO. https://stackoverflow.com/questions/15704565/efficient-way-t...

daemin · on Aug 21, 2019

The best way to tell is to try it the modern way and then look at the assembly code generated on something like godbolt.org. If it ends up being less efficient then you change it to accept a non-const reference to store the result in as a parameter instead.

Though if you'll be calling the same function repeatedly to accumulate content into a single container it is far more efficient to have a function with an output reference rather than returning a new container. This will result in fewer memory allocations and you can also pre-allocate the size once before calling those functions.

On the part of tooling it might be nice if there was a way to annotate a function so that it creates a warning if the compiler cannot use copy-elision for the return value. (To be honest I haven't checked the documentation for this specific thing)

gpderetta · on Aug 21, 2019

Copy elision is now mandated in many (but not all) cases FWIW.

daemin · on Aug 21, 2019

The warning that I would want would trigger when someone changes the function and prevents or suppresses copy elision from happening. Like for example adding a check at the start of the function and returning a default container.

gpderetta · on Aug 21, 2019

making the returned object non copyable, non movable is a an option.

edit: also, the rule is simple: RVO is always mandated, NRVO remains an optimization.

daemin · on Aug 21, 2019

I'm not sure if C++14 or C++17 has fixed this but if the object was not copy-constructible then the compiler would emit an error if it was returned by value even if RVO/NRVO was meant to be used. I figure because semantically you need to enable copy-construction of the object.

gpderetta · on Aug 21, 2019

IIRC it was changed in C++14. Now in the RVO case, no copy/move constructor is required (and in fact the compiler is not allowed to call it if it exists).

flohofwoe · on Aug 20, 2019

vector::emplace() still needs to construct the object, it just happens inplace and avoids a redundant copy of an already constructed object. Same with std::move(). As such the blog post is correct.

Using POD structs which can be zero-initialized and memcpy'ed may indeed be faster, especially when these are bulk-operations.

DerDangDerDang · on Aug 21, 2019

It's not clear from the article, but I suspect the author is talking about what happens when the vector is resized and has to move existing elements, which is a real problem.

There are plans to solve this - http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p114...

arximboldi · on Aug 20, 2019

AFAIK std::vector does use memcpy (std::copy as well) when objects are trivially copyable.

AshamedCaptain · on Aug 20, 2019

That is correct yet the compilers definition of what is trivially copyable might be more strict than what you expect. For example, objects that are trivially relocatable can also be memcpy'd for reserve/realloc, but the compiler will not be able to figure that on its own.

std::vector itself falls in this category: trivially relocatable, definitely not trivially copyable. So a vector of vectors will not necessarily be able to use memcpy but rather fall back to copy/move assignment. This is not very significant in performance for this type (vector move being cheap) but a language gotcha nonetheless (as the move constructor will be called n times in every capacity change)

hermitdev · on Aug 21, 2019

Since C++11, you can use template traits to determine if a type is trivially copyable, and even add static_asserts to your code to ensure future changes dont break expectations.

gpderetta · on Aug 20, 2019

Trivially copyable is a word of power (well, two words I guess), it's meaning is well defined and you can statically assert for it.

What unfortunately is not defined is (trivially) relocatable as that's not a property that can be safely be inferred so it is not (yet) part of the standard. Some libraries still have this concept and require some sort of opt in.

NCG_Mike · on Aug 21, 2019

It seems clear from the article the developer isn't that clued up.

codesushi42 · on Aug 20, 2019

It is better to use push_back over emplace to be explicit about which constructor will be called.

wrsh07 · on Aug 21, 2019

+1, Google suggests doing this as well:

https://abseil.io/tips/112

> So in general, if both push_back() and emplace_back() would work with the same arguments, you should prefer push_back(), and likewise for insert() vs. emplace().

daemin · on Aug 21, 2019

That's an interesting point the tip makes. Is there guidance on how to use the emplace_back() added to c++17 which returns a reference to the constructed element?

The reference returning emplace_back() is used frequently in the code to construct a new element of a struct and then fill in its members, as opposed to creating a new struct then push_back() to copy the memory in.

Koshkin · on Aug 20, 2019

But emplace() is already as explicit about it as it gets.

codesushi42 · on Aug 21, 2019

Nuh uh.

If you're not careful, it will call an implicit constructor.

Asooka · on Aug 20, 2019

No, the problem is that std::vector still calls the constructor and destructor of each and every object in the array at least once. This is a performance loss if they don't do anything - you have to rely on the compiler to inline the call, then remove the code. For POD datastructures it can be significant, because those are usually the largest arrays in your application. This is why e.g. Facebook's Folly library detects POD types in their vector and doesn't call ctors and dtors at all.

Similarly, std::vector has to allocate more memory every time it has to grow and copy all its contents, whereas for POD datatypes you can just use realloc which can save copies.

These are all borderline microoptimisations, but they matter for realtime highly responsive software. Or just in general when you need to squeeze out every last bit of performance.

dymk · on Aug 20, 2019

std::move'ing a vector does not call the ctor/dtor of every element within the vector, but that might not be what you're referring to.

If you want an `A` struct/class, you'll call the ctor/dtor, that's true. But for POD types, if the ctor/dtor does nothing, they are trivial to inline and will incur no runtime overhead by any compiler nowadays.