Fast atomic types support is indeed a good reason to do this sort of hackery - that's one of the reason why it is used in kernels. The difference of performance hashmap/vectors/sets between void* and int-specialized can be very significant (easily up to one order of magnitude if not more in some real cases).
I wished there was a language (not C++) which could help for this kind of things. Unfortunately, it becomes difficult very fast. One interesting approach is K as suggested by some FreeBSD hackers, but it never went into production AFAIK (http://wiki.freebsd.org/K)
I'd question how much of that impact comes from specializing the collection code, and how much of it comes from allocation and locality. The temptation with void* collections is to malloc everything, which is lethal for performance, and one of the very few non-bug things I've come across whose fix actually yields a 10x performance improvement.
Memory allocation can indeed be amortized with specialized allocator, but that's not what I had in mind. The context I usually operate with is numerical computation, and the indirection cost is very high in those cases, especially when you can access memory in blocks if you use specialized allocator. It would be very hard to do well for sure. It is well known that allocator is one of the main weakness of the STL (one of the reason for the existence of Electronic Arts STL).
I am certainly not advocating doing this in general - I think the need for atomic support in generic collections is quite low (I have been investigating the issue recently to add fast and generic support for sparse matrices in scipy). I am pretty sure the macro, specialized ones used in freebsd (tree/queue.h) and linux (rbtree, list) have been benchmarked to hell, though, and would trust them more than most STL implementations.
Seriously, why not C++? It's just C with some stricter type checking. Pretend it's C, then judiciously grab a pair of angle brackets when you need them. No macros required.
C++ has portability issue (where portability does not mean supporting g++ and MSVC), poor interoperability, and is much harder to maintain in a distributed team of people with varying ability in the language (e.g. open source).
When those are not issues, C++ is appropriate. Otherwise, it is a pain.
I wished there was a language (not C++) which could help for this kind of things. Unfortunately, it becomes difficult very fast. One interesting approach is K as suggested by some FreeBSD hackers, but it never went into production AFAIK (http://wiki.freebsd.org/K)