Show HN: Flexible C memory allocation scheme with leak checking

ktRolster · on Oct 14, 2016

1) The documentation doesn't show what file needs to be included; I think it's "mulle_allocator/mulle_allocator.h"

2) It can be convenient, in mulle_allocator.h, to have a define: #define malloc() mulle_malloc(). That way you don't actually need to dig through the code and replace anything.

3) If you're doing #defines, you can do something like this: #define malloc(a) mulle_malloc(a, __LINE__, __FILE__) which will let you keep track of the line and file where the memory was allocated.

4) If you're on Linux (which often C programmers are not) then mtrace() works really nicely for this.

5) It's always better to have more options for memory debugging, so good job!

mulle_nat · on Oct 15, 2016

I will update the documentation with the suggestion. I would not want to place the malloc #define in mulle_allocator.h though. First, it's too surprising. And second, it defeats the possibilty to isolate certain memory operations.

JoachimSchipper · on Oct 14, 2016

This seems to be a malloc()/free() wrapper that keeps track of malloc()ed/realloc()ed/free()d pointers. I see that it's been submitted by the author, so one hopefully-helpful comment: it would be really helpful to highlight why one would use this instead of one of the well-established alternatives (Boehm GC, valgrind, ASan).

The "run after every test case" example actually gives a fairly good reason, but note that realistic code bases probably have a few caches, freelist-optimizations or other exceptions to "deallocate all memory after every test".

ktRolster · on Oct 14, 2016

realistic code bases probably have a few caches, freelist-optimizations or other exceptions to "deallocate all memory after every test".

If an open source library has memory allocated at finish, to me that's a warning to find a different project.

If it's something I get assigned to work on at work, I still have to work on it; but it lets me know I'll probably be cleaning up many other things as well: the programmers who passed before were sloppy.

ars · on Oct 14, 2016

> If an open source library has memory allocated at finish, to me that's a warning to find a different project.

You need to read this:

https://news.ycombinator.com/item?id=8305283

Basically it's faster to simply exit and not try to walk datastructures deallocating everything.

mulle_nat · on Oct 16, 2016

I think one should differentiate between app/tool code and library/plugin/daemon code.

A typical tool like git starts, runs for a while and then exits. In this case worrying too much about memory cleanness is counterproductive.

This changes when you are writing a data structure, like say a hashtable. You don't want a hashtable that leaks during operation. It changes when you write a daemon, that runs for a long time, you don't want it to consume all address space. It changes when you write plugins, that get repeatedly loaded and unloaded.

ktRolster · on Oct 14, 2016

That rarely becomes an issue, and should not be optimized for by default.

mikepavone · on Oct 14, 2016

Doing work that makes your library slower so that certain primitive memory checkers don't get upset doesn't seem like a good use of time to me. Especially when more sophisticated tools (Valgrind at least, probably others) can differentiate between actually leaked memory and stuff that's still reachable at program termination. Obviously there are scenarios where this is not appropriate, but it seems like a reasonable choice in many cases.

ktRolster · on Oct 14, 2016

If your own code is clean other than de-allocating at exit, then I'm fine with that.

If not, I'm asking you to kindly leave the industry, and not inflict your messy code on others.

eutectic · on Oct 14, 2016

What's the advantage to deallocating on exit?

_0ffh · on Oct 14, 2016

The good feeling of having cleaned your house before it gets sucked into a black hole :)

ktRolster · on Oct 14, 2016

You know your code is clean.

luckydude · on Oct 15, 2016

Hi, BitKeeper guy here (BK predates git, first distributed SCM). I built what I'm sure is a less good version of this, I called it purify, you included purify.h and it warned you about all the stuff that it could by redefining malloc/free/etc to ones that kept track of stuff. So some overlap with the system posted here.

We were very careful to free everything before exit. It just felt cleaner.

It was also a bad idea. Until a few years ago, BK used an ascii file format, which means it read in the graph with all the meta data, parsing that, malloc-ing each string, each struct, etc. At exit we carefully freed all that.

Why was that a bad idea? If you wanted to link BK into your long lived IDE or wanted it to be a library, then it was a very good idea. But we never did that and BK doesn't really want to be used that way. So it became a performance problem. Consider

bk changes -1 # same as git log -1

That used to parse the entire changeset graph, build an in memory one version of it, find the tip (HEAD), print out the comment, then recursively free everything.

All for the holy grail of "knowing your code is clean". Clean is good unless it is pointless.

Another way to say the same thing is you could make the claim that all code should be written to be thread safe. Even if you never multi thread it. That's a lot of work, maybe not so bad for people who are good at threading, but it's work for most people (I'm a former Sun kernel hack, I get threads, it's not that hard once you get over the hump but a lot of people don't). If you have no intention of multi threading the code then making it thread safe is just "clean code" masturbation.

I'm all for clean code stuff where you are going to reap some value. In our case we did all that clean code memory management and eventually said screw that, changed the file format so we could mmap() it and page it in (read/write protect everything, take the fault, fault handler pages it in). No alloc, no free, way better performance.

All those years of clean code bought us nothing. I get that if you are making a library the rules about clean are very very different. But for an app that is going to run and exit, don't push clean down people's throat, it isn't helping in a lot of cases.

_0ffh · on Oct 16, 2016

upvote because thanks for taking the time!

eutectic · on Oct 15, 2016

This seems like cargo-cultism. How is doing something redundant and potentially harmful just to conform to a general guideline any cleaner than exercising judgement?

ktRolster · on Oct 15, 2016

I'm intrigued that you feel so strongly about this. I therefore refer you to this comment: https://news.ycombinator.com/item?id=12710990

mulle_nat · on Oct 15, 2016

An advantage over valgrind f.e. is that the same test works on Windows to Linux to OS X without having to interface with different tools. The use of the "allocator" struct enables you to isolate your code to test, so that other caches do not obscure the result. The thing is, it's not just a wrapper, but also a decoupler.

zde · on Oct 14, 2016

How do you redirect strdup() et al to your malloc? What's wrong with malloc hooks?

http://www.gnu.org/software/libc/manual/html_node/Hooks-for-...

mulle_nat · on Oct 15, 2016

Well for strdup, there is mulle_strdup. Who is Al ? :) There is nothing wrong with malloc hooks, but they are global and you can't isolate on a per data structure basis.

Also when you malloc you know it's malloc, but calling mulle_allocator_malloc it could be something else (your own memory manager). That's the decoupling aspect.

IpV8 · on Oct 14, 2016

We use something similar to this with object creation/deletion too. We set up a macro called watchallocation and watchdelete that keeps track of when objects get new'ed and deleted so we can track memory leaks. In release mode they just vanish. Quite helpfull