Basics of Memory Addresses in C

glimcat · on Aug 18, 2012

The fastest way to grok memory issues is to do some work in Assembly. Few programmers will use Assembly for regular work, but the intuition it fosters will serve you everywhere.

p4bl0 · on Aug 18, 2012

I think this is a very good advice, but for me it was the other way around: using assembly was easy when I had to because I already grok the pointers from C.

I remember in a programming language course we had to write a compiler for a subset of OCaml to MIPS assembly. I had literally no trouble adding support for references, while for others who did not grok C pointers (or had never programmed in C), it was less easy.

malkia · on Aug 18, 2012

For MSVC:

cl /FAsc source.c/source.cpp gives you assembly right there from the command line. For gcc: gcc -S

For dumping disassembly: DUMPBIN, or LINK /DUMP for MSVC, objdump -d/-D for binutils based (gcc, and others).

Debuggers: WinDBG (free), OllyDbg (Windows), gdb/ddd even WinGDB (not free)

cynwoody · on Aug 18, 2012

Absolutely.

First, learn assembly. Pick a machine architecture — it's not that important. For extra credit, pick multiple machine architectures and do something in each.

Then learn Lisp.

heretohelp · on Aug 18, 2012

Then learn Forth.

And Morse code.

You'll need Haskell for dat Damas-Milner sexiness.

Wait wait, we need to pile on some Ruby and JavaScript, you don't want to be some sort of weirdo that can't make a web app do you?

Hrrrrm. Yep, need some APL. With Unicode.

That'll really hammer in the concatenative into your brain.

derleth · on Aug 18, 2012

> The fastest way to grok memory issues is to do some work in Assembly.

And get some footing in OS design. Because, ultimately, your program's entire view of memory is just another API it uses to communicate with the OS.

tjoff · on Aug 18, 2012

I appreciate the distinction between arrays and pointers, but the article fails to mention a similar pitfall: A struct is the sizeof its members., which isn't necessarily true.

It's like how everyone learns that (INT_MAX + 1) == INT_MIN (even non-developers seems to know this) yet that it actually is undefined (in C/C++), I feel that just noting that it isn't the whole truth (such as noting that the OS handles the memory behind your back) is quite valuable, even when learning the basics.

Otherwise you might end up feeling, as I do, that your foundation is shaky and built up on lies - not really knowing what "facts" you can trust.

tolos · on Aug 27, 2012

I went through this learning experience. Something similar to a struct with 32 bit items, a 16 bit item, then a 64 bit item. The 64 bit item was being aligned to a friendly 32 bit address, which left 16 bits of unassigned memory between the 16 and 64 bit items.

sswezey · on Aug 18, 2012

One thing to note:

Maybe you should explain the first element of an array having the same memory address as the actual array a little bit more, and relate it to why array indexing is 0-based too - the index is that many offsets from the beginning from the array.

dschatz · on Aug 18, 2012

Nice introduction. I think it is worth pointing out that much of what you discuss is implementation dependent, the c standard doesn't require an implementation to lay out data in memory in any particular way. Instead it requires that access semantics behave in a particular way. These semantics, in turn, align with easy, low level implementations.

justincormack · on Aug 18, 2012

Er yes, virtual memory is an allowed implementation!

nemetroid · on Aug 18, 2012

I like the style, but it was basics indeed. I look forward to following parts. Something that initially confused me about sizeof on arrays is the somewhat deceiving parameter form `char s[]`.

denniskubes · on Aug 18, 2012

Yeah I agree it is basic. I needed to layout some of these concepts before doing the deep dive on arrays. Writing that post now will be out in the next couple days.

p4bl0 · on Aug 18, 2012

"Arrays are not pointers". Right. And wrong, actually.

The article says that arrays are different from pointers, but it does not prove it. It is quite simple to prove, see the program below.

Also, it's not interesting to limit the definition of arrays to just the locally and statically declared ones. If you do that then something like 90% of C programs (if not more, I think I never wrote such a C program except for exercises in class) don't use arrays at all. In all the other case (arrays passed as argument to a function, dynamically allocated arrays…), the are the same as pointers. Again, see the program below.

In reality, it is a bit pedantic to insist on this distinction, except for the rare case where it is a performance issue (the arrays of the article require one less memory access, the one to get the address of the memory at which the array starts).

    #include <stdio.h>
    #include <stdlib.h>
    
    void
    f (char a[], char *b, char *c)
    {
      printf("Once passed to a function as arguments:\n\n");
    
      printf("What the article limits the definition of array to:\n");
      printf("&a      = %p\n", &a);
      printf("a       = %p\n", a);
      printf("&(a[0]) = %p\n", &(a[0]));
      printf("a + 1   = %p\n", a + 1);
      printf("&(a[1]) = %p\n", &(a[1]));
      printf("\n");
    
      printf("Pointer to an array:\n");
      printf("&b      = %p\n", &b);
      printf("b       = %p\n", b);
      printf("&(b[0]) = %p\n", &(b[0]));
      printf("b + 1   = %p\n", b + 1);
      printf("&(b[1]) = %p\n", &(b[1]));
      printf("\n");
    
      printf("Pointer to dynamically allocated memory:\n");
      printf("&c      = %p\n", &c);
      printf("c       = %p\n", c);
      printf("&(c[0]) = %p\n", &(c[0]));
      printf("c + 1   = %p\n", c + 1);
      printf("&(c[1]) = %p\n", &(c[1]));
      printf("\n");
    }
    
    int
    main (int argc, char *argv[])
    {
      char a[4];
      char *b = a;
      char *c = malloc(sizeof(*c) * 4);
    
      printf("What the article limits the definition of array to:\n");
      printf("&a      = %p\n", &a); /* behavior differs only here, this is the
                                       difference with pointers */
      printf("a       = %p\n", a);
      printf("&(a[0]) = %p\n", &(a[0]));
      printf("a + 1   = %p\n", a + 1);
      printf("&(a[1]) = %p\n", &(a[1]));
      printf("\n");
    
      printf("Pointer to an array:\n");
      printf("&b      = %p\n", &b);
      printf("b       = %p\n", b);
      printf("&(b[0]) = %p\n", &(b[0]));
      printf("b + 1   = %p\n", b + 1);
      printf("&(b[1]) = %p\n", &(b[1]));
      printf("\n");
    
      printf("Pointer to dynamically allocated memory:\n");
      printf("&c      = %p\n", &c);
      printf("c       = %p\n", c);
      printf("&(c[0]) = %p\n", &(c[0]));
      printf("c + 1   = %p\n", c + 1);
      printf("&(c[1]) = %p\n", &(c[1]));
      printf("\n");
    
      f(a, b, c);
    
      return 0;
    }

Here is a possible output of this program:

    What the article limits the definition of array to:
    &a      = 0x7fff5590ff00
    a       = 0x7fff5590ff00
    &(a[0]) = 0x7fff5590ff00
    a + 1   = 0x7fff5590ff01
    &(a[1]) = 0x7fff5590ff01
    
    Pointer to an array:
    &b      = 0x7fff5590fef8
    b       = 0x7fff5590ff00
    &(b[0]) = 0x7fff5590ff00
    b + 1   = 0x7fff5590ff01
    &(b[1]) = 0x7fff5590ff01
    
    Pointer to dynamically allocated memory:
    &c      = 0x7fff5590fef0
    c       = 0x202f010
    &(c[0]) = 0x202f010
    c + 1   = 0x202f011
    &(c[1]) = 0x202f011
    
    Once passed to a function as arguments:
    
    What the article limits the definition of array to:
    &a      = 0x7fff5590fec8
    a       = 0x7fff5590ff00
    &(a[0]) = 0x7fff5590ff00
    a + 1   = 0x7fff5590ff01
    &(a[1]) = 0x7fff5590ff01
    
    Pointer to an array:
    &b      = 0x7fff5590fec0
    b       = 0x7fff5590ff00
    &(b[0]) = 0x7fff5590ff00
    b + 1   = 0x7fff5590ff01
    &(b[1]) = 0x7fff5590ff01
    
    Pointer to dynamically allocated memory:
    &c      = 0x7fff5590feb8
    c       = 0x202f010
    &(c[0]) = 0x202f010
    c + 1   = 0x202f011
    &(c[1]) = 0x202f011

As we can see, for their practical use arrays and pointers can really be seen as the same thing. So again, except if you are optimizing a program where you can statically declare your arrays and access them a lot (i.e., you are doing matrix multiplication), the difference between arrays and pointers does not really matter.

_kst_ · on Aug 18, 2012

Thank you for making it clear that arrays are not pointers!

I posted a few comments on the site.

iopuy · on Aug 18, 2012

Easy to follow, well written, cant wait for the follow ups. Thanks!