Why C Is Not Assembly

jedbrown · on Sept 14, 2010

I suspect this guy never compiled the code he posted. Not that his claim was completely wrong, but gcc-4.4.4, gcc-4.5.1, and clang-1.1 (LLVM-2.7) all fail to vectorize that snippet, with `gcc-4.5.1 -O3 -ftree-vectorize -ftree-vectorizer-verbose=9`, I get

    note: not vectorized: number of iterations cannot be computed.
    note: bad loop form.
    note: vectorized 0 loops in function.

However, gcc-4.5.1 and gcc-4.4.4 vectorize it when the operation is written as a standard `for` loop. Clang never vectorizes it.

nova · on Sept 14, 2010

Try -march=<cpu>

jedbrown · on Sept 14, 2010

Makes no difference because it's not the issue.

Locke1689 · on Sept 14, 2010

The author is not a moron but he is also not correct. Whether or not he thinks this is the best practice, ANSI C is not what is written in most of the important C systems applications. GCC intrinsics, Intel intrinsics and extensions, Nvidia extensions, inline assembler -- these are all things which you will find in most (if not all) of the very performant and useful C systems programs. Non-ANSI C is almost the de facto implementation in systems development and is mostly what people think of when they say "C is high level assembly." Moreover, people have to realize that this will never change unless intrinsics fall out of favor. For example, how would you begin to define a VMENTER intrinsic in the ANSI C standard? The author spends a lot of time railing against undefined behavior, but there is a very big difference between undefined behavior and implementation defined behavior. Simply because someone whips out GCC and says "see? GCC does it correctly" doesn't mean they're wrong unless they're only talking about undefined behavior.

So I guess my perspective is that there are two different types of people using C: people who are writing systems applications and people using ANSI C. Personally, I would guess that the first group is much, much larger.

mayank · on Sept 14, 2010

Sigh...pedantry at its finest. From TFA:

> Most real C implementations will go some distance beyond the standard(s), of course, but I have to draw the line somewhere.

You missed the guy's point, which was essentially that C code is munged and transformed by a good optimizing compiler, whereas assembly is left untouched by a good assembler.

Locke1689 · on Sept 14, 2010

No I don't think I did. I helped to write an optimized operating system kernel and virtual machine monitor for high performance computing. We absolutely treated C as high-level assembly. C provided a cost-saving measure in that we didn't have to run off customized asm blocks to do trivial one-liners that would be the same in the VMM implementation but slightly different in object code (two different opcodes). Not only is this common practice in academic and production systems development, but it also the recommended practice from most of my colleagues that added the intrinsics for their subsystems (Intel and Microsoft being my personal experiences).

You missed the guy's point, which was essentially that C code is munged and transformed by a good optimizing compiler, whereas assembly is left untouched by a good assembler.

That's irrelevant. They are both restricted on the basis of being functionally equivalent to the input code.

barrkel · on Sept 14, 2010

The fact that C can be relatively easily and predictably converted into reasonably efficient assembly doesn't make it an assembler.

All compiled languages are "restricted on the basis of being functionally equivalent to the input code", unless you meant something by that statement that I don't understand. The reason I'm uncertain is because I wouldn't presume that you are asserting that all deterministically compiled, correct compilers are high-level assembly languages.

I would add to what he writes in the article, however (and I did in a comment there): C doesn't model the von Neumann architecture very well. It lets you create data structures just fine, but the capability to create code, dynamically, at runtime, surely the defining characteristic of a shared instruction-data architecture, is all but completely absent.

Locke1689 · on Sept 14, 2010

Perhaps I've been unclear. C has characteristics of high-level assembly language because it allows you to directly integrate direct machine opcodes transformations directly into the syntax of the language itself instead of into the compiler/interpreter. In other words, C has an element of predictability in the opcodes it will generate for a given architecture. The exact opcodes it generates (SIMD, etc) aren't extremely important as long as this capability is preserved.

The fact that C can be relatively easily and predictably converted into reasonably efficient assembly doesn't make it an assembler.

Never said it was.

Locke1689 · on Sept 14, 2010

@barrkel

You make some good points and there is one which I thought about but didn't touch on because I didn't think it was important.

Consider: most systems and kernel developers do not write C. That is, they do not simply target "the C programming language." Instead, they often target GCC or the Intel C compiler. Even more, they usually target an architecture. In theory one may have to deal with many of the complexities of different implementations, but in practice development is usually restricted (or sometimes duplicated). I think that you may be taking my position a bit too concretely: I don't support the statement that all opcode output is predictable, nor do I support the statement that no other language has elements of the same techniques which make C so useful for systems development.

Simply, view my post as a (possibly slightly exaggerated) relation of the systems C development process from one low-level kernel developer to another. From the birds' eye view (which is the feeling I get from the original post) C may look abstracted and neat, but my experience says that the opposite is in fact true and that things tend to get a lot more dirty in the details.

barrkel · on Sept 14, 2010

I don't think C is necessarily predictable in the opcodes it will produce for any given architecture. The code produced for a big switch statement will differ hugely between compilers for the same architecture; some use binary searches, some use hash tables, some use multiple indirect jumps, etc.

And conversely, there are many other languages that have a similar degree of predictability in the code they produce for a given architecture; Pascal, for one, which is almost isomorphic with C in most practical implementations.

jacquesm · on Sept 14, 2010

> They are both restricted on the basis of being functionally equivalent to the input code.

But that goes for any compiled language, unless the compiler contains a bug.

Locke1689 · on Sept 14, 2010

Yes, I think the difference is that C allows for close interaction to the machine architecture. Normally I don't care what exact opcodes the compiler turns X language into, however C allows me to wrap direct hardware access into the syntax of the language itself. In essence, it mostly doesn't matter what exact opcodes the preceding for loop is turned into, it matters that in the next block I can take that result, move it to rax and rcx and use an intrinsic to execute a CPUID instruction.

Someone · on Sept 14, 2010

Worse, you can leave out 'compiled'. According to that logic, captain Picard programs in assembly whenever he starts a sentence with "Computer, "

edanm · on Sept 14, 2010

I think the author misses the point of the phrase "C is just portable assembly". It does't literally mean that C is just like assembly, it's usually a shorthand for saying the following things:

1. C is used in most contexts where you previously had to use assembly, and would use assembly if C didn't exist.

2. C is the language with the "lowest level" code imaginable.

3. What this means is, you can map almost any command in C into a specific command or a few specific commands in assembly.

That last line is the important part. C is optimized in the sense that every command in C will give you a deterministic amount of commands in assembly. When reading C code, someone who knows assembly can usually tell you what will happen in the compiled code. You won't see an operator which doesn't have a deterministic runtime or memory footprint. In fact, most of the questions on "Why C has this"/"Why C doesn't have that" can be answered exactly like that: you can't implement it with a deterministic set of commands.

In those senses, C is just like Assembly, at least more so than any other language around.

jasonwatkinspdx · on Sept 14, 2010

"every command in C will give you a deterministic amount of commands in assembly"

This is most definitely not true on any modern compiler. The generated assembly depends on the program as a whole and may not correspond fragment to fragment in any readily comprehensible way.

One example is a variable in C may correspond to a number of different storage locations at different points in the program graph, owing to SSA form (or individual optimizations that transform the program in a similar manner).

Another example is how pointer aliasing affects generated code: copying arguments into temporaries before arithmetic can actually result in fewer assembly instructions as the compiler can determine no aliasing is possible.

Add to this the more mundane and well understood optimizations like inlining, constant propagation, later passes of optimizations over the results of the prior and the result is that it's very hard to know exactly what assembly will correspond to any arbitrary part of a c program. Mapping memory locations to variables or assembly instructions to c program lines is not trivial.

I know this seems like nitpicking, but programmers using the mistaken mental model of c being textual macros for assembly leads to poorly performing code at best, and a variety of security vulnerabilities at worst. I think if you actually need to know what's going on at the machine level it's very important to know that the c abstract machine is most definitely NOT what the real machine on your desk is doing.

jrockway · on Sept 14, 2010

Also, like the article mentions, it has abstractions that you can't get rid of, like the stack.

wlievens · on Sept 14, 2010

> 3. What this means is, you can map almost any command in C into a specific command or a few specific commands in assembly.

Is that really true? It depends very much on your compiler and on the optimizations that it uses. I'd say that the level of optimization performed is proportional to the similarity between input and output. Straightforward translation makes for crappy optimization.

edanm · on Sept 14, 2010

The important point is that a C command will result in a deterministic amount of actual commands.

For example, C won't implement the "raise to nth power" operator (in Python, 210 means 2 to the power of 10, for example), because it can't be done with one assembly instruction, only with a loop.

Even with optimizations, you can reason about your code as if every C command is one Assembly command, and you won't be far off (you can say that every C command is O(1) assembly commands). This is very different from other languages.

1amzave · on Sept 14, 2010

Not sure I buy that.

C has a division operator, despite the fact that plenty of CPUs (ARM, for a common example) don't have a divide instruction. Typically, the compiler generates a call to a runtime support library (e.g. libgcc) with division functions for various datatypes, and yes, those functions typically involve some looping.

baguasquirrel · on Sept 14, 2010

That's true, and it's also a bit of a nitpick. How many of us actually work on multiple architectures or with different C compilers? For most systems-level stuff, you are simply using gcc on x86 linux, with plain-vanilla glibc, so honestly, I know what that div op is going to map to. If you've been at this for a while, you know what to expect when you write a simple for loop vs some pointer arithmetic. And if you happen to be an ARM guy, you probably are familiar with all the quirks of your platform as well.

fshaun · on Sept 14, 2010

I agree that experience will develop an intuition for how C statement map to assembly. But experience is no replacement for standards -- no implementation defines the language. In most cases, it's more useful to reason about the guarantees within the language rather than inferring what the compiled assembly will be. Possible? Yes, if as you said development is on homogeneous platforms. Painful? If you need to know how 5 compilers act instead of 1, I suspect so.

edanm · on Sept 14, 2010

The point is not to know the actual assembly. The point is that you'll never run an operation that might take a non-deterministic amount of time to run.

Contrast to, say, Python (as an obviously exaggerated counterexample). Just calling a function in Python means performing a lookup in a hash table.

mansr · on Sept 14, 2010

Dividing one 32-bit number by another is a constant-time operation even if on some machines (e.g. ARM) it is implemented as a loop. This is what matters. Raising a number to the nth power is an O(log n) operation.

jedbrown · on Sept 14, 2010

Division takes a data-dependent number of cycles on all x86-64 processors.

xyzzyz · on Sept 14, 2010

But the dividing instruction is still O(1).

jedbrown · on Sept 14, 2010

To decode? Why does the user care how long it takes to decode, they care how long it takes to run. We can provide upper bounds for exp, sin, and erfc too.

hackermom · on Sept 14, 2010

"3. What this means is, you can map almost any command in C into a specific command or a few specific commands in assembly."

Do you have any idea how wildly inaccurate that statement is? If you have no experience at all with assembly on any processor, or perhaps only on the x86 family, then I can understand your statement. It's still wildly inaccurate, though.

scrame · on Sept 14, 2010

Nice article; short, sweet and enough to pique my curiosity about modern assembly.

Some aphorisms like "C has the efficiency and speed of assembler, with the portability and readability of assembler" come to mind, but those are from ancient history in programming years, and usually said by people proficient in both.

There are other assertions (like jwz in the 'Java Sucks' rant), that refer to C as "a PDP-11 assembler that thinks its a language", but again an exaggerated statement made by someone who knows.

I have not met anyone who primarily codes in an interpreted language who think that C is just syntactic sugar for assembly. Maybe because most of them don't program in C.

I can certainly see some amateur programming pundits (or maybe just forum dickheads) regurgitating the lines without understanding them, but there is a world of difference between the examples touched on, and the actual syntactic sugar of recent java releases (generics, unboxing).

jacquesm · on Sept 14, 2010

It's not too much of a stretch to look at just 'C' as a very sophisticated macro assembler.

And probably you should strip out the 'macro' bit there because in reality that's the pre-processor doing it's work and even though there isn't a C compiler without a pre-processor technically speaking it is not part of the compiler since all it does is output more C.

Someone · on Sept 14, 2010

"And probably you should strip out the 'macro' bit there because in reality that's the pre-processor"

I disagree. Looping constructs that typically would be macros in an assembler such as 'while' and 'for' are C constructs. Also, C itself has automatic field offset computations.

berntb · on Sept 14, 2010

>>automatic field offset computations

A macro for subroutine entry could reserve stack space and define offset constants. A macro for subroutine returning releases the extra space.

It could be neat to use, if the processor architecture has sane addressing modes. (I might have seen/done something similar... doesn't feel like a new idea to me. :-) )

(The entry macro might need a simple preprocessor.)

Disclaimer: C and lower was another life, dimly remembered. :-)

mfukar · on Sept 14, 2010

It's not too much of a stretch to look at just ' FIXME' as a very sophisticated macro assembler.

s/FIXME/Ruby, or Python, or Perl, or any other language. Where does one draw the line that distinguishes a language from a sophisticated enough assembler?

jacquesm · on Sept 14, 2010

Those are interpreters, not compilers.

mfukar · on Sept 14, 2010

If you're not comfortable with interpreted languages, substitute for Java, C++ or other compiled languages. My question still stands for itself.

jacquesm · on Sept 14, 2010

No it doesn't.

In java any kind of minimal assignment can explode behind the scenes into a a whole pile of function calls, that would never happen in C, what you see is what you get.

logicalmind · on Sept 14, 2010

It's interesting to compare the real difference here. In C, what you see at compile-time is what you get. As compared to an interpreted language where what you see at compile-time is probably going to be completely different from what happens at runtime. The difference is really where the optimizations occur. Runtime optimizations have certainly come a long way.

mfukar · on Sept 14, 2010

jasonwatkinspdx and hackermom have explained why your statement is highly inaccurate.

Das_Bruce · on Sept 14, 2010

I have only written a small amount of assembly code, but I cannot imagine how someone who has written even a tiny bit of either could make such a mistake.

derefr · on Sept 14, 2010

The statement "C is portable assembler" makes sense in exactly one case: writing a UNIX kernel in the 1970s. Both the assembler and C versions of the kernel would be single-threaded, both would have a standard stack-usage convention (where cdecl is just a codification of the particular assembler stack convention K&R agreed upon), and neither would be run through an optimizer (as an optimizing C compiler didn't exist until quite a while later.)

jacquesm · on Sept 14, 2010

Even today, when you look at the core routines of for instance X windows C is still used in exactly that way, you don't have to but you definitely can and even though C compilers have advances tremendously the fact that there is a direct 1:1 correspondence between input and output is exactly why C is used in those situations.

C supports multiple stacks if you tweak setjmp and longjmp just right, and using co-operative multi-threading is possible without any OS support. Not very useful if you really have to do two things at the same time but a lot nicer than interleaving a bunch of code.

And an optimizing compiler is still a transformation of the input according to a given ruleset, you could not get the same level of optimization by just processing the output of the code generation stage of the compiler because you would lose a bunch of higher level information that is invaluable when optimizing the code but you could see it as just another stage in 'transforming' from one language to another without losing any functional bits along the way.

barrkel · on Sept 14, 2010

There simply isn't a 1:1 correspondence between C and its machine code; not in size, and not in functionality.

Any half-decent compiler is going to perform a non-trivial transformation of a big switch statement to make it efficient. The expected performance semantics of a switch (something better than O(n) in the number of cases) rules out simplistic iterated jumps in big cases. The programmer expects O(1), or at worst, O(log n).

But more importantly, CPUs generally have many more capabilities than are exposed by C, and this is where the "1:1 correspondence" really falls down. Assemblers generally have disassemblers that you can transparently round-trip through. That's a little harder in C.

Perhaps you meant a injective relation, rather than bijective? But that's a long way short of an assembler.

Someone · on Sept 14, 2010

"But more importantly, CPUs generally have many more capabilities than are exposed by C."

Hear, hear! Moreover, that has always been the case. As an example, try implementing multiple-word addition in C. On most architectures, you will learn that not having access to a carry bit makes that harder than it could be.

jacquesm · on Sept 14, 2010

Yes, you're right, the switch statement is a good example of how modern compilers fudge the boundary.

But the main point is that the difference between the generated code and the stuff you write is relatively small, when looking at the assembly that a C compiler generates I have relatively little trouble following the relationship between the two, and I can make reasonable predictions about what will pop out on the other end.

And of course processors are 'richer' than what most C compilers will use, especially when it comes to special instructions that have no equivalent in the C language.

I've worked on a 'decompiler' for the Mark Williams C compiler (yes, that's pretty long ago), and at the time the above still held true, today the boundaries are definitely fuzzier, mostly due to the increased smarts of compiler writers for the optimization stage.

Gcc is clever enough to optimize whole branches of code out of existence if you set it to be aggressive enough and the code was written naively, that's one way of dramatically losing that 1:1 correspondence.

andymorris · on Sept 14, 2010

I have to argue against most of the comments here though: I think, after a very short time with C, you get a clear sense of what will happen in the assembly. It's also pretty trivial to reach that stage with C++ - really, all the basic control structures are implemented in a pretty standard way. Sure, sometimes it throws a curveball, but you know what? It was the exception rather than the rule.

I'm fully convinced that I could sit down with the average C++ program and accurately predict the majority of the generated assembly. It's really not that hard - the compiler doesn't have THAT many instructions, and it only uses them in a limited set of occasions!

-- Ayjay on Fedang #coding

zafka · on Sept 14, 2010

>For example, C won't implement the "raise to nth power" >operator (in Python, 210 means 2 to the power of 10, for >example), because it can't be done with one assembly >instruction, only with a loop.

Not a great example of your point. Here is Analog devices Assembly for the ADSP-2100 Family. Assuming Ar is loaded with 2:

sr=LSHIFT ar by 10;

Even without the assumption, no loop is needed for raising 2 to some power.

mansr · on Sept 14, 2010

What you are doing there is multiplying something by a power of two, and 'ar' should have the value 1 for 'sr' to be set to two to the power of 10.

Multiplying by a power of 2 is easily done in C using the shift operator.

onan_barbarian · on Sept 14, 2010

I support the idea of labeling the 'sophistication level' of blog posts in their titles, but a bit of punctuation would be nice.

dangrossman · on Sept 14, 2010

The author was trying to play on "more on...".

Scriptor · on Sept 14, 2010

I considered using a different title for the submission, but I figured he was using "moron" as a modifier.

pajarito · on Sept 14, 2010

Reading only the comments, perhaps C is just a low level language very well suited for many applications that don't require class and objects. Is there something more in the original post?

dfox · on Sept 14, 2010

and incidentaly C tends to be better object oriented language than C++ :)

hippich · on Sept 14, 2010

Just like C is not Assembly, Digging in the Ruby/Python/etc source code is not a hacking.

hackermom · on Sept 14, 2010

Here's why: if you are not programming on mnemonic-level for the implied processor, you are not writing assembly. Period. Saying anything else is just romanticizing and paraphrasing (which in my opinion there is no room for in technics on this level).

mfukar · on Sept 14, 2010

+1 from me. It appears that not many among the HN crowd are willing to accept this, and I'm curious why this matter needs to be overanalyzed, in true geek fashion.