Hacker News new | past | comments | ask | show | jobs | submit login
The Unreasonable Effectiveness of C (damienkatz.net)
431 points by daschl1 on Jan 10, 2013 | hide | past | favorite | 384 comments



Oh for heavens' sakes. Yet more ignorance.

A more realistic view of C:

- C is straightforward to compile into fast machine code...on a PDP-11. Its virtual machine does not match modern architectures very well, and its explicitness about details of its machine mean FORTRAN compilers typically produce faster code. The C virtual machine does not provide an accurate model of why your code is fast on a modern machine. The social necessity of implementing a decent C compiler may have stifled our architecture development (go look at the Burroughs systems, or the Connection Machine, and tell me how well C's virtual machine maps to them).

- C's standard library is a joke. Its shortcomings, particularly around string handling, have been responsible for an appalling fraction of the security holes of the past forty years.

- C's tooling is hardly something to brag about, especially compared to its contemporaries like Smalltalk and Lisp. Most of the debuggers people use with C are command line monstrosities. Compare them to the standard debuggers of, say, Squeak or Allegro Common Lisp.

- Claiming a fast build/debug/run cycle for C is sad. It seems fast because of the failure in this area of C++. Go look at Turbo Pascal if you want to know how to make the build/debug/run cycle fast.

- Claiming that C is callable from anywhere via its standard ABI equates all the world with Unix. Sadly, that's almost true today, though, but maybe it's because of the ubiquity of C rather than the other way around.

So, before writing about the glories of C, please go familiarize yourself with modern FORTRAN, ALGOL 60 and 68, Turbo Pascal's compiler and environment, a good Smalltalk like Squeak or Pharo, and the state of modern pipelines processor architectures.


I'm pretty ignorant about this stuff, so please don't think I'm trolling.

I'm confused when you speak of a virtual machine with regard to C... can you explain what you mean by this?

I had to wikipedia the Burroughs machine. I guess the big deal is that it's a stack machine? It looks very interesting and I plan to read more about it. But I guess I don't understand why that is a hindrance to C.

The JVM is a stack machine, isn't it?

btw, I haven't read the article yet. It's my habit to check comments first to see if the article was interesting, and seeing your comment made me want to reply for clarification.


The Burroughs was a stack machine, but that's only the beginning. Look at how it handled addressing hunks of memory. Bounds checked memory block references were a hardware type, and they were the only way to get a reference to a block of memory. So basically, null pointers didn't exist at the hardware level, nor out of bounds writes to arrays or strings. Similarly, code and data were distinguished in memory (by high order bits), so you couldn't execute data. It simply wasn't recognized as code by the processor. Also interesting, the Burroughs machines were built to use a variant of ALGOL 60 as both their systems programming language (as in, there wasn't an assembly language beneath it) and as their command language. The whole architecture was designed to run high level procedural languages.

C defines a virtual machine consisting of a single, contiguous block of memory with consecutive addresses, and a single core processor that reads and executes a single instruction at a time. This is not true of today's processors, thanks to multicores, multiple pipelines, and layers of cache.


No, C does not define a single contiguous block of memory with consecutive addresses. It _does_ qualify that pointers are scalar types, but that does not imply contiguity or consecutive addresses (with the exception of arrays)

There is no requirement in C that you be able to execute data.

You certainly could have a C implementation that bounds-checks strings and arrays. (See e.g. http://www.doc.ic.ac.uk/~phjk/BoundsChecking.html for a very old paper on how you might do that)

The "abstract machine" of C explicitly does _not_ make reference to the memory layout. (cf 5.1.2.3 of the spec)

It also makes no reference to the number of cores, and order of execution is not one at a time, but limited by sequence points.

That's the whole point of C - it is very loosely tied to actual hardware, and can accomodate a wide range of it, while still staying very close to actual realities.

Edit: Please don't take this as a "rah, rah, C is great" comment. I'm well aware of its shortcomings. I've spent the last 20+ years with it :)


I would argue that C's problem is not that it's too strictly defined, but that it's too poorly defined. An in-depth look into all the cases of undefined behavior in C will show what I mean.

You want to really understand C? Read this[0]. John really understands C.

[0] http://blog.regehr.org/


Can't upvote this enough. Well, except that I'd replace "poorly" with "vaguely". "Implementation-defined behavior" is there for very good reasons in every single case it's there.

Sidenote: With John's name, I'd be tempted on a daily basis to change the last two letters of my name to a single 'x' ;)


You mean like this?

  $ whoami | sed 's/hr/xx/'


No, he meant:

    $ whoami | sed 's/hr$/x/'


Minor nit: She meant :)


I think you mean:

  $ echo "$PARENT" | sed 's_r/x_r$/_'
P.S. It's a bit hard to believe I misread that... thanks.


Random tip: in Bash (at least), you can execute the previous command with some changes using ^..^..^, e.g.

  $ echo john regehr | sed s/hr/xx/
  john regexx
  $ ^r/x^r$/^
  echo john regehr | sed s/hr$/x/
  john regex
(The second last line is just bash printing the new command.)


My sed knowledge isn't very advance. What is your invocation supposed to do?


You can use any separator you want in an s/../../ expression, not just /, in this case the separator is _ (this technique allows you to use / without creating a "picket fence": s/r\/x/r$\//).

So the regex just means replace "r/x" with "r$/".


I'd argue that there's a big distinction between C as described in the standard and C as actually used in real-world code, and the latter has much stricter semantics and is harder to work with. A C compiler that follows the standard but not the implied conventions won't do well.

For example, take NULL. Even on a machine with no concept of NULL, you could easily emulate it by allocating a small block of memory and having the pointer to that block be designated as "NULL". This would be perfectly compliant with the standard, but it will break all of the code out there that assumes that NULL is all zero bits (e.g. that calloc or memset(0) produce pointers whose values contain NULL). Which is a lot of code. I'm sure that many other examples can be found.


"C defines a virtual machine consisting of a single, contiguous block of memory with consecutive addresses"

This is 100% false. The C standard makes no mention whatsoever of memory. I don't know much about the burroughs machine, but it sounds like it would map very well to the C virtual machine:

C permits an implementation to provide a reversible mapping from pointers to "sufficiently large integers" but does not require it.

A pointer to an object is only valid in C (i.e. only has defined behavior) if it is never accessed outside the bounds of the object it points to.

Converting between data pointers and function pointers is not required to work in the C standard either.

C does require that you have a NULL pointer that has undefined behavior if you dereference this, but this could be trivially done by the runtime by allocating a single unit of memory for it.


>C defines a virtual machine consisting of a single, contiguous block of memory with consecutive addresses, and a single core processor that reads and executes a single instruction at a time. This is not true of today's processors, thanks to multicores, multiple pipelines, and layers of cache.

Which is true, for a rather stretched definition of "virtual machine"(which falls apart at the kernel level, because it's pretty hard to work with a machine's abstraction when you're working directly on the hardware).

The problem with the virtual machine comparison is that C doesn't mask ABI access in any meaningful way. It doesn't need to, since it's directly accessing the ABI and OS. So the argument that C isn't multithreaded is rather shortsighted, because C doesn't need that functionality in the language. It's provided by the OS.


FYI when discussing the ISO C standard the term "virtual machine" is well understood to be the abstracted view of the hardware presented to user code. Things well defined in it are portable, things implementation defined are non-portable, and things undefined should be avoided at all costs.


As a C programmer this is like watching virgins attempt to have sex. Normal people just write some code which does some sh*t and that's OK. We don't need to deeply reflect on whether it's cache optimal, because that will change next week. Just good clean C. When did that become a strange thing to do?


"good clean C"

Is there such a thing? It seems like every C program, even ones that are praised as being excellently written, are a mess of pointers, memory management, conditional statements that check for errors, special return value codes, and so forth.

To put it another way, look at the difference between an implementation of a Red/Black Tree in C and one written in Lisp or Haskell. Not only are basic things overly difficult to get right in C, but C does not become any easier as problem sizes scale up; it lacks expressiveness and forces programmers to deal with low-level details regardless of how high-level their program is.


Um. Read Bentley. Get back to me. Yesterday's old shit is last week's high level. Turns out clear thought in any language is the main thing.


"Turns out clear thought in any language is the main thing."

No, the ability to express your thought clearly is the main thing -- and that is why languages matter. If your code is cluttered with pointer-juggling, error-checking conditional statements, and the other hoops C forces you to jump through, then your code is not clear.

Try expressing your code clearly in BF, then get back to me about this "languages don't matter as long as your have clear thought" philosophy.


Can both of you guys just get back to me later? Kinda busy now.


Sure. RBTree in C is that ugly. Take your time.


If you can't see the flaws in C you're probably writing poorly optimized, security-atrocious C.


I'm a professional pentester and I have been a C programmer for over well over 5 years, but I acknowledge that my C is probably still pretty bad :) how about you? :)

P.S: now I have figured you out (on a very basic level of course) and I have a lot of respect, but nonetheless, let's play :)


I've been writing kernel code in C for about 8 years, including a hardware virtualization subsystem for use on HPCs. I used to teach Network Security and Penetration, but I lost interest in security and moved on to programming language development.

My code, in any language, is full of bugs. The difference is that in C my bugs turn into major security vulnerabilities. C is also a terrible language in that you never write C -- you write a specific compiler's implementation of C. If a strict optimizing compiler were to get a hold of any C I've ever written, I'm sure it would emit garbage. All the other languages I write code in? Not so much.

That said, is C useful? Hell yes.


Based on that I will buy you your beverage of choice at any conference you choose :)

P:S: I've probably written commercial stuff you work with and also I don't give a shit if you give a shit, if you see where I am coming from. I have a pretty good idea of what the compiler will do and I will be pissed off if it doesn't do that. It normally does.


Thanks. I hope you didn't take my first comment as an insult.

What I meant by that is C is not just something you sit down with after skimming the specification and "bang out." There are years of community "inherited knowledge" on how to write C so it doesn't result in disaster. The very need for these practices exemplifies the flaws in C as a language -- by the very nature of working around these vulnerabilities, you acknowledge that they are vulnerabilities. Thus, if one doesn't see C's issues then one is doomed to C's mistakes (this sentence is very strange when read out loud).


I think that your situation is pretty different from most programming projects in that you are way closer to the machine than most people need to be. Also, you are working on an OS which is particularly sensitive to compiler nuances. I would have a hard time imagining different compilers spitting out garbage with the standard "hello world". Now the almost mandatory caveat: I know that C has its flaws, but not all programming projects are the same. Projects which are not like your will have the "You write a specific compiler's implementation of C" problem in way smaller doses than you (possibly to the point of not having them at all, like hello world).


I'll have to read more about the memory references to get a feel for that.

However it speaks of a compiler for ALGOL... it was compiled down to machine instructions. Assembly is just a representation of machine instructions, so I don't see how it can be said to not have an assembly language.

Maybe nobody ever bothered to write an assembler, but that doesn't mean that it somehow directly executes ALGOL.

Thanks for your replies, you have given me some food for thought.

[1]http://en.wikipedia.org/wiki/Burroughs_large_systems_instruc...


> However it speaks of a compiler for ALGOL... it was compiled down to machine instructions. Assembly is just a representation of machine instructions, so I don't see how it can be said to not have an assembly language.

In this sense, you're completely right. But I think that people who grok the system mean something a bit different when they say it doesn't have an assembly language. (Disclaimer: I have no firsthand experience with Burroughs mainframes.)

The Burroughs system didn't execute Algol directly, true. But, the machine representation that your compiled down to was essentially a high-level proto-Algol. It wasn't a disticnt, "first-class citizen". It was, if you like, Algol "virtual machine bytecode" for a virtual machine that wasn't virtual.

If you're writing in C, or some other higher-level programming languages, there are times when you want more fine-grained control over the hardware than the more plush languages provide. That's the time to drop down to assembly code, to talk to the computer "in its own language".

The Burroughs mainframes had nothing analogous to that. The system was designed to map as directly to Algol as they could. It's machine language wasn't distinct from the higher-level language that you were supposed to use. To talk to a Burroughs system "in its own language" would be to write a rather more verbose expression of the Algol code you'd have had to write anyway, but not particularly different in principle.

So, I guess the answer to whether or not the Burroughs systems did or did not have an assembly language is a philosophical one. :P


C doesn't care for fancy terms like VM, multicore, threads, ... But you can always make libary and implement what you need. This approach has advantages, for example you can share memory pages between processes, because that kind of stuff are part of hardware/OS, not C language. It would be stupid to implement it directly in C language. You will now say that it is reason why C is bad, i say it is reason why C is so popular all these years.


> C defines a virtual machine consisting of a single, contiguous block of memory with consecutive addresses, and a single core processor that reads and executes a single instruction at a time. This is not true of today's processors, thanks to multicores, multiple pipelines, and layers of cache.

This type of machine has become so ubiquitous that people have begun to forget that once upon a time, other types of machines also existed. (Think of the LISP Machine)


It may be the parent comment is referring to the Runtime-Library when using the term Virtual Machine.


No, I am referring to the virtual machine defined by the C language.


I'd say "abstract virtual machine." You are just confusing people. "Virtual machine" most commonly refers to a discrete program that presents a defined computational interface that everyone calls the virtual machine. This VM program must be run independently of the code you wrote.

For C there is no such virtual machine process. The "virtual machine" for C is abstract and defined implicitly.


If you're going to be pedantic, use the right terms. The C language defines an 'abstract machine', not a 'virtual machine'.


Second this; in all my years (granted, not a lot, but enough), this is the first time I've heard anyone claim that C has a virtual machine. You can hem and haw and stretch the definition all you want, but when it compiles to assembler, I think that most reasonable people would no longer claim that's a "virtual" machine.

Edit: if you want to argue that C enforces a certain view of programming, a "paradigm" if you will (snark), then say that. Don't say "virtual machine", where most people will go "what? when did C start running on JVM/.NET/etc?".


It may be less confusing to say "C's underlying model of computation".


Given the way that LLVM has come onto the scene, I'm not sure I'd agree. C defines assumptions in the programming environment and does not guarantee that it at all resembles the underlying hardware. You are never coding to the hardware (unless you are doing heinous magic), you're coding to C. That's a "virtual machine" to me.

The concept of C as a virtual machine isn't new (I first heard it around 2006 or so? I don't think it was new then) and it's much more descriptive than referring to its "model of computation".


It's more descriptive, but somewhat incorrect.

The common definition of a process virtual machine is that it's an interpreter that can be written to that essentially emulates an OS environment, giving abstracted OS concepts and functionality. This aids with portability. Another concept of virtual machines in general is, for lack of a better term, sandboxing. You're limited to only the functionality that the VM provides.

C goes halfway with that. You generally don't need to care about most OS operations if you're using the standard library(which abstracts most OS calls), but you definitely do need to care about the underlying OS and architecture if you're doing much more than that. Also, simple C definition doesn't allow for threads or IPC, both of which are provided by the POSIX libraries. You're also allowed to directly access the ABI and underlying OS calls through C.

The best example of C not really having a VM is endianness. If C had a "true" virtual machine, the programmer really shouldn't need to be aware of this. But everyone that's written network code on x86 platforms in C knows that you need to keep it in mind. Network byte order is big endian, but x86 is little endian, so you need to translate everything before it hits the network.

EDIT: I think LLVM is somewhat of a red herring in this context. Realistically, unless you're writing straight assembly, there's nothing stopping anyone from writing a VM-based implementation for any language. The problem with C and the other mid to low level languages is that if you're writing the VM, you need to provide a machine that not only works with the underlying computational model, but also provide abstractions for all the additional OS-level functionality that people use.

So C could definitely become a VM-based language, especially if the intermediate form is flexible enough.


You mean like the TeDRA compiler

http://en.wikipedia.org/wiki/TenDRA_Compiler

Or the C compiler for i5/OS (ILE bytecodes) with an OS wide JIT?

http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/topi...

People keep on mixing languages with their implementations.

It should be compulsory to have compiler design classes in any informatics course.


"The common definition of a process virtual machine is that it's an interpreter that can be written to that essentially emulates an OS environment, giving abstracted OS concepts and functionality."

Is it? I have seen "virtual machine" used to describe the process abstraction and to describe the IR in compilers (hence "Java Virtual Machine"), and to describe the Forth environment (similar to compilers).


Using the term "virtual machine" to refer to the "virtual PDP-11" that C exposes to programs is possibly older than the internet.


What's so confusing about the term virtual machine? It's an abstraction of the underlying machine.


For the same reason that if you put a hunk of chocolate in the oven and called it "hot chocolate," people would be confused that it's not a warm beverage of chocolate and milk.

That is, the phrase "virtual machine" is usually assumed to be the name for a piece of software that pretends to be some particular hardware. It is less commonly used to mean a "virtual machine", that is, not a noun unto itself, but the adjective virtual followed by the noun machine.


The term "virtual machine" is already pretty overloaded. This isn't referring to virtualized hardware in the VMWare sense or a language/platform virtual machine in the JVM sense. Rather, it's talking about how C's abstraction of the hardware has the Von Neumann bottleneck baked into it, so it clashes with fundamentally different architectures like the Burroughs 5000's.


No need to downvote, guys.

The C language specification [0] defines an abstract machine and defines C semantics in terms of this machine. 5.1.2.3 §1:

> The semantic descriptions in this International Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant.

[0] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf


If C makes use of a virtual machine, then why we have to recompile it for every new machine/platform?


"Virtual machine" in this context refers to the computation model of the language. In C, that model is essentially a single-CPU machine with a big memory space that can be accessed with pointers (including a possibly disjoint memory space for program code).

Other models are possible; for example, lambda calculus and combinator logic are based on a model where computation is performed by reducing expressions, without any big memory space and without pointers. Prolog is based on a model where computation is performed by answering a query based on a database of rules. These are all "virtual machines" -- the realization of these computation models is based on compiling a program for a specific machine. It is not different with C; C just happens to use a model that is very similar to the real machine that a program executes on (but it is not necessarily identical e.g. you probably do not have so much ram that any 64-bit pointer would correspond to a real address on your machine).


Because C compilers write out to native code. It may help to think that C is the virtual machine language (as well as its specification for the virtual machine). This concept has been extended by things like Clang, that transform C to a somewhat more generic underlying language representation (LLVM bytecode) before compiling to native code.

You can ahead-of-time compile Mono code to ARM; that doesn't mean it's not defining a virtual execution environment.


See silentbicycle's sibling comment.


I'm also a novice on low-level stuff, but if I had to guess...

I'd guess that the virtual machine of C pertains to the addressing and the presentation of memory as a "giant array of bytes". Stack addresses start high and "grow down", heap addresses start low. These addresses need not exist on the machine. For example, two running C processes can have 0x3a4b7e pointing to different places in machine memory (which prevents them from clobbering each other).

Please, someone with more knowledge than me, fill me in on where I'm right and wrong.


C does not require the presentation of memory as a "giant array of bytes"---certainly when you have a pointer, it points to an array of bytes (or rather, a contiguous array of items of the pointer type) but that's about it. The stack does not have to start high and grown down (in fact, the Linux man page for clone() states that the stack on the HP PA architecture grows upward) and the heap doesn't necessarily start low and grow up (mmap() for instance).

You are also confusing multitasking/multiprocessing with C. C has no such concept (well, C11 might, I haven't looked fully into it) and on some systems (like plenty of systems in the 80s) only one program runs at a time. The fact that two "programs" both have a pointer to 0x3A4B7E that reference two physically different locations in memory is an operating system abstraction, not a C abstraction.


C pointer aliasing defeats certain compiler optimizations that can be made in other languages, and is frequently brought up in C vs FORTRAN comparisons. I think that's probably what the GP had in mind.


C99 includes restricted pointers, but support is a bit spotty. Microsoft's compiler (which is of course really just a C++ compiler) includes it as a nonstandard keyword, too.

http://en.wikipedia.org/wiki/Restrict


That makes a lot of sense. Thanks!


Well that is how memory, the hardware we have and also all normal operating systems work, but if you want to discuss other stuff we can do that too :) Try serious debugging and you will find that all your preconceptions are confirmed, yet it's still hard to know WTF is going on.


The "memory is just a giant array of bytes" abstraction hasn't been true ever since DRAM has existed (because DRAM is divided into pages), ever since caches were introduced, and certainly isn't true now that even commodity mulch-processors are NUMA with message-passing between memory nodes.


Look if we want to be super anal about shit all memory is slowly discharging capacitors with basically random access times based on how busy the bus circuity is with our shit this week. It turns out that memory is really complicated stuff if you look at it deeply, but the magic of modern computer architecture is that you get to (hopefully) keep your shelf model for as long as you can. If you were to try to model actual memory latency: here's a shortcut: you can't. That's why everyone bullshits it.


Fair point. At this point, it's a very leaky abstraction because not all levels of "random access" (e.g. L1 cache vs. main memory) are created equal.


True, and this is my biggest problem with writing optimized code in C -- it takes a lot of guessing and inspecting the generated assembler and understanding your particular platform to make sure you're ACTUALLY utilizing registers and cache like you intend.

If there were some way of expressing this intent through the language and have the compiler enforce it, that'd be fantastic :)

That said, there's really not a better solution to the problem than C, just pointing out that even C is often far less than ideal in this arena.


i'm actually spending much of my time these days having a go writing HPC codes in haskell.

So far, the generated assembly looks pretty darn awesome, and the performance pretty competitive with alternatives i have access too :)


Madhadron, you make a lot of claims but provide no detail. Also using terms like "virtual machine" with respect to C is plainly ridiculous and a case of bullshit baffles brains.

Turbo Pascal vs C? Really? In its time Turbo Pascal was an amazing piece of software but in the grand scheme of things it is a pimple compared to the whale that is C. Please compare all software written in Turbo Pascal as opposed to C if you have any doubts. The same goes for Smalltalk, Lisp, Algol 60/68. All great products/languages but very niche.

Fortran can be faster than C in some areas but again it is a niche language.

I could go on a lot more but quite frankly I don't think your post merits much more discussion and is borderline trollish.


> Madhadron, you make a lot of claims but provide no detail. Also using terms like "virtual machine" with respect to C is plainly ridiculous and a case of bullshit baffles brains.

C as a very thin virtual machine is a common conception and not an incorrect one--C runs on many systems with noncontiguous memory segments but presents it as a single contiguous space, for example. The idea of C as a virtual machine is much of the basis of LLVM, and to the best of my knowledge I've never worked on a computer where C represented the underlying hardware without significant abstractions.

If you're going to accuse somebody of trolling, you should know what you're talking about first.


> C as a very thin virtual machine is a common conception and not an incorrect one

I have worked extensively in the past with C and have never heard it referred to as a virtual machine, thin or otherwise. I understand that the OP probably means "computation model" or something similar but felt that the use of the phrase "virtual machine" was a bit on the bombastic side and that in addition to the general tone of the post made me think the post was borderline trolling.

BTW. I would be quite happy to be proved wrong about C commonly being referred to as a thin virtual machine - what books/literature refer to C in this way?


"I have worked extensively in the past with C and have never heard it referred to as a virtual machine, thin or otherwise."

True. Usually other terms are used; for instance, "memory model". If you google that you'll find some things. As you read them, notice that they may or may not match the hardware you are actually running on, and given the simplicity of C's model, nowadays almost never does.

C is a low-level language that lets you close to the machine, and even lets you drop into assembler, but it is true that it is enforcing/providing a certain set of constraints on how the library and code will act that do not necessarily match the underlying machine. It may not be a "virtual machine", depending on exactly how you define that term, but the idea isn't that far off.

Also, this is a good thing, not a criticism. If it really was just a "high level assembler" that provided no guarantees about the system, it would be as portable as assembler, which is to say, not at all.

For a much more clear example of this sort of thing in a language similarly thought to be "close to the metal", look at C++'s much more thorough specification of its memory and concurrency model, and again notice that this model is expected to be provided everywhere you can use a C++ compiler, regardless of the underlying hardware or OS. It is, in its own way, a virtual machine specification.


C as a very thin virtual machine is a common conception and not an incorrect one--C runs on many systems with noncontiguous memory segments but presents it as a single contiguous space, for example.

C does no such thing. That is the job of the MMU, and C has nothing to say about it. You're going to have a hard time convincing anyone that a language without a runtime somehow has a VM. That's nonsense.


See the above discussion about what he meant by "virtual machine." It's not what you think it is, and it does exist.


It doesn't matter, his description is still the same kind of nonsense in line with having a VM. I get the feeling that he thought the thing he meant to say is actually the thing he said.


Sorry, I don't see that. What he said is clear to me: The C virtual machine does not provide an accurate model of why your code is fast on a modern machine.

He's saying that the abstraction that C provides has deviated significantly from the underlying hardware. Considering the kinds of hoops that modern processor go through, this is a valid point.

And the above should answer the sibling's question, too.


He also said "C runs on many systems with noncontiguous memory segments but presents it as a single contiguous space." That is 100% false. In C, pointers to different objects are different. They're not even comparable. There is no concept of contiguous memory beyond that of a single array. Two arrays are not contiguous.


Allocate an array that crosses segments. Watch the details be hidden from you.


I'm not aware of an implementation that does that, and it's not required by the standard. A more common solution is to just limit the size of an array to the size of a segment (sizet max would also then be the segment size). If you think the C standard requires crossing segments, what section, what words?


Sorry, I don't see that.

Try reading the part I quoted. I think you're conflating my criticisms with that of someone else. Yes, C abstracts the hardware, and always has. That's the point I made in the first place: its memory abstraction is a prerequisite. Hardware and operating systems provide that to C, in direct contradiction to the comment I replied to (which claims C provides it.)


More importantly, all other languages have it, too. So why even bring it up?


The term "virtual machine" has several meanings, only one of which is what you're thinking of (something like the JVM or CLR). In the context Madhadron's using it, "virtual machine" means the abstraction of the hardware that the C compiler presents to the running program. E.g. it presents an illusion of sequential execution even on machines such as Itanium that are explicitly parallel at the hardware level. It presents the illusion of memory being a uniform array of bytes, when on a modern machine it's actually composed of several disjoint pieces.

CPU's and operating systems go to substantial lengths to present programs with a "C machine" on modern hardware. Your multi-core CPU doesn't look like a PDP-11, but it uses reorder buffers to make execution look sequential and cache coherence built on message passing to make memory look uniform.


Also using terms like "virtual machine" with respect to C is plainly ridiculous and a case of bullshit baffles brains.

Note that the C language standards are written in terms of an "abstract machine". For example, "When the processing of the abstract machine is interrupted by receipt of a signal, only the values of objects as of the previous sequence point may be relied on" from 5.1.2.3.


Seems to me that the GP was referring to the C abstract machine but used a similar, more common phrase by mistake.

Granting that, I can't say I agree with his point: C has probably ran on more machines than any other language. As an abstraction over machines it is, if not perfect, clearly good enough.


If you don't understand what "virtual machine" means in this context, why are you posting?


Maybe because he uses C and have never heard someone refering to VM and C in same sentence. I'm also interested in this because i also don't get wtf is "C VM". He could be refering to layer below which includes hardware and OS combo. You know, memory pages, translations, ... Is he talking about that? :-)


Yes.

All languages have a virtual machine, or if you prefer abstract machine, that describes how the language sees the hardware.

CINT is a C/C++ interpreter for example,

http://root.cern.ch/drupal/content/cint

Programming languages are independent of their canonical implementation.

Usually the usage of a compiler, interpreter or JIT as default implementation is a matter of return of investment for the set of tasks the language is targeted for.


He´s using the term VM, as a machine abstraction, wich C really is, in the same perspective the OS is a VM..

it will abstract you away, from device drivers, processor assembly, memory, (using files to represent a collection of data blocks in a block device)

C is a VM. but not the same way the Java VM is a vm.. cause that is a VM on top of another VM (OS and processor assembly)

its a software abstraction of a hardware system? its a VM.


Software abstracion of a hw system sounds ok. VM in C context is little bit weird ... but it's only terminology


I believe he actually intends to talk about the C abstract machine, which is a well-defined technical term (and is actually used extensively in the C standard).


> So, before writing about the glories of C, please go familiarize yourself with modern FORTRAN, ALGOL 60 and 68, Turbo Pascal's compiler and environment, a good Smalltalk like Squeak or Pharo, and the state of modern pipelines processor architectures.

All true, but how are you going to implement a single project using just the best parts of FORTRAN, ALGOL 60, Turbo Pascal and Smalltalk?


Yes, and that's the tragedy of it all...and the real reason everyone uses C. It's not that C is a well designed language appropriate for highly optimized compiling, with wonderful tooling and an elegant standard library that takes care of the difficult or insecure or repetitive tasks.

It's that I used to be able to sit down at any Unix machine in the world, shove some code into an editor, call some system libraries, get it to compile, and maybe even step through it in a debugger, and go home at the end of the day, and I could do that because that's what its designers were doing with it, so they made sure that all of it at least functioned at a good enough level to get through the day. Not always anymore, since a lot of Linux distributions no longer install the C compiler by default.


The fact that is it "not always anymore" may be a blessing in disguise. The focus of attention in the last decade or so has been all around heavier, VM-centric languages like Java or Ruby, but opportunities for new systems programming technology are coming up more often these days.


Fortran compilers sometimes produce faster code for a limited subdomain of problems because fortran's language semantics are less restrictive: all arrays are assumed to not alias and the compiler has more freedom to reassociate arithmetic and apply other unsafe arithmetic transformations that are disallowed by C's stricter model.

C compilers let you specify that you want similarly relaxed semantics via compiler flags and language keywords; however, they also allow the programmer to have stricter behavior when necessary, e.g. allowing one to operate on memory that may actually alias and still get correct behavior.


Also in Fortran you can get aliasing pointers, if you ask for it (mark the pointed variables or arrays as 'target').


The article argued that C was effective in practice, not that it was better than X in some other sense. Would you prefer to write production code in FORTRAN, ALGOL 60 and 68, Turbo Pascal, or Smalltalk to writing it in C today?


Yes. Though I tend to use PLT Racket these days when the choice of language is not constrained by other factors.


Seriously? Actual production code in ALGOL in 2013? If I ever had to deal with that code, I think I'd feel about like you would have if I wrote this comment in Aramaic because I prefer it to English on numerous grounds. I'd very much prefer the extremely unpopular Racket which is at least alive to some degree.

I think the author's criteria for "effectiveness" are very much different from yours, hence the different conclusions.


You might enjoy this link ;)

http://cowlark.com/2009-11-15-go/


I've read this comparison between Go and ALGOL-68, and I did enjoy it. Here are quotes from it that explain the difference between "good" and "effective" - exactly the point that I was making here (the article we're discussing claiming that C is effective, not that it is good in whatever sense):

"So why am I comparing Go with a language 41 years older? Because Go is not a good language."

But...

"I'm not suggesting that everyone drop Go and switch to Algol-68. There's a big difference between a good programming language and an effective programming language. Just look at Java. Go has a modern compiler, proper tools, libraries, a development community, and people wanting to use it. Algol-68 has none of these. I could use Go to get proper work done; I probably couldn't in Algol-68. Go is effective in a way that Algol-68 isn't."

Note that the author of a parent comment actually claimed to prefer ALGOL-68 to C for writing production code today, in practice.


modern FORTRAN is absolutely horrific, to the extent that F77 is far more popular than more recent flavors (like F90 or F95). It suffers from the same types of problems that the author identifies in C++ and Java.

As for standard library, I learned from K&R and have never been surprised by the library (insofar as I can reasonably predict what will happen, given the guidelines). I cannot say that about the C++ standard library (don't get me started on STL) or Java

And as for debugging, it took me years to become adept at parsing STL error messages (wading through cruft like basic_string to get to the essence of the error).

however I do agree that Borland and turbo pascal are exemplary.


It's not about being surprised by the standard library, there are no surprises there, it's about being appalled by the standard library.


The overarching point in the erlang example is that the bug stemmed from something in erlang itself. C and the standard library are sufficiently documented and tested that there is no ambiguity. (Regarding the natural counterpoint on indeterminate expressions like I++ + ++I, the language specification is also clear on the nature of indeterminate forms)


I thought the major point is that it's a "high level" language, high enough that a compiler can do serious optimizations. Read Fran Allen's interview in Coders at Work, she despairs of writing efficient compilers/programs in C because it's so low level.

Now, sometimes you need that low level access, but C is used way beyond those niches.


FORTRAN 77 is far more popular because it took so long for a GNU FORTRAN 9x compiler to appear, and meanwhile there are millions of lines of FORTRAN 77 in enormous libraries.

The problem with the standard library is that it is tiny and its basic types are often ill designed. Certainly it's fairly consistent, and so fairly unsurprising, but that's not the argument.


It also has to do with the type of people that program in FORTRAN. They often have backgrounds more grounded in Maths or Physics than computers. I don't think the GNU compiler had much to do with it because people using FORTRAN are more likely to be using something else (we use Solaris Studio where I work).


The C 'virtual machine' and its effectiveness is mostly at polarizing otherwise rational individuals.

The use of the term 'virtual machine' for C is appropriate in the case of an individual wanting to purposefully show that he has a 'better' handle on the semantics, and that a mere mortal surely would not understand his glorious knowledgs.


Can you elaborate on your first point a bit? How, specifically, does C's machine model not fit current architectures well, aside from multiple cores. How is Fortran's model better?


C is based on a simplistic view of the computer as a turing machine (or von neumann machine, if you'd prefer).

Since the 70s or 80s, CPUs have gotten a lot faster while memory access has only gotten incrementally faster. This means that CPU manufacturers have put an increasingly sophisticated caching system on the CPU die to speed up access to main memory contents.

C's simple model allowing pointer arithmetic, aliased pointers that overlap to the same memory, etc, mean that compilers can't produce the optimal cache behavior on the CPU, because they can't reason definitively about when a given region of memory will be accessed based on the variables being accessed (pointer might have been modified or might overlap another one). They also have trouble invoking things like SIMD instructions which do math quickly on vectors. Fortran, with stricter rules about arrays, is more conducive to this.


Pointer aliasing can cause additional problems with today's multiple dispatch CPUs, which are nothing like the CPUs of the 80s. Because pointer aliasing ties the compiler's hands, compilers may have trouble reordering code to maximize a single core's instruction level parallelism.


Aliasing is not really an architecture specific concern.

Furthermore the vast majority of languages allow free aliasing, differing from C in what things pointers are allowed to point at and not in the number of pointers which may point at them.


I agree. C is outdated and does not represent current machines very well. Nor does it benefit from fruits of years of programming language research. Different layers of cache? Multiple cores? Stream processing? A type-system that is the equivalent of stones and sticks in the age of guns? The joke of "macro"-system that is a miniscule improvement over sed expressions in your Makefile?

However, it is here to stay, just like Inch, Foot, and Fahrenheit. Not that the alternatives are worst, just the cost of switching over and training a whole lot of people is back-breaking.

Note that most unfortunately, it is being taken over by javascript (false == '0'), which is a far worse language in many aspects.


> - C is straightforward to compile into fast machine code...on a PDP-11. Its virtual machine does not match modern architectures very well, and its explicitness about details of its machine mean FORTRAN compilers typically produce faster code...

Nothing is straightforward to compile in to efficient machine code, and FORTRAN is not at all easier than C. Many compilers (e.g. GCC) use the same back-end for FORTRAN, C and other source languages. FORTRAN's clumsy arrays are a little nicer to the compiler than C's flexible pointers but the "restrict" keyword solves this.

> - C's tooling is hardly something to brag about, especially compared to its contemporaries like Smalltalk and Lisp. Most of the debuggers people use with C are command line monstrosities.

Yes, C debuggers may be a little clumsy compared to modern source level debuggers. But you can plug gdb in to almost anything, from a regular user space application, a microcontroller chip like Arduino, a kernel image running in QEMU, remotely debug Win32 code on Linux, a live CPU with a debugger dongle, etc.

> - Claiming a fast build/debug/run cycle for C is sad. It seems fast because of the failure in this area of C++. Go look at Turbo Pascal if you want to know how to make the build/debug/run cycle fast.

What's the point in compiling slow code quickly? Turbo Pascal was not really a good compiler, especially if you'd compare the output to a modern compiler.

Also take a look at the progress going on with LLVM/Clang/LLDB, they're working on better debugging and REPL-like environments for C.


Is there any way for software, even in assembly, to make intelligent use of on-cpu ram caches? I thought they were completely abstracted by the hardware.


Yes. You read about how the cache works, and design the code to take that into account. This is usually done through techniques like cache coloring (http://en.wikipedia.org/wiki/Cache_coloring), loop tiling (http://en.wikipedia.org/wiki/Loop_tiling) and other techniques that depend on the way that the cache works internally.

In addition, many CPUs provide prefetch instructions to make sure data is in cache before it's used, or cache-skipping loads and stores to prevent polluting cache with data that's only ever touched once.


Well, I am not as well-versed in optimization, but there is this:

https://en.wikipedia.org/wiki/Cache_timing_attack

If you can exploit the effect that cache has on latency, it is reasonable to assume that you can manipulate it in other ways.


Disclaimer: I am a C programmer.

C is straightforward to compile on pretty much every single CPU out there. A C compiler is required to certify many very common CPU's before they go into manufacturing, in fact. You cannot go to silicon, in many fabs, unless you have demonstrated the CPU can handle code; most of the code doing all this testing, is in C (lots of Assembly too). C is a front-line language, at the fabrication level.

- C's standard library may be sparse, but the diversity of other libraries out there means that you can do your own thing, and have complete control over what gets packed into your final binary image. This is a given, because it is powerful. A scan of /usr/lib,/usr/local/lib, or a search for .so/.dylibs in /Applications, for example, gives me, quite literally, nothing I can't like to with a C compiler.

-My fast edit/build/run/debug cycle goes like this: hit the hot-key in my editor, watch output, done. Of course, I also have periods where I must sit and compile a larger collection of libraries; usually at point breaks in my development cycle, but this is all taken care of by my build server, which runs independently of development servers. With C, I've been able to easily scale my build/release/developer binary packaging requirements from "needs to be instant interaction between new code and running tests" to "needs an hour to generate keys and dependencies and be prepared for a solid few years of use". C's build time scales according to the productivity of the developer.

-C is highly productive, if you do not trip over key things. Yes, building your own libs package should be a core part of professional C competency; no decent C developer worth their salt, who can't add a few lines of code to a multi-million-line codebase and figure out how to not sit there waiting for the kettle to boil while it builds for testing, should be allowed in any production environment. A proper C Makefile can build entire operating systems-worth of applications; for domain-specific apps, I hardly even notice the build time any more, in between edits and updates. C code is built so fast and integrated so well in repository/testing tools, that in my opinion the system is so well integrated its not relevant.

Now, if you need a prepackaged thing, and don't want to hassle with needing to know how to assume an iron grip over the dependencies issue, then of course C is a big hassle. Other languages have their time, and place.

You know what else I like about C? Lua. With C and Lua, and a careful selection of optimal libraries, you can build a development environment beyond compare in any other realm. The pure joy of understanding how a table-driven application goes from:

work_q = {} and work_q = { {name="empty trash", when=os.now(), msg="The system trash is emptied.."}, {name="copy file", when=os.now()+1000, msg="A backup is being performed..", cmd="tar -cvf /mnt/backups /home/*"}, }

.. to a working C-layer on multiple platforms, all functioning exactly the same way, all working from a single binary .. well then, this joy is greater than many other language/systems of development.

Welcome to C in the 21st Century.


Well put.


I agree with all of that.

Basically C is where it is because of a combination of timing and market forces. Market forces are usually underestimated by theoretical computer scientists, but they have a much stronger say in the success of a language than anything else.

However I still believe knowing C inside out is an essential skill, and as bad as it is, it's better than most of the alternatives for "low level" code. They usually either try too hard to be different or don't offer enough to substitute C and all the traction, coders, literature and codebase around it. And, as you mentioned, the fact that architectures became historically forced to play nice with C's machine abstraction.


Could you give a few links to papers on Burroughs systems and the Connection Machine? Is this article on LtU [1] and Daniel Hill's Connection Machine paper and book the sources I ought to look at?

Titles to some papers or some comparisons on a mailing list would be awesome to have if you know any off the top of your head.

[1] http://lambda-the-ultimate.org/node/87


> The social necessity of implementing a decent C compiler may have stifled our architecture development

the idea that specific requirements of c had started driving architecture development blew my mind the first time i came across it. it is still a point that does not get discussed nearly enough at a popular level.


None of what you said supports your initial claim that the author is ignorant. In fact, even if we pretend that everything you said is true, none of it contradicts the article, or the authors conclusions. Yes, C has lots of areas where other languages are better. And yet, it is still the most practical language to use because of the combination of things that it does well.


"it is still the most practical language"

Only in extremely limited contexts: low-level code for operating systems that happened to have been written in C. Otherwise, there is a better language for pretty much every use-case of C.


How might you suggest that I write a good compression, hashing, or encryption library with a lot of bit twiddling?

Any good alternatives to C?


Well, there is always Lisp:

http://psg.com/~dlamkins/sl/chapter18.html

At least SBCL will generate efficient machine language for bit vector manipulations if you tell it the sizes of the vectors (or if it can infer that information). It might just be a matter of opinion, but I would say that Lisp beats the pants off C in terms of programming; you are almost never going to have to keep track of pointers or memory allocations, you have far less undefined behavior to deal with, and it is generally a more expressive language. SBCL and CMUCL also support a non-standard feature that is similar to inline assembly language in C, the VOP system, that allows you to utilize obscure instructions or change how your code will be generated, if there is some reason for doing that (e.g. if your encryption library will use AESNI).

Of course, since you said "library," I assume you meant for this to be used in other programs, possibly programs not written in Lisp. Unfortunately, SBCL's support for that is still rough; commercial Lisp compilers may have better support for it. Of course, if you were targeting an OS written in Lisp, things would probably be different -- my view is that C's popularity is mostly a result of how many systems were written in C i.e. how much legacy code there is, and that were it not for that momentum C would be written off as an inexpressive and difficult to work with language.


low-level code for operating systems that happened to have been written in C

You say that like it is an accident that just about every major OS today is written in C or C/C++


"You say that like it is an accident that just about every major OS today is written in C or C/C++"

Is there some technical feature of C or C++ that makes those languages good for writing OSes, or that made OSes written in those languages overwhelmingly successful? You might argue that the simplicity of a C compiler made Unix successful because it helped with portability, but C is by no means unique in this regard -- a Lisp compiler can be equally simple, and can be used to bootstrap a more sophisticated compiler. Really, Unix won because the only real competition Unix vendors faced in the minicomputer and workstation market came from other Unix vendors; C was riding on the coattails of Unix. One would be hard-pressed to argue that Windows won because C++ is a great language, especially considering how Windows was plagued with bugs and security problems during its rise to prominence in the 90s (many of which resulted from bad pointer mechanics, which is what happens when you use a language that requires you to juggle pointers).


One would be hard-pressed to argue that Windows won because C++ is a great language

I'm not arguing that. I'm not even saying I know precisely why C or C/C++ is good. I'm just saying, when every major OS is written in it, can you really dismiss it so easily?


No, there isn't. What would you write a database in? Or an SMTP server, or IMAP, or HTTP, or whatever else. There's a reason all that stuff is almost always done in C.


C is a fantastic high level language.

Nonsense. This sudden C fad is totally baffling to me. C is a great language for low-level work and it does have an attractive minimal elegance but is in absolutely no sense of the term a high level language. A language with next to no standard containers or algorithms, manual memory management, raw pointers, a barely functional string type and minimal standard library and concurrency primitives is just the wrong choice for any application that doesn't need the kind of low-level control C provides.

Go ahead and jump on the bandwagon but don't come crying to me when you wind up with an unmaintainable mess riddled with security holes and fiendishly subtle memory errors.


> C... is in absolutely no sense of the term a high level language.

Of course it is. You're not worrying about how many registers your CPU has, or when you swap a register out to memory. All that has been abstracted away for you. The fact that you can access memory locations directly gives you some low-level access, but in a very real sense C is a high-level language. There are higher-level languages with more abstraction, of course. From the article: "It's not as high level as Java or C#, and certainly no where near as high level as Erlang, Python, or Javascript."


The problem is that the "level" of a language is relative to other languages. If C is a high level language, that means you can group it into the same pool as Java, C#, Haskell, Python, and Ruby. Don't you see a bit of a difference here? Unless there is a significant number of actual languages, not concepts, that are below C, it logically has to be called a "low level language" because there really isn't much below it but there are TONS of languages above it.

At the very least, it's something like a medium-level language. Grouping it with much higher-abstracted languages is just wrong.


Here's the thing: "high-level", when applied to a programming language, has a historical context. It means something specific (see: http://en.wikipedia.org/wiki/High-level_programming_language), and what it means and has always meant is that the language in question abstracts away registers and allows structured programming (as in, through the use of if, while, switch, for, as opposed to using labels and branching).

It's fine if you want to call "high-level" relative. But it should be acknowledged that "high-level" is not simply a strict comparison between the features of one language and another. And it SHOULD be acknowledged that C is, by convention, a high-level language. Python is definitely a higher-level language, but C is still a high-level language.


Here's the thing: The English language is polysemous. Computer science jargon even more so.

Meaning that any particular term can often have many gradations of meaning. So "X means Y" does not necessarily imply "X does not mean Z". Especially when Y and Z are similar concepts.

I suppose we could lament the inherent ambiguity of the jargon, but the truth is that for the most part problems only result when ambiguous language is used in combination with an argumentative person armed with equal measures of pedantic zeal and failure to grasp the fundamental characteristics of natural language. Without that element, for the most part any reasonably knowledgeable person should be able to figure out which of the particular meanings is at play from context. For example, even though the term "high level language" is used in multiple ways, the statement "C is not a high level language" is not all that ambiguous, even when considered in isolation. As long as you're willing to grant that the person making the statement is not an idiot, then it's trivial to determine that they weren't using the "everything but machine and assembly language" definition.


> polysemous

I did not know this word. Thank you.


Did you read the "Relative Meaning" section of that link? It specifically discusses C.


Yes, I did read the section. And I'm not blind to the fact that languages will age, and there are certainly many, many languages more advanced and higher-level than C is now. I think that's good. It's a bit weird to me to see people who really never programmed in C, and for whom the concept of a pointer is a bit foreign, but time marches on.

I suppose my quibble is semantic at its heart. To say C is not high-level language may be an opinion, both valid and not without reason. But it may also show a lack of perspective -- some missing history. It's a lower-level language compared to many, without question. But it's still a high-level language, and I have not only my reasons for saying so, but evidence to the point.


Under that classification, what is a low level language then? Even Forth abstracts away registers, and allows structured programming (or a sort). Is the set of low level languages that people have heard of in 2013 an empty set?


"Heard of", maybe not. But "working with" I can believe. A friend of mine was writing an assembler with a built-in arithmetic expression syntax last year.

Clearly it was low-level, because it wasn't trying to abstract away hardware - but it also wasn't just a straight-up (macro) assembly language, since it provided some specific features that needed a more sophisticated compiler to work properly.

When we speak of low-level coding, almost any abstraction can constitute a huge jump in power from the machine language.


Yes, and refusing to update (human) language in response to changes in the computing landscape is exactly the reason C fanatics don't see the weaknesses of C.


You can do some very interesting things with function pointers, structs, unions and judicious use of void pointers. I would absolutely group it with Java. Not saying they are very very similar, but you can do similar things.

Now grouping Java with Haskell, that doesn't seem like a reasonable grouping.

Of course, it's all swings and roundabouts really. You can draw lines anywhere.


"You can do some very interesting things with function pointers, structs, unions and judicious use of void pointers"

You can also do interesting things with straight assembly language. How does that make a language high level?

That you would ever deal directly with pointers is evidence that C is not really a high level language.


Yes indeed, you can do some very interesting things like interpret the raw bits of a float as a character and crash or corrupt your program.


When you use void*s, you do so carefully. You can make mistakes in any language. You can get objects in Python that don't do what you expected because there's no type safety, leading to crashy behaviour (granted, no memory corruption). I think you might have misread my comment as "you can't make mistakes in C".


It may have been a high level language when it was a choice between ASM and C but the definition of "high level" has moved on. C is no longer a high level language.

Just like diamonds are no longer at the top of the mohs scale. C is no longer anywhere near the top of "high" in "High level" languages.


C is and will (probably) always be a high level language by definition. That it's categorized as a lower level high level language than say Python or Ruby does not magically make it a low level language. There are low level languages which map mnemonics to instruction sets and high level languages that provide an abstraction over having to use the former and are more akin to the written word. That hasn't changed just because we have "higher level high level languages" these days.

Langauge itself is a highly variable thing. People can understand something completely different from what the originator intended, but it doesn't change what said originator meant by it. People can always say "oh C is not a high level language because I think that a low level language should be X and Y" the same way they can say "oh men have to have a beard and know how to fight to be be real men", but in reality men - beards or not - are still fundamentally men.


A categorization that contains all but one language (Assembler) is not a useful categorization.

Besides, why die on this hill? So everybody suddenly concedes that, fine, C is a "high level language". Is anybody's opinions going to be changed? No.

So, how about we stick to useful definitions, and agree that in modern times C is a low-level language, and, likewise, understand that agreeing to that isn't going to change one letter of the C specification or remove one line of C's libraries or anything else?

There's no gain to be had by anyone in this silly line of argument.


But that's exactly the point I was trying to make, albeit I wasn't very good at getting it across the internetz. Assembler is not truly "one language" it's actually a collection of mnemonics that are all extremely similar but based on specific architectures and instruction sets.

The point I want to get across is that for you C is a low level language. To the guy programming a fancy toaster in whatever version of an Assembly language C is a high level language. To be anally retentive and follow your comment about the ranking of languages, if I follow that logic then the only two low level languages would be C, Assembly, and perhaps a compiler complier circa the early 70's that whose name I cant remember. The difference between Assembler and C is extremely jarring in comparison to the difference between C and Javascript (for example).

One thing to note is that C is used a lot for low level systems programming and because of it it's so commonly described as a low level programming language. That said, writing low level systems programs does not mean that we're exclusively using a low level language to do so.


That's being deliberately obtuse. How many languages are lower level than C, besides assembly? If your ranking has every language in the "high level" camp, then your ranking is poorly calibrated.


Assembly is not one language (even if they all look very alike). It's a collection of opcodes or mnemonics specific to a specific architecture with a specific instruction set that map to the later. Even then writing assembly for MASM, TASM, WASM, or WhateverASM for Linux, Windows, or Winux (or whatever) can and will probably differ in structure and in the specific mnemonics used.

There's machine code, low level languages which are basically the plethora of different Assembly languages, and high level languages which provide abstractions to either or both of the former languages and resemble the actual written language. Putting C in the low level languages rank is doing a disservice to the language and not taking into account the bunch of people that program (for example) embedded devices in whatever architecture they're using.


> MASM, TASM, WASM, or WhateverASM

My favorite WhateverASM has got to be the "Single Pass Assembler".. or SPASM. I like to think the name was a backronym.


Hah! You deserve a billion upvotes because of the chuckles I just had remembering trying to explain the difference between one pass and two pass to a junior dev (which was completely irrelevant to our jobs at the time, I suspect he found something in Stack Overflow about it or something)


Forth is the only language I'd place between C and assembly.


Actually Forth is much higher level than C since it allows to define abstractions of any complexity in itself.


I think what constitutes a "high level language" changes with every few iterations of moores law, as it becomes permittible to add layers and abstractions and virtual machines on top of the old "high level".


High-level? Yes. Fantastic? Not so much. Dependable? Sure. Well-supported? Generally.

C is the Latin of the modern programming world.


> You're not worrying about how many registers your CPU has

But you need to worry about size of registers. How many modern languages have basic data types without strictly defined sizes?


The C basic data types have minimum defined ranges, with the actual ranges being available as compile-time constants, and that's really all you need in almost all cases.


C is not a high level language, because it does not contain some abstractions that actual high level languages contain.

Here's a non-obvious example. With C, the programmer must always be aware of evaluation order. In fact, this must be specified even in cases where it is not important. For example:

    x = a + b;
    y = c + d;
Since the values x and y don't depend on each other, they could each be calculated in any order. However, with C, the programmer must explicitly decide on the temporal order that x and y are calculated (the order they appear in a function), even though it doesn't matter. And yes, the C compiler may decide to re-order those calculations under the covers as it decides (for performance reasons perhaps).

With Haskell, when you are outside I/O Monads and such, you don't specify the order of calculations like the above. Those two calculations may appear in that order in the source code, but the Haskell programmer knows that doesn't mean anything. The system may calculate x first, y first, or perhaps neither if those values aren't actually used elsewhere. Temporal ordering is not necessary (or desirable) under most circumstances.

You can never forget about temporal order in C, but there are higher level languages that allow this.


C wasn't high level when it was born, still less now.


You can wind up with "an unmaintanable mess riddled with security holes" in any language -- that's not unique to C. Regarding the flaws you have mentioned in C:

Its flaws are very very well known, and this is a virtue. All languages and implementations have gotchas and hangups. C is just far more upfront about it. And there are a ton of static and runtime tools to help you deal with the most common and dangerous mistakes. That some of the most heavily used and reliable software in the world is built on C is proof that the flaws are overblown, and easy to detect and fix.


Security holes you wind up with in a Java or whatever codebase take the form of logic errors allowing for compromise of user information, or SQL injection or command injection. These are bad, they totally compromise data and systems.

Security holes in C applications allow for the injection of low-level programs. This is way more annoying, IMO.

also the static and runtime tools for C are good, and getting better, but still very far from being a replacement for writing code in a memory-safe language.


Agree and in C you still have all the bugs of the type that you describe for Java or whatever...

It just there is a lot of C out there which means we will see it for ever in our systems somewhere. It will get better over time as tools improve the existing C code base.


"You can wind up with "an unmaintanable mess riddled with security holes" in any language"

Sure, but it is much, much, much easier to do in C than in other languages. By default you have unchecked array access, fixed width integers, strings terminated by a special character, and tons of exploitable undefined behavior. Writing secure code in C takes substantial effort, substantially more than in other languages.


That's a misleading statement and I think you know it. In general, you have to go extremely out of your way to end up with a remote code execution exploit with Java or C#.

Look at Microsoft's CVEs for 2012. Approximately all of the serious ones would be impossible with a proper type system. This is after all of Microsoft's static analysis tools and code reviews and security focus.


In traditional circles (read: not rubymongojs land), C is considered a high level language because you aren't writing assembly or maintaining stacks by hand.


Not in this decade... I used to do embedded programming and even we considered C/C++ to be somewhere in the middle, with Java being the "high level" language for our management software and C/C++ being the "low level" language for our embedded devices.


Did you run Java embedded?


No, we ran Java in the middle-ware.


You are, however, dealing with pointers, memory deallocation, the bit widths of integers, etc. C as a high level language is a questionable idea.


The author makes the point more effectively when he writes "It's blown everything else away to the point it's moved the bar and redefined what we think of as a low level language."


> C... is in absolutely no sense of the term a high level language.

One problem with the term "high level language," is that it was coined many, many decades ago. C is a "high level language" like FORTRAN is one. Remember, when C was invented, programming via punch cards was still a routine activity in many shops.

This is why the assertion that, "C is a fantastic high level language," or even that it's a high level language at all is so controversial: It exploits a term which is so old, changing expectations have tectonically shifted the programming field into a place where there is an opportunity for surprise and controversy.

There are "high level languages" which are minimal, and those which are not. There are "high level languages" which offer a large amount of abstraction and those which offer less. Right there are two spectra which can be used to classify languages, and C is stuck near the extreme lower left corner. (A minimal language which offers less abstraction.) Then you have languages like Lisp and Smalltalk which are even more minimal, but which offer higher degrees of abstraction. When one says, "high level language," one is talking about a very broad category.

Which is why you can do this: http://news.ycombinator.com/item?id=5037315


It's not actually controversial, it's just arguing about the definition. Change the phrasing to "C is fantastic abstraction over the machine, but it's not in the same high level like, say, Erlang" and everyone agrees.

The article uses an old standard as a little trick to try to put C in a category it doesn't belong.


It's not particularly controversial to say that C is a high-level language. Maybe a bit historical, but it hasn't lost any features since it was new and ultra-high-level for its time.


> a barely functional string type

There's a string type in C? All I see are character arrays and a few (poorly thought out) conventions for using them.


It may be cumbersome, but C has string handling: http://en.wikipedia.org/wiki/C_string_handling


This is an article about the standard library. C has no built in string handling. As an example, there are embedded systems that do no have support for string types due to a lack of support for libraries that handle it, yet they are 100% compliant with the C spec.


The C standard library is specified as part of the language specification document. The language defines two kinds of implementations: "hosted", which must support the entire standard library; and "freestanding", which do not.

Those embedded systems that do not support the full standard library are 100% compliant freestanding implementations, but are not compliant hosted implementations.


How you can be 100% compliant with the C spec by omitting strlen is a mystery to me.



Forget to place that NULL in the right place and boom!


As someone who's happy writing C, has written a lot of C at work and at home, I have to say "C fad" sounds a bit weird.

What I see talking to people and looking at forums (this one for example) is the exact opposite of a C fad. Frankly I wish it were the other way. I encounter fewer and fewer people with C knowledge, and especially fewer people who say they are happy writing C. New college grads sure as hell don't graduate in large numbers knowing C, even if a good chunk of them can churn out acceptable Java.


That's why we have valgrind and glib.

At the core, a good language shouldn't really have any of these things you say built in - they should be built on top of the language itself.


Having strings, containers, etc, not built into the language itself means that you can't practically use them. Libraries will all have their own incarnations, and as a result you get stuck dealing with the lowest common denominator of C arrays.



The problem is choice. If you're writing a library, no matter which containers you choose you'll have users that chose differently. So you either require full buy-in for your framework (like GTK does with GLIB) or you end up exposing the lowest common denominator of functionality (C strings and arrays).


I've long been a diehard C fan, and I still jump at the opportunity to use it, but the realization that choice can be harmful has gradually pushed me away from C.

The problem of choice with libraries is really only the beginning though. Coding style/formatting choice, language feature choice (infamously an issue with C++; which subset of the language to use is the source of most trouble in C++ teams in my experience), build system choice, etc are all sources of trouble. Most modern languages cover some of these with various effectiveness, but miss others. Unfortunately, the popular modern languages seem to do rather poorly in these areas.

I find either extreme comforting. The reckless freedom that C offers is alluring, but the bondage of a batteries included language with strong idioms covering the entire system makes for better programmers.


And so this buy-in occurs at the library level rather than the language level. One might see that as a good thing.


Those people would be wrong. When buy-in occurs at the library level, then the de-facto response is to fall back to the lowest common denominator, whatever that happens to be. In a language with standard containers, that lowest common denominator is a reasonable floor. Having containers in the standard library has basically zero downside aside from some perceived conceptual elegance. People who need custom containers can always define their own. People who just need to get a string from point A to point B, where A and B are libraries written by different people, can easily do so by falling back to the standard containers.


I agree with this so much. A few years ago I realized that my C code wasn't longer or harder to reason about than most higher-level languages (yes, that factors in the memory management and even error checking), except for one thing: those languages came with built-in resizable arrays and hash tables.

I was grossly disappointed when the C11 revision had nothing about some default algorithm implementations in the standard library. If a C hacker needs something more specialized than those default containers, they can go write their own (they would have anyway), but for the rest of us it allows getting something reasonable off the ground relatively quickly.

APR and Glib are nice in theory, but most software shouldn't require such dependencies. Also, every C hacker I've spoken to has a different opinion about the quality and usefulness of those two libraries; most dislike the two.


That's kind of the point. Why do you have to choose? Why isn't it possible with the language and its standard libraries? What happens when you mix APR code with glib-using code?


1. Because there is no standard defining that. 2. Because there is no standard defining that. 3. Nothing. Functions in both libraries are prefixed to avoid name clashes.


The lack of standard for 1+2 is the complaint here. Prefixing in the libraries is a workaround to make it possible at all for their separate (but similar) basic types to co-exist, but the libraries shouldn't have to define sane basic types for these things. At most they should have to add extended operations.

But because they don't, it must be reimplemented in each library that needs them, which adds to code size and semantic overhead. Which might be helpful is for a good basic type library to be created which could be a common dependency for each (perhaps host it on CCAN), but the fact that this would be needed at all is the base complaint.


Isnt high level relative. I was once speaking to a friend who is an electrical engineer. While discussing some dsp stuff with him I explained some asm instructions, and suddenly he says you are going high level :) I am pretty sure if he speaks to some physics/semiconductor guy he probably will get the same answer back from them


C is (more specifically) a 3rd generation language, and when I was going through high school I remember it always being on the list of high level languages. It's more recently now I think, that we consider it otherwise...


C is not a fad. The sudden love for C is the logical backlash caused by the abuse of lasagna layer upon layer of enterprise oo mvc enabled, xml configurable, orm supporting, osgi friendly huge monolith frameworks. A whole generation of developers have barely known simple and small. And they got tired of big enterprise and are searching for something "not as clumsy or random as a blaster. An elegant weapon, for a more civilized age." . Maybe it is the wrong direction in which they are searching - but I think that C mastery pays off in the long term.


Totally agree. Learning object oriented frameworks gives me a headache. I learned C back in 1991 for first-year CS classes. Consider all the software built in C: Linux, FreeBSD, Apache, etc. It's the dominant infrastructure of the web.

What language will be used to build the first human-level AI? I think it will probably be C, and less than 1 million lines of code.


> What language will be used to build the first human-level AI? I think it will probably be C, and less than 1 million lines of code.

I don't think so, but I respect your opinion.


C is a mid-level language and it dominates that space.

The high-level space still doesn't have a winner and it probably won't for some time. It has a lot of excellent languages that are all radically different (Clojure, Haskell, Python, Scala, Erlang) and each is really cool in some ways and very infuriating in others. C, of course, is really cool in some ways and very infuriating in others as well.

I've taken up learning C, and my feeling about it is that it's not a "jump in and figure it out" language, in the way that most technologies are, because you're absolutely right. Naive or "cargo cult" C is going to be a disaster from a security, code-quality, and production stability perspective.

For most technologies to qualify as safe, sane, and consensual, they have to be secure even when wielded by a programmer who cobbled together incomplete knowledge from a few dozen Internet articles about how to do stuff. Not C. There are "quick adoption" technologies with quick learning curves and "lifer" technologies that are quirky and hard to learn at first but provide long-lived knowledge. C is a lifer language. If you don't treat it as such, you'll faceplant.


C is hardly a fad given that it is the foundation for the vast bulk of software that you interact with everyday (if not the top layer, ones below it). But of course at some point there is a looping back when silver bullets fail to pay off in the longer term.


"A language with next to no standard containers or algorithms, manual memory management, raw pointers, a barely functional string type and minimal standard library and concurrency primitives is just the wrong choice for any application that doesn't need the kind of low-level control C provides."

That's what libraries are for, numbnut.


C/C++/Java. A programmer's version of Rock/Paper/Scissors.

Ignoring pre-history (BASIC, FORTRAN, PDP-11 assembler, Z80 assembler, Pascal), I started out in C, many years ago. I found myself using macros and libraries to provide useful combinations of state and functions. I was reinventing objects and found C++.

I was a very happy user of C++ for many years, starting very early on (cfront days). But I was burned by the complexity of the language, and the extremely subtle interaction of features. And I was tired of memory management. I was longing for Java, and then it appeared.

And I was happy. As I was learning the language, I was sure I missed something. Every object is in the heap? Really? There is really no way to have one object physically embedded within another? But everything else was so nice, I didn't care.

And now I'm writing a couple of system that would like to use many gigabytes of memory, containing millions of objects, some small, some large. The per-object overhead is killing me. GC tuning is a nightmare. I'm implementing suballocation schemes. I'm writing microbenchmarks to compare working with ordinary objects with objects serialized to byte arrays. And since C++ has become a hideous mess, far more complicated than the early version that burned me, I long for C again.

So I don't like any language right now.


> Ignoring pre-history (BASIC, FORTRAN, PDP-11 assembler, Z80 assembler, Pascal)

A side effect of C universalization, especially with open source, is that people forget all about that pre-history. Hacker culture now is mostly C/Unix with a dash of Lisp and Smalltalk heritage.

But from what I remember of microcomputer culture in the early 80s, hackers in the field were doing line-numbered Basic and Assembly language - if you had exotic tastes maybe Forth or something. If you were a Basic guy C was DANGEROUS (my programming teacher spoke of C the way Nancy Reagan spoke of crack cocaine) and if you were an assembly guy C was a slow hog.

And if you had access to a "real" computer, chances are it was running PL/I, COBOL or Fortran, not C.

The 1983 video game Tron had levels named after programming languages: "BASIC", "RPG", "COBOL", "FORTRAN", "SNOBOL", "PL1", "PASCAL", "ALGOL", "ASSEMBLY", "OS" and "JCL". C apparently did not merit mention; its complete dominance didn't come until some ways into the PC era.


SNOBOL. I loved SNOBOL. A completely bizarre, very powerful language. I took two wonderful compiler classes from R. B. K. Dewar, who worked on the Spitbol implementation. He was also involved with the SETL language, which really impressed me.


And setl influenced ABC, then Python


I was doing Turbo Pascal with some C only when requested to do so.

In the university UNIX forced me to move to C and later on C++. P2C was the only available Pascal compiler and a joke when compared with Turbo Pascal.


You may like Rust (disclaimer: I work on Rust). You get the choice of manual or automatic memory management, but no matter which you choose you don't sacrifice safety. The language is simpler than C++ but can be about as fast as idiomatic C++ and supports a bunch of higher-level features like closures and generics.


Rust is the one language that we as a mixed java/c++/perl/scala shop look at, and see a future in. Exactly because of its safety features. Keep up the good work!


I have very much the same pre/history, along with years of Perl. Then I discovered Go, and can't recommend it highly enough for exactly the reasons and specific case you state.


I attended a tutorial on Go, and while I liked much of what I heard, the tutorial was so poorly done that I did not leave with a good high-level understanding.

I also left with the impression that it is very early days with Go, in the sense that several parts of the language are evolving. (Or maybe I have this impression because the presenter loved diving into ratholes.)


If you have an hour go through http://tour.golang.org/#1. If you're like me and can't enjoy a new language without connecting it to the network and passing some JSON around, check out http://golang.org/doc/articles/wiki/ and take a look at http://golang.org/pkg/encoding/json/#Marshal (if, also like me, you're picky about field name casing, pay attention to the little bit about `json:"myName"`)


I'll second this.

I went through the Tour of Go over a 3-day weekend when I didn't have many obligations and messed with a reasonable portion of the examples although none of the exercises (I think).

I really like it, although I don't think it replaces C's niche. But I think for the level where you'd like some of the structure and goodies of C++ or Java without the complexity it's pretty compelling.

The main thing I've been wishing were different about Go recently is that I wish it could generate some kind of C style dlls that could be interfaced with from other languages. It seems that Go wants to be the top-level, so I don't really know a way to use it for libraries or extensions for other languages.

So instead I'm considering C and Lua - C for the library with possibly a linked or embedded Lua to give some higher level niceties.

My hope is that I could create extensions that are callable from other languages using C APIs and then in the extension push some of the actual computations to Lua when it makes sense to use things like lists/dictionaries.


C# is an improvement in this regard - it at least gives you structs for when you need to store related data together without any greater overhead than you'd have in C.

Unfortunately unless you're on windows it's not that fast compared to Java.


I still hope that Microsoft will eventually make Bartok part of the standard .NET tooling and with that a proper native implementation for C#.

Actually they are doing something like that already with MDIL in WP8.


What kind of systems are you building that many gb of memory and millions of objects are a problem. I mean just the amount of memory and the number of objects itself isn't a problem for Java nowadays.

You mention GC tuning and object overhead so it sounds like there is some high-frequency object creation happening?


It also has a few downsides:

- The build system is broken.

vs. make. qmake. cmake. autotools. scons. 'modern' makefiles (>_> what does that even mean? Yes, I'm looking at you Google) There's a whole ecosystem of tools out there to solve this.

- There are a few rubbish IDEs, most of which support c as a second class candidate.

VS has officially abandoned C; xcode grudgingly supports it. The CDT is mediocre. There's little or no support for refactoring or doing other things on large code bases, and the majority of the time the work has to be done manually.

- Because the build system is broken, dependency management is hell

Got one library that uses scons, another using autoconf, and what to build a local dev instance of a library without installing it into the system path? Goooood luck.

This is made even worse by arcane things like rpath, which mean that dynamic libraries only work when they are in specific paths relative to the application binary (I'm looking at you OSX).

- Its a terrible collaboration platform.

Why? Because the barrier to entry is high. To submit a patch that works and doesn't break other things, doesn't introduce serious security holes or memory leaks is hard.

Astonishingly, some successful projects like SDL are actually written in C, but most C projects are not collaboration friendly.

- There are no frameworks for doing common high productivity tasks in C, only low level system operations, or vastly complicated frameworks with terrible apis (>_> GTK).

I'll just whip up a C REST service that talks to mysql and point nginx at it. Um... yeah. I'm sure there's an easy way to do that. ...perhaps...?

How about a quick UI application? We'll just use GTK, that's portable and easy. ...Or, not very easy, and vastly complicated to setup and build.

These issues aren't unique to C, but they're certainly issues.

I'm not really sure I'd happily wonder around telling people how amazing C is at everything.

It's a tool for some jobs; definitely not all.


> It's a tool for some jobs; definitely not all.

I'd completely agree. I would not use C for doing any web platform work (writing a REST service as you say). I may write a webserver in C if I had tight memory constraints.

Where I see C still being very useful: embedded applications, drivers, and latency sensitive code.

When you are trying to push the most I/O possible through a network interface or disk interface, C allows extremely tight controls on memory and CPU usage. Especially if you are coding to a specific CPU for an embedded product, you can really tweak the app to perform as required (though, this may require some assembly to do what you need).


I have done low-level and mobile programming on very restricted platforms and I cannot see any reason why in the world I would use C instead of C++. Basically there is always an opportunity to use C++ if you can use C. Myths that C++ is slower are spread by people who just do not know C++ well or are not skilled/clever enough to use it.


Indeed. And there are numerous reasons why C++ code can be/is significantly faster; firstly code inlining for code that would be required to use function pointers in C (ala qsort vs std::sort), secondly, things such as expression templates for things such as Matrix libraries.


Bravo. I had to read 15 minutes worth of comments to finally find one person who actually knows the truth. And this little gem was downvoted. Bravo.


> There are no frameworks for doing common high productivity tasks in C, only low level system operations, or > vastly complicated frameworks with terrible apis (>_> GTK). > I'll just whip up a C REST service that talks to mysql and point nginx at it. Um... yeah. I'm sure there's > an easy way to do that. ...perhaps...?

Search for libfcgi. It's nice and simple C library that can be used behind nginx and as the matter of fact anything that supports fcgi. In this case routes are defined on nginx. But just think a bit, it's nice little framework ;-)

Btw used exactly that few months ago.


For the record, I found cmake to be the best. The output is great, the convenience is awesome, the only thing that bugs me is the strange syntax, which is fine because once you set it up it gets out of your way fine.


Though it doesn't reach all the dark corners that cmake does, I found premake to be much easier to figure out and customize. All the configuration is done in Lua and you can customize the state with a normal programming language as opposed to the crazy cmake arcana you sometimes have to resort to.


To both parents: whenever I have to install and work out how to debug yet another C build system on my Mac (default build settings never work on Macs), it leaves me wanting to (and indeed going on to) rewrite entire libraries I should be just using. (notable exception: autohell seems to work, some of the time, and it even sometimes fails with an error telling me what's wrong.)

When I use python extensions (often written in C), this problem doesn't arise because distutils, and sometimes cython, handle everything.

Python isn't the best C build system, but frankly it scores better than cmake and friends for ease of use.


Oh, yeah, definitely, I wish everything was as easy as "pip install". On the other hand, I can sympathize with nothing working on macs. This is because the environment plain sucks.

All the installed tools and libraries are old as hell and rarely get updated, and it's riddled with quirks and tools that are subtly different on osx. I hear it's gotten better with Mountain Lion.

Thank god for homebrew.


Macports is my tool of choice, but yes it's the only way to get things done! Unfortunately, plenty of scientific libraries are too obscure to be ported, leaving it to us scientists...

Interesting what you say about 10.8 - I might give it a shot then. I will be upgrading soon from snow leopard (2009 MBP) and I was going to wipe and go debian/XFCE, because of all the "new ideas" like "let's not have save as" etc, but if installing stuff is easier (and I know multitouch works) I might give it a go.


I halfway solution I'm reasonably happy with is to have a VirtualBox install of Debian on top of OSX, and install most scientific libraries there. I've considered just wiping and installing Debian as the main OS, but when I first looked the Linux driver support for my MBP wasn't great, and since then I just haven't gotten around to it. VirtualBox works fine for anything that isn't too heavy duty, and anything heavy-duty I'm running on a beefier server machine anyway.


This is a good idea. The problem I have is that I sometimes need basically all my RAM for simulations (yes, occasionally even on my laptop) and my current only has 4GB (with OS X eating up about 1.5GB on startup for "kernel tasks") so it's dangerous to run a VM with more than 2GB. New laptop will def have 8GB though, so this solution becomes viable again, thanks :)


Oh no, I think slapping linux on it is the way to go, don't get me wrong :)

But I got stuck with it because I had to do ios dev at some point, so I learned to adapt ...

I hear arch and ubuntu both work great out of the box on macbooks.


hmm, If I can get Arch to install, I might give it a go. Ubuntu was a bit pants when I tried it for a month early last year (had installed Lion, was pants on the old machine, tried ubuntu before rolling back to SL.) Multitouch gestures for multiple desktops is a must now that I've got used to it...


By the way, I refactor C with regexps and text utils. In UNIX a pretty complicated refactoring can be done with a shell oneliner. This method actually rules. C allows that because it does not have namespaces. C is minimalistic so that it allows working with it with minimalistic tools rather comfortably.


Really?!

Have you ever looked at what InteliJ allows as refactoring tools?


IntelliJ Idea for Java?


Yes.

There are lots of refactorings that cannot be done by simple search and replace, even for C, because they require semantic knowledge of the language.


Very true, especially when you considering how many namespaces C has. In C++, its even harder.


you know that the c/c++ plugin for intellij is broken? so what are you using that is smart enough to parse and refactor c? (serious question as i've had to switch to eclipse).


I seldom use C or C++ nowadays and when I do it is only C++.

I rarely do IDE refactorings when in C/C++ because of what you state.

QtCreator and CDT do offer some nice things, although still far from what is possible in another languages.

If JetBrains produced a proper C/C++ IDE I would probably buy it for C++ work.


confused by this response i re-read the thread and it turns out i mis-read your comment (thought you said "can" instead of "cannot"). sorry!


There is a problem that each and every tool for semantic-aware C/C++ refactoring I have tried does not work 100% of the time (i.e. it sucks more than 0%). This happens particularly because C/C++ have header files that expand to huge source files and preprocessor macros driving the parsers nuts because the parsers of the refactoring tools are simplified for speed, and also in a typical C/C++ program there is a number of combinations of macros that can be active or non-active depending on compiler options. So in case of C it always comes out faster, easier and more reliable for me to use basic tools such as search/replace because C has very simple semantics without namespaces. I often do that even in C++ because of the mentioned reasons.


If you are using macros in C++, you are doing it wrong. They are a relic of a bygone age.


Thanks for letting me know, I try to use the macros as little as possible, but there are such things as platform-specific code, compiler differences, third-party libraries, standard libraries such as WinAPI and legacy code.


Platform-specific code is a very good use that precluded me at 1 o'clock this morning.

I tend to wrap legacy code in a C++ wrapper if I can; allowing me to code to a C++ idiomed API, rather than the choice of that particular developer. More often than not, it adds no overhead through the use of inlined functions...


I always get bashed for saying I like C++, but I genuinely don't get how C programmers manage without code like this:

    std::map<std::string, std::vector<std::string>> foo;
One line and you've set up a nontrivial data structure with automatic memory management. No macro horrors (which are a diabolical way of implementing what C++ templates do well).

Since you can code like C in C++, I'm not sure why more people don't use C-with-templates as a programming style.


This is precisely my problem with C, and why I hate using it. Data structures are incredibly tedious.

Whenever I see younger people gushing about C, I can't help but wonder if they've actually written anything nontrivial with it. When used properly, C++ is a vastly more productive language.


C guy here. I'm assuming you're serious and not being sarcastic... which is a nontrivial assumption, because that line you posted looks like something out of the foul depths of hell.

You aren't going to run out of lines any time soon. Who cares whether your data structure definition is 1 line or 5?


> because that line you posted looks like something out of the foul depths of hell.

I must say I laughed-out-loud at that, mainly because I've been on both side of the divide when it comes to opinions on verbose std definitions. It is an odd feeling to simultaneously feel revulsion and nostalgia towards a line of code.

I do think there is a non-trivial, and sometimes massive boon gained from concise definitions. It's the difference between an acronym in natural language vs referencing a concept by its full verbose name. The shorter the definition of a concept, the fewer units used by your working memory when referencing that concept, thus freeing your mind to higher level considerations.


I agree - I've found the STL a massive productivity boost, for all its ugliness. It's worth it for std::vector and std::string alone, and I also like std::map (even though the API teeters on the boundary between genius and insanity) and std::set. All there for your use, with your own data types, out of the box.

I discovered the STL one day in February 1999, by accident, when looking for something else in the VC++ help. Once I realised what I'd found, I gave up C entirely within about 1 week.


Foul depths of hell? I've never programmed in C or C++, but this seems to me to be a map from strings to vectors of strings. Reads easily. I don't like the :: namespace separator, but what can you do...


but what can you do...

    using namespace std;

    map<string, vector<string>> foo;


I used to do this. But then I realized I would be importing all symbols from std into my code. Including ones that might get defined in the future (similar to python's 'from foo import *'), which might conflict with some of my own symbols.

Hence, I now implicitly import each symbol I need with something like 'using std::map;' (similar to python's 'from foo import bar, bar2').

I've found that the 'using' statement can be used even inside a function to restrict the importation to just a single function.

FYI.


That's good advice. But even when importing all of std, at least namespace collisions in C++ will result in some somewhat sane compiler error.

I not so fondly remember a day hunting down a cryptic C compiler error coming from some header that had

      #define F1 ...,
replacing F1 in my file by some cryptic mess...


Interesting; I never thought of that analogy to Python. I've just had it drilled into my head from everything I've ever read about C++ that "using namespace std" is terrible, and I shouldn't use it.

Great advice. Thanks!


Except the analogy to Python lacks one critical point: C++ headers and Python imports do not work in nearly the same way. Do not EVER use a "using namespace <foo>" in a header file; any code that #include's such a header will pull this in.

In .cpp files it's ok (and almost required if you're using boost, unless you want boost::foo:bar::type everywhere). It still requires a bit of thought, though.


It's ugly to see shift operator here? :-)


Unless you're on C++11 you'll be required to shove a space in there to get it to compile anyway ;)


Binary search trees cannot be implemented in 5 (reasonably-wide) lines of C.


And can't be _implemented_ in any language. C gives you at least the charming fact that you can implement one without any overhead of objects, classes, implicit memory management or some other concept, just having a lightweight implementation that you may use in an environment without any libraries, like std or else..


So what you're saying is that C lets you write it on your own. In fact, and this might sound crazy, but you can do that exact same thing in C++, it's just that people don't because the C++ standard library provides plenty of performant and featureful data structures for you.

Besides, anything the C++ standard template library map does when implementing a red-black tree is exactly what you would need to do if you wrote your own, minus having things like allocators (which are incredibly useful for those that need them). You speak of overhead due to "objects, classes and implicit memory management". Inevitably, any implementation of a data structure of substance in C ends up looking like c++-ified C. Want to create your hashtable? Call hashtable_init and get back a hashtable_t. Want to insert into it? Pass your hashtable_t, a key and a value into hashtable_insert. Even worse, you either have the choice of making all of your data void* in the data structure and casting into and out of your interface or making awful macros to define the various methods and structures for the types you would like to use. It's like you're stubbornly writing C++ in C.


I'm confused.

C has libraries too. I personally like the headers used by openbsd: http://www.openbsd.org/cgi-bin/cvsweb/src/sys/sys/tree.h?rev...


I'm not sure why I'm getting sucked into this again :-) but it doesn't have standard, portable libraries. It also doesn't have the right abstractions in the language to provide clean and safe (i.e. no void* and no brittle preprocessor magic) implementations of such.

I've written C for most of my life, you can do great things with it. I also have a love/hate relationship with C++ and Python. There is no perfect language, they all have to make tradeoffs.


yes, even standard libs, but they arent that promoted as c++ std libs, as most of them are just apis for system interaction, only a minority are useful out of the box algorithms.

and third party libs are also possible, yes.


no its not c++ish, its simply an implementation which is technically similiar to c++, because they are at some core syntax/semantic similar. i didnt said that its the new wheel, or better or worse. its just free of third party stuff, which will be compilable anywhere only dependent of the compiler used. and i didnt exclude that this is exlusive for C, sure this applys to c++ as well. but not if you want to use feature complete c++.


You do not need to cast to or from `void *` in C, and it's not idiomatic to do so.


Well, what I posted wouldn't be in a tidy codebase. You'd have something like:

    // foo.h
    typedef std::map<std::string, std::vector<std::string>> StringToVectorString;
    
    // foo.cpp
    // in several places you can just use:
    StringToVectorString foo;


> Who cares whether your data structure definition is 1 line or 5?

Why not use assembler and make it 50?


well the thing is, it also works for std::vector<string>, or std::vector<int>, or anything you want. Last time I checked in C you'd have to either duplicate all your array code for each type or resort to something like void*. Only for that I'd use C++ (yes, you can use it without classes if you want).


Last time I checked in C, all pointer types where the same storage size. There's no need to define the same data structure multiple times by using templates. You just use casts.


... which is fine for creating data structures containing pointers to things created elsewhere, but it doesn't work if you want your data structures to contain more complex data types, without the overhead of another layer of pointer redirections.


Nope. There is no need in C for all pointers to be the same size. An implementation is free (and there indeed, exist some) where a char* is a different size from an int. You can* safely cast any data pointer to void, and a void back to any data pointer. POSIX requires that function pointers can do this; the C standard doesn't require it.

Now, that said, it is possible to write generic link list routines (I've done it) but the devil is in the details.


In the past, when I worked on a C++98 project, I would have disagreed with you for one simple reason: template debugging sucks. gdb did not provide good template information (but may have improved in the last few years, I have not checked it recently). g++ had horrific template error messages.

Today, I'm looking at C++11 and using Clang, and I like what I see. Vastly improved compiler error messages, lambdas, very decent type inference, standardized smart pointers, good standard library (and Boost is always an option). I haven't tried heavy-duty debugging yet, but I suspect it still sucks. Overall — I'll probably be using C++ quite a bit more in the future. It is no longer the nasty pile of kludges on top of C which it used to be ten years ago.


I up-voted you because I agree with you. And let me tell you that in the "real world" of systems programming (where stuff gets done), C++ is widely used and loved by engineers. It's not hip and popular to say that, but it's true from what I have seen over the last 15 years.


Does that actually compile in C++11? I haven't used C++ in a while, but you used to need whitespace to separate the two closing angle brackets (i.e. ">>" should be "> >").

Personally, I use C for my embedded work and Python for test applications and data processing. C++ is in the middle and I haven't had a use for it.


A few compilers even before C++11 treated >> in template definitions as expected (e.g. VS2010) even though it wasn't standard, then the C++11 standard officially made it required. So I haven't found the >> thing to be a problem for a while now.


Yes, that got changed/fixed with c++11. So you can now use >> there.


I bet in 99% of cases using such structure is a nonsense overhead, and in 85% the code relaying on it can be written without a single heap allocation.


One man's nonsense overhead is another's labour-saving modern convenience ;)

(I like to write my code without heap allocations, but when you need an array of arrays of strings, indexed by another string, it doesn't half get a bit annoying to code up.)


What is equally powerful about modern C++ (quite possible to make it work on C++98 as well) is the ability to have template methods that can dump out pretty-printed fully customizable (by overloading operator << in conjunction with a stringstream) debugging for data structures.

Yes, template programming is not something I am likely to ever figure out but these things only need to be built once, by somebody other than me.


But... you could have done it with a linked list!


"I always have it in the back of my head that I want to make a slightly better C. Just to clean up some of the rough edges and fix some of the more egregious problems."

I immediately thought of this: "Go is like a better C, from the guys that didn’t bring you C++" — Ikai Lan [1]

And that's what it feels like to me when I use it.

[1] http://go-lang.cat-v.org/quotes


To nitpick on a single statement:

> C has the fastest development interactivity of any mainstream statically typed language.

What? Despite all its shortcomings, Java in a modern IDE effectively has zero build time. The level of interactivity is as fast as that of interpreted languages. I haven't seen anyone manage this with C yet.


Agree completely about zero build time!

I've been using IntelliJ IDEA for Java development for the last 4 years, and before that I used C and C++ using Emacs for 7 years.

I am way more productive in IntelliJ IDEA than I was before. One reason is the instant feedback on syntax errors when I type the code. I don't need to compile to see them, as I used to in C and C++. Another reason is the navigation support you get in an IDE.

I've written more about the differences in development environment here: http://henrikwarne.com/2012/06/17/programmer-productivity-em...


That's the IDE not the language.

Java, itself, does not highlight your syntax errors. Nor does C.

There are IDEs out there that will do exactly the same thing for you for your C code.


The difficulty of writing static analysis software does depend on the language, though. In Visual Studio, for example, intellisense and autocompletion are vastly superior in C# compared to C++.

As a matter of interest, which C IDEs are you referring to? I'd like to check them out.


Try Xcode 4.5.x (wich uses Clang as the default C compiler).

VS2012 has similar capabilities for C++.


VS2012 is actually the IDE I was thinking of when I said VS C++ support is inferior to C#'s. While there is intellisense, it does not display any documentation. The closest thing it offers is variable names for a method's arguments.


In fact Eclipse features a special compiler that builds your code immediately and also allows IDE to highlight your errors better than any static analysis would. So actually it kind of does. Assuming compiler IS the language.


Eclipse+CDT provides some of the same features as well. Very useful :)

http://www.eclipse.org/cdt/


Emacs can do this with Flymake. I would stick with IntelliJ, though.


I have worked on several projects in Java (using Eclipse as an IDE). In my experience there is a 'significant' latency between when I hit run, and my program starts. (My suspicion is this time is spent starting the JVM, not compiling).

Every (non kernel) C project I have worked with has effectively 0 build time after making changes. This is because they all used make which (like Eclipse and likely all IDEs) only recompiles the files you changed. However, there is not the startup time of the VM (which for every non Java VM I have seen is also effectively 0.)

As an aside, does anyone know why Java has an unusually high startup time?


Regarding why Java has a high startup time, here's Wikipedia's take on the matter: https://en.wikipedia.org/wiki/Java_performance#Startup_time


> Every (non kernel) C project I have worked with has effectively 0 build time after making changes.

For trivial changes maybe. Rebuild time will depend on the size/complexity of the project and the number of dependencies on the thing that was changed.

A change to a header file of a core library may cause a rebuild of many parts of the project thanks to dependencies built on header files. We have such a library in our product and modifying a single file can cause ~40% of the product to need to be rebuilt.

Think about changing something like:

     #define VERSION "1.2.3"
which is compiled into every library and binary within the product.


As a tip I suggest running your code in debug mode. Which allows you to rewrite method contents as the program is running. If you place a break point in the method you are working in every time you save the program will reenter the method. I usually work in this mode.

JRebel is also worth investigating if you are getting paid to work with java code.


C and Java both have terrible productivity loops because they have a tedious compile step. Python you just run after editing. I was more productive in python using a test editor the very first week I learnt it than I ever was in Java or C++ even with IDE support and years of experience. Google App Engine python dev server reloads modified python files whithout server restart. That feature blew my mind after doing Java EE stuff (although that experience is 8 years old so maybe new Java servers can do this??). Talk about productivity!


Having a REPL is a big productivity boost for Python, not avoiding an explicit compile step. A compile step is utterly trivial to avoid (e.g. "make run" which builds and runs). Likewise, live reloading is possible in C using dlopen().


Yes modern java servers do this (if configured to do so of course).


At the risk of sounding like that guy, isn't this property of all Smalltalk descended languages?

And I mean on the basic language level, not "using a library to run a hot swap VM for my language" level like Eclipse and IntelliJ IDEA do.


I'd rather have a Repl than a "modern IDE".

C does not have or need a Repl because it's not designed for giant single programs anyway. Unix is the C Repl.


CERN researchers don't share your opinion:

http://root.cern.ch/drupal/content/cint


How is a unix shell going to help me debug the internals of my C program? You can use gdb from a terminal of course (yet another giant single program, funny how many of those I use every day…) but it's much more of a pain than a real REPL.

I noticed in another comment you said you are just learning C, perhaps you should wait till you have written non-trivial C before you declare what tools one does or does not need when writing it? C makes it very easy to make very nasty mistakes, so personally I take all the help I can get. I guess this means you think I'm an awful developer (according to your recent anti-IDE blog post), but I'd rather have code with less bugs than worry about violating some notion of 'purity' or hacker-hardcorness.


1) Pick a popular language

2) Figure out something controversial to say about it and a justification for that

3) Write it up and post it to reddit, HN, &c

4) Enjoy the hits

EDIT: Which is enabled by this - http://news.ycombinator.com/item?id=5037649


Pretty much this. Instead of reading about how awesome C is, I would much rather read a concrete example of a problem which you were facing and how you used C to solve it. Get to learn something in the process.


Very true, but I do end up learning a lot from the HN discussions. I'm a C guy myself, but I try not to be biased. I always tell new programmers to try and not favor any language, as they all have their uses.


I feel like I'm always trying to make this point to people, and doing so with far less eloquence. The fact that Katz is (a) an experienced and accomplished developer and (b) someone with very non-trivial experience with hot, hip, super-high-level languages should not go unnoticed. He's not some cranky old embedded systems programmer; he's doing thoroughly modern development.

Someone the other day said that the great thing about C is that you can get your head around it. When Java, or C++, or Erlang, or Common Lisp, or Haskell programming becomes unbearable, it's almost always because the chain of abstractions overwhelms the person building the system. And to be honest, it's why I'm a little less enthusiastic than most of the people here about Rust.


> When Java, or C++, or Erlang, or Common Lisp, or Haskell programming becomes unbearable, it's almost always because the chain of abstractions overwhelms the person building the system.

I ran into a bug in an XOrg driver for an old r128 card a few years back. (I think I have that combination right.) As I remember it, I spent several hours tracing through layer upon layer of C macros to even begin to get an idea of what the code was doing. The bug turned out to be part of a structure getting overwritten clobbering a part of a vertex definition with a color value due to a mix up in the size of the structure. (I'm forgetting the details on this and I can't remember the bug number off hand or I'd look it up.)

In any event, those layers of macros made it almost impossible to trace what the heck the code was doing without intimate knowledge of every part layer involved. I'd say chains of abstraction are more a feature of programming style rather than the language itself. Albeit, the language can ease the load a little.


What do you feel is too complex about Rust?

The region system is obviously a source of complexity. But I feel as though sacrificing safety at compile time in the name of simplicity just leads to complexity later—you either pay the complexity tax at compile time, or you transfer the complexity of correct memory management to runtime, resulting in sessions sitting in front of gdb or valgrind.


I can't really disagree with what you're saying. But I take the point of the article to be that what makes development easy or hard is not as related to the feature set of the language as we think.

The common liturgy of language design is that x feature makes development easier, less error prone, safer, etc. Nearly always, the precise way in which x feature does this is as clear as a bell. But somehow, as we multiply the number of features, something truly destructive starts to happen. It's not just diminishing returns . . .

I think about a language like C++. I can see the reasons for every single thing in it. I can understand the arguments for why this or that is a good thing. But it is perilously easy to write very unmaintainable code in this language, and to understand why, I think we have to get beyond the usual debates (static vs. dynamic, functional vs. OO, compiled vs. interpreted, etc). Small vs. large might be one place to start.


Can you give me an example of a problem in Haskell, where the abstractions are so complicated you can't get your head around it, but putting it back in C makes it easy enough to deal with?

A more plausible explanation is that you wouldn't even attempt a similar design in C because it'd be obviously impossible.


An industrial operating system. In C, you know what you've got: its exact dimensions, location, and lifetime. Since you're not relying on a GC, you can make latency more predictable and with much lower upper bounds.

On that topic, I'd rather write crypto code in C too.


So, there are problems that are so complex that they can only be dealt with in Haskell?

That would explain why there are so few Haskell programs that anyone actually uses.


If you tried to use an F1 racecar as your day to day commuter, of course you would complain about it. It would be horrible. Tune the engine every day!? Tire grip changes over TIME? Must watch oil temps so carefully? What a pain in the ass. Also easy to crash.

If you tried to use your soccer-mom minivan to participate in F1 races, of course you would complain. I can't tune the engine? I can't tinker with the oil pressure?!? How do I adjust performance for different tracks?! This thing sucks. Also it's slow.

These kinds of debates about C and other "higher" level languages are growing tiresome. We live in a wonderful world full of different tools for different purposes. Maybe your grandma doesn't want to worry about details under the hood, and she also doesn't need to race F1 cars, so a minivan suits her just fine. Maybe you value being able to tune your code for maximum speed and you don't mind (or even enjoy) the challenges involved in C code. Good for you.

yawn...


I really like your analogy.


Well put sir!


C is the language that doesn't force any preconceived notions about how the world should work onto you. Sure, C strings are NULL-terminated by convention, but even that is something you can almost completely ignore if you want to build up your own parallel stack of software that does it differently.

It is for this reason that C (and sometimes C++) are what people use when they have a new idea about higher-level programming abstractions. With C, you can be totally free of other people's big ideas (and their associated costs/complications) and invent something new.

Anyone who wants you to give up C in favor of their language/framework/VM/runtime is selling their own vision for how the world should work. That's fine and sometimes buying in will save you a lot of hassle. But what they're offering was almost certainly written in C or C++. Essentially their pitch is equivalent to "I have written the last C program you'll ever need."

In the early 90s that was Perl. In the late 90s, Java. In the early 2000s, .NET, then Ruby, then fast JavaScript.

This history is an oversimplification of course, but the real question is: would you really prefer a history in which we had, at some point, decided that the current VM that was in vogue should become the new replacement for C? That we'd tailor all our hardware to it, and no future VM would be "native" but would have to run on top of a different VM?

C is the key to why we have general-purpose computers that can run a wide variety of different languages efficiently, and why programs that need to be particularly efficient can always drop down to lower-level programming to get extra performance.


> C is the language that doesn't force any preconceived notions about how the world should work onto you

Some preconceived notions forced upon you by C, off the top of my head:

- problems should be solved by describing a linear sequence of steps (as opposed to logic/declarative programming)

- a variable can have different values at different times in the execution of a program (in contrast to standard mathematical conventions)

- a function can return different results when called multiple times with the same arguments (again, in contrast to the standard mathematical meaning of the term)

- there is random access storage of information (with constant time access and update)


problems should be solved by describing a linear sequence of steps (as opposed to logic/declarative programming)

Linear sequences of steps processing data arrayed in some linear fashion or others is what computers do. It is certainly what a single core does.

So you would rather have someone else write a C program for you that abstracts it away and calls it a new way to program? That's k, just know that you'll still have those pesky linear steps under the hood. You know, at some point the neat source you write actually has to get converted into stuff CPU can make the slightest bit of sense out of.

Of course it's nice to try to "get the computer think like the programmer instead of the other way around" (kind of the motivation for the inventor of the compiler IIRC, in times when people really had to "speak binary" to the computer), but it also kinda sucks when people completely loose track of the machine they're programming, and think their abstraction layers du jour grow on trees or something.

a variable can have different values at different times in the execution of a program (in contrast to standard mathematical conventions)

Your problem is nomenclature? Why not simply declare constant variables where you need them and move on? And are you really trying to twist C giving you the choice into "forcing preconceived notions" on programmers? That is hilarious, I will give you that much.

a function can return different results when called multiple times with the same arguments (again, in contrast to the standard mathematical meaning of the term)

"Function" can also mean a lot of people coming together for a wedding or something. I think that's why you would call them "C function" if you wanted to be precise.

there is random access storage of information (with constant time access and update)

Again, that's just how "the world" works. But C hardly came up with that, which is why all languages have it.. yes, all of them. Some abstract it away from you, sure, but they still have it. So you pay overhead and control "tax" in return for convenience and and expressiveness; that's not a bad thing per se, often it's the sane and productive choice, but to me dissing C is just shitting where you eat. The only reasons I can imagine for it are jealousy or ignorance.


> Linear sequences of steps processing data arrayed in some linear fashion or others is what computers do

No, it's what some computers do. It's almost certainly not what the computer you are using currently does. It's quite possible it's not what any computer you've ever used does.

> So you would rather ...

I'm not expressing a preference. If you think I was, you've misunderstood my comment. I'm making observations, not judgements. I'm not anti-C, it's a fine language for many things; however I do think that the articles author is not being as even-handed as he claims when it comes to some of the short-comings of C.

> Your problem is nomenclature?

Again, I'm not making a judgement. C exists in a paradigm where writing

    x = 3
    x = 4
makes sense. In other paradigms it would be logically inconsistent.

>>there is random access storage of information (with constant time access and update)

>Again, that's just how "the world" works.

That's absolutely not how "the world" works. Again, you've probably never even used a computer where it was true.

> dissing C is just shitting where you eat. The only reasons I can imagine for it are jealousy or ignorance.

I don't think I've made any statement anywhere in this discussion that's "dissing C".


How does this computer right here not consist of a whole lof of linear sequenceS?(notice the plural btw? and the fact that "linear" does say nothing about "serial" vs "parallel", either?) How is data not arrayed linearly? Please elaborate, "because I said so" is not enough.


> Some preconceived notions forced upon you by C

Nope, not forced on you, because plenty of systems implemented in C define models in which none of what you mentioned is true. For example, the Haskell runtime is implemented in C. That's the whole point of what I'm saying. A stack with C at the bottom allows Haskell to exist with good performance even though the hardware/OS were not designed with Haskell in mind. The reverse is not true.

Why do you suppose that no serious VMs or language runtimes are written in Haskell (or other functional languages)?


> A stack with C at the bottom

You realize C is not at the bottom of the stack, right?

> Why do you suppose that no serious VMs or language runtimes are written in Haskell (or other functional languages)?

Because Lisp Machines didn't win.

> plenty of systems implemented in C define models in which none of what you mentioned is true

True, but that's the whole point of abstraction. C, in turn, gets rid of some of the "preconceived notions" of the layers under it.


> Because Lisp Machines didn't win.

And if Lisp Machines had won, do you really think they'd run JavaScript, Erlang, C, and Haskell (as a rough sample) as fast as our Von Neumann machines do today?


C is built to run on a MASOS (multiple address space operating system) like UNIX so it wouldn't run as fast on a system like the Lisp machines that uses a single address space. Many of the things C deals with like the file system and interprocess communication would be obsolete in a system with single address space orthogonal persistence. Running C on a Lisp machine will be possible but at a performance cost.

JavaScript would run faster then it does on Von Nuemman machines because it would use the hardware support for dynamic languages and since JavaScript runs in a browser independently of the OS, the new machine architecture and OS wouldn't be a problem. Haskell, in so far as its purely functional, would probably run the same as it does now.


> C is built to run on a MASOS (multiple address space operating system)

What? You can run C on machines without a MMU, without an OS, or even to implement an OS.

> Many of the things C deals with like the file system and interprocess communication would be obsolete in a system with single address space orthogonal persistence.

What if people want a filesystem? They can't have one because the machine designer liked "single address space orthogonal persistence" better?

> JavaScript would run faster then it does on Von Nuemman machines because it would use the hardware support for dynamic languages > Haskell, in so far as its purely functional, would probably run the same as it does now.

I highly doubt these claims. Just because a language is "dynamic" doesn't mean it can take advantage of generic "hardware support for dynamic languages" to be as fast as highly sophisticated and language-specific VMs like V8 or IonMonkey. Just because a language is "functional" doesn't mean you can implement a thin translation layer that maps one onto the other with minimal overhead. This is fuzzy thinking that ignores the true complexities of high-performance language implementations.


> What?

The principal application of C was to build the UNIX MASOS. C gained popularity on UNIX systems because they already had support for the language. The vast majority of C programs use UNIX features that will be obsolete on a SASOS, so running these C programs will come at a performance cost.

> You can run C on machines without a MMU, without an OS, or even to implement an OS.

In my previous post I was referring to C programs which include C standard library files such as stdio.h which will be obsolete because it uses the file system and signal.h which is used for IPC. If a C program uses any header files at all it is obsolete because it should be using functions rather then files as its unit of composition.

> What if people want a filesystem?

As persistent data (data that is independent of the lifetime of the program) is always mapped to a single global virtual address space in SASOS, there will be no need to have a file system. A file system interface can be provided, but it will be just another way of accessing virtual memory.

> They can't have one because the machine designer liked "single address space orthogonal persistence" better?

Although eliminating file systems will be enormously advantageous to developers, it will also be equally advantageous to users. Whenever a user of a UNIX system changes an object that user has to save those changes, usually by clicking File/Save in a drop down menubar. An orthogonally persistent system will be more accessible to ordinary users because they won't have to handle the details of saving data to the file system.

The user interface expert, Jef Raskins, set out to design a user interface based upon the needs of the user rather then from the needs of software, hardware, or marketing. The product of this search was Archy which is an orthogonally persistent system.

The hierarchical nature of modern file systems also isn't ideal from a user interface design perspective. Forcing users to fit their data into a hierarchy is problematic. Using tagging and search will arguably make for an easier to use interface then a file system.

> Just because a language is "dynamic" doesn't mean it can take advantage of generic "hardware support for dynamic languages" to be as fast as highly sophisticated and language-specific VMs like V8 or IonMonkey.

The V8 virtual machine is implemented with a high degree of sophistication so that it can run large JavaScript programs. There will be no need to waste memory storing such a sophisticated VM in a Lisp machine because all large programs will be written in Lisp instead. JavaScript will only be used for small scripts in web pages, ensuring that users won't fall into the JavaScript trap. In this sense, JavaScript may run somewhat slower without a sophisticated VM, but that performance difference will be irrelevant because the language will only be used for small scripts.

http://www.gnu.org/philosophy/javascript-trap.html


They would because Lisp Machines were not some magical functional machines but pretty much standard CPU designs of their era with additional hardware support for object memory (tagged pointers, GC...) and relatively large word size.


Modern computer operating systems are systems of bloat. The amount of duplication between programming languages like C, C++, Objective C, C#, Java, JavaScript, Perl, Python, and Ruby, user programs, and applications is immense. To have a streamlined system that uses a single language all the way down to the machine itself, and that makes full use of the sharing capabilities of a single address space, would be truly magical!


C is the language that doesn't force any preconceived notions about how the world should work onto you.

... except for a type system based on the memory model of a machine which strongly resembles the PDP-11, a compilation strategy built for machines with less RAM than my car fob, and optimization possibilities limited by aliasing, plus everything kscaldef mentioned.


> except for a type system based on the memory model of a machine which strongly resembles the PDP-11

I'm interested in hearing about alternative memory models that are an improvement; every one I hear about is specific to some higher-level language or programming paradigm. Baking high-level language concepts into hardware is basically asking for stagnation.

> a compilation strategy built for machines with less RAM than my car fob

C is adapting (see: http://www.infoq.com/news/2012/11/llvm-modules)

> optimization possibilities limited by aliasing

"restrict" exists, and while it's not perfect, I have not seen anything that is a Pareto improvement over it. In other words, every alternative that avoids aliasing problems (that I have seen) makes other trade-offs that make it a worse choice overall (including FORTRAN, which I am sure you will mention).


> I'm interested in hearing about alternative memory models that are an improvement;

Intel 80286 introduced segmented protected mode where you had the option of placing each of programs objects (struct, array, string, even a single variable, if necessary!) into its own segment described by privilege access level (0-3, RO or RW) and length. So you had hardware-based array length checking. Access an out-of-bounds index, BOOM - segfault!

This is actually not incompatible with C's memory model (it requires that only individual objects be stored in contiguous memory blocks), but is incompatible with how most C programs were written at the time so it never caught on.

Also, you had to distinguish between "near" and "far" pointers, there were few segment registers, etc. But had Intel developed segmentation further, and had C programmers adapted, we could have ended up in the world where you could run programs with performance of C and safety of Java.


Oh, spare me, high-level language that can't even do arithmetic properly. It's anything but "damn successful as an abstraction over the underlying machine".

Case in point: Whenever you're doing signed arithmetic, and it overflows, you're in the land of undefined behaviour. (See here for an example of how this can bite you: http://thiemonagel.de/2010/01/signed-integer-overflow/)

Another case in point: type-punning with through pointer casting is also UB. [Going through union is legal though.]

I'm pretty sure the author hasn't written a line of ANSI-C compliant code in his life, otherwise he'd never write something like this.


The author doesn't claim it's perfect or that there's no way to improve on it, but I think he has a very salient point, which is that most of the OO buzzwords and programming fads that come and go in "higher-level languages" simply end up making a bigger mess of things as a project grows. Basically, the author loves C because it is simple, straightforward, and restrictive -- it forces you to write [relatively] simple, straightforward code too, instead of concocting a terrible Frankenstein of custom classes and types intertangled into a grotesque, intractable mass of dependencies and subdependencies. If something is in C, you know it is going to be built from the basics, and in many cases, this simplicity is a life saver as a project matures.

There are definitely annoyances and issues, but they are known and can be taken into account with much less hassle than attempting to grok a Java project that requires you to traverse into the basest-level of classes like BusinessObject2013SingletonDispatcherFactoryFactory every time something needs to be debugged or fixed.

It doesn't mean you should use C for your Web 2.0 startup, but his point is well taken. I heard someone 'round these parts once acknowledge that the modern equivalent of spaghetti code (i.e., code that intractably descends through hundreds of code paths with gotos, etc.) is OO hell, i.e., code with huge dependency stacks and equally intractable and unjustifiable inheritance models, where you have to descend into all kinds of classes and special cases to make a meaningful change.

>I'm pretty sure the author hasn't written a line of ANSI-C compliant code in his life, otherwise he'd never write something like this.

Damien Katz is actually fairly accomplished, and it definitely sounds like he's written multiple lines of C to me.


> OO buzzwords and programming fads that come and go in "higher-level languages" simply end up making a bigger mess of things as a project grows

No, it's not the language features which make a mess. It's people lacking judgement and common sense. (Like, trying to apply patterns everywhere. Been domain-specific [crypto] consultant on such a project and watched it smash the schedule by more than 2x. It wasn't the language [Java], it were stupid people.)

> sounds like he's written multiple lines of C to me

I don't contend that he's written a lot of C. I DO, however, contend that he's written some ANSI C. If you're writing ANSI C, you have to account for AT LEAST all of the following behaviors:

https://www.securecoding.cert.org/confluence/display/seccode... https://www.securecoding.cert.org/confluence/display/seccode...

For example, if f is some function, then in the sequence

    int x = 0;
    f(x++, x++);
the call to f is undefined behavior because x is modified twice without an intervening sequence point. Anybody claiming that such language is "high-level" is a moron, regardless of their "accomplishments".


In scheme, argument evaluation order is not specified. So, (pretending that set! returns the assigned value), (f (set! x (+ 1 x)) (set! x (+ 1 x)) ) is undefined and unpredictable as well. I don't think anyone would claim Scheme isn't high level.


"Undefined behavior" in C is much more insidious than merely "unspecified" or "unpredictable". All those security holes and exploits resulting from buffer overflows, stack overwrites, heap spraying, etc. are manifestations of UB made possible by a badly written program. (Or by the interaction between an invalid program and an optimistic compiler, as in the case of assuming strict pointer aliasing.)


I can relate to what he's saying. I've also been falling back to C pretty often over the years and finding it pleasant when I arrive there. LuaJIT+C is my latest experimental compromise.

My thoughts in more depth at http://blog.lukego.com/blog/2012/09/25/lukes-highly-opiniona...


I take significant issue with this paragraph:

    When you write something to be fast in C, you know why 
    it's fast, and it doesn't degrade significantly with
    different compilers or environments the way different 
    VMs will, the way GC settings can radically affect 
    performance and pauses, or the way interaction of one 
    piece of code in an application will totally change
    the garbage collection profile for the rest.
The implementation of malloc & free that your C implementation uses significantly impacts performance (both time and space) just like the choice of GC implementation in other languages. It is not uncommon for memory fragmentation to be a serious problem for large or long-running C programs, and at the point that you have to start worrying about that and working around it, you've broken the abstraction in just the same way that the author complains about for Java, Erlang, Haskell, etc.

Also, as others have pointed out, performance can also depend significantly on how well C's machine model maps to the actual machine architecture you're running on.


Its often worked out well to write the 10% most CPU bound parts of large Java or PHP applications in C. C defiantly has its place.

I wonder if this gentlemen has tried Go? If he is a big C fan, Go might be a great choice for the other 90%. Not only is Go a safer version of C with lots of great modern features, it integrates very well with C via cGo.

If memory usage and speed are your preeminent concerns, C is certainly a force to be reckoned with: http://benchmarksgame.alioth.debian.org/u64q/which-programs-...


Closures? HA!

If you are working in C and think you might like closures and a different concurrency model than pthreads, then you should look into clang+libBlocksRuntime+libdispatch. That gives you the blocks(closures) known to OS X and iOS programmers and a pretty spiffy queue based concurrency model. I'm never going back to pthreads. You can't make me.

Slightly sadly, I haven't seen a distribution shipping a modern version of libdispatch. There is one in github, but it may or may not have a problem in its read/write support. I use the old version that comes in my distribution.


The author's next love affair will be Go, and he won't be back. I see it as a very real successor to C. I have written about 6k lines of Go (on a project that I had previously written in C) and I'm deliriously happy with it (partly because its just more fun to write than C). Granted, you can't write a dylib or kernel, but it sounds like for the author's case it would be a good fit.


I am not sure about this. Every analysis of performance when it comes to Go shows it lagging behind even Java. It is simply not mature enough.


When you're an expert in C, you tend to forget how much you've really learned. And not just about the language itself, or the available libraries, but about your own coding style. As an experienced programmer, you have habits and intuition that make developing and debugging your code vastly easier than for a novice. The reason that C has given so much ground to a language like Java is not that Java is intrinsically more powerful, but that novice and intermediate programmers can be much more productive with it from the start. Companies don't want to hire a 10-year C veteran at $125/hr to code up their CRUD app, even if one were readily available. They want to be able to hire interchangeable $50/hr Java guys with a bit of experience, who won't have to waste days at a time trying to track down intermittent memory problems or re-writing the wheel because they aren't aware someone wrote a C lib for something Java has built-in.


Actually they will ake java 125/hr guy and still be ahead in terms of time/cost


> Faster Build-Run-Debug Cycles

Interactive languages blow anything with a Build-Run cycle away. "Faster" doesn't matter.

Having a Build cycle is not one of C's advantages. It is a trade off (and a good one, if you care about speed of execution vs speed of development).


I've recently been doing a little bit of C hacking on a small app with a web based interface. I've missed the higher level collections and I/O facilities, but some parts of the experience have been really refreshing. There are mature tools, the softwawre builds in no time at all, and it's been easy to make it fast to both run and startup.

Dropping down to C has taken an attitude adjustement, and a willingness to drop certain kinds of features, but it's really been quite refreshing.


> I've had intense and torrid love affairs with Java, C++, and Erlang.

Oh, but this time it's different, this time it's the real thing?


"But amazingly it's proven much more predictable when we'll hit issues and how to debug and fix them. In the long run, it's more productive."

That's nothing to do with C. It's because Damien worked on CouchDB first, and learned a lot from that. If I were to go and rewrite our current system in Java or Scala, I'd also be able to predict where performance, and other, issues would crop up, simply because I've already encountered a number of them before.


"It wasn't, it was a race condition bug in core Erlang. We only found the problem via code inspection of Erlang. This is a fundamental problem in any language that abstracts away too much of the computer."

Another way to look at this is that all languages which offer poor abstractions force you to write more code, by definition. Up front you will have to write much more code, but when the inevitable bug comes up, you will have the luxury of debugging code you authored. For some individuals and teams, this trade-off is a sensible one.


Both Jeff Atwood[1] and Patrick Wyatt[2] have written posts pointing out that the bug is more likely to be in your code than in the platform.

There are instances where writing it yourself is the better option, but more often than not you will want to offload that burden onto a library that deals with the details so you can focus on doing what's important to your system/business.

[1] http://www.codinghorror.com/blog/2008/03/the-first-rule-of-p... [2] http://www.codeofhonor.com/blog/whose-bug-is-this-anyway


To folks who think C code is in general a mess of Pointers, macros, and goto's - Go download Postgres codebase,build cscope,open vim.. and try browsing the code.

Arguably, one of the most beautiful C-code bases you can get hold off.


I just did. I agree with you.


C is a fine language. So is Java, and so are Python, Perl, Javascript, as well as Lisp and Scheme.

Saying C is always effective without further qualification is like saying that high speed racing cars are always effective. You need vans, trucks, bulldozers, etc, etc.

Then there is the question of what is really meant by "effectiveness". Is the code execution speed, maintainability, collaboration, compile-execute cycle, extensibility, etc?

As always, it depends on what you want to do. In many cases C is the most logical choice. But in other case Java or other languages get the job done more "effectively".

Anybody who is limiting him/herself to a single language to "rule them all" is certainly not effective.

Edit: Usual spelling mistakes.


Thank you for writing that. It makes me feel like I'm not completely insane because there are other people who think the same. If you know what you are doing - C is your best option. The problem is that there are thousands of people these days who do not know what they are doing and basically they do not understand how computers and operating systems work. They need very high levels of abstraction which is costly and usually introduces a whole bunch of new problems. Add all those weird frameworks/libraries on top of that and you end up paying 10x more for hardware and get 10x slower product. Though, it's really hard to find a decent and productive C programmer today, so people settle for what's easily available. Another problem is that those "cheaper resources" are defining what programming is these days. I think there is a huge difference between programming and scripting... Call me Dino. BTW, loved the banana example :)


"With C++ you still have to know everything you knew in C, plus a bunch of other ridiculous shit."

So true. C++ is designed by someone who is a moderately smart but not a great engineer. There are too many details without much benefit, and you are encouraged to hide things in obfuscated manners, rather than doing straightforward programming.


This should have been an article about having made the wrong language choice for the project to be done, as an attempt of an objective evaluation of C it seems very confused in several places:

Faster Build-Run-Debug Cycles [...] C has the fastest development interactivity of any mainstream statically typed language.

I could understand someone praising Go for this, because the language designers specifically targeted fast builds for example by making the syntax easy to parse efficiently, hence achieving good build times while keeping the language relatively high level, but C simply does little for the programmer and hence the builds are faster than in Java or Haskell. You could maybe admire the decades of work put into C compilers, but not C the language itself for it.

Also, if you criticize high-level languages for being far removed from the computer it would be fair to point out that wider understood "development interactivity" of C compared to something like Smalltalk is rather bad. Consider also that this removal from the computer could be as much an argument against our current computer architecture as it is an argument against high-level programming languages. That we have invested decades into a technology and can't easily switch to a different one doesn't mean we should stay uncritical about its weaknesses.

Ubiquitous Debuggers and Useful Crash Dumps

I suspect this is a sign of disappointment with Erlang, where debugging seems to be a complete mess, the last time I was doing some development in it I had to fire up some really weird looking debugger and run the program under it just to get something reassembling a stack trace with the filename and line number for the error I got in the program. But you do have lots of debuggers and crash dumps in the JVM for example and I don't think you all that often need to examine the level below the JVM when debugging JVM-based programs. It's somewhat funny to praise C for being convenient in interactions with other C code.

Contrast this to OO languages where codebases tend to evolve massive interdependent interfaces of complex types, where the arguments and return types are more complex types and the complexity is fractal, each type is a class defined in terms of methods with arguments and return types or more complex return types.

I don't know what this is even supposed to mean. From what I know most large scale C programs end up desperately trying to emulate some of the features of higher-level languages to create more strict module boundaries and prevent implementation details from breaking the library interfaces.


I re-read the article as a satire on C and thoroughly enjoyed it


I can't agree that C is as high level as C++. C++ templates and many features in C++11 are way higher level abstractions than C.


Templates and namespaces are reason enough to use C++; add in STL and you have a massive winner.


If you're C code is portable then you've chosen the wrong language. C is best when assembly would have been appropriate but you want to write something in a more maintainable, higher-level language. C shines when you need to optimize for a specific operating platform.


My humble opinion: Scripting languages like python do have merits.

As a 20+ years C++ programmer, I sadly found there seems no scenarios that C++ can play very well. Compared to C, C++ projects are much harder to maintain. Whenever C++ should be used, C is always a better option.


Well, I think people forget history a bit sometimes. For 20 years or so the most common exploit in any system was a buffer overflow, often in string manipulation. I think we're finally getting past that, but C's lack of abstraction over string pointers was a major, major problem and even today if I was writing a secure system I would choose C++ if only to use the std::string class.


This is not exactly the C language problem, but the design flaw of the C standard library. Also, there are folks out there that still use unprotected buffers in C++ instead of safe types.


I believe it depends much more on how the code is written, rather than on C vs C++. Believe me, I am able to write as obfuscated and impossible to maintain code in C as in C++ :) (Hopefully) my code in C++ is easy to maintain because of clear concise architecture, simple interfaces, good naming and the use of complicated things such as template metaprogramming only where it is necessary. I would prefer to maintain a C++ codebase written with a concise architecture and the KISS principle in mind than a C codebase written by unqualified programmers without the clear understanding of the task. Also, if you have have 20+ years of experience, then you have started long before C++ became a stable, developed and seasoned tool. I think the finally good enough state of C++ across the platforms was achieved at least in mid 2000s, although it is still in process with the new standard. Early C++ code was a mess (even Boost still contains tons of ugly workarounds for the early compilers).


"C is a weak, statically typed language"

Wouldn't that imply that a variable of one type could be coerced into another type?

I would say C is strong typed...


    int i;
    float f = *(float*)&i;
I'm not saying you should do this, but you can.

Or to use an example the author didn't regret:

    struct foo {
      int a;
      int b;
      int c;
    }__attribute__((packed));

    void print(foo* f) {
      int *p = f;
      int i;
      for (i=0; i<sizeof(foo)/sizeof(int); i++) {
        printf("%d,",p[i]);
      }
    }


Here's one I used in a GC implementation a while back. That last 'uint_8t obj[]' is used to hold the object that was actually allocated.

   struct meta_obj {
       meta_obj_type *next; // next object in our list
       mark_type mark;
       size_t size;
       gc_type_def type_def;
       uint8_t obj[]; // contained object
   };
Or another (contrived) example:

   struct obj_type {
       obj_type_enum type;
   };

   struct string_obj_type {
       obj_type_enum type;
       char *c;
   };
You start with a collection of obj_type pointers and cast them to the appropriate pointer type when you have identified the actual contained struct. Useful if you need to have a heterogeneous list of things.


Is this standard? I mean the unknown size field 'obj'.


It was added to the C99 standard.


C is weakly-typed. You can cast anything to anything else and the compiler won't stop you.


That's right! Completely forgot about the cast.


The worse thing is that you can address a memory location and interpret it as another type (aliasing).


> Wouldn't that imply that a variable of one type could be coerced into another type?

Yup! It's quite common to cast things from a void * to a more specific one. You do it every time you call malloc.


"bolted-on support for concurrency"--just what are your options when it comes to concurrency and C?


If you agree with Hans Boehm, "Threads Cannot be Implemented as a Library": http://www.hpl.hp.com/techreports/2004/HPL-2004-209.pdf


The C11 standard added concurrency primitives to the language.


Interesting. Do you know what the tool support is like? What compilers implement it, if debuggers support it, etc.


tool support is pretty much non-existent. expect it to have the same amount of tooling and support as pthreads.


so, with you kind of experience, how often did you prove Greenspun's tenth rule in practice? =)


I'm a language design buff, so you might be able to guess my biases: FP is good, OO is bad, every language should have higher-order functions, yadda yadda.

I finally decided to get a deeper knowledge of C, something I've been saying I "should do" for years. I'm learning it via Zed Shaw's Learn C The Hard Way and I'm very impressed by the language. It does what it does very well. I wouldn't use it for a complex web app, but it's a fine language for things that are small but in which quality (which includes performance) is extremely important.

I'm growing to like C a lot, and I think people who don't bother to learn it are short-changing themselves in a major way. That said, there are a lot of great languages out there that are better-suited to high-level projects... so long as the concerns remain high-level.


Interesting, I would have considered deep knowledge of C a prerequisite to being a language design buff. Did you have experience implementing languages?


No, most of my experience was with high-level languages like Haskell, Python, and Clojure, with my original interest being more in design than implementation. One of the reasons I'm learning C is to have a better grasp of implementation, and eventually (if needed) be able to competently implement languages.

For most of my time, I focused more on user experience (syntax, type system, workflow) and its effects on development culture. Watching language choices make or break companies made me very opinionated. That said, as I get older, I'm finding it harder to call specific languages "good" or "bad". They all have their niches. Enterprise Java is horrible, but that's not James Gosling's fault.


I'd be interested to hear what you think of D.


I don't know much about it.

What's its niche? C's major win is the ability to explicitly manage (and, thereby, reason about) memory. Garbage collection is great but I tend to associate it with high-level languages like Python, Haskell and Clojure.

I'm sure there is a niche for GC'd mid-level languages (e.g. Go) but I'm not yet experienced with them enough to know what it is.


What do you think about Factor?


Stack languages look really cool to me. I have no idea whether they're practical.

For language beauty, the ones with simple syntax (e.g. Lisps, Forth, q) win. However, for a large production system, I'd be more inclined to use something with static typing.

Static typing falls down in a different way with large systems-- compile speed-- but if the thing's not being changed on a regular basis, that's not a huge issue.


Regarding the performance of C, it is interesting to note that although hardly anyone does it, Java, for example, can still be very very low-level if you want.

You can still allocate a gigantic primitive array (say of Java 32-bits integers) and mess with it, including doing bit-trickery, etc. and that is amazingly fast.

When people want the uttermost speed in Java, they dodge objects as much as they can to prevent too much GC'ing.

For example that is how the people behind the amazing LMAX disruptor "pattern" (hardly a "pattern" in the OO sense) manage to handle, on a single thread, to process about 12 millions events... Per second. In Java.

"Thankfully" Java still allows to create a gigantic primitive array and to directly mess inside that array.


Why not just stay with the definition that C is a portable assembler to white OSes and Lisps?)


C is the language someone would create if he just came out of a 5-year long attempt to build an operating system such as multics and wants to do it right this time ;-)


The C language is wonderful is because you can look Quake I, II, III, and instantly see the intent. In C++ you can't.


Quake3 is written in C++


Although Quake 3 contains some C++ code (in splines/ iirc), it's probably 99% pure C code so I don't think it can be labeled as being written in C++ .

https://github.com/id-Software/Quake-III-Arena




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: