C for high level programmers (slides)

_kst_ · on June 1, 2015

Slide 31 says:

    Actually, this is how arrays are handled in C.
    A C array is a set of consecutive memory addresses.
    The first value is pointed to by a pointer.

    int a[] = {1, 2, 3}; // create an array
    // a is just a pointer to the first element...

No, "a" is an array, not a pointer. Defining an array object does not create a pointer. The expression "a" is implicitly converted to a pointer to the array's first element in most but not all contexts.

If "a" were nothing more than a pointer to the first element of the array, then "sizeof a" would yield the size of a pointer rather than the size of the array object.

This is all explained very well in section 6 of the comp.lang.c FAQ, http://www.c-faq.com/.

sjolsen · on June 1, 2015

It's also worth pointing out that there are no array parameters. If you write

    void foo (char bar [42]) {
        // ...
    }

then within the scope of foo, bar will be a pointer to char, not an array of char. You can, on the other hand, write

    void foo (char (*bar) [42]) {
        // ...
    }

in which case bar will be a pointer to array of 42 char (and notably, not a pointer to a pointer to char!).

C does some very strange things. There usually is (or at least was) a good reason for it to do so, but especially if you're a beginner, you need to be on your toes if you really want to use it well.

TheAceOfHearts · on June 1, 2015

I think the problem with learning C is that you need to learn stuff like make, autoconf, how the compiler + preprocessors work (what do all those flags even mean!?), how making "cross-platform" stuff works, how to pull in and use "libraries", C-isms, how to test, etc. C itself is a very small and simple language, but the tooling and patterns are old and mysterious.

In college I learned how to program embedded systems with C (and ASM). Once I knew how to debug, build, and push it onto the device, it was surprisingly easy and beautiful. There's no confusing and magical abstractions, it's just you and the hardware. You pull up the microcontroller's manual, and as long as you have an idea of what the lingo means, you can make it dance to your will.

Something I disliked was that with both microcontrollers I used, there was dark magic involved in building and uploading your binary. One of the devices provided examples for a specific IDE, so I imported that example and used it as a base. I could keep modifying it and keep adding files, but I never managed to setup a "new" project. The other device was similar, but instead of using an IDE it provided a Makefile, which was nicer.

Does anyone know any good resources on learning and writing real-world C? I've looked at C projects here and there over the years, with the most interesting being GNU Coreutils... But even if I can eventually understand what some code does, how do I learn why it does it that way?

I'd love for a guide that showed things like: "do X and Y because of this and that", "test X and Y by running the following", "write tests using XYZ", "debug X using Tool A, and Y using Tool B", "pull in this library by copying these files into these places, and use it by adding the following lines to the following files", "generate docs by pulling in this tool and set it up by doing the following", etc.

In the web world there's a massive set of problems, but it tends to be easy to find "boilerplate" generators that will get you up and running. And after you've tried out a few of them, you can usually pick up how people are mixing and matching different tools.

EDIT: Another comment linked to "Learn C The Hard Way" [0], and after browsing through the chapters it seems to cover a lot of the topics I'm interested in.

[0] http://c.learncodethehardway.org/book/

kibibu · on June 1, 2015

Javascript has the same problem, but I think its worse. At least there are platform standards in C (autotools in GNU, msbuild on Win).

Trying to figure out how Grunt/Gulp/Broccoli, LESS/SASS/Stylus/Jade, Coffeescript, Uglify, Bower, Browserify, Require.js, AMD/CommonJS, NPM etc all work together is a nightmare.

It's all too hard, so people added Yeoman, Brunch, or other things to generate application configs - but now you need to decide which skeleton/generator you want, which takes you down the JS and CSS framework rabbithole.

Of course the only sane response to all this is to write another build system that doesn't repeat everybody else's mistakes.

In comparison, Makefiles are remarkably usable.

laichzeit0 · on June 1, 2015

I'm a C guy, and I never "worry" that if I have to recompile a project I worked on 1 or 2 years ago that I'm going to have to fight the toolchain to get it going again. Make will be make, and it will work.

I have a huge concern when I do anything in Javascript about what happens in 1 or 2 years from now when I have to modify a project I built using some Yeoman scaffolding. Is it still going to work? Will those node modules still be there? Can I update them without it breaking everything?

It's really weird for me, coming from the world of C, CVS/Git, Make, Linux, etc. where it's almost _unthinkable_ to introduce changes that would break older versions. Hell even the world "old" generally means 5+ or 10+ years.

Sacho · on June 1, 2015

I don't understand. You can configure your node module to use specific versions of libraries just like a C project. Surely you've heard of problems people have with dynamic linkage and their programs not working? This is why distros spend so much time getting things to work with each other - I think you're just sweeping that under the rug.

The tools you're working with aren't different on this manner - it's not like your tool will magically recode itself to work differently.

voltagex_ · on June 1, 2015

Yep. I'm more comfortable with Make than I am with JSPM/NPM and System.js. The process to minify my javascript and CSS and then replace the paths in the HTML so my pages actually work seems to be needlessly complex.

I'm seriously considering whether I can use Make for my production website builds - the only issue is Windows support.

TheAceOfHearts · on June 1, 2015

Try out webpack. Webpack makes handling all of your assets into something trivial. I converted around 3,000 lines of Gulp files into a 200 line webpack config. (Most of which ended up being alises, and this is a reasonably big project.)

For an example of what makes it so awesome:

var img = require('./foo.png') // "/output-path/0dcbbaa701328a3c262cfd45869e351f.png"

Webpack will copy this file (foo.png) to your output folder, and rename it using the file hash, so it does cache-busting.

You don't need to use other build tools, you can just use the webpack CLI. I personally use npm scripts.

Webpack also allows you to setup aliases for modules, as well as load pretty much anything you can imagine. You wanna pull in a module that doesn't use CommonJS and instead exports a global? Webpack has global-loader. AMD is supported as well.

Oh, and this includes a sane development environment, with reload on save, as well as hot loading assets that support it. (Check out react-hot-loader: http://gaearon.github.io/react-hot-loader/, but it works with css as well.)

And you can require css files in your components, and then add the extract-text-webpack-plugin so it'll rip the css from the generated JS bundle!

Aaaaand it handles SourceMaps, so you don't have to worry about some plugins (looking at you gulp) not playing well together.

Finally, it also handles minification, either through a CLI option or in the config.

Gibbon1 · on June 1, 2015

You have a couple of options with make under windows.

There are native versions of make. Where you are using dos shell commands to do stuff. MinGW provides Posix compatible native commands. (you can use ls, cat, etc commands). You can use msys which gives you a Posix compatible build env. Finally something like Cigwin provides both a Posix compatible build env and Posix compatible runtime as well.

WalterBright · on June 1, 2015

I never found C's transition from source code to hardware to be confusing. I struggle to comprehend why so many people, some longtime professional programmers, have trouble understanding what a linker does.

On the other hand, I downloaded the CUDA SDK. I couldn't even figure out where the GPU compiled code even resided. I suppose I was just supposed to take it as it "just works" (and it did), but it all left me highly uncomfortable.

yoklov · on June 1, 2015

Probably in memory? Most GPUs use different ISAs, and they're mostly proprietary, so they usually have to be compiled (at least partially) at runtime.

Then again, never used CUDA so who knows

iamcreasy · on June 1, 2015

AMD provides their specification. I guess up to HD4000 series.

TheAceOfHearts · on June 1, 2015

The OS is big and scary. When you're working directly with the metal, it's SUPER simple, there's no magical abstractions. But when you're using libraries and doing system calls you don't really know what's going on. Sure, you can dig into em sometimes, but it's more than a bit daunting.

userbinator · on June 1, 2015

I don't think you need to learn the build system until after you're comfortable with single-file programs, where "gcc yourfile.c" is enough. Then add compiler options, and get to know Makefiles after your programs grow large enough to require multiple files.

GNU coreutils (and in general, a lot of the GNU projects) are rather excessively complex and certainly not what I'd advocate "learning by example" from. Take a look at the BSDs' standard utilities for simpler, more straightforward code.

But even if I can eventually understand what some code does, how do I learn why it does it that way?

I believe that the best way to learn "why" is to ask "why not". You will see that a lot of programmers, IMHO unfortunately really don't know why and are just doing what they were taught to. If you don't do X, then either [1] it doesn't matter and you don't actually need to, or [2] it does and you realise the reason why, when you see how X makes things simpler/more efficient for either the programmer or the machine, or both.

yoklov · on June 1, 2015

Definitely. Not only is it worth doing the wrong thing first to understand why its wrong, its also often worth revisiting after you have more experience with the other alternatives.

Especially when it comes to programming paradigms and stuff like 'best practices'. Theres a lot of cargo-culting in programming culture, and you really shouldn't take it as dogma.

thelogos · on June 1, 2015

Personally, the hardest part for me was keeping track of the size of everything. Coming from a higher level language, keeping track of the bits and bytes takes some getting used to. Working with arrays in C is much tougher especially when the compiler will compile almost anything you give it, and even a small mistake is catastrophic.

Before C, I was pampered and took everything for granted. Now I appreciated my life more after C and feel blessed every time my IDE gives me a warning.

fr0styMatt2 · on June 1, 2015

My pet hate is #include files. The whole way that C handles multiple source files just seems archaic to me, having worked in higher-level languages. I wish C had a proper package system that was standard, so I don't have to mess around with things like include file path order (or my favorite, the C++ template definitions having to be in the header files thing I only recently learned about).

NhanH · on June 1, 2015

Interestingly, I first started with C (although I haven't written a line of C code for a long time), and when I first move to higher-level languages, I dislike the fact that I have no idea where the file I just imported is. Moreso when I'm playing with obscure/ new language: if I can just import whatever files I wanted (rather than at package level), it seems that would be much easier to hack on the language/std itself.

pjc50 · on June 1, 2015

I have no idea where the file I just imported is

A sufficiently long include path can give you this problem anyway. I recently tripped over this when I created a "reason.h" and discovered that Windows had a file of the same name deep inside MFC.

fao_ · on June 1, 2015

I recently tripped over this when I created a "reason.h" and discovered that Windows had a file of the same name deep inside MFC.

It is my understanding that that would be solved by placing your file in a local directory called 'inc/' and having:

  #include "inc/reason.h"

instead of:

  #include <reason.h>

(or just replacing "" with <>). But I have never programmed in VSC before, so I may be wrong.

ZanyProgrammer · on June 1, 2015

I don't quite get this lament about being pampered in higher level languages. To me, it feels like someone saying they feel pampered for having indoor plumbing or running water.

pjmlp · on June 1, 2015

Other systems programming languages e.g. Turbo Pascal, Modula-2, Ada, ... you don't have to track everything if you don't want to.

benwaffle · on June 1, 2015

be sure to compile with -Wall -Wextra

charlesL · on June 1, 2015

Frankly, I don't use a lot of crazy libs or IDEs when writing C code. Most of my projects consist of one or two external libs and a few simple makefiles. I use Vim and clang/gcc for compiling and lldb/gdb for debugging.

As for compiler flags, the only ones I ever worry about are `-O`, `-g`, `-c`, CFLAGS and LDFLAGS.

What I've learned is that the way C includes other files/libs is extremely simple. The header files are simply a simple mapping of the code in the `.so` or `.a`. If it's a n `.so`, you can't make it a static executable and if it's `.a` you have to.

I've never worked directly with low-level microcontroller programming, and the extent of my electronics knowledge is some fiddling with arduino. When I did that I used ino (http://inotool.org/) to compile and upload code to the board.

EDIT:

Projects that have good C code include: http://suckless.org/ and the linux kernel (and other stuff by Linus Torvalds). Avoid anything with GNU (most of it's over-engineered)

bnolsen · on June 1, 2015

Adding -Wall would be a very good addition that would help with fixing a few things as well.

pjc50 · on June 1, 2015

I reccomend "Programming in the UNIX Environment" by Kernighan and Pike. Partly because it was written so soon after UNIX and C themselves, it has very little of the modern 'cruft' in it. It's at the level of "cc program.c -o program".

Make is fairly easy to learn, at least in its basic form. Autoconf is horrendous.

charlesL · on June 1, 2015

> Make is fairly easy to learn, at least in its basic form. Autoconf is horrendous.

I completely agree. Make is extremely flexible on it's own. I don't understand the need to abstract the build system to generate thousand-line makefiles that are impossible to hand edit.

pjc50 · on June 1, 2015

As benwaffle says in adjacent comment, make is the 80% solution that works most of the time. Autoconf is the 100% solution that's supposed to work everywhere, no matter how weird or long-dead your UNIX is. In order to do that it does a vast number of compatibility tests. The result is complex enough that simple substitution of makefiles doesn't quite cut it.

Of course, that imposes the cost of 100% compatibility on every developer, when most would be happy to just build on today's Linux and call it a day.

clarry · on June 1, 2015

> Autoconf is the 100% solution that's supposed to work everywhere

> Of course, that imposes the cost of 100% compatibility on every developer, when most would be happy to just build on today's Linux and call it a day.

The reality is that when you use programs built with autotools on systems that aren't mainstream, you'll have troubles. Because the scripts aren't right and were only tested by Linux developers on Linux.

And much of the time, I find it faster and easier to fix a broken makefile than to fix broken autohell.

charlesL · on June 1, 2015

> And much of the time, I find it faster and easier to fix a broken makefile than to fix broken autohell.

Exactly. 99% of the time, a broken makefile simply has an incorrect linker path or cflag. When it's not a path, the Makefile is structured in a way that makes sense and is easy to fix. If an autoconf project is broken, I just scrap it and don't even bother trying to build it.

The other issue with autoconf is that it's not standardized. So many of the autoconf projects I've seen have shell scripts (to install it or download deps) mixed in that only add more confusion. Some of them have a configure script. Some have a configure.in, so you have to generate the configure yourself.

100% of the makefiles I've seen have a build, install and clean task. Sure, it's not required, but everybody does it. You can't say the same for autoconf.

benwaffle · on June 1, 2015

The idea is to make it work on all platforms, only requiring minimal POSIX compatibility. It also checks for any requirements you specify, and sets up stuff like make install, make dist-check, make check. It handles compiling your code into a library regardless of the os

benwaffle · on June 1, 2015

21st century c

http://shop.oreilly.com/product/0636920033677.do

Flimm · on June 1, 2015

I agree. I think tools like Valgrind and GCC debugger should really be introduced to the beginner from the get-go.

bnolsen · on June 1, 2015

unit tests are more effective than using a debugger. I use a debugger a couple times a month and unit tests with sprinkled asserts and debug prints in the code.

ndesaulniers · on June 1, 2015

My recommendations (I've read a lot of C books), I think you can get away with just these two:

* Head First C - David Griffiths

* Expert C Programming: Deep C Secrets - Peter Van Der Linden

iyn · on June 1, 2015

Thanks for recommendations.

BTW: just want to say that you have a fascinating blog! I've just spent last hour only skimming through some of the articles and bookmarking them for later. (Link for the lazy: https://nickdesaulniers.github.io/)

ndesaulniers · on June 1, 2015

I wrote it for readers like you!

rhodysurf · on June 1, 2015

I switched to CMake for all of my C and C++ projects and i never looked back. Takes care of a bunch of makefile issues for you once you get used to it.

StephenFalken · on May 31, 2015

  C is quirky, flawed, and an enormous success. While accidents of history surely
  helped, it evidently satisfied a need for a system implementation language
  efficient enough to displace assembly language, yet sufficiently abstract and
  fluent to describe algorithms and interactions in a wide variety of environments.

  -- Dennis M. Ritchie (in "The Development of the C Language" [1])

[1] http://heim.ifi.uio.no/inf2270/programmer/historien-om-C.pdf

okasaki · on June 1, 2015

These slides were full of dangerous, elementary errors when they were submitted to /r/c_programming a couple of days ago. The author doesn't know C enough to have worthwhile views on the language.

billforsternz · on May 31, 2015

I was quite impressed how many important concepts are covered in such a small presentation. He explicitly doesn't cover the things in C that programmers from other domains will understand easily (functions, conditionals, loops etc.) and goes straight to the things that make C different.

htor · on May 31, 2015

I feel C is best summarized in these two slides:

http://charliethe.ninja/slideshow/introtoc.html#22 http://charliethe.ninja/slideshow/introtoc.html#25

joshuapants · on May 31, 2015

I just about died laughing at slide 22. Anybody know how these slides were made?

charlesL · on May 31, 2015

Hey author of the slides here, glad you liked it :D

I used http://remarkjs.com/ to make the slides. All you have to do is include the script, add a textfield with your content in markdown and it automatically converts to a slide show.

iamcreasy · on June 1, 2015

I enjoyed your slide very much. Can you make another slide that contrasts C and Rust?

Since presumably Rust has major improvements on both of this cases : Memory & Pointers

agumonkey · on June 1, 2015

Intersting how simple it is. `class` support is nice to grab the attention (e.g: the black slides)

saadel · on June 1, 2015

Can you tell us what font you used?

caipre · on June 1, 2015

It's all there in the CSS. Looks like "Amaranth" and "Droid Serif".

kanche · on June 1, 2015

I came back to the thread again just to say this. Really, that slide is gold!

pkulak · on June 1, 2015

I was hoping this would actually tell me how to accomplish something in C. I know all about pointers and memory, but I don't know anything about the current state if C development. What libraries do people use? What are common memory management strategies? Etc.

yoklov · on June 1, 2015

Libraries depend on what you need to do. A lot of work is done with just the std lib.

For domain specific stuff you use domain specific stuff. For generic stuff, well, you tend not to want generic data structure libs that come with other languages stdlibs because its hard to do those both efficiently and cleanly (you can only pick one). Its easy enough to hack up a dynamic array or a hash table with a fixed capacity that can only insert and get, maybe delete, so you tend to do this when you need it. (Also, a sorted array usually does very well in place of a hash table with not much code)

Memory managment it depend what your doing. I use a lot of arenas, and pools, and as a result very rarely have to worry too much about memory management. Some things become harder here like dynamically sized arrays, but you can do this with chunks of fixed length arrays. (or whatever)

I have to do very little string processing most of the time (and when i do they usually have small, known maximum lengths), and i imagine this memory management technique would work less well for that.

pjc50 · on June 1, 2015

This depends so much on what you're doing. You get entirely different answers for embedded systems, UNIX, Windows, games, and so on.

benwaffle · on June 1, 2015

https://notabug.org/koz.ross/awesome-c

akkartik · on May 31, 2015

No mention of undefined behavior? That's the first thing I tell people about C, considering how easy it is to trigger.

charlesL · on May 31, 2015

The issue with undefined behavior is it's kind of hard to... well... define. It's an odd combination of unsafe memory, float/integer rollover, and rules with memory allocation. I wasn't quite sure how to clearly state it.

caf · on June 1, 2015

All you really need to say is: There are some rules in C which neither the compiler nor the runtime is required to check if you've broken. On the contrary, the implementation is allowed to assume that you haven't broken them - which means that if you do break these rules, all bets are off as far as the behaviour of your program goes, which may lead to all sorts of strange and apparently inconsistent results.

akkartik · on May 31, 2015

Yeah I tend not to get hung up on the different ways it can trigger. It's more like, if you trigger it your program might run fine for now and then suddenly act up one day. It might die before the undefined behavior. It might launch nukes. Still sure you want to get into this? No? Want to go back to Ruby? Well, Ruby is built atop C. That's the five states of grief I try to get them through.

kibibu · on June 1, 2015

Worse than that - undefined behaviour can cause optimization to make your software insecure.

http://blog.llvm.org/2011/05/what-every-c-programmer-should-...

vortico · on May 31, 2015

There was a mention of garbage being returned from an out of bounds array.

TheLoneWolfling · on June 1, 2015

Unfortunately, UB extends far beyond that. To the point of being (very) logically inconsistent.

People not familiar with C would naively expect that, for instance, `x[10] == x[10]` is always true even if 10 is out-of-bounds for x (or rather: it may crash, or it may be true.) . But compilers can - and will - assume that this situation is false if it makes the code faster.

This sort of thing is my major pain point with C - you quite literally have to know the entire code to make any judgements about any one piece of code, even trivial ones. You end up having to fight the compiler at every turn.

missblit · on June 1, 2015

Not only can the compiler make it false. It can assume that the code would never make the comparison in the first place, and optimize away any cases where the undefined behavior is guaranteed to be triggered.

So in a sense undefined behavior can travel back in time :O

TheLoneWolfling · on June 1, 2015

Yep.

It's impossible to have actual modular code in C. A real shock for people coming from higher level languages.

xcogiz · on June 1, 2015

What exactly do you mean by modular code in your context?

edit: Oh, you probably meant modules integrated directly in the language. I was surprised to read that since modular design is highly successful and useful in C.

TheLoneWolfling · on June 1, 2015

No, I'm not talking about modules. I'm talking about modular code:

I mean that in many languages it's at least possible to write code that can be understood fully in isolation. Objected-oriented programming, etc, etc.

In C that's impossible. Due to UB being global.

vezzy-fnord · on May 31, 2015

command line utility development, writing a replacement to systemd

Those don't require direct use of C, necessarily - but rather some form of FFI to POSIX and the syscall interface. Hell, if one is writing a systemd replacement, I'd encourage use of OCaml. It'd spare you lots of boilerplate because you're writing relatively high level userspace logic, anyway.

pjmlp · on May 31, 2015

Yep!

Unix system programming in OCaml - https://ocaml.github.io/ocamlunix/

charlesL · on May 31, 2015

Huh, I've never actually heard of using OCaml for system programming. Interesting!

microtonal · on June 1, 2015

Red Hat's Richard Jones (who is also often commenting on Hacker News) does a lot of system programming in OCaml (mostly VM management-related I think):

https://rwmj.wordpress.com/tag/ocaml/

inglor · on June 1, 2015

Why do presentations always get interpreted vs compiled wrong? It's not a property of a language it's a property of the runtime. For example Java is interpreted on old versions, JIT compiled on the desktop and compiled AoT in Android...

nemo · on June 1, 2015

Compiled vs. interpreted is intended there to mean something a little different than runtime behaviors.

Compiled - source files are run through a compiler which produce some output file that's then run. With C or Java you run the .c or .java file. C compiles to an executable, Java compiles to byte code.

Interpreted - source is fed to an interpreter which typically turns it into some kind of representation that's immediately run with no intermediary step on the part of the user/programmer. Perl, PHP, Ruby, Python, JS, TCL, etc. are typically run this way.

The terms are inherently a little muddy, since you can compile some interpreted languages to bytecode (common in PHP, and how JRuby/Jython are often run), and in theory could write a dynamic loader for C / Java to run them as interpreted - no idea if someone has been so perverse as to do it.

The behavior of a JRE or other runtime in loading bytecode/whatever to execute - JIT, AOT, or whatever has to do with how a runtime handles bytecode that's already been compiled from source and turns it to machine code to execute. The terms show up there also, but have a different meaning.

That's not what people are typically discussing when they talk about a compiled vs. interpreted language, and the author was correctly using the terms when referring to languages to draw the distinction in what a developer does (rather than what a runtime does, which is basically irrelevant to C).

Sacho · on June 1, 2015

I think the parent was correct.

In your explanation you even state "an interpreter which typically...". You talk about compilers and interpreters, which aren't part of the language.

The way you describe it, compiled or interpreted is a transitive property of a language, based on a popular way to utilize it.

If the definitions are "inherently a little muddy", then they're not very useful. Javascript and compile-to-JS languages have a ton of compilers written for them. Does that make them compiled or interpreted languages? How popular does compiling a language have to be for it to become compiled instead of interpreted?

It's no wonder that the parent prefers the strict definitions which lack this ambiguity.

nemo · on June 1, 2015

I think the slide was correct in expressing the idea it wanted to express using terms that are commonly used in the sense they used, while the parent was being pointlessly pedantic in wishing the terms had some strict sense that was the only true definition.

Sometimes terms do have very strict definitions that are only correctly used in a limited sense. In the case of compiled vs. interpreted that's not the way things are and criticizing an introductory doc. for not adhering to an arbitrarily selected strict definition is pedantic and counter-productive.

inglor · on June 1, 2015

No, seriously. All the lecture had to say to be correct and at least as useful is that C requires a build step unlike a lot of high level languages which don't.

Nothing about compilers vs interperters, none of that is relevant and the fact it's totally - incorrect is just icing.

This is not a terminology debating society, you're welcome to look up the definitions on wikipedia or take an introductory CS course if you're not sure what a compiler is or the difference between a language and a compiler/interpreter for it.

nemo · on June 1, 2015

You're right that if the slide said C requires a build step unlike a lot of high level languages it would be correct. The terms "interpreted language" and "compiled language" have a sense in normal usage that I described, and I just verified that Wikipedia doesn't agree with you. There are no standards bodies that define formal definitions for those terms, common usage is how they are defined. All this is pointless pedantic quibbling, though, so it wasn't really worth my time, nor is it worth yours, really.

patal · on June 1, 2015

Agreed, interpreted/compiled shouldn't necessarily be seen as part of a language definition. However, I see two counter arguments.

When using preprocessor macros in C/C++, these only make sense with a compilation step. But they are part of the language, are they not?

In interpreted languages, you generally have an eval function/command that lets the interpreter execute any dynamically constructed code. That eval function arguably is percieved as part of the language, but only works in an interpreted environment.

readme · on June 1, 2015

From slide 12:

    int my_var = 3; // It's an int!
    my_var = "abc"; // COMPILER ERROR! You clearly stated that it was an int!

"abc" gets interned and has a memory address. The statement my_var = "abc" tries to assign that address to my_var, but since it's not a pointer to char it gets cast to int instead, possibly truncating the value.

The program still compiles, just printing a warning.

allcentury · on June 1, 2015

As someone who just reread "C Programming Language", in chapter 1 they teach you how to count occurrences of characters so I'm not sure I understand the sarcastic tone of "you're not going to count foo in a file". C can be used for many things, I don't think the author of the slides dod a great job of describing "when" C is the right tool for the job.

109876 · on June 1, 2015

I think he is just saying that C isn't worth learning if you just need to do really simply things that a ton of modern languages can do in one of two lines.

dmix · on June 1, 2015

Or, you know, old-school 'programming' by combining Unix utilities from the 1980s.

    $ wc foo file.txt

rorykoehler · on May 31, 2015

C was my first language and I find knowing it makes me a better programmer in higher level languages. Apart from that though a great intro.

gshrikant · on June 1, 2015

As someone who occasionally dabbles in C code, I am interested in knowing what is the modern take on `goto`s.

I know they are "harmful" but I often come across code riddled with goto statements [1] and I personally feel that as long as it makes the code readable without significantly obfuscating the logic, goto is a perfectly fine way of doing things (although popular opinion and consideration for best practices have more or less forced me to remove goto from my list of C tools). Also, the fact that it maps almost directly to asm makes it easier to reason about the generated machine code (although if that's a significant reason for using goto is questionable).

[1] https://github.com/zedshaw/mongrel2/blob/master/src/http11/h...

Zed Shaw's Mongrel2 server code linked to in another thread.

yoklov · on June 1, 2015

Goto is fine. The 'harmful' style was using them in favor of ifs and loops.

The things I've seen people do to avoid a goto are pretty awful though. If you ever use a 'do {} while (0)' just to break out of it, you should feel bad. Goto is much clearer and cleaner than nonsense like that.

prav · on June 1, 2015

Agree with above comment. Linux kernel code, especially device driver code use goto a lot.

Gibbon1 · on June 1, 2015

I think one uses goto's for two sometimes three reasons.

1. Sometimes one feels the need to abuse exceptions. But C doesn't have exceptions. So one abuses the old goto instead.

2. Sometimes the code is far more readable if you use a goto to short circuit a complex block of code. Which will become insufferably more complicated if it has to say keep track of a trivial case. if(this and not trivial case) else this and not trivial. if(case a and not trivial case) else {if not trivial case)

3. You can use long jumps to do exception handling stuff. I've never had to actually do this.

One comment. I remember trying to read ancient code that abused goto's mostly because the programmer was desperately trying to fit everything into 4k of prom in languages that didn't support structured code. That was the kind of stuff Dijkstra was bitching about, not uses 1, 2, and 3. And actually since C has always had modern control structures goto just is not abused much in practice. Probably the opposite.

Side note.

int my_var = 3; my_var = "abc";

Just usually generates a warning when compiled. If run my_var will usually get loaded with the address of "abc". If you follow it with the statement

printf("my_var=%s\n", my_var); // this will throw a warning

It'll print 'my_var=abc'

olavk · on June 1, 2015

Some languages, eg. BASIC, used goto's instead of functions. It is easy to see how this can lead to a horrible mess of spaghetti code.

MarkSweep · on June 1, 2015

That's not really the best example of using goto's, as that code is generated from this file:

https://github.com/zedshaw/mongrel2/blob/master/src/http11/h...

A nicer example of the use of goto's in Mongrel2 is the check macro:

https://github.com/zedshaw/mongrel2/blob/a884aef0f4a460e130d...

The convention of the project is that every function that has to deal with an error condition has an "error" label. These macro a used to jump to that label to clean up the function before returning. Here's an example:

https://github.com/zedshaw/mongrel2/blob/a884aef0f4a460e130d...

I think that this use of gotos is more clear than the alternative of a lot of nested if statements checking for success or having the clean up logic duplicated in a lot of places.

sjolsen · on June 1, 2015

In the C code I've written, I've found there's really one major, common use for goto: stack unwinding. It usually looks something like

    status_t foo (result_t (*result))
    {
        status_t status = good;

        resource1_t r1;
        if (failed (status = get_r1 (&r1)))
            goto end;

        resource2_t r2;
        if (failed (status = get_r2 (&r2)))
            goto cleanup_r1;

        resource3_t r3;
        if (failed (status = get_r3 (&r3)))
            goto cleanup_r2;

        status = get_result (result, r1, r2, r3);

    cleanup_r3:
        release_r3 (r3);
    cleanup_r2:
        release_r2 (r2);
    cleanup_r1:
        release_r1 (r1);
    end:
        return status;
    }

Personally, though, I prefer using C++ and destructors (and, where appropriate, exceptions):

    result_t foo ()
    {
        // Allow exceptions to propagate
        auto r1 = get_r1 ();
        auto r2 = get_r2 ();
        auto r3 = get_r3 ();

        return get_result (r1, r2, r3);
    }

Exceptions aren't always appropriate, but the C++ still usually ends up a little cleaner. I would love to see something like Haskell's Maybe monad that lets you write code like the above, but returning status information instead of throwing exceptions.

reitzensteinm · on June 1, 2015

If it's genuinely easier to understand with gotos, you can safely ignore the opinion of those that object.

Gotos are a specialised tool that many misuse, especially historically, but when you need them, you need them.

ndesaulniers · on June 1, 2015

Without having RAII from C++/Rust, it's really tricky to correctly release resources when something exceptional happens.

GnarfGnarf · on June 1, 2015

Never ever ever use goto's. If you have to use a goto, rewrite the code.

eru · on June 1, 2015

C is a horrible language to write a compiler in. The other reasons for `Why you should learn C' are better, though.

kgabis · on June 1, 2015

Why? Compilation times are quite important, and writing a fast compiler is easier to achieve in C than in most other languages.

eru · on June 1, 2015

Premature optimization. Write correct first, profile then optimize later.

kgabis · on June 1, 2015

Why is writing a compiler in C a premature optimization? Ignoring performance is just bad approach to writing software. And "premature optimization" phrase is overused, even when it's not relevant to the discussion.

bhaak · on June 1, 2015

I misread the title as "C for high level programming" and was a bit confused. Of course, long time ago, C was considered a HLL.

The presentation is quite good. I might point people to it in the future so they might at least have an idea in what dangerous place they want to venture.

jbenner-radham · on May 31, 2015

So as someone who has an unhealthy love of C I have to say this is amazing and will be forwarding it on to some friends trying to learn C. The humor is just fantastic IMHO.

qohen · on May 31, 2015

May I suggest that you also send them a link to Zed Shaw's "Learn C The Hard Way" online book [0], which presents lessons in modern C programming, including the use of tools like Valgrind, etc.?

[0] http://c.learncodethehardway.org/book/

TheAceOfHearts · on June 1, 2015

I just wrote a comment [0] about my problems with learning C, would you say this book covers the issues that I raised and is worth reading?

EDIT: Well, I just looked over the chapters in this book and I'm now extremely excited to give it a read. It seems to cover most of the topics that I'm interested in, so thank you SO MUCH for sharing!

[0] https://news.ycombinator.com/item?id=9636122

14113 · on June 1, 2015

I think it might not cover them, going by this: http://hentenaar.com/dont-learn-c-the-wrong-way

charlesL · on May 31, 2015

Ehh, you can never have too much C! ;)

chrixian · on May 31, 2015

The thing about pointers that I don't get is why do you need the memory address of the variable? Is that the only way to get the value when you want it? Like, every variable has to have a pointer in order to make use of the variable?

jbangert · on May 31, 2015

This comes from a restriction of almost all existing computer architectures. You have a small amount (16 on amd64) of 'variables' called registers that you can directly work with. Additional variables have to be loaded from and stored in memory, which is slower and requires you to know that variables address (pointer) -- which is just an integer with special meaning. In C, variables you never take a pointer to might live exclusively in a register, and all local variables are stored at a fixed offset to a special 'stack pointer' which is kept in a dedicated register.

Some architectures try to have a more sophisticated approach and have some sort of 'fat pointer' in which pointer values have a special tag and are subject to special rules so they can only point to valid objects. The exact rules used and what constitutes 'valid' is specific to the architecture. Intel has introduced mpx on newer processors to check array bounds with such a scheme, and older processors such as LISP machines have much stronger (but less efficient) schemes.

andrepd · on May 31, 2015

If you want arrays of variables, you need pointers. It's not enough to know the value of an int (the first element of the array), but you need the pointer to an int (pointer to the first element of the array). Now you can increment the pointer to go to the next element. You couldn't do this with a simple value.

Also, suppose you want to pass a variable of a large data type, like an image, to a function. Instead of copying the entire variable, just pass the cheap pointer. The analogy is giving someone an URL vs the source code of the site for them to paste in the browser (You can't fit the latter onto a QR code, for instance, but you can the former).

kyllo · on June 1, 2015

Of course the problem is, if you're passing a pointer to a data structure to a function, the function doesn't know the size of the data structure unless you pass that as another argument.

xamuel · on June 1, 2015

You meant to say, "if you're passing a pointer to an array to a function, the function doesn't know the size of the array unless you pass that as another argument".

When passing a (pointer to a) data structure to a function, in 99.99% of cases there's only one data structure you'd pass, and you build this into the function's prototype, e.g.,

  int myfunction( struct my_structure *x )

instead of

  int myfunction( void *x )

and so, yes, the function does know the size of the structure. And in the case of arrays, often it's enough to mark the end of the array (with '\0' in the case of char arrays or NULL in the case of pointer arrays), I'd only roll my sleeves up and worry about minimizing length calculations if I had actually done some profiling and determined that such nitty-gritty optimization was needed (it rarely is).

caf · on June 1, 2015

You don't need the address of a variable: you can use the plain variable just fine, and most C code does a lot of that.

What a pointer does is add a level of indirection: so instead of having a value "an integer" you can have a value which is "the location of an integer". A variable holding such a value can be assigned the location of any integer variable, and importantly can also be reassigned the location of a different integer variable.

The additional indirection also means that you can link one data structure to another without including one as an integral part of the other.

danjayh · on June 1, 2015

And why is this useful? Well, for high-level folks, pointers are used for roughly the same thing as reference variables in other languages.

For low-level folks, sometimes you need to be able to read from / write to a specific address in memory. So if you have, for instance, a system clock device that always give you the current time if you read address 0x1234, you might do something like this:

uint64 system_time;

uint64 system_time_device = 0x1234; // A pointer to the system time device...

system_time = system_time_device; // read the contents of the memory at address 0x1234 to get the time

redler · on June 1, 2015

For anyone confused, HN's formatting system changed the two asterisks that give this meaning into italics-start and italics-end.

  unit64 system_time;
  unit64 *system_time_device = 0x1234; // set pointer
  system_time = *system_time_device; // read what's pointed to

72deluxe · on June 1, 2015

Speaking from a C++ perspective, the item pointed to may not be a simple type (like an integer) but a complex object.

Copying an object around all over the place (into functions, out of them) would be expensive. It would be like copying an entire ledger every time you wanted to make a change to the ledger. Far better would be to hand the ledger around, or when looking for it ask "where is it?" and be pointed to where it is now.

It also makes it simpler to make sure all your data is in one place, which is a good thing for program design.

srtjstjsj · on May 31, 2015

If you don't have a copy of the value, you need someway to find it, right? That's a pointer, or a reference, or a handle. (These are all approximate synonyms for some way to "address" the data.) In the olden days, there was a fixed mapping from a number to a physical location in storage, but now there are many levels of indirection, such as virtual memory, pools, etc.

cosmolev · on June 1, 2015

C is syntactic sugar over ASM

iopq · on June 1, 2015

Really? I forgot the sugar for registers in C, can you remind me?

benwaffle · on June 1, 2015

the register keyword

ufo · on June 1, 2015

Which most modern compilers just ignore.

TickleSteve · on June 1, 2015

...and Python is syntactic sugar over C.

Quite often, C is the right tool for the job. That isn't likely to change.

smilefreak · on June 1, 2015

These slides are great, I was thinking of presenting something similar to a bioinformatics lab group I am a part of. Could I adapt some of these slides?

velox_io · on June 1, 2015

This is why I like C# so much, you can go from LINQ & generics to pointers. So much versatility.

72deluxe · on June 1, 2015

I remember someone using LINQ to do fancy things on the results of an SQL query. It was stupid. Far better would be to have written the SQL properly in the first place, instead of grabbing loads of data and doing cartwheels client-side.

LINQ is neat, of course.

halayli · on May 31, 2015

teaching C and not checking malloc, bad idea. using realloc in the way its used in this "tutorial" can cause memory leaks. realloc should be checked before reassigned because it can return NULL and that will overwrite the previous valid memory address.

mappu · on June 1, 2015

>teaching C and not checking malloc, bad idea

On Linux, malloc will appear to give you memory even if there is none left to give, so checking the error is not important.

pdw · on June 1, 2015

Not checking malloc's return value can easily lead to security vulnerabilities, particularly in bytecode interpreters and things like that.

The basic plan is simple: Trick the target program into allocating an impossible huge block of memory (e.g. 3 gigabyte on a 32 bit system). Malloc will return NULL but the program blindly assumes the allocation has succeeded. Now use carefully chosen indices to read and write whatever memory you want.

As an example, here's a classic Flash exploit that used this technique: http://chargen.matasano.com/chargen/2007/7/27/this-new-vulne...

Really, xmalloc is five lines of code. Just use it.

    void *xmalloc(size_t size) {
      void *ptr = malloc(size);
      if (!ptr) abort();
      return ptr;
    }

masklinn · on June 1, 2015

An issue which is not linux specific is when calculating the size of an array allocation you can overflow size_t. So when mallocing arrays, you're supposed to check the computed size for overflow (or use calloc(3) or OpenBSD/libbsd's reallocarray(3) instead).

halayli · on June 1, 2015

If you take such assumptions when writing production software, good luck.

You need to program based on what the man page says and it's clear even in linux that it can still return a NULL when overcommit is turned off.