Hacker News new | past | comments | ask | show | jobs | submit login

The fact that one must watch out for integer overflow on a buffer size calculation for malloc, realloc, etc. should be no surprise to a thoughtful C programmer. It sounds like somebody just did an audit of an old code base looking for bad patterns and this is what came out of it. It sounds like great work on the part of the people doing audits/fixes but I'm not sure it's freakout material ("massive" seems a bit sensationalist).



The only reason this isn't worth a freakout is that nobody running Linux in a graphical desktop configuration could ever have assumed that the machine was multiuser-safe to begin with. The reality is probably that no Linux (or FreeBSD) system in any configuration is secure from an attacker that can run native code in an unprivileged process. But with X11 running, you never had a chance.

I'm always amused by the dismissive tone programmers take to findings like this --- "oh, any thoughtful C programmer could avoid these problems". No, they can't. How many billions of dollars does it take to prove this point? The very best secure programmers in the world, building C (and C++) codebases from scratch specifically to address these problems, almost invariably fail to do so: memory corruption bugs get found in these systems just like they do in '90s-vintage X11 code.

C is my very favorite language and I'm not saying people shouldn't use it (well... they should avoid it, actually), but people should work in it with their eyes open.


At no time has there ever been a Linux kernel release with no local root exploits. There have been periods of uncertainty when we didn't know the exact exploit but we always find out eventually. Keep this in mind whenever you think about Linux security.


This is true for virtually all software.

That said, Linux users do have an unwarranted confidence in the security of the operating system. The kernel developers actually don't care that much about security--but because it was designed with some level of sanity, most installations are headless, and the non-headless installations aren't used by very many people, it has been able to tout some apparently obvious superiority to e.g. Windows (which isn't really the case post-Vista, when Microsoft had enough of being a punching bag.) Same thing goes for Mac--a whole lot of confidence, but not a lot to back it up. Curiously, a lot more Mac malware has shown up after they gained considerable market share.

There is no security in this world, only opportunity.


Careful. I'm not making a "Linux vs. Windows" argument here. I'm making a Linux desktop vs. any other computing environment argument. Not all computing environments are equally exposed to attacks. It is for instance much more annoying to exploit iOS vulnerabilities than it is to pivot from user "nobody" to root on an Ubuntu desktop system.

That these two environments aren't comparable --- one is a heavily-locked down and simplified computing environment and one is a general purpose desktop operating system --- is exactly my point.


I totally agree. It wouldn't be fair to compare Ubuntu to Chrome OS either. (My response was aimed more at thrownaway2424's comment, and I didn't mean to give the impression that everything is equally vulnerable, just that most software has exploitable bugs that may or may not be known.)


Did I say avoid the problems? I'm pretty sure I said that the problems are not surprising. And, to add to that: I think it's especially not surprising in an older code base from an era in which this was less common knowledge. New code bases are of course not infallible but awareness of this is a higher priority than it used to be.

I think it's also interesting that issues like this sometimes remain in libraries because people think the caller should figure out overflow. From a casual read of the announcement it looks like they took a better approach in some areas, i.e. the library assumes the worst of the caller. I'm reminded of something I read a few years back about the OpenBSD folks adding more overflow checks to calloc which does multiplication. It's nice when you can catch it in a library rather than pushing decisions to the caller.


> C is my very favorite language and I'm not saying people shouldn't use it (well... they should avoid it, actually),

Could you elaborate on this? Particularly "you should avoid it, but I'm not saying you shouldn't use it" (what exactly do you mean?), although I'm also curious why it's your favorite. This is coming from someone who is spending more and more time with C, so genuinely interested.


I suspect it's something like "don't use it unless you need to".


Yep. Although, "I just want to" is a fine reason, as long as you know what you're getting yourself into.


The reality is probably that no Linux (or FreeBSD) system in any configuration is secure from an attacker that can run native code in an unprivileged process.

There's a lot of web hosting companies which use Linux on their servers?. Are you saying they're all doing it wrong? Or by "an unprivileged process" do you mean "not specifically sandboxed process"?


A thoughtful programmer can avoid these problems by using proper process separation. For example, openSSH works this way. In this setup, compromising the untrusted part of the application does not give you full control. If you can keep the trusted part small, you can avoid a lot of problems.

X11 violates this principle big-time. Basically, it's a giant application that runs entirely as root. For most of its lifetime on Linux, it included a huge chunk of what we would normally expect to be video driver code that lives in the kernel. It's always had a lot of trivial denial-of-service attacks, keystroke grabbing attacks, and so on.

X11 was a great system, especially for the time, but let's not pretend that it was "the very best secure programmers in the world, building C (and C++) codebases from scratch... specifically for security." We are building better systems, like Wayland, the Android stack, and so on, and it's not taking "billions of dollars." Maybe millions.


Do you have any data, or even theoretical reasons you can give as to why Wayland, Android or anything else will be engineered in a more secure way?


Developers are much more security aware these days. Wayland will not require root privileges which means a buffer overflow isn't as critical. It also will not be network transparent which reduces the attack surface enormously. Then remember it was created because xorg had accumulated so much cruft over the years because of its obligation to support dozens of outdated X11 extensions nothing (except for emacs :)) uses. Hopefully that means Wayland's devs will keep it lean and clean and easy to secure.


Two things.

(1) Developers tend to be more aware of secure programming issues than they were in the 1990s, when you could coerce a shell out of a lot of SUID programs and network daemons with a well-placed pipe filter or semicolon character. But that doesn't make them better at implementing more securely. Case in point: Nginx, a codebase that was designed from the start for modern C secure programming, had an integer overflow just a couple days ago; a couple years back, our team found another serious Nginx bug (that time, an info leak) stemming from their attempts to be extra-cautious. So simply being aware of secure programming issues isn't an assurance that code is actually more secure.

(2) More importantly: it doesn't matter so much if there are fewer memory corruption vulnerabilities in C programs. Like I said before: what matters is if there are any of them, or at least any that can be discovered in a reasonable amount of time. In (say) C# or Java serverside code that doesn't call out to C code, we can say with some assurance that there aren't any such flaws; we don't have to stare at the same 10,000 lines of code for a month to prove that to ourselves. You simply can't say that about C codebases. There could be a memory corruption flaw in Dovecot right now, or in Postfix, or (less likely, but still...) tinydns.


Well, we're going to have to agree to disagree. I do think security has improved since the 1990s. I think that this kind of "we're all doomed, it doesn't matter what you know" attitude that you are displaying is exactly the problem. It does matter what you know and how you architect the system. Security has improved a lot, even on platforms like Windows that are heavily C/C++ based.

Garbage collected languages are nice, but they're really just an extension of the general principle of minimizing the trusted code base. They are not a silver bullet (witness the recent Java, Ruby, etc exploits). I am all in favor of these languages, and especially Golang, which shows a lot of promise.

What I'm not in favor of is using a really crappy program written in language X as a stick to beat the language designers with, regardless of what value of X we're talking about.


I gave specific examples of modern C codebases for which lots of attention had been paid to security that subsequently suffered vulnerabilities stemming from memory corruption. You respond with "garbage collected languages are nice but they're really just an extension of the general principle of minimizing the trusted codebase", which is nonresponsive to my point. We agree: we disagree.

I don't think it's negligent to write things in C. I like C, a lot; it's my first language. But stop kidding yourself that with just a couple modern development practices you'll produce code in C that is comparably secure to Python or serverside Java. You won't.


In some ways, C is more secure than Python because it doesn't support eval. Things like pickle can easily be abused.

Also, I notice you completely snipped the point about using process separation. Sigh. Mistakes of the past, doomed to repeat, and all that.


C does support eval. C calls it "trying to copy strings".


I'm really annoyed by your refusal to stay on topic. I pointed out a specific problem with security in many higher level languages-- the presence of eval and eval-like constructs in the language. You changed the topic. I pointed out that minimizing the size of the trusted code base, and reducing the privileges with which code runs is the foundation of any successful attempt to make secure coding easier. You changed the topic.

I'm really tired of the ideology that everything written in C is insecure, and everything written in higher-level languages smells like roses. Should we be surprised that things like Wordpress, Ruby on Rails, and even client-side Java are riddled with security vulnerabilities? Well, when programmers refuse to learn from the mistakes of the past, and think using a higher-level language is a magic elixir for achieving security, it's not a surprise.


"The fact that one must watch out for integer overflow on a buffer size calculation for malloc, realloc, etc."

This should be considered evidence that C is not a good language to use in code that needs to be secure or reliable.


I feel like this statement does not reflect an understanding of the problem.

Machines do not have arbitrary-precision integers. If you need to take X units of Y bytes, and your integers are 32 bits, and X is big enough so that X * Y will not fit in 32 bits... Then this problem has to exist somewhere - the machine will have to truncate the product into a 32 bit entity, the bit pattern of which is a legitimate quantity in its own right. The allocator doesn't and can't know whether you really meant "the lower 32 bits of X * Y" or "the product of X and Y", because the former is a totally valid input.

You could very well make the same mistake in a higher level language since AFAIK most of them don't have arbitrary precision integers by default: create an array of a size based on an expression with an integer overflow. The difference is that it will blow up in a "safer" manner when you index beyond the end of the array. You could argue that this is the real benefit of the higher level language and perhaps you're right, but somebody at some layer of the system must implement that check; the machine does not have it as an inherent feature.


> Machines do not have arbitrary-precision integers.

Machines don't have concepts like "integers" at all. They have concrete functions that produce defined outputs with defined inputs, which can be interpreted as fixed size integers, components of arbitrary precision integers, or specifications of glyphs, or numerous other things. But none of those interpretations are inherent in the machine.


By integer I mean machine word. Those do exist.

I suppose we can get philosophical and say that there is no such thing as a machine word, that when you operate on the memory using different instructions, or jump into the memory, or whatever, this "machine word" becomes something else... But IMO this is a pointless exercise. The fixed-size machine word is a reality in the machines we're talking about. (How that gets translated into a higher level language like C will vary from machine to machine, compiler to compiler, but the analogy still stands and I do not believe it is meaningful to argue such points unless engaged in purposeful misinterpretation.)


"Machines do not have arbitrary-precision integers"

Nor do they typically have multidimensional arrays. The reason we use compilers and high-level languages is so that we can think in terms of an abstract machine that is easier to work with. Languages exist for the benefit of programmers, and the less mental effort programmers have to spend on nonsense (like remembering that arithmetic is done modulo 2^32 -- or is it 2^64, or 2^29 [yes, really], etc.?) the better able they are to create useful programs.

"the machine will have to truncate the product into a 32 bit entity, the bit pattern of which is a legitimate quantity in its own right"

Right, and that is why the compiler should translate "x * y" into an arbitrary-precision multiplication -- because programmers assume that "int" refers to "integers," which do not get truncated when they are multiplied. The only reason a compiler should not do such a thing is when it can deduce bounds on the operands that ensure no truncation will occur. Languages that use arbitrary precision arithmetic by default are better and make programmers more productive.

"You could very well make the same mistake in a higher level language since AFAIK most of them don't have arbitrary precision integers by default"

Yeah, but integer overflows can be and should be reported as errors in languages that do not support arbitrary precision arithmetic. It is very unusual for an overflow to be the desired behavior of a program, and more often than not, overflows are bugs. There was an attack on an electronic voting machine (shown in Hacking Democracy) that exploited an integer overflow -- had the software simply entered an error state, the attack would have been stopped regardless.

"You could argue that this is the real benefit of the higher level language and perhaps you're right"

Yes, that is basically the point I was making. Let's ignore the part about arbitrary precision arithmetic, and simply assume that overflows will go unreported. Why should it be possible to read or write arbitrary locations in memory as a result? If you try to access an array index that is out of bounds, an exception should be thrown. If the compiler can guarantee that you'll never be out of bounds, i.e. that the exception would never be thrown, it can remove the bounds check as an optimization -- but only when there is such a guarantee.

"somebody at some layer of the system must implement that check"

That is why we have compilers: they can insert checks for us, automatically, everywhere the checks are needed. It is not different from having your compiler ensure that all functions use the same argument passing convention. Again, this is why we use compilers -- compilers are better at doing mundane, tedious tasks like setting up stack frames or checking array boundaries. Programmers tend to make mistakes with these things, and those mistakes usually lead to disastrous problems later on.


> Why should it be possible to read or write arbitrary locations in memory as a result?

Because that is the point of an allocator. Your favorite language cannot have allocators in the same sense that C can. Instead of going on a crusade against the language, why not simply accept that some problem domains have to care about that?

In this discussion I am not here griping about my favorite criticisms of various higher level languages (some examples: lots of code out there that assumes memory allocation is free, data structures with lots of gratuitous pointers [oops sorry, "object references"!] that slow down access, over-reliance on exceptions i.e. thousands of gotos that cross stack frames) because I admit that these languages are good for certain problem sets.


The issue is not with the allocator; the issue is with the ability to read or write beyond the bounds of an array. The fact that arrays are represented with pointers only worsens the problem. Even for low-level code, it never makes sense to step out of array bounds. Worse, though, is the fact that most C code is high-level, with no particular need or benefit from pointer acrobatics, and exposed to potentially malicious inputs.

A common refrain in these discussions is that C is good for some problem domains, and high level languages are good for others. The problem is that C is not being used only for one domain or some set of domains; it is being used everywhere. Web browsers and web servers are written in C. Email clients are written in C. Instant messaging programs. It is hard to see how my email program benefits from pointer acrobatics, unchecked array access, or a pile of undefined behavior -- so why was it written in C?

You could say this is my main issue. I can just assume that C is good in some domain, even if I never come across it; what I cannot understand is why high-level software is being written in a language that bogs programmers down with low-level concerns. Even implementing a small core in C, and the rest in a high-level language (e.g. Emacs) makes more sense than what we see today.


> The issue is not with the allocator; the issue is with the ability to read or write beyond the bounds of an array. ... Even for low-level code, it never makes sense to step out of array bounds.

You must understand that from the low level perspective, the array and its bounds do not exist. This is an abstraction. An allocator needs to chop up and slice a larger block of memory and give different segments to different callers. You cannot do this if something is enforcing bounds.

What you can do is build higher level abstractions that track bounds and do runtime checks at every access. In C there is nothing to stop you from doing that as a library. In C++, some STL implementations even have bounds checking as a compile time option. But the languages don't force you into paying that cost.


"An allocator needs to chop up and slice a larger block of memory and give different segments to different callers. You cannot do this if something is enforcing bounds."

Sure you can -- if the enforcement of array boundaries is based on types, and is not applied when dealing with a generic, low-level pointer type. Take a look at the implementation of SBCL (a Common Lisp compiler) to see this sort of thing in action.

Yes, building abstractions is the right thing to do, but libraries are the wrong way to do it. The problem with libraries is that the programmer needs to expend their mental energy on using the library, and needs to remember to not just use what the language provides them. If anything, the programmer should be forced to use a library to avoid bounds checking -- extra effort should be required to do dangerous things, rather than to be safe.

"the languages don't force you into paying that cost."

Neither do high-level languages, if you can guarantee that the cost does not need to be paid (and if you are using a half decent compiler). If your compiler can deduce that your array index cannot be out of bounds, it should generate code without a bounds check. That is why Lisp has type hints, and it is one of the arguments you hear in favor of static type checking.


After all these years, why do we still have CPU architectures that don't trap on integer overflow?


We kind of already do. For example, there is the INTO (interrupt on overflow) x86 instruction, which calls an interrupt if the overflow flag is set. Nobody ever uses it, though. Probably because it saves you nothing over just using JO (jump on overflow) after every arithmetic instruction, which itself has so little cost (overflow doesn't usually happen, so the branch will almost always be predicted) that many languages have compilers that already do this by default.

But C is a terribly primitive language. If I remember correctly, unsigned integers don't really have overflow; the result of any arithmetic operation is supposed to be modulo whatever the maximum is. UINT_MAX + 1 is required to be 0. Signed overflow is supposed to be undefined, but I suppose C programmers care more about saving one instruction plus a usually-predicted branch than they care about making programs more reliable. Of course, a lot of programs depend on signed integers behaving just like unsigned integers, so fixing this would be painful.


Because in the vast majority of cases you don't need to and it would unnecessarily slow down a lot of legitimate computation? You can already detect integer overflow extremely easily, why would the cpu need to trap?


Because if

     0x7FFFFFFF =                   2^31 - 1
     0x7FFFFFFF + 1 = 0x80000000 = -2^31
then

     2^31 - 1 + 1 = -2^31
     2^31 + 1 = -2^31 + 1   (adding 1)
     2^31 = -2^31           (subtracting 1)
     1 = -1                 (dividing by 2^31)


I think you may not have understood my question. It is _trivial_ to detect integer overflow in virtually every major cpu I've ever seen. x86, ARM, PPC, MIPS, and many, many more all support this in a single instruction. Trapping, on the other hand, is extreamly expensive in modern processors, on the order of tens of thousands of cycles wasted. Furthermore, there are many cases where integer overflow is actually desired. My question is would be why would a cpu trap be a better solution than what we already have?


For the same reason we trap on integer division by zero.

> It is _trivial_ to detect integer overflow

I think if we counted the number of + operations in a typical C program and multiplied that by the instruction(s) needed to implement a conditional branch path (often never taken dead code), it would be nontrivial. This overhead should be easily measurable on benchmarks.

> Furthermore, there are many cases where integer overflow is actually desired.

I'm not arguing against the ability to do that, but I think those cases are far less common.

> My question is would be why would a cpu trap be a better solution than what we already have?

To allow compilers (not just C) to emit safer code by default by removing the small but measurable instruction penalty.


> For the same reason we trap on integer division by zero.

The thing is I can hardly imagine a situation where a division by zero would be normal and expected, while having integers overflow as a normal part of a calculation is business as usual in many cases. Having to handle a trap (or disabling/enabling it dynamically) would slow things down way too much. Many crypto ciphers for instance specify operations "modulo 32" in order to take advantage of this "feature".

> To allow compilers (not just C) to emit safer code by default by removing the small but measurable instruction penalty.

Detecting overflow in asm is just testing the carry bit of the processor. A trap would be orders of magnitude slower on any modern architecture. You seem to be assuming that overflows are rare in C programs. I'm not sure this is true, but I have no hard data on that so who knows...


> Many crypto ciphers for instance specify operations "modulo 32" in order to take advantage of this "feature".

Which is one reason I'm specifically not advocating for doing away with nonsignaling modulo-2^2^n addition.

However, of all the addition operations in the gigabytes of object code on my hard drive, what percentage of them do you think actually encode the true desire of an application developer to perform wrapping arithmetic?


> For the same reason we trap on integer division by zero.

We trap on division because it is objectively a programmer error that has no obvious right answer. Division is also a relatively expensive and much less common operation for the cpu. The overhead of adding trapping is much smaller compared to addition. Furthermore, integer overflow has an easy solution: widen the computation. Divide by 0 has no quick answer.

> I think if we counted the number of + operations in a typical C program and multiplied that by the instruction(s) needed to implement a conditional branch path (often never taken dead code), it would be nontrivial. This overhead should be easily measurable on benchmarks.

A consistently not taken branch has very, very close to 0 overhead in a modern processor. Not to mention that adding trapping to addition would hinder things like instruction reordering and speculative execution, which adds an overhead that can't be disabled when overflow can provably not occur or not matter. The cpu would also have to treat unsigned and signed addition differently, which means an additional opcode, which adds complexity to the control logic and decreases code density. Finally, there are plenty of languages and libraries which do almost exactly what you propose and automatically widen variables when they overflow and do not experience significant slowdown.

>I'm not arguing against the ability to do that, but I think those cases are far less common.

You are wrong. Consider the pointer arithmetic expression "ptr += x;" on a cpu that has two trapping add instructions (add unsigned and add signed) and ptr is a pointer and x is a signed value. Note that this is extreamly common for things like array access. Suppose the compiler generated an unsigned add. If the ptr value was 0x1 and x was -1 that would generate an overflow because 0x00000001 + 0xFFFFFFFF results in an unsigned overflow. Suppose it instead generated a signed add. If the ptr value was 0x7FFFFFFF and x was 1 this would also generate an overflow because 0x00000001 + 0x7FFFFFFF results in signed overflow. To resolve this you would need 4 different instructions, which would only get worse as you add various operand lengths to the mix. Now I wouldn't say this is a problem for a majority of add instructions, but adding a signed and unsigned number is a very common operation. In addition, if you actually look at generated assembly code, many of the add instructions do not correspond to a '+' but instead do things like stack management or pointer offset adjustment which do not care about overflow because they can be proved to not occur.

> To allow compilers (not just C) to emit safer code by default by removing the small but measurable instruction penalty.

Again, there is virtually no overhead on a modern out of order, branch predicting processor. And adding trapping does have an overhead, a significant one, that cannot be removed when overflow can be proved to not occur or isn't important.

As a final argument, consider this: of all the major cpu architectures I've studied (MIPS, x86, ARM) not a single one has a trapping add instruction. Very smart people have also considered this problem and have come to the same conclusion: trapping is bad and should only be used in truly exceptional circumstances where no recovery is obvious. I agree with you that integer overflow is a big problem in languages like C that do not have a good way to detect it. However, this is truly a language only problem NOT a cpu one.


> We trap on division because it is objectively a programmer error that has no obvious right answer.

Defining (2^31 - 1) + 1 = -2^31 is no more of an "obvious right answer" than defining N/0 = 42 for all N. It's just one that we computer engineers have been trained to accept because it is occasionally useful.

But it's still madness which discourages us from using tools that emit more correct code.

> A consistently not taken branch has very, very close to 0 overhead in a modern processor.

Except the half a dozen or so bytes it consumes in the cache.

Nevertheless, I concede that others have thought about this a lot more than I have and done so with hard data in front of them.


(Assuming you meant 2^31)

The first equation in that second block (and so the rest of the block) doesn't really make sense in 32 bit two's complement arithmetic - there is no 2^31. The real weirdness is that -(-2^31) = -2^31.

The fundamental issue here is that computer representations of numbers aren't accurate and don't follow the usual rules, in more ways than just overflow.


> Assuming you meant 2^31

Where?

> The first equation in that second block (and so the rest of the block) doesn't really make sense in 32 bit two's complement arithmetic

My point exactly.

> computer representations of numbers aren't accurate

The numbers which can be represented in a computer are perfectly accurate.

> and don't follow the usual rules, in more ways than just overflow.

At a low level, we program computers to perfom bit operations on registers. Some operations are more straightforward and efficient than others, but ultimately it's how we interpret the bit patterns that makes them "numbers".

So if they "don't follow the usual rules", it's our own fault.

Security of low-level code has been enough a royal fuckup for long enough now that perhaps we should start to reevaluate some of our traditional assumptions about how CPUs should behave.


>> The first equation in that second block (and so the rest of the block) doesn't really make sense in 32 bit two's complement arithmetic

> My point exactly.

It doesn't make sense because there's no such number as 2^31 in 32-bit two's complement arithmetic, and I don't think the finite range of finite-length number representations is really a solvable problem. This inevitably leads to a lot of mathematical weirdness, because our math is built on the set of integers being infinite (which is the only kind of set of integers in which closure on addition makes any sense at all).

>> and don't follow the usual rules, in more ways than just overflow.

> At a low level, we program computers to perfom bit operations on registers. Some operations are more straightforward and efficient than others, but ultimately it's how we interpret the bit patterns that makes them "numbers".

Any finite representation of numbers is going to have to break the familiar mathematical rules somehow; I'm pretty satisfied with the tradeoffs made in our current integer representations (unsigned more than signed, but that's in some ways an easier problem).


"Any finite representation of numbers is going to have to break the familiar mathematical rules somehow;"

How is that? The only numbers you would have trouble with are non-computable numbers, and I doubt you will find many software systems that deal with those. The only limiting factor should be the memory that is available, but that is beyond the scope of software; the software should work up to the limits of available memory, and increase the amount of memory should increase the range of numbers the program can deal with.

"I'm pretty satisfied with the tradeoffs made in our current integer representations"

I'm not, because (a) the semantics are counter-intuitive and (b) not a day goes by without people finding bugs related to integer overflows or weird, unexpected behavior. We have security problems. We have reliability problems. Voting machines get hacked and report negative vote totals.

At the very least, integer overflows should trigger an exception unless the programmer explicitly requests arithmetic modulo 2^32 (or 2^64, or 2^29, or whatever weird bit width the hardware supports).


> It doesn't make sense because there's no such number as 2^31 in 32-bit two's complement arithmetic

I started out with the definition:

    0x7FFFFFFF = 2^31 - 1
Adding one to that

    0x7FFFFFFF + 1 = 2^31 - 1 + 1
should be 2^31, but it isn't.


C is fundamentally difficult when it comes to miss-allocation of memory.

intp = malloc(int); looks almost exactly the same as int*p = malloc(int); bit will work on a 32 bit system. Now imagine working with variable sized stucts over a 20,000 line code base you start to see the issue.


If your compiler doesn't warn you about that, try upgrading to an alternative. Anything more recent than the last twenty years should do.


I see an awful lot of C and C++ projects that generate large numbers of warnings during a build. It would be pretty easy to miss one like that.


Which is why

    gcc -Werror
is a good idea


So you are going to take the time to correct hundreds or sometimes thousands of warnings in some open source program you need to use at your job?


No one said that... -Werror is good when you are developing, not when you are compiling someone else's code.


Every good programmer I met enables as many warnings as possible and strives for zero warnings. This is just professional practice like keeping your code in a version control system.


gcc/clang warns about this particular problem without any extra command-line options. You need to go out of your way to stop it complaining.


When I hear people say this I think "there's someone without a lot of exposure". Yes the mistakes do happen. Aside from the fact that bugs will happen with your high level languages too, when you get on a good stride with C you will find the mistakes not as frequent, and you will become good at debugging and fixing them when they do happen. As mentioned on this thread they won't go to zero, and I'll admit they'll sometimes be a different set of problems than with other languages. But it's not totally unmanageable or unapproachable.

PS: your comment wold hold more weight if you had written C in your example. It looks like HN's italic formatter got in your way in the first snippet, but p = malloc(int) will not compile.


It's just not true that experienced C programmers don't generate integer overflows. They do so routinely. Practicdally every large C codebase has had problems with them. Sqlite has had exploitable integer overflows. Even qmail had an LP64 bug. The grain of truth that makes your comment sound plausible is that bad programmers generate more integer overflows. But that's not the problem with C --- it's not that there are too many integer overflows, it's the very high likelihood that any given codebase harbors any of them.


Again, I don't think I ever said that people will stop making mistakes, just that the frequency of mistakes and ability to deal with them when discovered (or even ability to discover them) will change with experience. I feel like too many people dismiss the problem as scary and intractable and that's IMO not a good reason to never write any C.


If you read what I wrote up there again, I think you'll see that it actually is an argument for avoiding C.


"bugs will happen with your high level languages too"

Sure, but the nature of the bugs is different. In C and C++, you have low-level bugs and high-level bugs to deal with; in a high-level language, you almost never have to deal with low-level bugs. Pointers are an inherently low-level construct, which are necessary in low-level code but completely inappropriate in high-level code.

"it's not totally unmanageable or unapproachable"

It is, however, a waste of time and mental effort, and it leads to less reliable software. The majority of C and C++ programs are high level programs that neither require nor benefit from the low-level features of those languages; unfortunately, writing a non-trivial C/C++ program without using low-level features (like pointers) is impossible.


This is a case when the int* p = malloc(sizeof *p); syntax is extremely useful: there's no risk of making a typo for complicated types.


> ... should be no surprise to a thoughtful C programmer...

Those are in shortage nowadays...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: