Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've recently tried to fix JOE (a UNIX portable program) so that it will compile without warnings and with minimum casting with -Wconversion. This is what I've found:

Chars: I hate it that they are signed because I like the convention of promoting them to int, and then using -1 as an error. It's easy to forget to convert to unsigned first, and the compiler will not complain. In the past I've used 'unsigned char' everywhere, but it's a mess because strings are chars and all library functions expect chars. My new strategy is to use 256 as the error code instead of -1. The only problem is that getchar() uses -1, so it's weird. IMHO, it's a C-standard mistake that char is signed.

I used to use int for indexes and long for file offsets. But these days, int is too short on 64-bit systems and long is not large enough on 32-bit systems.

ptrdiff_t is the new int. I've switched to ptrdiff_t in place of int and off_t in place of long. Ptrdiff_t is correct on every system except maybe 16-bit MS-DOS (where it's 32-bits, but I think it should be 16-bits). Off_t is a long long if you have '#define _FILE_OFFSET_BITS 64'. Ptrdiff_t is ugly and is defined in an odd include file: stddef.h. It's not used much by the C library.

The C library likes to use size_t and ssize_t. The definition of ssize_t is just crazy (it should just be the signed version of size_t, but it isn't).

I understand why size_t is unsigned, but I kind of wish it was signed. It's rare that you have items larger than 2^(word size - 1), so signed is OK. You are guaranteed to have -Wconversion warnings if you use size_t, because ptrdiff_t is signed (even if you don't use ptrdiff_t, you still get a signed result to pointer differences so you will have warnings). Anyway, to limit the damage I make versions of malloc, strlen and sizeof which return or take ptrdiff_t. They complain if the result is ever negative. Yes this is weird, but I think it's better than having many explicit casts to fix warnings. Casts are always dangerous.




> IMHO, it's a C-standard mistake that char is signed.

The standard doesn't specify whether char is signed or unsigned, it's left to the implementation.


chars are only signed on some platforms (x86, for one). On others they're unsigned (ARM, for one).

One knockon effect of this is that strcmp() will return different values on the two different platforms for UTF-8 strings (because 0xff > 32, but -1 < 32)...

Incidentally, I don't know if you know about intptr_t; it's an int large enough to put a pointer in losslessly. It's dead handy. (My current project involves a system with 16-bit ints, 32-bit long longs, and 20-bit pointers...)


I did not know that chars were unsigned on arm- interesting.

I try to be conservative with the definitions I use, so I'm worried that intptr_t might be too new.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: