Hacker News new | past | comments | ask | show | jobs | submit login

htons, etc are very 1980s, and I’ll make a fairly strong claim: they should never be used in new code, with a single exception. The reason is that an int with network endianness simply should not exist. In other words, when someone sends you a four byte big-endian integer, they sent four bytes, not an int. You can turn it into an int by shifting each byte by the relevant amount and oring them together. And a modern compiler will generate good code.

The sole exception is legacy APIs like inet_aton() that actually require these nonsensical conversions.




> You can turn it into an int by shifting each byte by the relevant amount and oring them together.

Yeah! And you can even write a function to do that for you! Maybe call it "ntohl".


You could also use the functions with explicit bit widths, like bswap64 and bswap32.


No, you should not. By the time you have a wrong-endian uint64_t or whatever, you’ve already done it wrong.


After re-reading your comment above, I'm actually confused. You think you should never store a big-endian int? That is ridiculous. Some architectures are big-endian. You should not be using custom bitswapping as part of application code, because you cannot know the endianness of your architecture.

The ntoh* functions are the right approach, and your claim is not only strong, it's wrong. The ntoh* functions exist to transform network byte-order to host byte-order. Depending on your architecture endianness, their functionality will change.


Let me try saying it differently. The following code is poorly written:

    char *buf = ...;
    uint32_t word = *(int32_t *)buf;
    uint32_t host_word = ntohl(word);
Because you just type-punned the read from buf. (In fact, this code is UB.) You could write it a little better like:

    char *buf = ...;
    uint32_t word;
    memcpy(&word, buf, 4);
    uint32_t host_word = ntohl(word);
Although IIRC there is or at least was still some disagreement as to whether this might be UB. You could use a union to make it definitely not UB.

But none of these variants are sensible, and, in fact, they don't even translate to most safer languages than C. The correct way to write this code is:

    char *buf = ...;
    uint32_t host_word = ((uint32_t)buf[0] << 24) |
        ((uint32_t)buf[1] << 16) | ((uint32_t)buf[2] << 8) |
        (uint32_t)buf[3];
On any recent compiler, this will generate as good or better code, and it doesn't make pointless assumptions about the representation of uint32_t on the platform you're using.

So I stand by my claim: well-written modern C code should not contain any "network-order" values. They should contain bytes, vectors of bytes, and numbers.


My C code didn't include mention of ints though, so I'm wondering where you got that from.

Your first example is UB and again, is not something my example depended on.

Your final claims are overly cautious. It is perfectly fine to use uint32_t in this way. Uint32_t is defined as a 32-bit unsigned integer. There is a bijection between network order 32-bit unsigned integers and host order integers, and ntohs is the bijection. It is no different than storing any other value. It is certainly not wrong.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: