How Unreliable is UDP?

keithwinstein · on Oct 16, 2014

UDP is just about as reliable or unreliable as IP. It's a shim on top of IP to let unprivileged users send IP datagrams that can be multiplexed back to the right user. (Hence the name: "user" datagram protocol.)

Lots of people talk about a "TCP or UDP" design decision, but that's usually not the relevant question for an application developer. Most UDP applications wouldn't just blat their data directly into UDP datagrams, just like almost nobody blats their data directly into IP datagrams.

Typical applications generally want some sort of transport protocol on top of the datagrams (whether UDP-in-IP or just plain IP) that cares about reliability and often about fairness.

That could be TCP-over-IP, but some (like MixApp, which wanted TCP but not the kernel's default implementation) use TCP-over-UDP-over-IP, BitTorrent uses LEDBAT-over-UDP, Mosh uses SSP-over-UDP, Chrome uses QUIC-over-UDP and falls back to SPDY-over-TCP and HTTP-over-TCP, etc.

The idea that using UDP somewhere in the stack means "designing and implementing a reliable transport protocol from scratch" is silly. It doesn't mean that any more than using IP does.

The question of SOCK_STREAM vs. SOCK_DGRAM is more typically, "Does the application want to use the one transport protocol that the kernel implements, namely a particular flavor of TCP, or does it want to use some other transport protocol implemented in a userspace library?"

huhtenberg · on Oct 16, 2014

> UDP is just about as reliable or unreliable as IP

That's in theory.

In practice ISPs routinely classify, prioritize and shape application traffic (based on either TCP/UDP ports or results of deep packet inspection), so running SSL over IP will yield different IP loss profile than when running SSL over UDP.

antirez · on Oct 16, 2014

Fun how this is more or less the contrary of what I think, so I'll add my point of view as a counter argument. IMHO if you need a transport layer, you are 99.9% of cases well served by TCP and should not consider UDP at all if not as a last resort to implement your transport protocol. Instead UDP is the way to go when you don't need a transport layer at all, and you can do with trivial ack / resend strategy (DNS is the obvious example), or just best-effort transfers where packet loss is not an issue at all (think a thermometer sending readings every second). In those cases the TCP three way handshake would be an overkill, and the benefits of TCP are minimal because of the use case.

_gsin · on Oct 16, 2014

> Instead UDP is the way to go when you don't need a transport layer at all, and you can do with trivial ack / resend strategy (DNS is the obvious example), or just best-effort transfers where packet loss is not an issue at all (think a thermometer sending readings every second)

That depends I guess. As somebody who works mostly with cloud telephony, SIP (signaling) and RTP (audio) are almost always carried over UDP. Funny thing is that you'll see more occurrences of one-way audio or call drops because of people implementing the application level protocol like SIP wrong rather than the transport layer letting you down.

That said, I would any day prefer a private MPLS network rather than the public internet.

Though I can understand why somebody who writes a fast nosql database prefers TCP. Also, thanks for redis, we use it a lot and love it! :)

MichaelGG · on Oct 16, 2014

SIP signalling over UDP was a bad idea, especially with the mandatory TCP failover. SIP's actually a spectacular clusterfuck of a design overall anyways. It's actually impossible to implement SIP in an unambiguous manner on today's networks because even the parsing rules are so bad, that multiple, popular, stacks have incompatible ways of dealing with things.

UDP for RTP makes sense because transmitting packets is pointless. In case of packetloss, the end user has to interpolate the result either via software or in their head.

This private MPLS network... it wouldn't happen to run over the exact same equipment that'd be handling your IP traffic, would it?

bradleyland · on Oct 16, 2014

Just a small correction for anyone else following along. This:

> UDP for RTP makes sense because transmitting packets is pointless

Should be this:

> UDP for RTP makes sense because retransmitting packets is pointless

TCP will retransmit if no ACK is received; UDP won't (there being no ACK and all).

ay · on Oct 17, 2014

At the same time, beware of "helpful" lower layer protocols.

Mobile wireless stuff (EDGE/3G) is going to great lengths to avoid dropping packets, so if you get conditions right (moving train in the areas with poor coverage is one place where you can easily see this), you can get the packets "reliably delivered" in 20 seconds and more.

spb · on Oct 17, 2014

This really bugs me. It seems like they should be leaving the retransmission to TCP.

rix0r · on Oct 18, 2014

Well, yes and no.

TCP has been designed in such a way that it interprets packet drops as a sign of "congestion" (which was typically true in ye olden days of purely wired networking), and it will start sending less data in response.

Whereas in wireless networking, occasional packet drops are just a fact of life and are not indicative of competing flows trying to share the channel. So it actually makes [some] sense that wireless protocols try to compensate for the behaviour of the transport protocol used by 90% of all data: TCP.

colanderman · on Oct 16, 2014

That said, I would any day prefer a private MPLS network rather than the public internet.

Thanks to your comment, a short Wikipedia trip later, I now know that "penultimate hop popping" is a thing and has a fantastic name.

derefr · on Oct 17, 2014

The one case where you don't really want TCP semantics, is when you want SCTP/RTP semantics: basically, user datagrams decomposed into frames and then priority-multiplexed over a reliable circuit, then buffered back into in-order messages. This is, it turns out, what web browsers, game clients, VoIP services, and a lot of other things want—to the point that if we're going to have one kernel-implemented protocol and everything else has to live on top of UDP, I wish the one kernel protocol was SCTP. (Because one-channel SCTP really looks a lot like TCP.)

jhallenworld · on Oct 17, 2014

The middle ground is to improve the latency of TCP along the lines of this: https://tools.ietf.org/html/draft-ietf-tcpm-fastopen-10

I read that Chrome, Linux 3.6 and Android all support this.

derefr · on Oct 17, 2014

Cheaper SCTP isn't quite enough—one of the major problems with lots of little TCP connections is that they all have their own backpressure, so you get things like worker processes that share a remote flapping their window sizes up and down in response to one-anothers' requests. Ideally, backpressure would occur at the IP layer (host-to-host rather than port-to-port) but SCTP gives you the ability to have a set of streams that share a backpressure "group" on both the local and remote ends.

jacquesm · on Oct 16, 2014

> UDP is just about as reliable or unreliable as IP.

Ok.

> It's a shim on top of IP to let unprivileged users send IP datagrams that can be multiplexed back to the right user. (Hence the name: "user" datagram protocol.)

It's mostly a way to send data in such a manner that you don't need to do the full 'call-setup-data-transmission-and-terminate' that would be required for other virtual circuit based protocols such as TCP when you don't need all that luxury. So for protocols that carry small amounts of data in a manner where retrying is not a problem and where loss of a packet is not an immediate disaster. It's also more suitable for real-time applications because of this than TCP (especially true for the first packet). Because of the fact that there is no virtual circuit a single listener can handle data from multiple senders.

The 'USER' does not refer to unprivileged users but simply to users as opposed to system packets (such as for instance ICMP and other datagram like packets that are not usually sent out directly by applications). So it's not a privilege matter but a matter of user-space vs system modules elsewhere in the stack.

> Lots of people talk about a "TCP or UDP" design decision, but that's usually not the relevant question for an application developer. It absolutely is.

> Most UDP applications wouldn't just blat their data directly into UDP datagrams,

They usually do exactly that.

> just like almost nobody blats their data directly into IP datagrams.

You're comparing apples with oranges, IP is one layer and TCP and UDP are on another. So you'd have to compare UDP with TCP and then you're back to that design decision again.

> Typical applications generally want some sort of transport protocol on top of the datagrams (whether UDP-in-IP or just plain IP), that cares about reliability and often about fairness.

Fairness is something that is usually not under control of the endpoints of a conversation but something that routers in between influence. They can decide to let a packet through or drop it (this goes for TCP as well as UDP), if a line is congested your UDP packets will usually (rules can be set to configure this) be dropped before your TCP packets will be in spite of the fact that TCP will re-try any lost packets. UDP packets can also be duplicated and routed in such a way that they arrive out-of-order.

> That could be TCP-over-IP, but some (like MixApp, which wanted TCP but not the kernel's default implementation) use TCP-over-UDP-over-IP, BitTorrent uses LEDBAT-over-UDP, Mosh uses SSP-over-UDP, Chrome uses QUIC-over-UDP and falls back to SPDY-over-TCP and HTTP-over-TCP, etc.

Running alternative protocols packaged inside other protocols is a time honored practice. See also: TCP over carrier pigeons and tunneling HTTP over DNS traffic (effectively using UDP). This is not in any way special, it's just a means to an end.

> The idea that using UDP somewhere in the stack means "designing and implementing a reliable transport protocol from scratch" is silly. It doesn't mean that any more than using IP does.

It actually comes down to exactly that. If you use UDP as your base and your application requires reliable transmission of data then you're going to have to deal with loss/duplication/sequencing at some other point in your application or put another (pre existing) protocol on top of it in order to mitigate these.

If your application can tolerate those errors (or if they are not considered errors) then a naive implementation will do.

> The question of SOCK_STREAM vs. SOCK_DGRAM is more typically, "Does the application want to use the one transport protocol that the kernel implements, namely a particular flavor of TCP, or does it want to use some other transport protocol implemented in a userspace library?"

TCP is the default for anything requiring virtual circuits if you have demands that are not well describe by that model and/or need real time, low overhead and you're willing to do the work required to deal with UDPs inherent issues (if those are a problem) then you're totally free to do so.

But the question is usually not 'do I need TCP', it usually is 'how do I avoid re-implementing TCP if I need its features'.

It's a tough choice because at a minimum it means that you're going to have to write software for both endpoints.

This is one of the reasons why we see HTTP over TCP in so many places where it wasn't originally intended: it is more or less guaranteed to be well tested and there are tons of tools available to use this protocol combination, especially browsers, fetchers and servers in all kinds of flavors. For UDP that situation is much less rosy and using UDP always translates into having to do a bunch of plumbing yourself.

ChuckMcM · on Oct 16, 2014

The thing that bites a lot of naive protocol designers who use UDP is that it doesn't guarantee order either. So you can get delivery out of order, and that gets more likely the more your packet crosses IP subnets. In part because some folks do traffic shaping, in part because UDP traffic is considered "less important" by a lot of ISPs, and in part because new switches, like the latest compilers, have enough cpu in the control plane to play games on packets flying about.

jcampbell1 · on Oct 17, 2014

> You're comparing apples with oranges, IP is one layer and TCP and UDP are on another.

That is what the books say, but I don't think it is right. When you consider that raw IPv4 doesn't work in practice because of NAT, UDP is a defacto minimum internet layer in practice.

jacquesm · on Oct 17, 2014

NAT is a kludge. The existence of NAT does not remove the fact that both TCP and UDP are layered on top of it and there is plenty of stuff happening that is layered directly on IP besides UDP, for instance ICMP.

srj · on Oct 17, 2014

> It's mostly a way to send data in such a manner that you don't need to do the full 'call-setup-data-transmission-and-terminate' that would be required for other virtual circuit based protocols such as TCP when you don't need all that luxury. So for protocols that carry small amounts of data in a manner where retrying is not a problem and where loss of a packet is not an immediate disaster. It's also more suitable for real-time applications because of this than TCP (especially true for the first packet). Because of the fact that there is no virtual circuit a single listener can handle data from multiple senders.

It is possible that someone wants a virtual circuit but can do better than TCP for their application. I think the parent's explanation was more apt - it's a small layer on top of IP for you to implement your own protocol logic.

paraboul · on Oct 16, 2014

Right.

One usually want to reimplement a reliable protocol (e.g. similar to TCP) on top of UDP when dealing with peer-to-peer capabilities.

Indeed, NAT traversal is easier to implement and more reliable when dealing with UDP techniques (e.g. UDP hole punching).

runeks · on Oct 16, 2014

I'll the share the comment here that I posted on the article page.

OP, you should have done latency tests as well. Whether packets arriving out-of-order is unacceptable to an application or not depends on the time distribution of packet delays. If, for example, 99.9% of packets arrive in-order within, say, 10 ms, then that is perfectly acceptable to use for video streaming, for example. Even if that means a packet is 1000 packets late, this just requires a bigger buffer on the receiving end. As long as it's below a certain duration, out-of-order packets aren't necessarily a problem even for live content.

The grouping of 5 packets together and thus checking whether a packet arrive out-of-order more than six places seems arbitrary. If, for example, a packet arrives out-of-order 50 packets late, but it's actually only 1 ms late, then it's not a problem. If it comes along 1 second too late, then it is a problem.

latch · on Oct 16, 2014

Yes, you're right. That was obvious once I got the data. I still thought what I had was worth writing about :)

As you say, that grouping is arbitrary and, it was more likely to include packets that shouldn't have been included (multiple seconds later) than those that did. Though, that's more todo with the fact that I sent so few packets (5-10) in each burst.

Zikes · on Oct 16, 2014

"I would tell you a joke about UDP, but you probably wouldn't get it."

6581 · on Oct 16, 2014

"Hi, I'd like to hear a TCP joke."

"Hello, would you like to hear a TCP joke?"

"Yes, I'd like to hear a TCP joke."

"Ok, I'll tell you a TCP joke."

"Ok, I will hear a TCP joke."

"Are you ready to hear a TCP joke?"

"Yes, I am ready to hear a TCP joke."

"Ok, I am about to send the TCP joke. It will last 10 seconds, it has two characters, it does not have a setting, it ends with a punchline."

"Ok, I am ready to get your TCP joke that will last 10 seconds, has two characters, does not have an explicit setting, and ends with a punchline."

"I'm sorry, your connection has timed out. Hello, would you like to hear a TCP joke?"

spb · on Oct 17, 2014

kragen · on Oct 16, 2014

The canonical collection of five hundred jokes of this genre is at http://attrition.org/misc/ee/protolol.txt:

@mckeay The sad thing about IPv6 jokes is that almost no one understands them and no one is using them yet.

@dildog What's up with the jokes... Give it a REST, guys...

@ChrisJohnRiley: The worst thing about #protolol is that you get the broadcast even if you really don't give a shit!

@mdreid: The best thing about proprietary protocol jokes is REDACTED.

@maradydd: The bad thing about Turing machine jokes is you never can tell when they're over #protolol

...

monstermonster · on Oct 16, 2014

A UDP packet walks into a bar.

BCM43 · on Oct 16, 2014

A walks UDP packet bar a into.

danatkinson · on Oct 16, 2014

I think this is definitely a more accurate representation of the joke. Although maybe drop one of the words. :)

TheLoneWolfling · on Oct 16, 2014

A walks UDP packet bar into bar.

?

nanijoe · on Oct 16, 2014

Might make more sense as "A TCP Packet walks into a bar" , since TCP supports retransmissions

gliptic · on Oct 16, 2014

An UDP packet may arrive as two copies. That's the joke.

mjevans · on Oct 17, 2014

Duplication is a form of forward error correction.

stavros · on Oct 16, 2014

Knock knock.

A UDP packet.

Who's there?

frozenport · on Oct 17, 2014

The problem with UDP jokes is that I don't get half of them!

CPLX · on Oct 16, 2014

I would counter with an HTTP joke but I am feeling slightly insecure.

Sindisil · on Oct 16, 2014

How 'bout you tell us an HTTPs joke, then?

temuze · on Oct 16, 2014

Well, based on these results, you probably will.

georgemcbay · on Oct 16, 2014

wouldn't it but sense make.

_3u10 · on Oct 16, 2014

Knock knock. Who's there? Google.com. whois Google.com? 8.8.8.8 / 8.8.4.4

jws · on Oct 16, 2014

"conKnk k.kco"

mey · on Oct 16, 2014

I've seen UDP fail drastically (dropping north of 50% of packet loss) on controlled networks. Once you hit a certain congestion on a network or a network buffer fills up you suddenly things go a bit sideways. I'm not sure if switches and OS's consider UDP lower priority then TCP or simply that TCP has mechanisms to recover from packet storms, but TCP continues to "function" where UDP starts hitting the floor en masse.

Conversely I've pondered using UDP packets as a "canary in the coal mine" for networks to monitor it's health.

colanderman · on Oct 16, 2014

This is how congestion works: queues are not infinitely deep, so the only way to deal with congestion is to (at some point) start dropping packets.

TCP (that is, most TCP flow control algorithms) specifically uses packet loss as an indicator of network congestion / how much bandwidth is available, and will back off and retry dropped packets at a lower rate; hence why TCP continues to function (albeit at degraded performance). UDP has no such mechanism, hence you see firsthand the dropped packets. A well-written application should back off in such a scenario so as not to flood the network, but of course many won't.

(Of course TCP may very well be prioritized over UDP! It's just not necessary to explain your observations.)

tptacek · on Oct 16, 2014

It's often the opposite. In order to prevent the Internet from melting, TCP does an elaborate dance with all the TCP speakers to back off and avoid congesting the network. UDP does no such dance, and so can monopolize traffic at the expense of TCP.

Obviously this presumes that the UDP protocol in question has some mechanism for handling lost packets. But, for instance, if you're doing lossy video or forward error correction, end users do not deal directly with lost packets.

mey · on Oct 16, 2014

Also priority may be a function of the network configuration (QoS etc) not biased against UDP by default.

bboreham · on Oct 17, 2014

Very much this.

It's a bad idea to measure UDP reliability on a quiet day and then make decisions based on those results. On a different day, everything will be happening at once - everyone trying to message their Mom, everyone trying to sell stock, everyone trying to cast a spell on the big monster - and that is when the most packets will be dropped.

verbatim · on Oct 16, 2014

Switches dropping TCP packets means slower TCP rather than lost information. This is what TCP is designed to do when it loses packets, it's not related to UDP vs. TCP priority.

You can use TCP as this canary as well by monitoring a counter of how many retransmissions have been performed.

gojomo · on Oct 16, 2014

For a fuller picture:

• Timestamps are necessary: UDP loss is bursty, when any component anywhere along the path (including the endpoints) is momentarily too burdened (buffers, CPU, wires) to forward every packet.

• Try packets larger than the 'path MTU' between the two endpoints: any packet larger than that is fragmented somewhere en route, and then loss of any one fragment causes the full UDP packet to be lost.

• Try alongside other traffic on the endpoints, and note that TCP streams can't find their achievable rates (or split the available bandwidth amongst themselves) without hitting the packet-dropping pressure that also affects UDP. So perhaps try with 1, 2, or more TCP streams trying to max wire-speed, between the same machines at the same time.

Note also you can create arbitrarily-high loss rates by simply choosing to send more than the path and endpoints can handle. (Let one side send X thousand in a tight loop; on the other side only check for 1 per second for X thousand seconds. Many will be lost, depending on how much you've exceeded the available bandwidth/buffering between the send and receive.)

julie1 · on Oct 18, 2014

Avoid timestamps: computer are not perfect, their clocks tends to drifts in non predictable ways (even with NTP). I suspect seen quite real bugs due to clock drifting (and linux non monotonic timer (which impacted python) for a while (but problem is solved)) resulting in packet drops because networks stack don't like to see packets arriving in the future when I was using pyzmq.

http://lists.zeromq.org/pipermail/zeromq-dev/2013-October/02...

My intuition is we are close to relativistic problems. (Can't prove it, I have to code a kikoo lol form for tomorrow).

xamolxix · on Oct 16, 2014

> • try alongside other traffic on the endpoints, and note that TCP streams can't find their achievable rates (or split the available bandwidth amongst themselves) without hitting the packet-dropping pressure that also affects UDP. So perhaps try with 1, 2, or more TCP streams doing a max-speed copy at the same time.

I agree. Having (or not) other traffic on the network will impact UDP throughput dramatically. As one who does VoIP QoS will soon find out.

drawkbox · on Oct 16, 2014

For the best of both before SCTP exists, if ever, RUDP (Reliable UDP) [1] where needed is what most game developers/engines use on the networking layer such as RakNet [2] (recently bought and open sourced by Oculus), enet [3], and the many based on those.

RUDP or similar systems allow network messaging to ack when needed and unreliable by default which is fine for positional updates with some missing data using interpolation and extrapolation to simulate missing data. Global game states and determining the winner/ending the level might need a reliable call.

With UDP and only some reliable calls you drastically improve real-time performance with less queueing. TCP is ok for turn-based or near-real-time but high action real-time games almost all use UDP with a RUDP twist. Using a mix can also be harmful [4] so the best option is RUDP where needed, default to UDP.

UDP is great because it is a broadcast, almost like TV/radio in that you can show the data you receive and smooth the rest. Although, TV/radio needs every data frame to prevent static/lag but imagine the broadcaster having to error check all connections, it would quickly saturate. UDP allows saturation later. Games can predict movement and smooth out missing data for most game actions, you only need a few points for a bezier to be smooth or you might have variables for speed/direction/forces that you can predict with. Pretty good reliability and lack of ordering are not really an issue, if out of order (timestamp older than last) discard and use next or predict until the next valid message (too much of this leads to lag but normal UDP operations it is enough and actually smoother).

[1] https://tools.ietf.org/html/draft-ietf-sigtran-reliable-udp-...

[2] https://github.com/OculusVR/RakNet

[3] http://enet.bespin.org/

[4] http://www.isoc.org/INET97/proceedings/F3/F3_1.HTM

comex · on Oct 17, 2014

Two other reasons to use UDP for low-latency use cases, even if you cannot handle actual missed packets and thus are fully reliable:

- If your bandwidth use is small, you can just spam multiple copies of each packet to decrease the chance that a laggy retransmission will be needed. If you're sending packets at a constant high rate, you can instead include copies of the last N messages in each packet, rather than just the new data.

- You can send packets directly between two NATted hosts using STUN, rather than having to rely on UPnP or manual port forwarding. Pretty obvious, but I only see one other mention of this fact in the thread, and it vastly increases the likelihood of being able to make a direct connection between two random consumer devices.

zamalek · on Oct 17, 2014

RUDP isn't really the same thing as RakNet. Both are actually distinct and different protocols.

Most game networking libraries will let the developer choose, when sending a datagram, whether it's reliable, ordered, both or neither. The reason is that you rarely actually need both, but in the rare cases you do (e.g. if the user issues a "fire" command) you can upgrade datagrams to that state.

RUDP is simply a message-based connection that is both reliable and ordered. You might be able to get away with using it for games, but, why would you - RakNet is open source and is mostly the industry standard.

Also don't forget the one final advantage of UDP is that STUN/TURN work great. NAT punching with TCP is walking on thin ice.

IvyMike · on Oct 16, 2014

This is interesting, but in my experience, you can't use this data to usefully extrapolate anything. Slightly different hardware, topology, OSes, drivers, system load, network "weather conditions", etc, can radically change the results.

My biggest UDP "surprise" was on a system using UDP but treating it as perfect, because the system designers knew that the hardware bus they were using was "100% reliable". And they were right--the hardware wasn't dropping anything. Too bad that the OSes on either end would discard UDP packets when they faced buffer pressure.

noselasd · on Oct 16, 2014

Also keep in mind this note: http://technet.microsoft.com/en-us/library/cc940021.aspx

Basically, if you send() 2 or more UDP datagrams in quick succession, and the OS has to resolve the destination with ARP, all but the 1. packet is dropped until you get an ARP reply. (This behavior isn't entirely unique to windows btw.)

colanderman · on Oct 16, 2014

> Too bad that the OSes on either end would discard UDP packets when they faced buffer pressure.

Which unfortunately is all they can do, in the limit. Otherwise, if applications (or the OS itself) can't keep up with the incoming packet/data rate, buffers would grow without bound. Not good for a production system.

georgemcbay · on Oct 16, 2014

This is really testing "how unreliable are the network routers, cables and network adapters (and drivers!) that I'm pushing traffic through".

How unreliable is UDP as a protocol? Unreliable. This is really a binary state, not a percentage-measurable value.

If you need reliability (in ordering or delivery) you need to layer on top of it and unless your network usage has very specific constraints (eg. low latency is far more important than strict ordering, as with most fast action games) you should almost certainly just use TCP rather than end up badly reimplementing it over UDP.

Mithaldu · on Oct 16, 2014

> low latency is far more important than strict ordering, as with most fast action games

On the other hand, in that situation TCP is the worst choice you can make, as, especially on bad lines, like say Whistler in the canadian mountains, it's easy to get TCP into a state where it builds up a massive latency because it insists on trying to push every single packet through.

JoeAltmaier · on Oct 16, 2014

Agreed. TCP stands for "Transmission control protocol" which started out as file transfer. These days it is exquisitely unsuitable for most things it get used for, even fetching web pages. The delays, retries and congestion controls are set arbitrarily and rarely adjusted. In this modern world of wireless roaming and streaming media, TCP has little or nothing to offer, except that its there.

TheLoneWolfling · on Oct 16, 2014

I've long wished for a reliable protocol that was negotiated on a per-link basis. I mean: we have the processing power now. (So, effectively, packets aren't removed from the router's buffer until it knows the next link has the packet in buffer. Lots of implementation details to be gone over, though.)

It seems a mite... silly... to resend a packet from the other side of the world just because your ISP couldn't shove the packet down your internet connection fast enough.

lmm · on Oct 16, 2014

When packets are being sent too fast the sender needs to be slowed down. Otherwise the buffers would just fill up. And in traditional TCP, the only thing that tells the sender this is dropped packets.

The smart solution you're looking for is TCP ECNs - a way for the routers to say "I'm buffering it for now, but you'd better slow down". If you're running linux it's a kernel setting you can enable (they're disabled by default as some routers mishandle them).

TheLoneWolfling · on Oct 16, 2014

Sorry, should have specified. One of the things this suggests is explicit cache management for individual links. I.e. "I have space for <x> packets before my next ack".

Don't treat links as end-to-end. Treat them as a bucket chain - each link in the chain negotiates with its immediate neighbors only. Currently we do a game of "toss the bucket at the next guy and hope he catches it".

zAy0LfpBZLC8mAC · on Oct 17, 2014

That would cause abysmal performance because of the head of line blocking of this approach: If one of the outbound links becomes congested, sooner or later, all of the memory would be occupied by packets destined for that congested link, as all of the packets destined for other links would be transmitted, making room for some more inbound packets, of which again all the packets destined for the uncongested links would be transmitted right away, leaving in the buffer the ones destined for the congested link, ... repeat until the buffer is filled with packets destined for the congested link, thus preventing the flow of any more packets to the uncongested link that might be queued somewhere in the network.

The only solution to avoid that problem would be to have every router keep state for each connection/flow it sees, and managing separate buffers for each flow - but that would mean keeping state for hundreds of millions of flows on backbone routers, all of which essentially would need to be in level 1 cache in order to be able to move packets at line speed, and even that probably would be too slow - there is a reason why those routers use CAMs to be able to keep up even just doing routing table lookups in only about 500000 routes.

JoeAltmaier · on Oct 17, 2014

Carnegie Mellon has a blank-slate project to manage data center traffic similar to this. I understand it involves bandwidth/buffer credits passed around between endpoints to the subnet. The idea is, buffer backup at the source is inevitable if subnet bandwidth is insufficient; you have to accept that. What you CAN do is avoid pointlessly sending packets to where they can't be kept, because it wastes bandwidth; even worse signaling and retries use even more.

lmm · on Oct 17, 2014

It seems like the routes would have to be a lot more static for that to work, negating the big advantage of the internet over traditional circuit switching. Right now each end-to-end link negotiates its own window size and can accept that many packets before acking, and it doesn't matter whether half of those packets go by one route and half of them by another, they just have to all arrive at the end.

TheLoneWolfling · on Oct 17, 2014

Not particularly. You can still do all the fancy and not-so-fancy tricks regarding packet routing. As long as each router knows a "closer" router to the destination, you're fine. This is identical to the current setup in that regard.

As a matter of fact, it would probably be easier to make dynamic. (Router A gets a packet for router Z - router A wants to send it to router B, but router B is currently congested, and router A knows that router C is an alternate route, so it sends it to router C instead.)

Now, there are circumstances where this approach is not particularly valid. In particular, on wireless networks. However, TCP over wireless networks isn't exactly great either. (TCP and this approach both make the same assumption: namely that most packet loss is congestion as opposed to actual packet loss.) This approach is for the segment of the network that's wired routers with little to no packet loss disregarding packets being dropped due to no cache space. I.e. this approach is for the reliable segment of modern networks - wireless should probably have an entirely different underlying protocol.

lmm · on Oct 17, 2014

Router A knows that router B is congested - but this is actually due to congestion in the link between router K and router L. How does it know which of router C or D would be using the same link? It has to have a global understanding of all the routing paths, no?

Routing the packet to Z and telling you that the path to Z is congested are mirror images of each other; it makes sense to use the same mechanism for both.

lerno · on Oct 16, 2014

Wrong. TCP has a lot of behaviour you can't work around. UDP is what you write it to be. When experiencing packet loss not due to bandwidth, TCP will delay recovery compared to what UDP can achieve.

For games etc, you need to be able to hide 1-2 seconds of lag for TCP.

georgemcbay · on Oct 16, 2014

I thought it was pretty clear in the post you're responding to that there are situations in which UDP is a much better choice, I even called out games specifically. The question is, do you need reliability (as defined by guaranteed delivery and sequencing) or not. If you do, UDP is not enough even if it is seems "mostly reliable" in some tests. "Mostly reliable" is not useful if you need "actually reliable".

If you need the full reliability TCP offers, UDP isn't the answer unless you layer your own protocol on top of it, and while it is possible to layer your own protocol on top of it that will beat TCP for specific use cases, most people (who aren't expert network programmers and don't even know about [let alone understand] the minefields of issues you can run into such as dealing with NATing, etc) are much better off just using TCP, warts and all.

Animats · on Oct 16, 2014

The writer is testing over links between data centers. Those tend to have good bandwidth and no "deep packet" examination and munging. Try testing over consumer-grade links, and see what DOCSIS and Comcast are doing to your packets.

IgorPartola · on Oct 16, 2014

So UDP is great. I love the simplicity of it, and the fact that it adds very little overhead while being sufficiently low level means you can do all the fun networking things yourself.

What would be great is a "third half" of this equation: a popular protocol a la SCTP. Imagine not having to re-invent message size headers for every damn protocol: the size of your packet is embedded in the datagram just like UDP, but the transmission is reliable just like TCP. HTTP would suddenly get so much better!

shortsightedsid · on Oct 16, 2014

CoAP or Constrained Application Protocol for IoT/M2M is being defined by the IETF is on top of UDP. One of the reasons is UDP's simplicity thereby making it easy to implement on microcontrollers. I guess there should be more love for UDP coming.

_asummers · on Oct 17, 2014

There's currently several proposed RFCs for adding TCP to CoAP. The next IETF meeting should (hopefully) standardize one. One is a pretty naive implementation that adds TCP semantics as an almost wrapper, and another removes the application level ACKs because of the transport level ACK associated with TCP, and several other "well TCP takes care of this for us" sorts of changes. ARM's Sensinode library has the first RFC I mentioned currently implemented, so it's being used in the wild.

Once that standardization happens, the OMA LWM2M spec is going to need some revisions, because it's very much written with "CoAP over UDP" in mind. This will ultimately be very good, because it will allow the possibility of other protocols like MQTT, HTTP, etc.

shortsightedsid · on Oct 17, 2014

There are several alternative transports being proposed (including the 'unusual' CoAP over websockets), but I wonder if they can be implemented with the original constraints of 8bit micros as servers. Maybe its possible, but not sure if there is a need and hence the new TCP proposals.

TD-Linux · on Oct 16, 2014

WebRTC actually uses SCTP, but the problem is that SCTP gets filtered out a lot by firewalls and NAT that don't know how to handle it. The solution is to stuff SCTP inside UDP, which works OK because UDP is pretty minimal itself.

e12e · on Oct 17, 2014

Sadly, not just SCTP. Anything not-tcp, not-udp, not-icmp (are there others in common use?) tends to get nuked by firewalls. Great fun if you try to run IPSEC with GRE for example. Really sad to have a protocol number in the IP header, only to have firewalls dropping anything but a few whitelisted ones... :-/

On another note... does anyone know if RDP[1] has seen any use lately? I'm guessing it's pretty redundant vis-a-vis TCP in practice?

[1] http://tools.ietf.org/html/rfc908

IgorPartola · on Oct 16, 2014

Did not know that WebRTC used, it, but did know about the NAT and firewall problem. That's the part I find unfortunate: a reliable general purpose datagram-based protocol is a shoe-in for so many application protocols, yet vendors tend to do bad things like this for no really good reason.

colanderman · on Oct 16, 2014

Don't forget that SCTP is also multihomed, providing the extra redundancy (though not extra performance) of MPTCP as well.

nanijoe · on Oct 16, 2014

The question may have been better phrased as "How unreliable is the internet" . TCP and UDP packets will be dropped at just about the same rate across any network, the only difference is that TCP keeps track, and re-sends any packets that did not arrive.

Each protocol has its own place (and uses)...

JoeAltmaier · on Oct 16, 2014

Almost. The Internet does that. But home routers are notoriously buggy. UDP often gets bad treatment there. E.g. your UDP stream may have large gaps when one of the kids upstairs is uploading to facebook - their TCP traffic gets priority over UDP which is often just dropped. This is because you can generate traffic much faster than the router can dump it to your ISP, and the little buffers fill up.

Also there was an ATT home router that dropped every other UDP packet! Really! Its a matter of, if you don't test it, it doesn't work. And home appliances get tested on delivering web pages etc, not UDP streaming.

bradleyland · on Oct 16, 2014

If you want to know how reliable UDP is, ask a VoIP engineer. The audio component of VoIP calls is called RTP (also referred to as media). Media is passed using UDP, because there's no point in retransmitting voice packets. The time sensitivity of voice communication makes it a frivolous act. By the time the retransmitted packets arrive, the audio stream has already been reassembled and played back to the end user with defects.

I used to run a pretty sizable VoIP network over a DS0-based Carrier Ethernet network. When you have tight control over the network, UDP is extremely reliable. We would go long periods of time with 0% packet loss (as reported by AdTran VQM). When we did see loss, it was normally because of a problem we were already aware of. That's the beauty of DS0. When a DS0-based circuit drops frames, alarm bells sound.

dspillett · on Oct 17, 2014

The "unreliable" in UDP doesn't mean "expect this to fail", it just means "things are not guaranteed to arrive, and if they do they might not arrive in the right order".

So "make sure you can handle (where "handle" could simply be "safely ignore") failed/unordered transmission" rather than "expect failed/unordered transmission".

If you can't cope with or ignore missed/unordered reception, wrap your comms in a stream based protocol (TCP, one of the other existing ones, or some contraption of your own) that manages reordering and retransmission as needed.

Maybe there is a better word to cover that then "unreliable" which sounds quite definite, but I can't think of one off the top of my head.

ay · on Oct 17, 2014

Arguably "expect this to fail X% of the time" is a more robust approach - because in-order delivery or zero-loss situations are then simply best cases that "just work".

But you can not make work what you can't test.

So anyone embarking on making their own transport protocol should test how their app works with various amounts of packet loss (1%, 5% and 20% are three reasonable marks to try which may happen in real world - especially in the wireless links case), as well as jitter, and variable delay.

To do such testing, it's handy to use a Linux module called "netem" which allows to simulate the delay, jitter and loss:

   http://www.linuxfoundation.org/collaborate/workgroups/networking/netem
   https://calomel.org/network_loss_emulation.html

dspillett · on Oct 17, 2014

You also need vary the pattern of packet loss as well as the overall frequency: You could lose 20% of your packets over a short period in roughly even distribution (perhaps due to "constant" background noise), or you could lose a sequence of packets amounting to that much (due to a sudden burst of interference on a particular hop of a route).

zwieback · on Oct 17, 2014

Upon customer request we implemented a UDP-based application protocol on a wireless network that had lots of low coverage areas. We thought they were crazy but it turned out that the overhead of keeping a TCP connection in an area of poor coverage can be overwhelming. The app retried a few times (and of course 802.11 retries on it's own), when responses didn't arrive but in the end the user was the real retry mechanism. To our surprise (not the customer's) everyone was happy in the end.

Packet ordering wasn't an issue in our app, though, so I'm not sure what my point is other than sometimes UDP is reliable enough.

jeffreyrogers · on Oct 16, 2014

The reliability of UDP is mainly going to depend on the end nodes you're trying to connect. The backbone of the internet is very reliable and if you're connecting over ethernet or through some other wired connection you'll see very few dropped packets and thus UDP will be very reliable.

If you're connecting with wireless, expect to see some more dropped packets, and if you're connecting with bluetooth expect to see even more (because of its low power bluetooth is more likely to drop packets than either a wifi or wired connection).

richardwhiuk · on Oct 16, 2014

Wireless (802.11) has built in re-transmit on a per link basis, so in practice the drop rate isn't as big as you'd expect - the main problem is packet jitter, not loss.

jeffreyrogers · on Oct 16, 2014

Ah interesting, I wasn't aware of that. I think the Bluetooth part is still valid however, as is my point about the loss rate being determined largely by how the end hosts are connected to the network core.

Rapzid · on Oct 17, 2014

UDP, the protocol, is entirely unreliable. The article is testing the reliability of the internet to deliver datagrams.

martincmartin · on Oct 17, 2014

Have two machines (A & B) send UDP packets to a third machine (C). Give them all the same networking hardware, e.g. 1 Gb/sec NICs. Have A & B send at full speed, so a total of 2 Gb headed to C every second, which can only receive 1 Gb. You'll get a lot of dropped packets.

Tloewald · on Oct 17, 2014

It's interesting that a lot of the arguments against UDP are because networks deprioritize the packets or that routers don't handle them. If you know a message is going to be lost if you don't handle it, perhaps it should receive better or at least equal priority. Since TCP is chattier, anything that encourages people to use a lighter weight protocol would then benefit everyone, since there would be more advantages to using the less resource-intensive option. It's an interesting conundrum.

alex_duf · on Oct 16, 2014

This is relevant for Server to server communication : with a fiber link or a copper one.

I'm not sure it's that reliable for a end user on a wireless network.

Still, it's good to know.

dxhdr · on Oct 16, 2014

Nice post. iperf is a pretty good tool for playing around with this stuff. In my limited testing with EC2 instances, packet size and send rate both have very large influences on packet loss. Usually loss will be stable up until some threshhold and then it will go nuts with 40-70% dropped, probably from some buffers somewhere filling up. I'd be curious to see more tests with those as variables.

film42 · on Oct 16, 2014

I thought UDP is to TCP as C++ is to Java with regards to memory management. Meaning, TCP will do whatever it can to ensure packet integrity, while UDP pretty much leaves packet integrity up to you.

Is that correct? If so, then regardless of UDP actually appearing somewhat reliable, the spec makes no guarantees that it will be consistently reliable.

JoeAltmaier · on Oct 16, 2014

There's a triplet that loosely describes protocols: { Reliable, Stream, Connection} Each can be true or false. So TCP is { True, True/False, True } and UDP is { False, False, False }. (TCP can be stream or datagram)

There are (were) other protocols possible but these are about all that's supported/tested these days.

Also there's fragmentation/reassembly issues with UDP. If you send a 30K UDP datagram packet, it gets sent as many Ethernet-sized chunks. The receiver is supposed to put them back together. If one is lost, the entire thing is lost. And to boot, Linux didn't do UDP reassembly at all until recently (last year?)

fl0wenol · on Oct 16, 2014

Linux has always supported UDP reassembly. Quoting a manpage: "By default, Linux UDP does path MTU (Maximum Transmission Unit) discovery. This means the kernel will keep track of the MTU to a specific target IP address and return EMSGSIZE when a UDP packet write exceeds it. When this happens, the application should decrease the packet size. Path MTU discovery can be also turned off using the IP_MTU_DISCOVER socket option or the /proc/sys/net/ipv4/ip_no_pmtu_disc file; see ip(7) for details. When turned off, UDP will fragment outgoing UDP packets that exceed the interface MTU. However, disabling it is not recommended for performance and reliability reasons."

And they ain't kidding, because there are plenty of layer 3 devices out there who play fast and loose with fragments of UDP packets; it used to be you could assume they'd drop anything that doesn't have a header, so by default UDP on linux doesn't encourage it.

It could be that distributions have been bundling sysctl.conf files with ip_no_pmtu_disc set to true, being that most modern layer 3 devices no longer mistreat UDP so badly; this may be what you're experiencing in the last year.

JoeAltmaier · on Oct 16, 2014

It was on Hacker News; UDP reassembly had never worked in Linux. Maybe I'm hallucinating that? { edit } It might have had something to do with UDP fragment reordering in Linux?

fl0wenol · on Oct 16, 2014

Maybe what it was (quoting from memory) is that the timeout for re-ordering/assembly started at 30 seconds (not 120 per the RFC) and only increased up to a limit of 180 when the kernel thought it wasn't something bursty like DNS; some kind of heuristic probably introduced to make NFS pre v4 work better. Maybe the queues were too small when MTUs could be 1500 bytes or less?

JoeAltmaier · on Oct 16, 2014

Here's something I found: https://lists.openswan.org/pipermail/users/2005-March/004037... But that's very old

colanderman · on Oct 16, 2014

(TCP can be stream or datagram)

Not at all! There's no way to ask for delivery of a datagram of unknown size (which is pretty much the definition of a datagram-oriented protocol), as in UDP, RDP, SCTP, and other datagram-oriented protocols.

(No, not even the PSH bit guarantees this!)

This has implications when implementing application-layer protocols. In stream-oriented protocols like HTTP, which intersperse headers with payload, it is impossible to read the header from the OS's socket without also (possibly) reading part of the payload (unless you're reading a single octet at a time). Your application is thus forced to implement buffering on top of what the OS provides (if the OS does not itself provide a general pushback mechanism).

With an (ordered) datagram-oriented protocol, this is not a problem, as you can ask the OS to return a single datagram corresponding exactly to the header, and process it before retrieving any of the payload.

And to boot, Linux didn't do UDP reassembly at all until recently (last year?)

Wow, that surprises me. Good to know.

colanderman · on Oct 16, 2014

"Packet integrity" usually means something different than how you're using it: generally, it refers to whether the data in the packet arrives unchanged. By this definition, UDP has identical packet integrity as TCP (they use the same checksum, albeit optionally in the case of UDP), assuming the packet gets there at all.

What UDP doesn't guarantee is ordering and arrival of packets. Usually "channel reliability/integrity" or simply "reliability" is used to refer to this.

ArkyBeagle · on Oct 16, 2014

One thing to consider - how quickly do you need to know whether the node on the other end has gone off to see the wizard?

With TCP, it might be a while - you may not get notification that he's gone until he comes back.

So you end up with keepalives, and... might as well have used UDP if your media can stand it.

chetanahuja · on Oct 17, 2014

Very interesting analysis. By the virtue of the crazy business we've decided to be in (creating a mobile optimized, reliable protocol over UDP), we've had some experience with this... over mobile networks. I can confirm that sending packets at a slow and steady rate, with proper packet pacing (this is all being done by our user level protocol stack using UDP as the underlying datagram sending transport), we commonly see 0% packet loss.

Of course, some of the actual packet loss at the media level is often being masked by the media protocol itself. That's why you'd often see cases where a packet meanders it's way to the destination after 4-5x the normally observed ping latency. This sort of stuff plays havoc with TCP congestion control algorithms (because by then, TCP send side has already decided that the packet is lost and it can only be due to congestion... so it backs the heck off). A lot of our win comes because of doing these things more in tune with how mobile networks actually behave.

kokey · on Oct 16, 2014

I've been quite happy with UDP lately. I set up a prototype log analysis system and just to get testing quickly I just sent the CSV log data as is over UDP between servers. It works so well that I still haven't bothered to swap it out with something else.

balls187 · on Oct 16, 2014

Instead of calling UDP an "unreliable protocol", we called it "best effort."

jhallenworld · on Oct 17, 2014

I'm wonder which vendor's routers are re-ordering packets. I know this was a big no-no in the core routers I worked on because of the bad effect it has on TCP. It was one of the things that customers checked in their product evaluations.

decisiveness · on Oct 17, 2014

The whole point of TCP's design is to make delivery reliable, and in order on stressed/lossy networks.

These test reveal exactly what you'd expect, as they are run between servers under prime networking conditions.

willejs · on Oct 16, 2014

The title should read: How Unreliable is the network?

The answer is: Very.

sigzero · on Oct 16, 2014

Doesn't the "U" in UDP stand for unreliable? Joking!

froztbyte · on Oct 17, 2014

"How unreliable is UDP? Let's test with these 5 network paths that I have no control over, and have no idea what their design constraints are...that should be totally authoritive, right?"

Pretty bad test conditions on this, tbh.

esbranson · on Oct 18, 2014

Finally, a informational discussion on HN..