I have done a lot of work with IoT-ish data, e.g. sending location and telemetry data from vehicles and remote sensors over unreliable cellular networks.
I need the ability to queue data on the device to deal with patchy connectivity and some policy for deleting messages in the queue. For example, I might throw away unsent location updates that are "stale" to save space. I might need to prioritize some messages, e.g. "lithium-ion battery pack overheating".
I may be running on embedded hardware that is too small to run Linux. The connection to the modem might be serial.
Data size and bandwidth usage can make a difference. I might get 2MB/month for $3. Bytes count if I want to send frequent updates to get more precision on the location.
I may need to get through multiple layers of network address translation, making it different to send messages back. So keeping a persistent TCP connection can help. But using TCP for one-off connections wastes bytes, so UDP can be useful if I don't care about losing packets.
I may want to encrypt or digitally sign your messages.
So, I end up doing a lot of work to handle queueing locally, but using a pretty simple message-oriented binary protocol to send to the server. The server can do whatever it wants, e.g. write it to a Kafka queue.
I need the ability to queue data on the device to deal with patchy connectivity and some policy for deleting messages in the queue. For example, I might throw away unsent location updates that are "stale" to save space. I might need to prioritize some messages, e.g. "lithium-ion battery pack overheating".
I may be running on embedded hardware that is too small to run Linux. The connection to the modem might be serial.
Data size and bandwidth usage can make a difference. I might get 2MB/month for $3. Bytes count if I want to send frequent updates to get more precision on the location.
I generally know exactly what messages I am sending, so using a compiled format like gRPC or COAP (https://en.wikipedia.org/wiki/Constrained_Application_Protoc...) can be better than JSON.
I may need to get through multiple layers of network address translation, making it different to send messages back. So keeping a persistent TCP connection can help. But using TCP for one-off connections wastes bytes, so UDP can be useful if I don't care about losing packets.
I may want to encrypt or digitally sign your messages.
So, I end up doing a lot of work to handle queueing locally, but using a pretty simple message-oriented binary protocol to send to the server. The server can do whatever it wants, e.g. write it to a Kafka queue.
Amazon's IoT framework nails a lot of these points. https://aws.amazon.com/blogs/compute/building-an-aws-iot-cor...