We were using protobufs at Spotify and ditched them for simple JSON calls on the client side. No one complained, and never going back to having anything like that on the client side if I can.
Just too many drawbacks.
For server to server, they might be fine, but to client just stick with JSON. (which when compressed is pretty efficient).
One could combine JSON and a serializationless library, your JSON would be blown up with whitespace, but read and update could be O(1), serialization would be a memcpy, you could probably canonicalize the json during the memcpy using the SIMD techniques of Lemire.
I did this one for reading json on the fast path, the sending system laid out the arrays in a periodic pattern in memory that enabled parseless retrieval of individual values.
That's an intriguing idea but limits you to strings for your internal representation. Every time you wanted to pull a number out of it you'd be reparsing it.
Also I assume you'd have to have some sort of binary portion bundled with it to hold the field offsets, no?
It sounds like the approach is to set aside e.g. 11 bytes for an i32 field and write or read all 11 of them on each access, and to do similar things for strings, such that their lengths must be bounded up-front. It's interesting, but it requires a bit of work from the end user, and to be polite one may want to remove all the extra spaces before sending the data over the wire to a program that isn't using these techniques.
I think I'd take a different approach and send along an "offset map" index blob which maps (statically known in advance based on a schema that both client and server would need to agree to) field IDs to memory offsets&lengths into a standard JSON file.
Then you have a readable JSON, but also a fast way O(1) to access the fields in a zero-copy environment.
Done right the blob could even fit in an HTTP response header, so standard clients could use the msg as is while 'smart' clients could use the index map for optimized access.
But as I said would suffer from numeric values being text encoded. And actually 'compiling' the blob would be an extra step. Wouldn't have all the benefits of flatbuffers or capnproto, but could be an interesting compromise.
I'd be surprised if this isn't already being done somewhere.
When the library decoding the data is falling with weird errors, and you open the devtools in the browser and the data being transmitted is all in binary, well you have a very hard time debugging things.
We moved to flatbuffers and back to JSON because in the end of the day, for our data, data compression with JSON+gzip was similarly-sized than the original one (which had some other fields that we were not using) and 10-20 times faster to decode.
That said, the use case for flatbuffers and capnproto isn't really about data size, it's about avoiding unnecessary copies in the processing pipeline. "Zero copy" really does pay dividends where performance is a concern if you write your code the right way.
Most people working on typical "web stack" type applications won't hit these concerns. But there are classes of applications where what flatbuffers (and other zerocopy payload formats) offer is important.
The difference in computation time between operating on something sitting in L1 cache vs not-in-cache is orders of magnitude. And memory bandwidth is a bottleneck in some applications and on some machines (particularly embedded.)
Not OP, but I'm going to guess because it's an added dependency in your client library, and even worse, it includes some code generation in each client's build.
How is it annoying? To be fair, we’re fronting our gRPC service with a AWS LB that terminates TLS (so our gRPC is plaintext), so we don’t deal with certs as direct dependencies of our server.
Just too many drawbacks.
For server to server, they might be fine, but to client just stick with JSON. (which when compressed is pretty efficient).