Hacker News new | past | comments | ask | show | jobs | submit login

One could combine JSON and a serializationless library, your JSON would be blown up with whitespace, but read and update could be O(1), serialization would be a memcpy, you could probably canonicalize the json during the memcpy using the SIMD techniques of Lemire.

I did this one for reading json on the fast path, the sending system laid out the arrays in a periodic pattern in memory that enabled parseless retrieval of individual values.

https://github.com/simdjson/simdjson




That's an intriguing idea but limits you to strings for your internal representation. Every time you wanted to pull a number out of it you'd be reparsing it.

Also I assume you'd have to have some sort of binary portion bundled with it to hold the field offsets, no?


It sounds like the approach is to set aside e.g. 11 bytes for an i32 field and write or read all 11 of them on each access, and to do similar things for strings, such that their lengths must be bounded up-front. It's interesting, but it requires a bit of work from the end user, and to be polite one may want to remove all the extra spaces before sending the data over the wire to a program that isn't using these techniques.


ah, I see.

I think I'd take a different approach and send along an "offset map" index blob which maps (statically known in advance based on a schema that both client and server would need to agree to) field IDs to memory offsets&lengths into a standard JSON file.

Then you have a readable JSON, but also a fast way O(1) to access the fields in a zero-copy environment.

Done right the blob could even fit in an HTTP response header, so standard clients could use the msg as is while 'smart' clients could use the index map for optimized access.

But as I said would suffer from numeric values being text encoded. And actually 'compiling' the blob would be an extra step. Wouldn't have all the benefits of flatbuffers or capnproto, but could be an interesting compromise.

I'd be surprised if this isn't already being done somewhere.


Take a look at Msgpack.


I have before, but that's very different from what I'm brainstorming here.


Yeah, you all get it.


Is the serializer public?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: