Do protocol buffers count as a 'new auto magic framework'? https://developers.go...

ninkendo · on Aug 10, 2018

The problem is, protobuf isn't just a serialization protocol, it's also a bunch of generated model code you have to start using in your application. You don't just serialize your domain model to protobuf, you tell protoc to build you some classes that become your model.

Which means if you aren't careful, you can't easily move to anything else for serialization, ever again. Your code now uses protobuf-specific objects everywhere, because that's what protobuf encourages. I'm currently in a codebase where countless method signatures (which should be serialization-agnostic) take or return `Message`-derived objects because, that's what we get when we read in a request or emit a response, and using those types everywhere was just so tempting.

And now, we have new requirements that introduce some dynamism to our data model, in a way protobuf doesn't provide, so we're trying to move away from protobuf, and it's turning out to require a rewrite of practically everything because these protobuf classes are our data model, so everything depends on them.

What I've come to prefer is for serialization to be implemented a the boundaries of your service, with your models at least somewhat isolated from any given serialization technique. Protobuf is a foot-gun here because it blends these roles in a way that's hard to get away from.

deathanatos · on Aug 11, 2018

> What I've come to prefer is for serialization to be implemented a the boundaries of your service, with your models at least somewhat isolated from any given serialization technique.

I think this is the right way to do it. Just like how UTF-8 to a string type is kept at the borders. Inevitably, someone comes along with a requirement that implies the first iteration of the data modeling was not only wrong, but backwards-incompatibly wrong.

It's hard to convince coworkers that it isn't code duplication though.

> Protobuf is a foot-gun here because it blends these roles in a way that's hard to get away from.

I'm not sure; in many ways it is just trying to give you a way to supply it the data to serialize with those models. I'd be nice to not have the "foot gun", but I'm not sure what such a serialization framework would look like.

ninkendo · on Aug 11, 2018

IMO the serializers should be their own standalone classes/modules which live separately from your application’s core types. You can invoke them when you need to do the serialization and keep parallel versions of them for legacy clients, etc.

ActiveModel::Serializers work like this in Rails, although I haven’t tried any similar approaches in statically-typed languages where protobuf is so commonly used.

aldarn · on Aug 11, 2018

For Python there's Marshmallow (https://github.com/marshmallow-code/marshmallow) and Django REST Framework if you're using Django (http://www.django-rest-framework.org/api-guide/serializers/). Both of these work as you described.

SamReidHughes · on Aug 12, 2018

Serializers are just functions. Why do they need to be classes?

TeMPOraL · on Aug 11, 2018

> What I've come to prefer is for serialization to be implemented a the boundaries of your service, with your models at least somewhat isolated from any given serialization technique. Protobuf is a foot-gun here because it blends these roles in a way that's hard to get away from.

This is exactly what I think about using ORMs, too, and keep repeating it. Using ORM-generated model classes as your models is a semi-automatic footgun with a hair trigger.

skybrian · on Aug 11, 2018

It seems like this might be fixed with a more flexible code generator? Perhaps one that merges app-specific definitions and the definitions in the .proto file.

foota · on Aug 11, 2018

Yeah, though it can be tempting to just pass some protos around, it's generally best to use some other abstraction for your code (even if it's just a wrapper around a proto!)

sprucely · on Aug 10, 2018

How about FlatBuffers, making the file format and in-memory format one and the same?

https://google.github.io/flatbuffers/

thechao · on Aug 10, 2018

So, this is literally the whole point of this article: it turns out that that is a bad idea, in the long run.

rdtsc · on Aug 11, 2018

It seems unrelated in a way. Flatbuffers and Protobufs are ways to serialize data. The fact that FlatBuffers happen to serialize such that the persistent format is the same as in the in-memory representation is an optimization. It is just as easy to shoot yourself in the foot with one as is with the other in regards to what the article talks about. That is could choose do dump your objects as vertices only instead of including edge information with protobufs, json, flatbuffers, xml, s-expressions etc.

The main point was that serialization needs to be thought about very well, because it will involved compatibility issues. It shouldn't be an automating stream of current object structures to disk.