Fixed schemas aren't just about validation but performance and space usage as we...

Fixed schemas aren't just about validation but performance and space usage as well.

Consider this long open high priority enhancement to MongoDB: https://jira.mongodb.org/browse/SERVER-863

Without a fixed schema, the schema must be stored with each item. Every "row" not only has the data but the metadata as well, including field names.

If you have a lot of repeatable data this can consume very significant amounts of space (regularly on the order of 4x comparing PG rows to BSON docs). This isn't just a disk space issue, this means less caches and more pages to scan when doing any operation. Compression only improves the disk space issue while adding CPU overhead.

Relational DB's like PostgreSQL having a fixed schema need only store it one time, each row has minimal overhead (about 23 bytes in PG) to store only the unique data.

Perhaps Mongo could optimize behind the scenes, detecting a repeating schema and making a fixed schema data structure for it similar to how the V8 javascript engine creates C++ classes for javascripts objects so its not a key-value lookup everywhere limiting performance. I haven't seen any NoSql db attempt that, I don't think any even attempt simple string interning as suggest in Mongos 863 issue.

PG gives me a choice I can go nearly schemaless with the JSON type or I can create a fixed schema when that makes sense, with many optimized data types to choose from. PG having array support give you a nice middle ground between the two as well.