Yuk, that's just manually doing what ORMs give you automatically; that's not a g...

sagichmal · on April 22, 2014

    > you're being a human compiler.

The transformation from "type instance in my language" to "relational data in my storage engine" is absolutely not "compilation". It's a subtle translation of data and grammar, and as the type relationships grow more complex, ORMs fail. The lesson of ActiveRecord, from the perspective of Go, is that ORMs are fundamentally broken abstractions.

gnaritas · on April 22, 2014

Don't be pedantic, my point was clear and you obviously understood it.

As for the lessons of active record, says who?

Jweb_Guru · on April 22, 2014

I do, for one, but don't take my word for it... lots of very intelligent people have problems with ORMs. http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Compute... presents a good summary of the problems involved (forgive the inflammatory title).

Despite its many problems, SQL remains the best way of interacting with a relational database.

gnaritas · on April 22, 2014

> lots of very intelligent people have problems with ORMs.

And lots of very intelligent people like them. Both statements are meaningless appeals to authority.

> Despite its many problems, SQL remains the best way of interacting with a relational database.

That's your opinion, it's certainly not a fact, and just as many disagree as would agree. Secondly, ORM's don't stop you from using SQL where it's beneficial, so ruling out the ORM because you don't like it for some cases is throwing out the baby with the bath water. For standard CRUD operations, ORM's are the best approach by far. Your hand written CRUD operations gain you nothing but extra work.

Jweb_Guru · on April 22, 2014

> And lots of very intelligent people like them. Both statements are meaningless appeals to authority.

That was meant as a preface to a link that goes into much more detail. There are technical reasons why object-oriented programming and the relational model do not map well to each other, the infamous object-relational mismatch.

> That's your opinion, it's certainly not a fact, and just as many disagree as would agree.

I suspect rather more would disagree than would agree with me, if we're appealing to popularity. But given that the wire protocols for most relational databases (yes, there are exceptions) primarily accept SQL, and all their optimizers are SQL-based, I don't think this is actually as contentious a point as you believe it is. It would be different if ORMs, like optimizing compilers, were able to take high-level information about your application and query structure and transform it into better SQL--but that is in fact what modern RDBMSes do themselves. The more information you provide them, the better they are generally able to optimize your query. In my experience, ORMs usually have the opposite effect--they have less specific information about how you want your data to be used than an SQL query would, so they generate SQL in smaller chunks which can't be optimized as effectively. ORMs also rarely support all the features of any but the simplest RDBMSes, which means that you end up having to drop down into SQL in many cases to take advantage of them.

> Secondly, ORM's don't stop you from using SQL where it's beneficial, so ruling out the ORM because you don't like it for some cases is throwing out the baby with the bath water.

Sure. To continue the extremely apt compiler analog, I can also use inline assembly in any C program (not standard C, perhaps :)) for things I can't do in C. In both cases, every time I have to do that, it is a failure of the original language to allow me to do the things I want to do. Any time I have to do that, I also have to independently verify safety, correctness, and platform independence, as well as correctness guarantees from the compiler (such as there are any in C). And it requires domain-specific knowledge in an area that is dramatically outside my comfort zone.

If people were always having to drop in and out of inline assembly in C, it would have failed as a language long ago. The fact that people are still using it is largely testament to the fact that most C instructions (on low optimization levels, anyway) translate fairly straightforwardly to assembly instructions, and can be optimized to incredible extents by the compiler using knowledge of entire functions (or even the whole program). In performance-critical situations it is sometimes possible to do better than the compiler by dropping down into assembly, but the situations where you want to do that are very rare. By contrast, ORMs can't do any of those things--and I find myself dropping down into "inline SQL" so often that it's become my default approach. In sharp contrast to an optimizing compiler, the odds that even an SQL newcomer can produce better-optimized code than the ORM are shockingly high.

This is all not even going into the real reason I've given up on ORMs, which is that they are terrible at guaranteeing consistency of your data in concurrent situations. In part out of deference to less-able databases, many ORMs will use transactions only begrudgingly and are often "unsafe by default" (using low levels of transactional isolation like REPEATABLE READ), which makes dropping into SQL not just a performance concern, but one of data integrity. And if you don't use a very strong, serializable isolation level, you have to worry about dealing properly with deadlocks, livelocks, and other sorts of concurrency failures, which are nearly impossible to reason about unless you explicitly acquire the locks. I can't stress enough how nightmarishly difficult this can be in a large application even without an ORM. Adding an ORM to the equation basically means you spend half your time debugging the SQL, and the other half debugging the ORM. Turning on serializable isolation would help a lot, if you can take the performance hit, but the reality is that for similar reasons to why ORMs can't optimize queries very well, they're also not able to reason effectively about transaction lifetimes. Holding onto transactions longer than necessary is a great way to kill performance without substantially improving reliability. In the meantime, the majority of people who assume ORMs are protecting them from concurrent data access issues are very likely to have subtle, nearly undetectable data races that lead to extremely problematic bugs down the line.

So ORMs provide neither safety, nor speed, nor (IMO) simplify the amount of information you have to keep in your head to reason effectively about your program's behavior (since you have to know SQL anyway) over standard SQL. But, you say,

> For standard CRUD operations, ORM's are the best approach by far. Your hand written CRUD operations gain you nothing but extra work.

To me, this is the strangest idea of all. Standard CRUD operations aren't very verbose in SQL, either, and in a properly normalized database, it's not likely that the majority of the time all you're interested in is dealing with a single row from a single table in a single transaction (more often a view, perhaps, but if you're using views you're already well outside of familiar ORM territory). I've personally found ORM behavior to be precisely what I actually wanted a pretty low percentage of the time, since even the smarter ORMs have a tendency to acquire data I don't need (especially if you have foreign keys defined), make repeated unnecessary query requests, and insist upon jumping through elaborate hoops to enable common query patterns (try referencing a join table in Django that doesn't have a unique ID--or worse, using a table that has a multicolumn primary key). Your ORM only saves you work if you treat the database as a dumb store--a faster filesystem, basically. There are certainly situations that call for dumb storage like that, of course, but in my experience there are not a whole lot of them.

What's funny is that I used to completely agree with you. I saw monstrous SQL queries eating up valuable database and developer resources, surrounded by custom-built frameworks (different ones depending on which developer was working on a portion of the application), with hand-written migrations[1] that were supposedly rerunnable but somehow never were in practice, and my reaction was, "this is insane! None of this logic should be in the database! We should just use an ORM, and rely on the work of much smarter people who have surely reasoned about these problems much longer than we have, figured out best practices for data access, and encoded them into libraries. To do otherwise is the worst kind of not-invented here syndrome and is clearly the result of DBAs clinging desperately to jobs that became irrelevant ten years ago."

The more I learned about databases, though, the more my tune started to change. It turns out that (and yes, this is persistently my opinion :)) ORMs aren't that at all. The best practices for handling and accessing data are encoded into the RDBMSes themselves, and for technical reasons that I've since uncovered (for starters: catalog access), it is nearly impossible for the application to do better without itself becoming the main source of information about the data and metadata of the application. I'm not saying that situations like the above are good--far from it--only that the solution isn't an ORM.

SQL is an ugly, ugly language. It's insanely complex to parse and has baffling type coercion, it repeats the mistakes of past languages in including NULL and handles it in the worst way possible, it lacks (by default) many convenient structural types (but advanced systems like PostgreSQL get around this issue), and the promise of logical independence is often more myth than reality. I would love for someone to come up with a better alternative. But ORMs sure aren't it. They don't solve any of SQL's real problems, or even attempt to, and introduce a whole new set. Their singular virtue is that syntactically they play nicely with your language of choice (if it happens to be object-oriented[2]), and that alone isn't very compelling.

[1] Hand-written migrations are still usually a terrible idea, though RDBMSes with proper transactions can mitigate this somewhat. This is somewhere where I still hold out hope that someone can come up with a much better solution. However, I think good migration frameworks are somewhat orthogonal to the ORM problem, and ORMs can't really help out much besides providing templates for schema diffs, since the hard part of a migration is actually migrating the data.

[2] I think a lot of the problems with ORMs are also problems with object-oriented programming in general. A declarative or functional ORM framework would actually be able to function as a real replacement for SQL, at least in theory. I've investigated such projects with interest, especially initiatives like SPARQL or the long-ongoing DataMapper2 project in Ruby (as well as abandoned good ideas like QUEL). But so far, none of them seem to be able to gain significant traction, which is disappointing but perhaps expected.

parasubvert · on April 23, 2014

ORMs are never the "best" approach to data access. They are a heavyweight, leaky bridge between two very different abstractions (objects and relations).

An individual framework may have the right balance of tradeoffs for one's needs to manage the dynamic complexity of certain object graph update scenarios. But a blind choice of ORM has usually ended in tears on most projects that reach substantial complexity in my experience. Explicit SQL is much more predictable and often just as productive as discovering the voodoo to make your ORM-du-jour dance. The lighter the ORM, the better, in my opinion.

For what it's worth, Golang does have an early ORM of sorts, it's called Gorp. https://github.com/coopernurse/gorp

sagichmal · on April 22, 2014

    > Your hand written CRUD operations gain you nothing but 
    > extra work.

And maintainability. (shrug)

Groostav · on April 23, 2014

So, getting fairly seriously tangential, what would you recommend for green field devs? Have you tried to skirt the ORM problem entirely and simply go for object stores / NoSQL?

parasubvert · on April 23, 2014

Learning direct SQL would be preferable to doing ORM or NoSQL for a greenfield dev, if the problems they're solving involve managing "transactional" data (like sales / orders).

It's a great fallback skill to have and if you're ever going to use an ORM in anger you'll need to know SQL well anyway.

If they're mostly managing technical data (like clickstreams or logstreams), then, use whatever store makes sense.

SQL will be around for decades to come and at least gives one hope they'll learn something about the relational model which is our only logically grounded approach to managing data integrity.

NoSQL is appropriate when dealing with problems where

(a) have a different natural requirement than general-purpose logical data management, e.g. your problem maps neatly into a log store, document store, search index, flat file, or K/V store and there's no gain in decomposing these structures into relations

(b) no one is going to want to query this data or update these things arbitrarily (famous last words);

(c) require massive scale and continuous availability and thus don't fit with most of today's SQL databases. ... though you'd still be probably better off looking around Github for the various frameworks to help you manage a sharded/replicated MySQL or PostgreSQL setup before jumping into NoSQL.

chc · on April 22, 2014

Well, Martin Fowler has gone on the record saying that Active Record (the design pattern, of which Ruby's ActiveRecord is one implementation) is not well-suited to sophisticated data models, and he tends to prefer the Data Mapper pattern — which similar to what's being described for Go.

gnaritas · on April 22, 2014

That's not what the OP said, he said the lesson was "that ORMs are fundamentally broken abstractions". Fowler does not believe this, and even if he did, his opinion is certainly not a global lesson as if it were now best practice to consider ORMs fundamentally broken.

camus2 · on April 22, 2014

expect it is quite hard to write a generic data mapper in go as well. something like Hibernate or Doctrine ORM would be nearly impossible to write with go.i'm not talking about the basic features but advanced features of both frameworks.

chc · on April 22, 2014

That is true, but genericity is actually not an inherent attribute of Data Mapper. The key idea is that there are fundamental incompatibilities between two data models (e.g. a structure in your program and a relation in your database), and the translation between the two models is a reified component of the program rather than just being absorbed into the behavior of the object itself as in Active Record.

There are some efforts along these lines, such as gorp (https://github.com/coopernurse/gorp), but I don't think they're all that widely used.

camus2 · on April 22, 2014

that's not the issue here, wether a pattern is good or not is irrelevant.The question is ,is it possible to write something like ActiveRecord and keep its most advanced features in Go. Right now the lack of genericity in the language makes it a no go.