Hacker News new | past | comments | ask | show | jobs | submit login
Simon Willison: Django | Multiple Databases (simonwillison.net)
34 points by mnemonik on Dec 22, 2009 | hide | past | favorite | 9 comments



Reading through that documentation page I must say I'm extremely underwhelmed.

Is that doc-page outdated/incomplete or did it really take them 4 years and a 21kloc patch to tie a model to a database instance? I mean, I don't see anything about proper partitioning/sharding, not a glimpse of cross-db queries. Is this a joke?

SqlAlchemy has had proper multi-db support (vertical and horizontal) since around 2007. Perhaps the time would be better spent porting django to a real ORM?

http://www.sqlalchemy.org/docs/06/session.html#partitioning-...


The original plan was to add higher level support for certain types of sharding (e.g. tying a specific model to an individual database) and earlier versions of the code actually provided a method for doing so. This was removed because it wasn't judged to be a good enough solution - Django is very keen on the concept of reusable applications, and hard coding a decision of which database should be used for an application's models in to that application would conflict with that concept.

Obviously the solution is to specify a mapping somewhere, but there's no need to hold up the feature while waiting for a design for that particular bikeshed.

Instead, the plan is to release an initial version with low level primitives: the ability to configure multiple databases and specify which database should be used for a given query. This gives us a chance to see what kind of patterns are most widely used and figure out the correct way to support those at a higher level of the API.

In my opinion, the SQLAlchemy example demonstrates why handling sharding automatically is a distinctly non-trivial problem: http://www.sqlalchemy.org/trac/browser/sqlalchemy/trunk/exam... - it's not clear to me that that's a better solution than just manually picking the database to execute a query on at the point of execution.


Well, SA provides all the primitives required for setting proper sharding up - admittedly it could use some more documentation and an out-of-the-box anonymous sharding lib. The example is fully functional, though, and once you've wrapped your head around it, it all makes sense and does indeed work (I've used it in production).

Sorry, but "seeing what patterns are most widely used" is a lame excuse. You could just look at what real world sites do today (and did 4 years ago) - or simply jump right to implementing the only anonymous sharding model that can possibly work (see how mongodb does it).

I'm not meaning to bash django as a whole here, but I had a true WTF-moment looking at that page. It seems the home-grown ORM is becoming a ball on a chain if something like that takes this long...

Anyways, at least it's progress in the right direction, looking forward to real partitioning support, hopefully in earlier than another 4 years.


There are plenty of reasons one may prefer Django's ORM over SQLAlchemy. For example, SQLAlchemy doesn't have robust support for the most popular spatial database implementations: PostGIS, MySQL, Oracle, and SpatiaLite. Django's ORM does.

Moreover, it did not take four years to develop multi-db support. This was a GSoC 2009 project that was merged in 2009.


I submitted a direct link to the documentation over here, which is more interesting than my blog post:

http://news.ycombinator.com/item?id=1010373


21,000 line changeset... does that sound right? I mean, we live in the era of distributed version control, in times when merging is cheap, shouldn't there have been regular 2-way synchronizations?


There were. The work was carried out in a svn branch (with actual development in Git, but the branch was updated so svn users could still see what was going on). The svn branch was frequently updated to trunk using git-svn. Once the feature was ready, an svn merge was performed resulting in the monster changeset.

Django uses svn for the core repository, but the majority of the actual work takes place in git or mercurial. Django has committers using both.


Actually the merge wasn't done using SVN. Russ didn't want to deal with svnmerge, so he just took the diff from git and applied and committed it.


I stand corrected.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: