Hacker News new | past | comments | ask | show | jobs | submit login
Python Web Frameworks' Complexity (mindref.blogspot.com)
94 points by mariuz on Nov 20, 2012 | hide | past | favorite | 63 comments



Note that this article is by the author of wheezy, a framework that just happens to do quite well on this particular metric.

Yes, I'm a maintainer of Django, so I'm biased too. But look: there's a good argument to be made that parts of (even of all of) Django is too complex. This, however, is not that argument; this is just a made-up metric.

ETA: I misspoke above: McCabe isn't made up. I meant to write that this was a cherry-picked metric. Leaving the above as is because people have already replied to it.


I happen to agree that this kind of measurement is pretty much useless here and that I don't pick my web framework via cyclomatic complexity, but let's not be too disingenuous. He's not making up a metric, it's a thing, and an occasionally useful one IMO.


Sorry, I didn't mean to suggest he was making up a metric.

Yes, McCabe is a real thing. I haven't really found it that useful myself -- I'm not sure that you can adequately measure something fuzzy like "complexity" through static control flow analysis -- but sure, I'm willing to accept that it could be useful in certain circumstances.

What I am saying is that this guy is cherry-picking a specific metric (out of many that he could choose) that makes his tool looks good. I could just as easily come up with a metric that sorts Django to the top of an arbitrary list. That wouldn't be useful, either.


Ah OK. Sorry, misread. Agreed. Oh and maybe the metric you're looking for that puts django on top is "# of users" ;) :trollface:.


"# of users" is a metric I use to choose which framework to use. For one, it makes it much more likely that a problem I'm having can be solved with a google or forum or stackoverflow search.

I think that alone makes it a much better metric to judge the experience you'll have with a framework than its cyclomatic complexity (at least for a user of a framework, maybe not as a developer)


A bit off topic here, but I feel that this is a terrible metric.

The reason it's impossible to find answers for how to fix problem X in Windows (besides it being terribly designed) is that there are a ton of users who are trying to be helpful and providing bad advice.

Googling for solutions to an Ubuntu problem is starting to look grim too. Read some bug reports on Launchpad and then read some on the Debian bug tracker; there's a clear difference in signal/noise.

Circling back to the original point: if a web framework is simple, you can figure out the problem by just reading the code rather than looking online for solutions (many of which are in the form "oh, framework Y automatically does Z for you so you have to...").


I don't care how simple a framework is. There are always dusty corners that are hard to understand. There will always be bugs. And the ability to get help from fellow users when you hit those is, in my experience, crucial.


By "problem" the OP didn't just mean a bug or ambiguous situation, they meant plugins, addons, etc, that the community has contributed.

EDIT: Reading his post again, maybe he didn't mean that, but the point is still valid.


Don't forget ease of finding and/or training other developers to work on a project


If it's a made up metric, it was made up a long time ago:

"Cyclomatic complexity (or conditional complexity) is a software metric (measurement). It was developed by Thomas J. McCabe, Sr. in 1976 ..."

https://en.wikipedia.org/wiki/McCabe_complexity


The question isn't whether cyclomatic complexity is made up. It is what his "excessive complexity" measure is actually measuring. Is it the sum of all the measures? The max? I would argue that neither of those are useful on their own.


> there's a good argument to be made that parts of (even of all of) Django is too complex.

As long as those complex parts are encapsulated well enough that I don't have to peek 99% of the time, I don't think it matters. I am more concerned about how complex the code I write has to be. If by making the framework complex you can simplify the code I have to write, I am all for it.


For some reason CherryPy's score doesn't quite correspond to its actual simplicity.


The wikipedia entry at http://en.wikipedia.org/wiki/Cyclomatic_complexity notes 10 as a module threshold.

This is a bit like comparing apples and oranges, usually you'd use 10 as a rough number for your method/function whatever, not for a whole framework. The whole point of modularization/refactoring is to have small parts, I've never heard of any mention regarding to cyclomatic complexity that you cannot use a multitude of small modules.

So to sum it up, it's nice that someone tried to compare python web frameworks, but the method of comparison is misleading at best and wrong at worst.


I'm not familiar with this complexity metric, but I immediately wondered if the results had been normalized with regard to the number of lines of code or modules in each framework (The wikipedia article doesn't say anything about normalization). Django does a lot more stuff than bottle or Flask. So if you just sum up complexity of their modules, it's pretty obvious which one will come out ahead.


It's not clear to me whether this "excessive complexity" has an actual performance impact or is just a subjective measure of complexity.

Django, one of the worst offenders according to your chart, also happens to be a mature project in wide use that works really well and handles most edge cases where, perhaps, some "less complex" frameworks would fall down.

So what does this measurement mean in practical terms? How should I balance this score against the benefit of the framework?


The other frameworks would "fall down" or you would need to write a small amount of glue code?

While Django is mature, I'm using it right now in fact, I find that I spend a hell of a lot of time trying to figure out why Django is doing a specific thing or how to convince it to do the exact thing I want.

I'm convinced, after a few years of Django work and a year or so of Flask (or just werkzeug libraries), that Django is the wrong level of complexity for a web framework.

For every minute I've saved by having django handle the edge case or having some great feature built in I feel like I've lost a few minutes to fighting against the framework when my use case doesn't fit into the expected usage and I'm fighting against a fairly rigid framework.

This feeling is what I think the OP is trying to get at with the "excessive complexity" measurement, although using a single metric like this is probably not the answer. It does fit decently well with my impressions of the frameworks listed (among the ones I've used anyway) but this is a messy and subjective problem.

I've found myself firmly in the "libraries > frameworks" camp because of this issue so I'm certainly biased towards liking this metric, even with it's faults.


Same here. I tried valiantly to do some class based views the other day and my spidey sense started going off. Maybe I'll find time later to get those the way I want them but I suspect they will suck up more time than they will save.

It's a general problem with hierarchies of inheritance. Too many separate places where the actual functionality is, so you have to learn all of them and hold them in your head to understand the few pithy lines of code in front of you.

And that's the issue with the admin and forms. Hard to keep track


> It's a general problem with hierarchies of inheritance.

And what do you suggest to make it easier for you to read, spaghetti code ? Django makes great use of mixins, mixins you can re-use to make your own views if you dont like the default generic views.


That's what I was doing, using mixins. In a way it's no different than stacking decorators, but soon enough you get helper methods being called and cooperations between helper methods and you have no memory which class its implemented it.

An IDE can help but that's an indicator that the solution has a readability problem.


How about loosely coupled components and code that makes the logical flow explicit? Tightly coupled frameworks with a lot of magic is a poor solution to annoyance at boilerplate/glue code in most professional situtations.

That's not saying Django should do this, they want to create an easy to use, well integrated framework. That's saying why it might be a good idea to avoid tightly coupled frameworks when designing an application.


So which framework comes out on top with the least developer friction metric? Not just convenient features like an ORM but also, your ability to do stuff without the framework getting in the way.


I find the best solution is as minimal a framework as possible with full featured libraries. Avoid the temptation to let a big framework choose your orm, forms library, routing, etc. just because decisions are stressful or complicated.

Loose coupling, loose coupling, loose coupling!


it directly correlates to the number of hours you will spend with the debugger stepping through the labyrinth admin trying to figure out why your widget ended up without a whatsit.


Isn't that dependent on actually hitting a bug in the framework where you need to do that?


I recall reading somewhere that the distinction between a framework and a library was that in a library, you[r code] calls the functions it needs, whilst in a framework, it calls you.

Often, bugs in your code will manifest as the framework throwing exceptions somewhere internally, often with difficult to interpret stack traces until you're familiar with how things work and what parts are safe to ignore.

I've not had much success with interactive debugging in django (recommendations/howtos would be hugely appreciated), but I can imagine a similar situation in which you have to learn what is safe to skip through (and maybe how to automate that to break only at points of interest, especially those points which are in some internal django setup methods)


Most of my most time consuming django hours involve forms or admin. it's not that there is a bug in my code, it's just very twisted and thick in there.

Eg. Form classes are created dynamically per request and then instantiated. Lots of currying, many places that params can be specified.


Yep, forms (and presently, Tastypie) are the biggest hassle for me as well. Digging through the internals of CBVs in order to figure out annoying little mixin-related quirks probably comes second.

Even in spite of all that, it's still the nicest (and by far the most well-documented) framework I've used.

If I had to pick one complaint, I'd say the "define fields as class-variables" for forms and models is a bit too magical, and picking through the metaclass implementation to understand exactly what things happen when can be a pain, and I'm not 100% sold that 'reduced boilerplate' is a good enough reason for it (there may be other good reasons. I'm just whining :)).


A little off topic, but forms are a big PITA using Flask and extensions, too. Forms are definitely not a solved problem in the python world.


What are you using? They are pretty convenient with WTForms.

    # models.py
    class Post(db.Model):
        id = db.Column(db.Integer, primary_key=True)
        name = db.Column(db.String(80))
        title = db.Column(db.String(200))
        content = db.Column(db.Text)
                
        def __init__(self, name, title, content):
            self.name = name
            self.title = title
            self.content = content

    # forms.py
    import flaskext.wtf as wtf
    from flaskext.wtf import Form, validators
    from wtforms.ext.sqlalchemy.orm import model_form
    import models


    PostForm = model_form(models.Post, Form, field_args = {
        'name': {'validators': [validators.required()]},
        'title': {'validators': [validators.required(), validators.length(min=5)]},
        
    })

    # views.py
    def post_new():
        form = forms.PostForm()
        if form.validate_on_submit():
            post = models.Post(form.name.data, form.title.data, form.content.data)
            db.session.add(post)
            db.session.commit()
            return redirect(url_for('post.index'))
        return render_template('post/new.slim', form=form)

    # _post_form.slim
    - from 'helpers.slim' import render_field

    form method="POST" class="well"
      = form.hidden_tag()
      = render_field(form.name, class\="span4")
      = render_field(form.title, class\="span4")
      = render_field(form.content, class\="span4")
      
      .field
        input type="submit" class="btn btn-success"

    # post/new.slim
    - extends 'layout.slim'

    - block content
      .span6
        h2 Creating new post
        - include 'post/_post_form.slim'
        a href="=url_for('.index')" class="btn btn-primary btn-small" Back

    # helpers.slim
    - macro render_field(field)
      div class="{{ 'field-with-errors' if field.errors else 'field' }}"
        - if field.errors
          - for error in field.errors
            p class="error" =  error
        = field.label
        = (field(**kwargs) | safe)

Don't bother about the slim templates. It's a small jinja2 extension which I use to give me slim syntax.

Now this probably looks like a lot. But consider that you write the helper to render the form only once. Also, you can write a helper to generate models and form scaffolding for you. That's what I do.

    python manage.py create_model post -f 'name:String(80) title:String(200) content:Text'
It generates me the model, form and form template(models.py, forms.py, _post_form.py). The form of course will have empty validators which I fill.


I disagree. I found WTForms quite easy to use and straight forward. I believe there's even a Flask-WTForms module. It's very natural. I was just thinking the other day how nice forms are in Flask. Having to write ky own forms stuff in Scala.

Edit: stupid auto-correct.


Not necessarily a bug, more often it's just something not working the way you expect. Documentation is always ambiguous when you get down to the fine details and even in a very well documented project like Django there is still a whole lot of code surface area that is not documented other than an autogenerated API listing.


Yes, the "excessive complexity" is a measure I've not heard of before. Surely summing up functions over 10 doesn't make sense.


Beyond the question of how meaningful these metrics are, wouldn't it be a more useful metric to look at how complex the code you write when you use each framework is, rather than the code that implements each framework?

I suspect -- but don't know -- that these might even turn out to be inversely correlated for some types of frameworks. DSL-based frameworks often require some interesting tricks to get right.

So which is better? A complex framework that lets you write simple, concise code? Or a simple framework that requires a lot of boilerplate and complexity to use?


I think the answer to your question, and probably the point of asking it, is that It Depends. If you're writing a very simple web service that will be accessed by millions of clients for a small set of straightforward operations, you'll probably want to pick the framework with the smallest footprint and write a very little bit of custom code. If you're writing a vastly complex system to mingle support for new and legacy data and integrate with many disparate third party services, you will want a framework that does a lot behind the scenes to make your own code as painless as possible.

For everything else, aside from Mastercard, there's room for argument and discussion... but I think the only universal takeaway is that most or all of the frameworks mentioned have solid use cases, and none of them work for everyone in every circumstance.


I visited KyivPy where the author of Wheezy.Web described his project. So I want to share own thoughts about it.

It has all basic things you would like to have. I mean you can work with templates engine you want, forms, middlewares etc.

Speaking about Django we can find it is not just Django. It is also hundreds of django apps like django-social-auth, South, django-celery, django-mptt etc.

This new framework has no such huge community for now. On the other side the author did a good work. You can use pretty complete framework on Python3 when the most of others can't offer the same. And it is not 2to3. If I correctly remember impressions of the code it was written on Python3 and then ported to python2.

So, I like it. You can use classes or functions for handling requests. You have modules so you can install just wheezy.http and not so many other parts and I believe we should choose the right tool to solve the problem. In one case it can be Django, in others we can take Wheezy.web.


I agree that a lot of python web frameworks are excessively complex. However, sometimes complexity is warranted.

I use web.py, flask and bottlepy depending on the required complexity of a project. They intuitively feel less complex. If I need a step up from web.py, I'd be inclined to go with Pyramid (reddit is built on Pylons, which is a precursor to Pyramid and before that, they used web.py)

To me it's a tradeoff between coding speed and project size/speed/performance. For example, if you use Bottlepy to write a typical webapp, you'd have to write your own session handlers (or use Beaker), and you'd have to write your own utils to transform and sanitize data etc.

However, using Bottle.py for a cookie dropping service? 30 lines of code and you have a very robust and scalable cookie dropping service. Ditto to using bottle for adserving and ad decisioning off redis.

Flask however, in my opinion works well for public API endpoints, while web.py works very very well for web frontends


There's a pretty strong correlation between "frameworks that are actually used" and "frameworks with high complexity" here. That makes sense to me: most of the software systems I've worked with that are actually used have a lot of warts to them. It comes from having users, who never seem to want to do the simple thing.


There appears to be only a loose correlation between McCabe complexity and "excessive complexity." It's not clear how the author converts from one to the other.

One would hope that the difference in ordering comes from accounting for the amount of functionality in each project (and even if you prefer bottle to django, you have to admit that django builds in more functionality.)

But, again, it's unclear how that conversion happened. I'm wondering if the author ranked "excessive complexity" however he wanted, and the McCabe complexity is just there to give this the veneer of scientific authenticity.


This article would have been a lot better with examples of where that complexity is (particular functions, where the complexity score comes from, etc), and what suggested refactorings help to reduce that complexity. That would make it possible to judge if this is a reasonable metric, or if this is just some overly-strict tool that dings perfectly reasonable code, which some frameworks happen to trigger more than others.


I was going to rant about django, but I think my issue is applicable to all wsgi applications. I develop APIs that often involve long polling as well as streaming, and while I can do this with most frameworks, the choice of the server in front can make or break my implementation.

This I've settled on tornado as my framework and server.


This doesn't make great sense - why is this particular kind of complexity a bad thing?

Most Python frameworks, regardless, seem to score quite well-Flask, Turbogears, Bottle-in contrast to Django, which is to be expected as it doesn't try to be simple (it is, however, well documented and easy to learn).


According to the Wikipedia article increased cyclomatic complexity leads to harder test code and the increased likelihood of more defects.

I'm curious how the analysis in the blog post works. In languages like Python it's common to hide branches inside of data structure lookups like the following:

  def f(): 
     ...

  def g():
     ...

  def h(x):
   return {
    True: f,
    False: g}[not x]()


Slightly off-topic but why would you build up a dictionary in h(), then index it, then call the function being returned? The dictionary must be garbage collected later too. To me this seems rather inefficient compared to a simple if/then/else. I think readability also suffers.


It's likely the parent has listed this way for brevity of the post, while still trying to show the potential for complexity, which I believe is the key point here and i would agree with it.

The dict-as-a-switch expression is common to python, but it can be masked as the parent has shown in larger examples.

I have no idea how we could apply static analysis techniques to a construct like this. To be honest I tend to avoid this approach on readability grounds. It could be cleaned up with a dict subclass but I think inlining the dict call mechanics is simpler to read, even if it produces larger code, leads to repetition and couples you to this approach (which would be a strong consideration in the case of writing a library - where this type of code is most prevelant).


Due to the dynamic nature of python, static analysis to determine CC is hard.

CC is useful for determining number of tests for complete branch/path coverage if you don't pull out the switch on dictionary tricks (among others).


I find django's level of complexity just about right. It has a lot of built-in things that you might not need in every case, but they might come handy some of the times. But what makes me stick with django is that it is low level enough to allow me to build whatever I want while smoothing the process.


I would call Django everything but low-level. Tornado, Twisted.Web and (possibly) CherryPy are low-level, Flask, Bottle, Web.py and other micro-frameworks I consider mid-level and Django, Pyramid and other "full-stack" frameworks are definitely high-level.

And Django's complexity is really good in it's class. "Complex is better than complicated" - and Django tries hard not to be the latter :)


I usually just route a json request to a python function and send a response. And I've found the ORM which would be the most high level part of it to do everything I've needed to do which without tweaking anything.


And you're doing this utilizing Django's urlconf machinery?! With all it's support for deeply nested hierarchies of urls, passing keyword and other arguments extracted from url (and/or just supplied), supplying views as strings instead of function objects etc, not to mention setting settings.py up?

Look, I know that bare-bones Django looks simple, but in reality it isn't. You don't even have "bare-bones" unless you manually disable quite some functionality in settings.py. It's still good if you plan to grow your app. You can always turn the features back on and plug reusable apps with ease.

But for something as simple as getting JSON requests, hitting db and returning response I'd choose something else, some microframework and simplified ORM, for example Bottle and Storm[1]. I had a great success with using CorePost[2] on top of Twisted, but that's only if you can live without an ORM and like Twisted's brand of async.

Long story short: there is no need to employ Django for every little webservice. You can, of course, but you should think about why your using full-stack framework when something simpler, faster and smaller would do just as well.

[1] https://storm.canonical.com/ [2] https://github.com/jacek99/corepost


I understand what you're saying. Is that kind of magic that makes effortless the making of these simple things. And that's what's expected in a framework. But you are right, when compared to other smaller python frameworks django turns out as more complex. But let's say, compared to other frameworks like .NET for example, django stays in a lower which I like and I don't need more.


Django is a whole different beast though. It comes with the kitchen sink and its own cheeseshop. It was designed to cater for every single edge case, but in my opinion, is the very definition of excessive complexity.

Most of the things aren't needed by most people, but it gets installed anyway.

EDIT: I am an idiot who clearly doesn't know much about Django and spake without prior research.


A) The kind of complexity you're referring to is different from what this blog post is talking about. (Specifically cyclomatic complexity.)

B) Django doesn't have its own cheeseshop. Django developers use PyPI and pip just like everyone else. (Maybe you're thinking of Crate.io, which is a PyPI mirror that happens to use Django in its implementation.)


Oh. Thanks for setting me straight on the cheeseshop (I actually thinking that since it has its own package management system, it would have its own cheeseshop) Sorry.

As for complexity, I still think it's excessive. It's good to know the internals of a module/package/whatever, and django is really really complex, of which cyclomatic complexity quantifies I believe. I mean, if you have code that branches off everywhere, it'd be hard to follow the logic when inspecting the internals.


So Django doesn't have its own package management system, either.

I'm curious: where did you read/see that?


Maybe the 'django [reusable] app' terminology?

One of the first few pages in the tutorial includes:

"Django apps are "pluggable": You can use an app in multiple projects, and you can distribute apps, because they don't have to be tied to a given Django installation."

and I guess it might be possible to mistake 'manage.py startapp ...' as some sort of internal packaging thing if you don't look too deeply.


Yah. Obviously I was mistaken. But your points exactly


Holy ... jacobian replied me.

To answer your question: I am an idiot and I didn't know much about Django. I did once find djangopackages or something like that and simply jumped to the conclusion. Big and Unwieldly is the brand perception I had.


The overwhelming majority of functionality in Django is useful for day to day web development. An argument could be made that a few bits of contrib and stuff like databrowse should be culled but they are completely separate from the rest of the framework and don't add to the conceptual complexity in any meaningful way.

And catering to 'every single edge case' actually seems to contradict the philosophy I've observed on the django-developers list.


This test doesn't look right. For instance, Django comes with its own ORM, and a bunch of stuff in contrib. Does the test count the code in contrib as well? To be fair, you would need to take for instance Flask, and add SQLAlchemy and a bunch of extension to get a meaningful score.


Isn't the McCabe metric simply a measure of software complexity?

This method is designed for individual modules. Here you've applied for the complete system. What sense does this make? Resulting metric is not conveying any meaningful information. And higher numbers mean its not necessarily worse; it just means the program is more complex, which may be due to more functionality.

Right way to compare it - Level the scope of the frameworks (e.g. add SQLAlchemy to pyramids) and compare again. Bring all to equal levels.

https://en.wikipedia.org/wiki/McCabe_Metric


I.e. wheezy.web doesn't use middleware + tags for csrf. It is Python3 without 2to3 crap. And I agree with Aaron Swartz that framework must handle input, output and how to work with server. Logic, templates, databases must be free to choose. Wheezy.web is a set of modules here so we must not install bunch of stuff like Django for something simple. And we can use all of it for something difficult.


...what if having this complexity "packaged" in the framework prevents it for cropping up in your application code? (yeah, this usually happens when you do things "the framework way", but most web apps are pretty unoriginal so picking a framework that "fits" and walking "it's way" is a good idea most of the time...)

...though it would be interesting to see this metric applied to a Ruby framework like Rails




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: