Hacker News new | past | comments | ask | show | jobs | submit login

I love Django / Django Rest Framework and have used it for a long time, but we recently dumped it from a project in favor of FastAPI.

There is just so many layers of magic in Django, that it was becoming impossible for us to improve the performance to an acceptable level. We isolated the problems to serialization / deserialization. Going from DB -> Python object -> JSON response was taking far more time than anything else, and just moving over to FastAPI has gotten us a ~5x improvement in response time.

I am excited to see where Django async goes though. Its something I had been looking forward to for a while now.




We ended up with 2 python layers:

-- Boring code - Business logic, CRUD, management, security, ...: django

-- Perf: JWT services on another stack (GPU, Arrow streaming, ...)

So stuff is either Boring Code or Performance Code. Async is great b/c now Boring Code can now simply await Performance Code :) Boring Code gets predictability & general ecosystem, and Performance Code does wilder stuff where we don't worry about non-perf ecosystem stuff, just perf ecosystem oddballs. We've been systematically dropping node from our backend, where we tried to have it all, and IMO too much lift for most teams.


Similarly, we ended up doing the same. Boring CRUD/CMS stuff is all in Django. That's 90% of our codebase and by far the most important. Our "user scale" endpoints are all implemented in Lua in NGINX and just read/write to Redis and data changes go into SQS and processed by Celery back in the Django app. It scales phenomenally well and we don't lose any of the great things about developing all of our core biz-critical stuff in Django.


"Async is great b/c now Boring Code can now simply await Performance Code" - that's really smart, I like that philosophy.


I like this idea. Also I am looking at a separate GraphQL stack alongside Django for flexible access points.



That is quite interesting. There are a lot of things like management, tests, and such that I love and miss from django. Going to have to really think about what I think of this.

Edit: Although, now that I think a little more its not that surprising. Our initial tests did literally just define FastAPI schemas on top of our existing DB. The co-mingling while actually running is an interesting concept though.


Just FYI, for anyone reading this and having the same problem, I suggest they try Serpy which is a near drop-in replacement for default DRF serializers. It might solve your performance problem without having to switch to a completely different API framework.


Thanks! I'll check it out. I also seeing pretty bad performance when deserializing complex objects.


We did look into it. But didn't end up going that route.


Django is obviously surpassed in raw performance for more basic applications and APIs, but there definitely isn’t a lot of “magic” in Django.


I recently wrote a python api for work and used FastAPI, I want to like it, but it was doing so much magic behind the scenes that it ended up being frustrating to use and just got in my way, ended up dropping it in favour of using Starlette directly


What didn't you like about it? Curious where our pain points will be. Only been using it about 6 weeks.


The way it tries to construct the return values kept getting in the way.

I’d define the class, add it as the return value, I was manually instantiating the class and returning that, but for it didn’t like that and would constantly throw errors about it. I think it was PyDantic which was the root cause there.

The Depends functionality refused to inject my classes as well, but I was probably doing something wrong there...

Dropping back to Starlette was good because it gave me everything I needed and got out of my way. I’ve still got everything fully typed and passing MyPy.


If your DB is Postgres and you can do everything you need to fetch the data in SQL, Postgres can output JSON directly. It’s pretty fast at it. Usually it’s not too hard to do this on a few performance-sensitive endpoints in a framework web project.


For performance-sensitive endpoints, Django just isn't the right tool. You can do a lot of optimizations in Django but in reality the WSGI/ASGI overhead and Django's request routing through middleware and view functions or CBVs is extremely slow. Is anyone handling 1,000 requests/second in their Django app without having to run 50 servers? The answer is no. If you're getting to the point where you're trying to figure out how to emit JSON from your database directly, then you've already lost. Django is exceptionally well suited to exactly what it was originally designed for: a content management system and "source-of-truth" for all of the business data in your application. High-velocity "user-scale" is better done in another service.


Not just Django, but Python.


Interesting.

Have any interest in expanding this into a blog post? I've been working on a similar post. Maybe we can compare notes. I'm at michael at testdriven dot io, if interested.


Django Rest Framework has really slow serialization. After seeing it in action, I wrote my own simple serializer that I have been using quite a bit. Deserialization isn’t event really needed: just feed the submitted JSON into vanilla Django forms. It works better anyways.


What did you use in place of the DRF serialization to get from DB -> json response?


FastAPI uses Pydantic under it for python objects. And we have been tinkering with orjson for the actual json serialization, since it appears to be the winner in json serialization at the moment.


Why didn't you use Pydantic with Django if the DRF serializers were too slow?

You can also skip the object serialization from the ORM and work with python dicts directly to significantly improve serialization performance from the database.


Was there still a significant speedup using the standard library json module?

For the DB requests Are you writing sql directly, using a different ORM, or something like sqlalchemy core that makes sql pythonic without being an ORM?


Yeah, the main improvement was seen even before playing with orjson. It did help too I think, but only started it yesterday so haven't actually profiled the two side by side. To have real numbers.

And it uses SQLAlchemy under the hood. Can use all of it. But if you want full async all the way down, can just use core and something like encode/databases for the DB access.


How detailed was the profiling on this? Reason I ask is I’ve faced this myself and had to spend a lot of time on both query and serializer optimization.


We used `silk` a lot to profile the app. And basically all the time was being spend inside django somewhere between getting the data from the DB and spitting out the response. We would have things like 15ms in the DB, but 250ms to actually create the response. On simple things. Some of our responses were into multiple second (large amounts of data) but still only spending maybe 150ms in the db. And there was at least two weeks spent on and off trying to improve it before we finally decided we had to go somewhere else. And thats after having to redo some of our queries by hand because the ORM was doing something like 15 left joins.


You might be interested in https://hakibenita.com/django-rest-framework-slow. If you weren't able to update to a 3.x version that has https://github.com/django/django/commit/a2c31e12da272acc76f3..., this might have bit you pretty hard.


I'd be curious to know more about those 15 joins. Why do you think the ORM was doing those ? And what DB are you using ?


Basically just a complex permission model based on relationships. Much better handled with a subquery. Mostly on is. I don't blame the ORM entirely, but it was more joins than necessary too.


Isee. Would love to find an elegant way to use PostgreSQL permissions system from Django that would result in a great perf boost, no doubt.


So how does it get from DB --> JSON response? SQLAlchemy or dbapi?


Yeah, FastAPI uses SQLAlchemy under it. Along with pydantic to define schemas with typing. And then just started tinkering with orjson for the json serialization. Seems to be the fastest library at the moment.

I have also been experimenting with encode/databases for async DB access. It still uses the SA core functions, which is nice, but that means it does not do the nice relationships stuff that SA has built in when using it to handle everything. At least not that I have found. However it does allow for things like gets without relationships, updates of single records, and stuff like that quite nicely.


I see, thanks. Is it required to define models twice as this page seems to recommend?

https://fastapi.tiangolo.com/tutorial/sql-databases/


FastAPI is database agnostic, although tutorials talk about using SQLAlchemy (probably because it's most popular).

I am using asyncpg[1] (much more performant and provides close mapping to PostgreSQL, making it much easier to use its advanced features) through raw SQL statements without problems.

[1] https://github.com/MagicStack/asyncpg


Yeah, you do end up defining everything more than once, once for SA, and then for pydantic. Create, Read, and Update may all be different pydantic models as well. They are for defining what comes in and out of the actual API. Your create request may not have the id field yet and some optional fields, and then the response has everything. And then an update may have everything as optional except the id. Only been using it a few weeks now, but liking it a lot so far.

https://fastapi.tiangolo.com/tutorial/sql-databases/#create-...


> FastAPI uses SQLAlchemy under it

This is somewhat inaccurate. They use SQLAlchemy in the tutorial, but FastAPI is in no way tied to SQLAlchemy.


Valid point. Should have said "can use".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: