Old box, dumb code, few thousand connections, no big deal

kragen · on May 7, 2020

Rachel presumably wrote her server in a reasonable language like C++ (though I don't see a link to her source), but when I wrote httpdito⁰ ¹ ² I wrote it in assembly, and it can handle 2048 concurrent connections on similarly outdated hardware despite spawning an OS process per connection, more than one concurrent connection per byte of executable†. (It could handle more, but I had to set a limit somewhere.) It just serves files from the filesystem. It of course doesn't use epoll, but maybe it should — instead of Rachel's 50k requests per second, it can only handle about 20k or 30k on my old laptop. IIRC I wrote it in one night.

It might sound like I'm trying to steal her thunder, but mostly what I'm trying to say is she is right. Listen to her. Here is further evidence that she is right.

As I wrote in https://gitlab.com/kragen/derctuo/blob/master/vector-vm.md, single-threaded nonvectorized C wastes on the order of 97% of your computer's computational power, and typical interpreted languages like Python waste about 99.9% of it. There's a huge amount of potential that's going untapped.

I feel like with modern technologies like LuaJIT, LevelDB, ØMQ, FlatBuffers, ISPC, seL4, and of course modern Linux, we ought to be able to do a lot of things that we couldn't even imagine doing in 2005, because they would have been far too inefficient. But our imaginations are still too limited, and industry is not doing a very good job of imagining things.

—

⁰ http://canonical.org/~kragen/sw/dev3/server.s

¹ http://canonical.org/~kragen/sw/dev3/httpdito-readme

² https://news.ycombinator.com/item?id=6908064

† It's actually bloated up to 2060 bytes now because I added PDF and CSS content-types to it, but you can git clone the .git subdirectory and check out the older versions that were under 2000 bytes.

jefurii · on May 8, 2020

> I feel like with modern technologies like ... we ought to be able to do a lot of things that we couldn't even imagine doing in 2005...

As a self-taught programmer I would say that what all these less efficient bit easier to learn technologies have done is enable people like me who evidently are not geniuses like yourself to write software. Should programming always be an ivory tower thing?

closeparen · on May 8, 2020

It does take geniuses to design and operate the incredibly complex cloud-native distributed systems we all insist on doing now.

It’s fair to point out how far you can get just programming one computer using traditional and well understood concepts like sockets and threads. And how weird it is that we live in a world where Kubernetes is mainstream and fun but threads are esoteric.

kragen · on May 8, 2020

I'm not a genius. I started programming in BASIC. I haven't had a lot of schooling: I haven't had a computer science class since I was 12, and I didn't finish high school. I just didn't stop learning.

Adding unnecessary complexity doesn't always make things easier to learn.

_4ziu · on May 8, 2020

Very well put. It's also a matter of use cases. These people seemingly implement servers for big companies. I just make shitty websites. We aren't their target audience yet it comes off as 'this is what everyone should be doing'.

rcxdude · on May 8, 2020

I think part of the point of this is that this approach isn't particularly complicated (writing in assembly is unecessary, but everything else described in both the article and the comment above is basically the simplest way to make a webserver).

highfrequency · on May 8, 2020

> single-threaded nonvectorized C wastes on the order of 97% of your computer's computational power

Can you elaborate on what this means exactly? For example, is there some reasonable C code that runs 33 times slower than some other ideal code? In what sense are we wasting 97% of our computer's computational power?

augustt · on May 8, 2020

A good example of getting a ~3000x speedup from naive matrix multiplication in C here (slides 20 onward): https://ocw.mit.edu/courses/electrical-engineering-and-compu...

Includes a 9-level nested for loop, which is always great to see.

kragen · on May 9, 2020

Thank you very much for posting this!

Roughly that 3000× is 18× from multithreading, 3× from SIMD instructions, 15× from tuning access patterns for locality of reference, and 3× for turning on compiler optimization options. This is a really great slide deck!

I was assuming "single-threaded nonvectorized C" already had compiler optimization turned on and locality of reference taken into account. As the slide deck notes, you can get some vectorization out of your compiler — but usually it requires thinking like a FORTRAN programmer.

So I think in this case reasonable C code runs about 54× slower than Leiserson's final code. However, you could probably get a bigger speedup in this particular case with GPGPU. Other cases may be more difficult to get a GPU speedup, but get a bigger SIMD speedup. So I think my 97% is generally in the ballpark.

A big problem is that we can't apply this level of human effort to optimizing every subroutine. We need better languages.

mratsim · on May 10, 2020

That's why you have people working on Halide, Taichi, DaCe, Tiramisu.

- https://halide-lang.org/

- http://taichi.graphics/

- http://spcl.inf.ethz.ch/Research/DAPP/

- http://tiramisu-compiler.org/

This way you can have a researcher implementing the algorithm (say bilinear filtering) and a HPC expert who tunes it with parallelism, SIMD, tiling.

I wrote an overview of most DSL for high performance or image processing in this issue: https://github.com/mratsim/Arraymancer/issues/347#issuecomme...

kragen · on May 10, 2020

This is great! Which of these do you think could be extended to general-purpose programming without the HPC expert? Taichi and DAPP seem to be aimed at that goal, but you seem to be implying they don't reach it yet?

mratsim · on May 11, 2020

You can use them without the HPC expert, Halide for example has a good autotuner and has been used by Google and Adobe to create image filters for mobile devices.

kragen · on May 12, 2020

Thank you!

umvi · on May 8, 2020

I think it's referring to simd instructions [0].

I'm still kind of a newb myself but from what I understand these are special CPU instructions that allow you execute the same instruction in parallel against multiple data points. This allows you to eke out a lot more performance. It's how simdjson[1] is able to outperform all other C++ json parsers.

[0] https://en.wikipedia.org/wiki/SIMD

[1] https://github.com/simdjson/simdjson

kragen · on May 8, 2020

8 cores times 4 SIMD lanes is a 32× speedup; that's where "97%" comes from, as explained in the note I linked to.

It's pretty variable: some things we haven't figured out how to speed up with SIMD, sometimes we have a GPU, sometimes we can get 8 or 16 SIMD lanes out of SSE3 or AVX128 or 32 of them out of AVX256, sometimes you only have four cores, sometimes make -j is enough parallelism to win you back the 8× factor from the cores (though not SIMD and GPGPU). But I think 97% is a good ballpark estimate in general.

sagarm · on May 8, 2020

A realistic speedup to expect from vectorization is probably closer to 10x. Other overheads start to dominate once you've optimized your inner loops.

It's also not reasonable to expect to vectorize everything; for example a web server is unlikely to benefit from vectorization.

kragen · on May 8, 2020

Just to clarify, I was only estimating a speedup of 4× from vectorization, while the other 8× comes from multithreading.

Fifteen years ago we thought regular expression matching and compilers were unlikely to benefit from vectorization, but now we have Hyperscan and Co-dfns, so they did. So I think it's likely that we will figure out ways to do a wider variety of computations in vectorized ways, now that the rewards are so great.

cellularmitosis · on May 8, 2020

A great talk which gets you thinking in this mindset is "Data-Oriented Design and C++" by Mike Acton https://www.youtube.com/watch?v=rX0ItVEVjHc

As an example, if you are checking a boolean flag (1 bit) on an object, and it ends up being a cache miss (and x86_64 cache line size is 64 bytes), then your computer just went through all the expense of pulling in 512 bits from RAM yet it only used 1 of them. You are achieving 0.2% of the machine's possible throughput.

neurostimulant · on May 8, 2020

I imagine if you could make the most out of vector instruction set in your code (where they can operate on a vector of data at once instead of one by one), you'll get a huge performance boost for "free". GP seem to be working on a vm that let you do that (a lot of it was flying over my head though, need some coffee).

pjmlp · on May 8, 2020

Despite all my rants here about C, on my travel netbook I switched to XFCE as I could not stand the performance impact of all JavaScript and Python based extensions on GNOME.

rantwasp · on May 8, 2020

if you're going to serve static files, nginx absolutely leaves this in the dust (think 50k rps on a beefy machine)

kragen · on May 8, 2020

I would expect so, but did you mean 500k or something? "50k rps on a beefy machine" sounds like about the same as, or maybe even a bit slower than, 20k–30k on this 2011 laptop, which was how fast httpdito was last time I measured it.

rantwasp · on May 8, 2020

went back and looked at it. a webserver written in asm for the lols is okay but my point was that you probably want a proven, battle ready web server (along the lines of nginx or apache) if running something in production. So, 50k rps on a vanilla well used/well maintened server > 50k rps on an experiment (and don't get this wrong, it's pretty impressive for what it is)

kragen · on May 8, 2020

Yeah, I definitely wouldn't advise anyone to run httpdito in production. It's so lacking in observability that it doesn't even log hits, it doesn't do timeouts, and its MIME types are configured by writing in assembly language. But it shows that some surprising things are possible. And it can be handy to serve up some static pages from your laptop to your phone or whatever.

rantwasp · on May 8, 2020

https://gist.github.com/willurd/5720255 Each of these commands will run an ad hoc http static server in your current (or specified) directory, available at http://localhost:8000.

kragen · on May 8, 2020

Yeah, httpdito is similar to those, but maybe with less security holes and definitely with less dependencies.

bdavis__ · on May 8, 2020

of course it does !!

bjt · on May 7, 2020

Whether intended or not, there's an undercurrent of "you're all so dumb for using Python" (or Ruby, or PHP, or other similarly performant language) here. I want to surface that and question it a bit.

It's totally reasonable for a company to choose the Python/Gunicorn option if they already have a bunch of people who know Python and they don't need to serve tons of requests per second.

Even if they do need to serve tons of requests per second, it's totally reasonable for them to still choose Python/Gunicorn if the cost of the additional servers is less than the cost of having to support multiple languages. Or if they get a lot of value from libraries that are unique to the Python ecosystem. Or if they care more about quickly iterating on features than driving down server costs.

I agree that there's a point where it stops making sense, and there are plenty of engineers who don't recognize when they're past that point because they keep doubling down on sunk costs and things they're familiar with. But let's not be too quick to assume people are in that camp when we don't know all the tradeoffs they're facing.

theelous3 · on May 8, 2020

I don't really get what the point of this post is. Is it really a dig at python? Python can handle thousands of connections in a single thread no problem with basic enough stuff.

Is what the author did supposed to be impressive? Is it supposed to make python look bad?

I don't get it. Seems like run of the mill stuff. Python might struggle at the same level of concurrency (was it like 15k?) but you can still do 10k connections easy enough iirc.

WJW · on May 8, 2020

I made a post some time back about maintaining 64k connections in a Ruby process a while back: https://www.wjwh.eu/posts/2018-10-29-double-hijack.html . It only stopped at 64k because I could not be bothered to rig up multiple IPs so it eventually ran out of ports.

Just having a lot of connections that do trivial stuff is not very difficult. It becomes way more interesting when all of those connections need to access shared data structures and whatnot.

rejschaap · on May 8, 2020

I read it more like: "what if we just use threads instead of avoiding them at all cost and complicate things to get concurrency"

theelous3 · on May 8, 2020

Which I read as "I have no experience whatsoever with modern python, and async is something that catches water".

Async is typically less prone to error and complexity than thread code, and also typically faster/lighter for io (in python). I can't see a reason for this not to be the case in other languages too.

dnautics · on May 8, 2020

Note: these opinions are mine

In terms of error-proneness, I would say the hierarchy is this:

Actors < CSP ~ Async << Threads

In terms of "getting started" difficulty, I would put the order like this:

Async < Actors ~ CSP < Threads

In terms of first-order maintainability in large projects I would put the order like this:

Actors >> CSP ~ Threads > Async

Async is on its face easy to grok, and saves you a ton of problems with locking and the like, and you can run it on a single threaded system with the right type of dispatcher. However, small amounts of complexity rapidly devolve into towers of async calls that cannot be untangled from a spaghetti pile, and there's not really satisfying ways of figuring out how to unwind error-handling in async, except for the basic case of "I truly don't care if this async fails".

I will say though, it's spectacularly easy to write messy and garbage code in all four of these concurrency models (I certainly have). My preference for actors comes from four opinions:

- some systems (if you want to be flippant, microservices architecture) use actors to encapsulate failure domains, which is really fantasic, and truly the #1 reason to use actors

- 99.8% of the time no need to write mutexes, and good actor systems basically won't deadlock unless you really try hard. (also true for async system btw)

- gives you an organizational framework to write well-designed and well-engineered systems.

- with a small amount of discipline "not having a spaghetti ball" scales with complexity (I find it takes a lot of discipline to not have a spaghetti ball with async in the more complex cases)

skrtskrt · on May 8, 2020

The async stack traces I've come across in Python are extremely simple to grok. I've never had something I couldn't figure out. (Using aiohttp, aoipg, asyncpg).

I can't even say the same for synchronous Django. Sometimes it's just the quality of the tool you're using, not the higher-level concept it implements.

dnautics · on May 9, 2020

You can trivially implement async on top of actors, (like exists in Elixir's Task module), and generally most async in elixir should go through that model. But if you're in python-land you really don't know what you're missing. In elixir, my tests autopartition the state so that each test exists in it's own "universe in the multiverse", so I can run concurrent, async integration tests with stubs, database transactions, even http requests that exit the vm (through chromedriver) and come back, finding their own correct partition of the test mockset and database state, and I have a few custom extensions to the multiverse system like process registries, and global pubsub channels. We're talking hundreds of highly concurrent integration tests that run through completion in seconds.

anonymoushn · on May 8, 2020

It seems like doubling down is the standard thing to do. Here's a video about how Instagram bugs engineers to do fewer string manipulations in Python instead of using a faster language https://youtu.be/hnpzNAPiC0E

bsder · on May 8, 2020

You ... are not ... Instagram.

Okay. Get over it.

Python is fine. Threads are fine. Go is fine. Clojure is fine. Java is fine. PHP is ... okay I won't go that far. :)

You fix the problem when the cost required to fix the problem is finally less than the opportunity cost of fixing something else.

anonymoushn · on May 8, 2020

I would prefer not to deal with "scaling up" for workloads that could run in a single process on commodity hardware.

I've mainly worked at companies that would be bankrupt if they used Python the way Instagram does.

bsder · on May 8, 2020

Django and Python somehow handled websites for entire newspapers on far wimpier hardware just fine--which makes me wonder if "cloud" (ie. non-deterministic memory and I/O accesses due to sharing with other tenants) isn't the problem rather than Python.

You need a market. You need paying customers. You need cashflow. You need features. You need a business plan.

You don't need scaling. Ever. To first, second and third order approximations.

Your company has a higher probability of bankruptcy than needing scaling.

earthboundkid · on May 8, 2020

Newspapers run everything behind a CDN, so Django isn’t a bottleneck. The things they do that can’t just be CDN buffered (like ad targeting and paywalls and comments) end up being supplied by outside vendors. I think Django fits well for a news org, but it’s good to understand why it fits well.

nexuist · on May 8, 2020

PHP is fine too. Check out https://laravel.com/ and tell me that doesn't look like a joy to work with.

franga2000 · on May 8, 2020

I just got roped into a project with the promise of Django, but the project lead ended up deciding to go with Laravel, so I've been looking into it for a few days.

It's a resounding no from me. If you're stuck with PHP, sure, this beats the WordPress style of...well...I can't really find words to describe how bad it is..., but it's still miles behind anything else. It full of strings and other things that you just have to memorize with IDE integration even with specialised extensions still worse than other languages with standard IDEs.

zippoxer · on May 8, 2020

"String-programming" is mostly optional. Modern Laravel is introducing type-safe alternatives. For instance, instead of doing validation like:

  'gender' => 'in:male,female,other'

You can now do:

  'gender' => Rule::in(['male', 'female', 'other'])

Which, coming from Go, I very much prefer.

Modern PHP is pushing for type-safety (via type-hinting) and Laravel is following this direction as well.

phreack · on May 8, 2020

How much of the old "A fractal of bad design" post would you say still applies? That one post showcased so many footguns right at the language level that it spooked me away from it forever.

smacktoward · on May 8, 2020

You can get a good sense for what "modern PHP" looks like by browsing through PHP The Right Way (https://phptherightway.com/).

PHP has grown up a lot over the last ten years.

neurostimulant · on May 8, 2020

A lot of those are still applies, but these days you can ignore those bad parts and only use the new good parts just like how you would use javascript. Still, I only use php occasionally but I wish it's not as eager in treating string as number in many cases.

RicardoLuis0 · on May 8, 2020

if you have an hour to spare, this (https://www.youtube.com/watch?v=wCZ5TJCBWMg) is a great talk by Rasmus Lerdorf, the creator of PHP, on the design of the language, the reason it ended up the way it did, and the ways that it is evolving for the better.

bschwindHN · on May 8, 2020

I took a look, doesn't seem like a joy to me.

It's too stringly-typed for my tastes, seems like a lot of errors you can make will only be caught at runtime.

pjmlp · on May 8, 2020

And it has JIT support out of the box, while PyPy still needs to fight for adoption.

rumanator · on May 7, 2020

> Even if they do need to serve tons of requests per second, it's totally reasonable for them to still choose Python/Gunicorn if the cost of the additional servers is less than the cost of having to support multiple languages.

How hard is it to get up to speed on any other tech stack? ASP.NET Core is extremely fast and the learning curve is close to none, for example.

If someone was able to wrap his head around backend development with Python I'm pretty sure they have the mental fortitude to onboard a tech stack that doesn't suffer from major performance problems.

neurostimulant · on May 8, 2020

That's because I'm more productive with Python (been using it for ten years to implement many backend services), never hit any of its performance limitation yet (I mostly use it to develop b2b app with medium traffic at most, and always leverage distributed task queue for anything that might excessively blocks the webserver process), and also didn't want to maintain a fleet of windows servers (at least in the past when .net was windows only).

tetha · on May 8, 2020

Writing code is one thing though.

Now you have to figure out how to run a CI on a build server. Deployment on your production systems, be it containers or even not. Monitoring, alerting, profiling, tuning under load. You have to support a new database connection library with new shenanigans. In general, you need to integrate the new language into the existing ecosystem. The latter may even be impossible depending on the stack and solution chosen. Just look up those weird Java only caching servers.

All of that is possible, sure. Due to company acquisitions and specialized teams in some areas, we're kinda running the full bingo card of languages.

But there's little denying: We overall spent less time handling language runtimes and language-specific monitoring back when we were java+mysql and that's it.

rumanator · on May 8, 2020

> Now you have to figure out how to run a CI on a build server. Deployment on your production systems, be it containers or even not. Monitoring, alerting, profiling, tuning under load. You have to support a new database connection library with new shenanigans.

Most if not all of those items are either trivial or non-issues.

In fact, I would argue that deploying a Python app is a far more convoluted process than getting an ASP.NET Core app up and running.

With Docker, the problem simply disappears.

Getting it to build and test on a CICD pipeline is as hard as typing $ dotnet build, or $ dotnet test.

> All of that is possible, sure.

Not only it is possible, it's laughably easy.

We are supposed to avoid our problems, not perpetuate and aggravate them as eternum because we are too lazy to look for ways to make our life easier.

pmoleri · on May 8, 2020

Learning curve close to none? I like that many times there's "the way" of doing things, but when I have to do something slightly outside the recommended way, I feel that the effort required outweights all the benefits.

tarsinge · on May 8, 2020

“The way” of doing things depends on the language, and for example for me Python is very hard. It’s not my main stack but I use it regularly and after years I still can’t find "the way" of doing things by myself, I feel like I end up on stackoverflow way too often because of anxiety of not being idiomatic. It’s like there is an enormous meta knowledge very specific to the language. That’s not something I experience with other stacks.

rumanator · on May 8, 2020

Isn't that true for all frameworks?

heelix · on May 8, 2020

I'll find out myself. Just got handed a codebase that is a mix of Python/Gunicorn/node when this sprint started off. I've seen the word gunicorn in the repo... but still don't know what it does yet. First time for this old back end Java/C++/XQuery programmer, so how hard can it be?

(On the plus side, seems my Jetbrain kit includes PyCharm, so I've even got an IDE!)

neurostimulant · on May 8, 2020

Gunicorn is just a WSGI server, basically used to spawn a pool of webserver processes for your python backend. A python webserver is more optimal when used in multiprocess configuration (as opposed to multithreaded configuration, where python sorely suck at), and gunicorn will do that for you automatically, routing each http request to available worker process in the pool.

pdonis · on May 8, 2020

> Gunicorn is just a WSGI server, basically used to spawn a pool of webserver processes for your python backend.

Or a pool of asynchronous "green threads" using any of a number of libraries (gevent is the one I'm most familiar with). The thing to avoid is mixing the two (multiple processes and green threads).

> A python webserver is more optimal when used in multiprocess configuration

For CPU bound worker tasks, yes, this is true. For I/O bound applications, not so much; asynchronous I/O can handle the same I/O load with much fewer resources (particularly as forking Python processes uses a lot more memory because so much of the memory in the Python interpreter is dynamic, so you don't get a lot of benefit from what in a compiled language would be shared read-only code that doesn't need to be copied for every fork).

> (as opposed to multithreaded configuration, where python sorely suck at)

Yes, the limitation of the GIL is one of Python's worst warts. (It looks like there are finally efforts to remove it, but it's taken a long, long time.)

neurostimulant · on May 8, 2020

Ah yes, I haven't used gevent for years and now associate gunicorn with wsgi. I have far easier experience with nodejs and golang when I need to do async parts on the backend (usually for websocket stuff), then use some message queue/passing system with celery or zeromq to communicate back and forth with the python backend.

vidarh · on May 8, 2020

A lot of developers never get proficient on one stack. Much less more than one.

rumanator · on May 8, 2020

You don't need to master one stack to get up to speed. For that investment to pay off you only need to be almost as good as you were with your old stack, which isn't hard if you already are not proficient.

vidarh · on May 9, 2020

I'd argue that time would be better spent actually learning any single stack properly.

stev_stev · on May 7, 2020

This is where my head always goes as someone who does mostly Asp.Net Core. Why is it always between something like C++ or something like Python? Nowadays with middleware and endpoint routing, asp.net core can be almost as simple as flask (even if that’s not idiomatic or what the docs show you).

xeroalt · on May 8, 2020

I’m most comfortable in .net core but I recently learned Django because I kept hearing how productive it was from HN and all the startups around me use it, but Im starting to feel like I chose to move the wrong way. I will say I like how Postgres is the default for python and the library situation is much better over there as well.

pdonis · on May 8, 2020

I've tried working with Django and it's always seemed much too heavyweight to me. Flask is my preferred web framework for Python; much easier for me to use and feels like it's helping me where needed and getting out of my way where needed, not weighing me down.

xeroalt · on May 8, 2020

Yeah I actually wrote a prototype in Django and flask and just felt like I was reinventing the wheel too much with flask. e.g. marshmellow = serializer from DRF, sqlalchemy = django orm, etc. I wrote password hashing with bcrypt for user logins and then realized I should salt the passwords and just went back to django since it already has stuff for everything I was doing.

I actually don't mind the heavy framework thing because .net core mvc is the same deal. The developer experience in python was just way worse for me. Autogenerating swagger docs didn't seem possible without manually adding annotations to all my code (for flask at least), the battle-tested python libraries for web dev aren't async, mypy is obviously way worse than an actual type system, model validation (the thing forms/serializers handle) was more tedious, vscode was worse than visuals tudio.

raverbashing · on May 8, 2020

Django is very good if you want your end-site to be "CMS like". It comes will a lot of accessories to help in that goal.

Flask is lighter and it is more DIY but easier if you have something that is more generic.

xeroalt · on May 8, 2020

Yeah I've heard that before, but from my (admittedly brief) time it seemed like you could just use the utilities you want and ignore the stuff that forces you into a CMS'y box. Feels like if you really wanted to you could just use a form/serializer to validate input data and to return a domain object, pass the domain object to a service layer, and then use the django orm (or honestly anything if you really want to stray from the django way), to handle database interactions. I understand flask might be nicer because you get to feel like you dont have a bunch of wasted code bloat in your deployable, but there are some pluggable systems that come with django that are nice (user system, pluggable authn/authz, caching framework, etc.) I didn't really mind the bloat because a lean deployable isn't that important to me and it all felt sluggish to me compared to .net core anyway.

raverbashing · on May 9, 2020

Yes you can definitely pick and choose the parts of Django you want. Sometimes they have interdependencies but you don't need to use it all. Or for example, use Jinja instead of its templating system

thdrdt · on May 8, 2020

I am a little confused by your post. Moving to Django is the wrong way?

xeroalt · on May 8, 2020

Yeah Django feels terrible to me compared to .net core. I already wrote a similar answer to someone else so I will just paste it here

"The developer experience in python was just way worse for me. Autogenerating swagger docs didn't seem possible without manually adding annotations to all my code (for flask at least), the battle-tested python libraries like django and sqlalchemy aren't built for async, mypy is obviously way worse than an actual type system and having static types really helps with understanding a large new code base, model validation (the thing forms/serializers handle) was more tedious, vscode was worse than visual studio (code navigation was very hard to do in python. I could jump 1 layer into library code, but when I tried to jump deeper vscode couldn't find anything.)"

I've seen people say you should only choose rails or django for a startup, but .net core mvc provides the same batteries included approach and is built with the modern web ecosystem in mind. It also runs on linux and easily integrates with postgres so its just as cheap now as well. I think people that write off C# haven't really worked with it in its present form. I didn't find python significantly more terse than C# (other than having to define properties on types, which I already stated I prefer), just less feature rich.

pjmlp · on May 8, 2020

Same here, for me it has been mostly about .NET, Java or anything else with JIT/AOT compilers out of the box, after my AOLServer like experience.

Back then we already had Zope and it wasn't blazing fast.

flyinglizard · on May 8, 2020

Or Azure Functions if you want to go the serverless route. Coupled with Visual Studio, you can build high quality, large scale apps at about the same effort of blinking a LED on Arduino.

freeone3000 · on May 8, 2020

Are we using the same Azure Functions? I spend more time messing with functions.json and determining binding types and waiting five minutes for error messages to come in than I ever did when I simply provisioned an app service and deployed code.

xeroalt · on May 8, 2020

Feel the same way with lambda.

turtlebits · on May 8, 2020

I disagree. I tried to learn F# a few months ago and the experience was horrible.

I followed the tutorial using the aspnet CLI tool and got nothing but errors. I Google’d for a while until I gave up.

rumanator · on May 8, 2020

> I tried to learn F# a few months ago and the experience was horrible.

Switching to a programmig language based on an entirely different programming paradigm is not comparable to switching to a language based on the exact same programming paradigm to develop the exact same application using the exact same design patterns.

turtlebits · on May 8, 2020

I’m completely comfortable with the language coming from Elixir. The problem was the tooling/docs didn’t match the experience.

This was from the official site and many of the commands threw errors.

I think it was a case of incomplete Mac support, which should have been called out.

IIRC I posted on HN about it and got an apology for the state of tooling

tasogare · on May 8, 2020

Of course it’s gonna be hard if you are trying to learn a language plus a framework at the same time, especially given that F# is treated as a second class citizen by Microsoft. An easier path is to break that learning into more manageable tasks: on one hand getting familiar with F# (which is already some work as it has both a functional side, and a CLR one; you need more the OOP part for doing ASP.net) and in the other ASP.net, for which the golden road is C#.

Otherwise the doc is quite good for ASP.net Core, and it’s rare I get stuck on a problem for too long with it.

jdc · on May 8, 2020

Can you link to the tutorial you followed?

29athrowaway · on May 8, 2020

1) .NET APIs change very frequently.

2) C# is a very verbose language, that requires a lot of typing.

3) F#, the best language in .NET, is largely ignored by the .NET community.

EdwardDiego · on May 8, 2020

I can't speak from recent experience, but on point 1) when I was teaching myself to code, I focused on .NET due to its prevalence in my local market. I took part in the .NET user group (called DUG, natch), attended the meetups etc.

And this was at the time when MS would announce a new blessed way to do things on a reasonably frequent basis. When I started, the blessed way to access data was DAOs, then it was ADO.NET (note, those could be around the wrong way, I have trouble figuring it out now in hindsight), then it was Linq2Sql, then that was deprecated for Entity Framework (which, I'll give credit, they seem to have stuck with, even if it does feel like a half-cribbed NHibernate).

I was frantically trying to learn the blessed thing, because the MS shops I was familiar with only used the blessed thing - the server was IIS, the database was SQL Server, the language was C# (I was the only member of the DUG who coded in F#, and few others knew of it), and you used the blessed patterns and the blessed frameworks.

Incidentally, this is why FOSS has had such a hard time in .NET, as soon as MS releases something that is reasonably feature complete, a lot of single-vendor minded companies switch to it.

And I met a lot of developers in their early 40s who were quietly terrified of getting left behind on the MS technology treadmill, and trying just as frantically to learn the new blessed thing as I was.

And then I got hired by a Java shop, and faced a paradigm where shit code cough java.util.Calendar, java.util.Date cough stuck around for yonks because it was good enough and replacing it had to be done very thoughtfully and gently.

Point #2 isn't super relevant in the age of IDEs, but I agree wholeheartedly that F# deserves a lot more love than it gets.

rumanator · on May 8, 2020

> 1) .NET APIs change very frequently.

ASP.NET Core 2.1 was released in 2018 and will be supported until late 2021.

ASP.NET Core 3.1 was released a few months ago and there is no end of support in sight. Moreover, the changes between 2.1 and 3.1 were not that many. I've migrated a whole ASP.NET Core 2.1 web service to 3.1 in less than 1 hour.

> 2) C# is a very verbose language, that requires a lot of typing.

Nonsense. The only added verbosity to C# when compared with Python are the type declarations, which arguably are a problem plaguing Python. The first class support for events and async programming and properties in C# more than make up for it.

> 3) F#, the best language in .NET, is largely ignored by the .NET community.

I fail to see what point you were trying to make.

flukus · on May 8, 2020

> ASP.NET Core 2.1 was released in 2018 and will be supported until late 2021.

Which is way too unstable, especially for the kind of corporate environment c# has typically been used in. Getting those places to upgrade to stable supported versions of the framework has always been a battle even when backwards compatibility was great, if they have to deal with breaking changes every few years they will never upgrade.

This is why so many companies stick with their ancient COBOL systems, most modern alternatives don't offer the stability they need.

rumanator · on May 8, 2020

> Which is way too unstable,

ASP.NET Core 2.1 is the LTS release of ASP.NET Core 2, which was released in 2017. I fail to see how a first class framework with a LTS that was released years ago can be described with a straight face as "way too unstable".

> Getting those places to upgrade to stable supported versions of the framework has always been a battle

ASP.NET Core 2 is stable since at least 2 or 3 years ago, depending on how you decide to count.

> This is why so many companies stick with their ancient COBOL systems, most modern alternatives don't offer the stability they need.

This assertion is simply wrong at so many levels. Don't confuse "why waste money maintaining working software" linesof reasoning as a sign of respect for stability.

More importantly, it's disingenuous to even think of the technical debt that keeps cobol on the map as relevant to the world of web services.

flukus · on May 8, 2020

> I fail to see how a first class framework with a LTS that was released years ago can be described with a straight face as "way too unstable".

I fail to see how you can call 2 years of support an LTS with a straight face, it's taking the piss out of the term. The LTS of the OS I'm likely to run it on is supported for 8 years. 2 years isn't even enough time to finish many projects on the same LTS it started on.

At work we've got 30 year old c/c++ code bases that still run, they'll probably run for another 20 at least, we've got 20 year old python code that still runs (for now) and we've got 20 year old c# projects that still run. That last one will never get rewritten in .net core, in part because they've pissed away the stability the framework had. It would be crazy to use tools with 2 years of support for any of those projects.

> Don't confuse "why waste money maintaining working software" linesof reasoning as a sign of respect for stability.

Why should they waste money maintaining working software when there are stable options available? What does upgrading to asp.net core get them? Why should tens of thousands of companies waste money modifying working software just because someone on the core team thought the existing API was inelegant or too hard to maintain compatibility?

29athrowaway · on May 8, 2020

Compare C# with Kotlin or Scala and you'll find that involves a lot of redundant, boilerplate code.

pjmlp · on May 8, 2020

I wonder when was the last time you used C# then.

Rury · on May 8, 2020

The strengths of any programming language can also be their weaknesses, so there isn't really a "best" language.

Notice a language can't be everything. They're either too verbose, too terse/cryptic, or trade ease of development for lack of control/performance/efficiency.

IMO, C# strikes a good middle ground on such matters...

pjc50 · on May 8, 2020

C# doesn't require all that much typing if you're using Visual Studio, and even less if you have Resharper.

29athrowaway · on May 8, 2020

And you mention this as if it was a good thing.

pjmlp · on May 8, 2020

It is, not using IDEs in 2020 is like cutting trees with a knife.

You will eventually cut it down, but I see better ways to use my time.

PeterisP · on May 8, 2020

It does not take that much time to get up to speed on another tech stack, but it does take some time, and servers are really cheap. Even if the learning curve is close to none, it's just cheaper to rent a dozen machines than have an engineering team spend a day or two looking into a new ecosystem.

hedora · on May 8, 2020

> How hard is it to get up to speed on any other tech stack?

If I find myself debugging python tools, I usually just add debug statements to figure out WTF it is trying to do, and reimplement it in bash. It invariably is less than 10% as many lines of code, and also more debuggable / readable than the original.

Granted, most of the python scripts I see these days are build processes or cluster coordinators.

EdwardDiego · on May 8, 2020

> reimplement it in bash

> more debuggable / readable

More debuggable / readable for you, perhaps.

arpa · on May 8, 2020

Hey, for me too. But then again I used to write bash frameworks for fun so there's that.

pjc50 · on May 8, 2020

Given how badly bash goes wrong if spaces in filenames sneak in, or if you want to do error handling, at several places I've had a policy of rewriting scripts from bash into python. It has the advantage that it's usually a better cross-platform solution than running bash on Windows.

pdonis · on May 8, 2020

> It's totally reasonable for a company to choose the Python/Gunicorn option

Yes, if done properly. The issues with the particular Python/Gunicorn setup the author described in an earlier article (linked to in this one) were not so much Python/Gunicorn issues as "not understanding how to properly use Python/Gunicorn" issues, or more generally "not understanding how the tool you are trying to use actually works" issues. (I actually shudder to think what such a group would have done trying to program the same application in C.)

mdszy · on May 8, 2020

Twitter ran on Rails until 10 million users.

randomdata · on May 8, 2020

Missing an order of magnitude? Twitter was claiming to be still running Rails in the mid-2010s, when they apparently had 200+ million active users. The service was finally sunsetted during the big Scala rewrite.

They abandoned their message queue (Starling) written in Ruby early on. That abandonment was often misattributed to them abandoning Rails, I guess because of the shared usage of Ruby. The confusion was compounded to back when someone at Twitter (may have been Odeo at time?) posted a message board rant about how Rails does not scale because ActiveRecord did not allow connecting to multiple databases out of the box. That event seems to be the origins of the "Rails does not scale" mantra that swept the internet for a time. But, humorously, someone replied with a solution a few minutes later.

samatman · on May 8, 2020

Ah yes, the Fail Whale era.

fnord123 · on May 8, 2020

They should have kept the fail whale instead of offering "Sorry the tweet couldn't be loaded. Please retry" which I get 3/4 of the time on my phone.

kelnos · on May 8, 2020

But how many instances did they have running that app, and at what cost? Did they have to build a ridiculous amount of caching in? And wasn't that the time period where they were incredibly unreliable to the point that their "fail whale" server error page became a running gag?

vidarh · on May 8, 2020

Twitters architecture was a bad joke. Many to many communication in a scalable manner has been a solved problem from decades: you federate. You assign users to buckets, you assign buckets to servers. You route messages like you'd route e-mail. Been there, done that. The federation does not need to be outwardly visible.

Twitters problem was a too centralized architecture not Rails.

And I say that as someone who at the time hated Rails and who still hates Rails. I find it bloated and over-complicated. It may even have led them to make bad architectural choices because of how it was structured.

But they still did make bad architectural choices, and they fixed those choices at the same time they moved off Rails.

pjmlp · on May 8, 2020

Until they had to switch to a mix of Java, Scala and C++.

Had they started with them and there wouldn't exist blue wales.

Doxin · on May 8, 2020

Not to mention that python is plenty fast compared to the time it takes to write stuff to the network. Of course heavyweight frameworks like django don't help the equation, but writing fast network code in python isn't exactly hard either.

vidarh · on May 8, 2020

This is something a lot of people don't get with most higher level languages.

My first commercial use of Ruby was in 2005. Not web facing, but messaging middleware. As in a pub-sub type passing of messages between various endpoints.

We had a C version. It was about 7k lines to support the bare minimum we needed. As an experiment to teach myself Ruby I wrote a Ruby implementation. With the usual caveats (it's often easy to make a rewrite better in all kinds of ways, including size), it was ~700 lines, far easier to read, and supported far more functionality, so I put it in production.

Was it slower? As usual that depends what you mean by "slower". It consumed 10x more CPU, but it also did much more work (e.g. supporting more flexible routing of messages etc.). The throughput, however was the same, and 10x more CPU means that maxing out the network connection took 10% of a single core instead of 1% of a single core.

For some types of tasks CPU is the most important thing, but for a lot of tasks you'll be IO limited. And for a lot of tasks that people think are CPU limited are really down to poor IO handling (causing excessive context switches e.g. through lots of small reads is a common one)

pjmlp · on May 8, 2020

And other people don't get it is possible to have high level languages and almost C like performance.

You don't need to give up on JIT and AOT compilation to use high level languages, and this is where current tooling for Ruby and Python ends up losing.

Doxin · on May 11, 2020

Fair enough, I've been playing with D a lot lately, which is basically "what if python was a C dialect instead". It's an incredibly simple language to learn but it's no less easy to write in than python. For most things I still reach for python though, probably because I've grown comfortable with duck typing.

side-note when starting with D: make sure to install dub. it's the package manager and basically eliminates makefiles from the compilation process. Just "dub init" and "dub run" and you're off to the races.

kccqzy · on May 8, 2020

> python is plenty fast compared to the time it takes to write stuff to the network

With the proliferation of microservices, I find this increasingly not true. Sure, python definitely is plenty fast when you need to send something to a user many miles away. But with microservices, writing to the network might mean writing to a machine in the same data center, or even the same host in a different container. That's as fast as a few memory copies and a few context switches.

fnord123 · on May 8, 2020

>python is plenty fast compared to the time it takes to write stuff to the network.

Give us some numbers.

Inside the data centre you have 10G, 40G, 100G ethernet connections. I know for a fact that you will struggle to soak a 10G connection using a single thread so I know you can't do this in Python without multiple processes using SO_REUSEPORT.

Doxin · on May 11, 2020

So use multiple processes with SO_REUSEPORT then. Or find yourself a wsgi server that does, because it's not exactly an unsolved problem.

That said by far most programs don't need to worry about saturating a 10G connection. I'm not writing a file server in python, I'll leave that to nginx or S3. I'm writing business logic in python which tends to be bottlenecked by a database in any case.

Python is great for plumbing together other functionality, which it turns out means most backends you'd be writing anyways. Python is less great at handling a large quantity of data, though most of the time you can get away with handing the data handling to some library (e.g. numpy or libuv or any one of thousands of libraries).

Worst case you can easily plumb in some C-calling-convention code into python. With FFI it's a matter of copying the header definition and you're off to the races. That way you can still write the bulk of the program in python, delegating the bulk data wrangling to C or D or rust or go or whatever you prefer.

flyinglizard · on May 8, 2020

The only reason to use Python for anything more than few hundreds lines worth of utility is if you're working on a codebase that's already in Python, and even then it's debatable.

There simply isn't an excuse for using Python for any infrastructure. It does nothing particularly well - or even right - other than very purpose-specific scripting. It can tie things together well enough. And your codebase becomes a liability rather than an asset.

I always say this opinion when it comes to Python discussion in HN and I always get downvoted but hey, "all it takes for evil to triumph...".

santoshalper · on May 8, 2020

So you always get downvoted to hell whenever you post this and you've decided everyone else is the problem? Can you even understand how you sound? I cringed with sympathetic embarrassment just from from reading this. Seriously, rethink your life choices man.

This coming from someone who has literally never written a line of python in his life.

kragen · on May 8, 2020

"Rethink your life choices" — because they have opinions that are unpopular and they express them anyway? I might not agree with those opinions, but I fail to see how discouraging them from expressing them is in any way good.

People expressing unpopular opinions and defending them with evidence is how we find out when the popular opinion is wrong, and how, when our discourse is functioning properly, we can gradually change the popular opinion to be less wrong. You, and the people downvoting the comment, are throwing a monkey wrench in those works.

flyinglizard · on May 8, 2020

Trading internet points for potentially helping someone make a better engineering decision is a fair enough deal.

kragen · on May 8, 2020

I love Python and use it most of the time when I need to get things done quickly, but its runtime efficiency cost is becoming increasingly unappealing for three reasons: the end of Moore's Law, the concurrent rise of manycore and SIMD, and the steep rise in Python's footgun count. And now there are better alternatives.

— ⁂ —

In 2000 or 2005 Python was a simple, consistent, practical language with a policy of strict error handling that was very useful for producing reliable code: "Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it." Its runtime cost was significant but bearable, about a factor of 20–40: if you wrote your code in C instead of Python, it would run 20 to 40 times faster, and that was all the machine could do.

In 2020 Python is an overcomplicated, inconsistent, slow, unreliable language with a persistent schism resulting from the core developers' poor choice to make backwards-incompatible changes to simplify the language. It has metaclasses, superclass method resolution order linearization to enable mixins (two different ones, in Python 2), a lazy sequence construct that has been gradually Frankensteined into a general coroutine construct (with an additional lazy sequence construct added on top), two different incompatible language constructs to compensate for the lack of block arguments or full-fledged lambdas (decorators and context managers — I'm excluding generators here since they're more powerful than block arguments), and on and on. The reference documentation for "import" alone is 20 pages, and that's in Python 3, the simplified version of Python.

Python's performance cost has not increased in absolute terms — in fact, it's even improved a bit — but it's increasingly painful. In 2000 we could rest easy knowing that whatever we wrote in Python would be sped up by Moore's Law and Dennard scaling, roughly a doubling in speed every 18 months, so in three years it would be four times as fast, and in three more years it would be 16 times as fast. That, together with a little judicious implementation of inner loops in C, was a small price to pay for getting things done sooner and not having to open core files in a debugger.

But then Dennard scaling slammed into a wall around 2006 and Moore's Law sank into a swamp around 2016. Meanwhile, manycore meant that without multithreading, or at least multiprocessing, your program suffered an additional order of magnitude slowdown. Even US$40 hand computers now feature quad-core CPUs. Today, the gap between what the machine can do in absolute terms and what it can do when saddled with Python is a gap of 1000 or 10,000, not 20. If you can cope with the limitations of PyPy (it supports Numpy now! Since 2017) then you can get up to the speed of single-threaded C, which is about 3% of what your computer is capable of. But it's not going to get faster just because hardware progressed: computers will maybe be twice as fast in five years, at best, and maybe not. If it's too slow today, it'll probably be too slow then too.

But that's not the worst part. Python's completely botched Unicode handling introduces bugs into most Python programs that handle strings from the outside world, latent bugs that only surface once those strings contain non-ASCII characters — similar to the situation with bash scripts and filenames containing spaces, although that can be detected by purely local analysis (missing doublequotes around a $var, red alert!). Plan 9 had already demonstrated one correct way to handle the situation (the one used in Golang and Rust) and Markus Kuhn's UTF-8B proposed another, one which was eventually partially implemented in Python as PEP 383 ("surrogateescape") but turned off by default. I've had bugs in on-orbit satellite control software that I couldn't track down because Python generated a UnicodeDecodeError when it tried to log the stack trace.

— ⁂ —

At the same time, other alternatives got a lot better. Java grew into a mildly reasonable language, and Kotlin and Clojure are outstanding ones. Haskell, defying everyone's expectations, became practical. Microsoft started trying to embrace and extend free software, so now we have F# on Mono, which is almost OCaml — almost as convenient and concise as Python, but enormously less bug-prone. Mike Pall, a superhuman intelligence from the future, wrote LuaJIT, which gives you performance on par with C in a language as friendly as Python — not modern overcomplicated Python, old Computer Programming For Everybody Python. 100 million people, including little kids, program computer games in Roblox using Lua. (It's bug-prone as hell, though. Lua has a footgun for each toe, as Sean Palmer says.)

Even C++ has been tamed somewhat. And of course we have Rust and Golang. Golang is only a little bit uglier to program in than Python, and both of these new systems-programming languages make it a lot easier to take advantage of manycore, though not SIMD.

Switching from Python to Golang is as easy as switching from Perl to Python, but with a lot more benefits. And that's a big reason why a lot of the important infrastructure software written over the last decade has been written in Golang.

On the horizon, we have things like arcfide's Co-dfns APL compiler, Matt Pharr's ISPC, and GLSL to show us how massively parallel programming, including SIMD, can become accessible to mere mortals. They aren't yet practical options as alternatives to Python (except that GLSL is practical for its original purpose, of making nice graphics), but they might be pointing the way to something that is.

— ⁂ —

So I think it's eminently defensible that Python should now be consigned to "a few hundred[] lines worth of utility". Python is great for scripting TensorFlow, and it's a far superior substitute for MATLAB. But writing infrastructure in Python in 2020 is like writing infrastructure in Perl in 2005.

...which I was also doing. I think I probably owe an apology to a lot of folks at Aruba Networks who are maintaining that code today.

itwy · on May 8, 2020

Any which language do you use for web apps if I may ask? Let me guess. Rust.

flyinglizard · on May 8, 2020

I primarily do all sorts of systems programming, but the few web backends I did were C# on Azure Functions and it was spectacular. I’m a very big proponent of Microsoft’s tooling and language development.

The difference in productivity between C# and Python is such that they might as well have been developed by different civilizations.

(I have occasionally used Python since about 2003 and C# since 2018)

jefurii · on May 8, 2020

> The difference in productivity between C# and Python is such that they might as well have been developed by different civilizations.

I think they were. Is that a bad thing?

bsder · on May 8, 2020

> The only reason to use Python for anything more than few hundreds lines worth of utility is if you're working on a codebase that's already in Python, and even then it's debatable.

One important reason for using Python is that it almost always forces people and companies to release the source code.

I will take a shitty Python script over shitty C/Rust/Go/Java code any day for that reason ALONE.

With source code, I can fix your shitty program (and all programs are shitty--even mine). If it's compiled, that path is blocked.

bob1029 · on May 7, 2020

I think going back to basics would be a really good idea for a lot of people in software. It seems like modern developers are more disconnected than ever from the reality of the hardware situation sitting right next to them. I believe there was a post on the front page today detailing a certain 22ms Hello World execution...

yowlingcat · on May 8, 2020

```

First of all, it does not take "that much machine" to serve a fair number of clients. Ever since I wrote about the whole Python/Gunicorn/Gevent mess a couple of months back, people have been asking me "if not that, then what". It got me thinking about alternatives, and finally I just started writing code.

```

Another day, another questionable premise for a blog post. No, @rachelbythebay, the question is not "if not that, then what", and the answer is not reinventing the wheel. The question is just "what ___" -- what do you plan on doing, what does your software need to do, what does it need to support? Use the right tool for the right job. If you have a language you're proficient in and with an ecosystem that supports you developing something rapidly, it's borderline malpractice not to start there. When you need to optimize, optimize then. Maybe that means you carve out a subcomponent into a new service, and you choose a language purpose built for speedily doing what you need. Maybe it means a lot of things, but it doesn't mean you throwing out the baby with the bathwater and setting out to recreate the baby, the bathtub, and the bathwater from scratch to answer the question of why your tub is overflowing.

I wish this blog post was about solving real engineering problems instead of writing code to provide mediocre answers to poor questions.

Aeolun · on May 8, 2020

> If you have a language you're proficient in and with an ecosystem that supports you developing something rapidly, it's borderline malpractice not to start there.

I think the point of this post is that most new programmers do not know things can be faster than their monstrous JS blob. The solution to slow requests is more servers instead of fixing the code.

Doxin · on May 8, 2020

> I think the point of this post is that most new programmers do not know things can be faster

Related to this: programmers who know an ORM or two but never learn SQL proper.

At my job I refactored a giant, slow, memory hungry reporting task into a single sql query. It used to take 10s of minutes to collate 1000s of datapoints. Now it takes 10s of miliseconds to collate a factor 10 more data. Never mind after I added some indexes to speed the thing up.

Knowing about the layer below the abstraction you're working at can be rather useful at times.

yowlingcat · on May 8, 2020

What's telling here is that you refactored the individual task into a single query, and nothing outside of that. That's fantastic! You scoped your work into a small unit and derived a significant impact. You maximized your impact to effort ratio.

Notably, you did NOT decide to write your own dialect of SQL to "scratch an itch". Knowing the layer below the abstraction you're working at can be useful at times, but only in relation to understanding the context of how all the layers fit together and making effective, pragmatic decisions. Pointless rewrites are anything but. Good on you for avoiding that impulse and doing the right thing.

yowlingcat · on May 8, 2020

It's fascinating that you see the misguided impulse being prematurely adding more servers rather than fixing the code. I would agree with that, but I would also see the premise of this blog post as a similar misguided impulse -- rather than fixing the code, writing a completely new part of infrastructure from scratch in a new language because the old one wasn't good enough. Unless you've exhausted reasonable attempts to optimize existing code inside the language the original code was written in, I believe that following this impulse would be (as I alluded to in my original comment) borderline malpractice.

_4ziu · on May 8, 2020

In what way is this helpful to newcomers at all? It uses a ton of heavy jargon and no example code.

chris_wot · on May 8, 2020

It’s not aimed at introductory programmers. It’s aimed at more experienced programmers to suggest to them that they can show new programmers what’s possible.

epicide · on May 8, 2020

The author never really expresses the intent of the writing, so the reader is left to assume it is meant for them -- experienced programmer or no.

andrewflnr · on May 8, 2020

People don't need to explicitly audience-tag every single thing they write. I don't know if "entitled" is the right word, but it's really an annoying demand. Most times, if you can understand something, or are willing to Google the jargon, then you're part of the audience, and that's plenty of precision.

epicide · on May 8, 2020

I can Google medical terms until I pass out, but that doesn't make me the intended audience for a medical research paper.

Besides, I'm not suggesting (or "demanding") that all articles explicitly state their target audience. Just that this one does a poor job of indicating it.

So I think it's a poor argument when the parent comment (to which I was responding) assumes an exclusive target audience: "not introductory programmers... more experienced programmers".

You could argue that one was supposed to assume that from context. However, I didn't make that assumption whilst reading, so I especially wouldn't expect "introductory programmers" to pick up on that either.

kragen · on May 8, 2020

You can probably understand most medical research papers after reading a few dozen medical Wikipedia articles and taking an introductory stats class. Math and physics research is a little more difficult.

chris_wot · on May 9, 2020

What part of "I think some of us who have been doing this a while have been doing a terrible job of showing what's possible to those folks who are somewhat newer." didn't you understand?

epicide · on May 9, 2020

Yeah, first sentence no less. No excuse for that one!

chris_wot · on May 9, 2020

You just earned my respect.

sytringy05 · on May 8, 2020

Compare this to the Enterprise platform I am dealing with on a current project, which has 4 x ec2 nodes with 8 CPUs and 32 Gb RAM.

I can DoS it with a single java client running 50 threads. If I use 100 the p95 shoots up to 30 - 40 seconds.

But the kicker is that no one (other than me) really cares. 50 concurrent threads is probably around the peak load it will get in prod, and various people involved think why bother trying to fix it?

It's driving me nuts.

Thanks for listening

Aeolun · on May 8, 2020

Oh man, don’t remind me. We have a bunch of GraphQL proxies in ECS that somehow cannot handle more than 5 connections each, so naturally the solution is to just spin up 19 more of them to get to 100 concurrent connections...

All of them sit at 1% cpu as well.

rbinv · on May 8, 2020

That actually made me laugh. The time savings can't possibly be worth it.

bibabaloo · on May 8, 2020

Honestly, I think that's what makes working at a large "web-scale" company so attractive to me. When you're running at that sort of scale you can't afford to be as apathetic about performance so there's a lot more engineering effort put into efficiency, because it makes financial sense to do so. OTOH, in a lot of enterprise type companies, you can be nothing more than a "feature monkey".

l0b0 · on May 8, 2020

The only endpoint to ever do anywhere near as little work as this example would be the heartbeat. Most of the cost comes from a combination of a bunch of things other than simply fetching some data from a local device:

- Web-anything is fetched off of a DB nowadays. That's another crapton of latency, because a) it's just easier to understand its characteristics if it runs on a separate system and b) most companies IME have either no DBAs at all or the DBAs have no time to look in depth at every system being built. So the DB resources are vastly underutilized, and the DB itself is badly understood.

- Cache invalidation is still the hardest problem, and every caching framework I've looked at seems to gloss over that part. Just cache all the things and hope people retry enough times to get the latest update. I would love someday to work on a system where things are aggressively cached at every level and invalidated at every level and with perfect granularity.

- Building for web scale from the beginning is premature optimization for the vast majority of companies. In the vanishingly unlikely scenario that the company actually grows 100-fold or more it makes sense to start investing heavily in performance. Of course, this also has the knock-on effect that the vast majority of software developers never get anywhere near a web scale system. OTOH it creates jobs for millions of developers, some of whom might end up building at scale someday.

Another elephant in the room is that building anything at web scale is just not something anybody straight into the workforce is anywhere near qualified for. We desperately need more focused learning (mentoring, pairing, etc.) across the board to bring everybody up to speed faster.

whorleater · on May 8, 2020

The typical response to these types of posts is "oh your /toy/ server doesn't account for x, y, z in my use case, like ddos, network issues, etc. But how many people actually handle those cases in your production application? I can say for the majority of the applications I've written at large companies handling significant traffic API compatibility was far higher on the priority list than the cases that people often bring up.

IMO she's right, I wish we didn't mess wrap around gunicorn and gevent for some of our services. Certainly would've made my life easier and the services faster.

kragen · on May 8, 2020

Also it sounds like Rachel's server does account for DDoS and network issues.

ivalm · on May 7, 2020

What WSGI do people recommend for python? I've been using gunicorn but this made me think of alternatives. Quick google search found this benchmark [0], is it really that bjoern is much quicker? It seems all other WSGI are ~ equivalent.

[0] - https://www.appdynamics.com/blog/engineering/a-performance-a...

luhn · on May 7, 2020

I've been using gunicorn quite happily without the gevent stuff that Rachel ran into issues with. Running 2-3 workers per core is enough for my application to make full use of the CPU. It is a "waste" of memory, but memory is so cheap and plentiful I've never had an issue with it.

bjorn is fast because it has a minimal feature set. No threads, no multiprocessing, no nothing. If it works for you, great, but it's never satisfied my requirements.

I'd avoid uWSGI, it's performance is good but it's so complicated with so many features that I never felt confident using it.

Never used waitress except in development, but people seem to have had success in production.

dimatura · on May 7, 2020

I've mainly used gunicorn and uwsgi, without seeing any large differences. But my apps have fairly light network requirements.

It seems like the new trend for Python servers is ASGI (https://asgi.readthedocs.io/en/latest/), e.g. as in uvicorn (https://www.uvicorn.org/).

ivalm · on May 7, 2020

But from what I understand this replaces gevent, you still use gunicorn to manage workers?

sknzl · on May 8, 2020

Exactly. That’s what we are doing in production.

JimDabell · on May 8, 2020

Do you have a performance issue that you have tracked back to gunicorn? If not, then continue using gunicorn. There are almost certainly more valuable things to spend your time on than replacing something that is already working.

TheAdamAndChe · on May 7, 2020

I've used uwsgi a few times before. It is a pain to set up every time, but once it's configured, it works fine for the small, basic apps that I make.

CopyZero · on May 8, 2020

IDK but the mod_wsgi, although there seems to be a ton of documentation... It's way harder to install than a virtual environment, python and django. I love doing backend stuff in python and jinja but I fucking hate setting up wsgi on apache. It really makes me love php again.

neurostimulant · on May 8, 2020

Is there any reason why do you need mod_wsgi and apache? gunicorn behind nginx is an ideal setup for me (nginx handles static files and reverse proxying while gunicorn handles the python backend). I wrote a Dockerfile that combine all of those in a single package and I've been using it with very little tweaks for many of my smaller projects over the years and it makes deployment very easy.

throwaway9d0291 · on May 8, 2020

I've been using uWSGI behind NGINX (using uwsgi_pass) for years. It might not be the absolute fastest option around but it's faster than a lot of the other options, rock solid and extremely configurable.

Some fun extra features that uWSGI provides:

- Daemon management: It can manage other daemons, for example Celery, for you, so you can put your whole environment into a single uWSGI.conf.

- A cron-like interface for generating events on a schedule.

- Emperor - hosting of multiple apps. This is extremely configurable, you can have the server look at the filesystem for config files for the simplest setup but it also supports AQMP, a Postgres database, MongoDB, a shell command or a bunch of other things. It has a bunch of interesting isolation options, like running each app in its own Linux namespace. It also has (optional) socket activation, so apps won't be started until the first request.

- Auto-scaling

- Soon, multiple event-loop/IO subsystems to choose from, including asyncio

I've been meaning to have a look at nginx-unit though.

DizzyDoo · on May 7, 2020

I'm using Waitress in production: https://docs.pylonsproject.org/projects/waitress/en/stable/ It's main thing is that it's really simple, worth a look if you're using Django.

X-Istence · on May 8, 2020

Thanks for your kind words :-)

CopyZero · on May 8, 2020

going to look at this in depth tomorrow. Thanks

X-Istence · on May 8, 2020

The thing to note is that waitress is not built for the highest speed, or the fastest, or anything along those lines.

Its primary use case is that it is pure python, doesn't rely on any specific libraries or compilers to run/build, and is a threaded WSGI implementation so it uses Python threads to run a WSGI app.

It works well for what it needs to do, and hopefully it is fairly robust. I've personally ran waitress directly facing the internet, but will readily admit that in most cases running it behind a load balancer is a good idea, especially since it doesn't support SSL out of the box (yet, I should say, it's on my roadmap).

It won't win any speed contests and it won't win performance contests, but it holds its own.

If you have any issues, please drop by https://github.com/pylons/waitress/issues and I'll see if I can help you out :-)

Falell · on May 8, 2020

I use uwsgi at work, we've found a lot of bugs in the more interesting features of the uwsgi module, and had to fix some signal and memory leak issues (I think we're still trying to get them upstreamed). That said, it's reasonably fast and the WSGI interface in general is pretty pleasant to work with.

xorcist · on May 8, 2020

Always uwsgi. It's the hidden gem of web serving. It's even useful for other things than Python, and has the operational features you would expect (such as graceful restarts). The more exotic features are a bit less tested however, and it shows.

rpedela · on May 8, 2020

Falcon

https://github.com/falconry/falcon

aeyes · on May 8, 2020

That's a web framework, you need a WSGI server to run it.

xakahnx · on May 8, 2020

The general attitude here reminds a bit of the following post from the architect of the Varnish proxy. I think the attitude comes down to the fact that modern kernels and in general the foundations of network programming are pretty strong. We should trust them more.

https://varnish-cache.org/docs/5.2/phk/notes.html

mapgrep · on May 8, 2020

I would argue you can say the same about foundations of rdbms systems as well. People build similar elaborate caches around those, for example rails has a “russian doll” cache layer built in, not realizing how much time has gone into developing well tuned caches within the database system itself, which simply needs to be allocated sufficiently large ram.

tdeck · on May 8, 2020

Russian doll caching also caches view rendering which can be non-trivial in Rails.

flukus · on May 8, 2020

> not realizing how much time has gone into developing well tuned caches within the database system itself

File systems too. A cron job writing a csv makes a surprisingly simple cache the OS can keep in memory.

simfoo · on May 7, 2020

Honest question: why go through the hassle of multiplexing waiting in a single thread only to dispatch to a thread per client anyway? Simply using blocking IO for the clients in those threads should be much simpler right?

rachelbythebay · on May 7, 2020

If you get stuck in read(), you can't do neat things like waking up when it's time to kick a client for being idle, doing other housekeeping, or cleanly shutting down the whole thing in a timely fashion. When I ^C the server, it sends the same wake condvar-poke but it twiddles the flags so the worker shuts down instead.

pdonis · on May 7, 2020

Yes, using epoll with nonblocking I/O is better than blocking I/O on each worker thread. But that basically means that you are doing asynchronous programming--i.e., the exact same thing that the wonky Python/Gunicorn stack you described is doing! You're just doing it with better attention to important details.

Here, to me, is the key item:

The "listener" thread owns all of the file descriptors (listeners and clients both), and manages a single epoll set to watch over them.

This is exactly what any async server does: it centralizes all the file descriptor management and handling in one place, and only uses workers (whether they are threads or "green threads" or whatever) to read from/write to fd's that are marked as ready in the epoll set.

For your case, unless I'm misreading something, what the workers are doing in between the read/write is CPU intensive (or at least it's CPU work and not I/O work, even though it's not very "intensive" CPU work), so actual OS threads are a better choice for the workers since you can't rely on cooperative scheduling.

If what the workers were doing was I/O work (for example, sending a request to a remote database and waiting for a response), "green threads" would work fine (since their only real purpose would be to organize the I/O--the actual fd's are going to be managed by the central server that manages all the fd's and checks which ones are ready for read/write). And one definitely should not try to run "green threads" for the same server in multiple O/S threads (or worse still, multiple OS processes). For an I/O bound server, one shouldn't need to anyway.

underdeserver · on May 7, 2020

Better attention to important details is what makes or breaks a library.

swsieber · on May 8, 2020

Exhibit A: Dropbox (attention to detail and executing it)

the8472 · on May 7, 2020

> If you get stuck in read(), you can't do neat things like waking up when it's time to kick a client for being idle

Totally possible with another thread acting as watchdog timer and sending a signal which causes the read to return with EINTR which can then check a flag whether it should retry or abort. And that's for file IO. For socket IO you can just set it to non-blocking.

kragen · on May 8, 2020

File I/O is usually not interruptible with signals.

An alternative to putting the watchdog timer in another thread is to use alarm(2) and use the kernel's built-in watchdog timer, and the default behavior for SIGALRM is probably adequate. This might be easier than non-blocking I/O.

simfoo · on May 8, 2020

Good point. Would still be possible with the threads being blocked in a read() but that would require signals and more logic in the threads, so a central multiplexing and coordinating thread seems like a cleaner solution.

probably_wrong · on May 7, 2020

I think the answer is in the previous article about Python linked in this article: to show that you can serve more requests with less resources if you avoid the Python/Gunicorn way and do it her way instead.

cgh · on May 7, 2020

I think it's basically just the equivalent of select(2)? Once upon a time, all network servers were written this way. They were pretty fast, too.

kragen · on May 7, 2020

When was this time? In BSD, which introduced select(2), most network servers ran from inetd. I got a stern talking-to from the computer security folks for running a process in my .cshrc that would repeatedly finger someone at another university, where their fingerd ran from inetd, because I was making their shared VAX run unacceptably slowly. Early versions of httpd included instructions on how to run it from inetd, along with a note that you would probably regret it.

Are you thinking of, like, CICS systems from the 1970s connected to SNA or something? I mean they didn't have select(2) but they did serve many clients in a single process.

icedchai · on May 8, 2020

Long running network servers, such as most IRC servers, were written using select. Example: https://en.wikipedia.org/wiki/Comparison_of_Internet_Relay_C...

kragen · on May 8, 2020

IRC servers were indeed written using select(2), because interaction between the clients is the whole point of IRC; that gets a great deal more difficult if you spawn off a separate process for each client! I think ICB/FNet predates IRC slightly and was also written using select(2). But IRC wasn't written until 1988, at which point select(2) and inetd were already about ten years old, and things like (most) FTP servers continued to run one process per concurrent client throughout the 1990s. (Walnut Creek CDROM wrote their own high-performance event-driven FTP server, IIRC.) Typically on the machine where you were running the IRC server you would also be running about a dozen or two other servers from inetd, all written with the one-process-per-client model.

If there was a time when all network servers were written using select(2) or similar event-driven APIs, it wasn't 1988 or later.

icedchai · on May 8, 2020

I think the previous poster was just exaggerating... "All" network servers is a pretty broad category. They were never all written using select.

kragen · on May 8, 2020

They were never mostly written using select, either.

icedchai · on May 8, 2020

you are correct. I'm not the original poster though.

icedchai · on May 7, 2020

Most select(2) based servers were single threaded, however. Not that this is necessarily a bad thing.

cgh · on May 7, 2020

Yes, you are right, of course. I shamefully misread the article. It's closer to listen/accept/fork.

pdonis · on May 7, 2020

> It's closer to listen/accept/fork.

No, it isn't, it's doing the same thing as a select(2) server would do, except it's using epoll to avoid scaling issues when you have a lot of file descriptors in the polling set. The only difference is that the workers are doing something that requires CPU, not I/O, so OS threads are being used for them (a single threaded server would be fine if the workers were just doing more I/O, like sending a request to a remote database and waiting for a response). But the worker threads are not doing any I/O management at all; they read from or write to an fd only when the central server that is calling epoll tells them to. In the listen/accept/fork model, the central server forgets about an fd once it has passed it to a handler process, and the handler process using blocking I/O.

robrtsql · on May 8, 2020

> Ever since I wrote about the whole Python/Gunicorn/Gevent mess a couple of months back, people have been asking me "if not that, then what". It got me thinking about alternatives, and finally I just started writing code.

I actually want to know: then what? As a web developer who usually reaches for Django or Flask with Gunicorn because I just don't know any better, is there a better stack that doesn't face these problems? Or is this a 'call to action' for somebody to build a better web server that follows this advice?

EdwardDiego · on May 8, 2020

> is there a better stack

Apparently, it's C, or ...assembly? According to the comment above yours.

Because developer time is apparently cheaper than CPU cycles. Weird.

kenhwang · on May 8, 2020

Every time I read one her posts, I think, wow she must either work for literally dirt cheap or her code is expected to run on several thousand machines. Most projects just don't reach the scale or steadystate where developer time is cheaper than machine time. It's fun trying to squeeze the last drop of blood from stone, but it rarely makes economical sense.

disgruntledphd2 · on May 8, 2020

To be fair, I believe she spent the guts of a decade at Google and Facebook as a Production Engineer, so this isn't really that surprising.

kuschku · on May 8, 2020

As we've seen in the cloud costs thread, going from python on AWS to Rust on dedicated hardware can push your bills from 30'000$/month down to 300$/month, which more than pays for the dev salaries (especially if you're not paying the excessive and unnecessary salaries of the bay area)

EdwardDiego · on May 8, 2020

How many hours would you expect that rewrite to take when done by Python developers learning Rust as they do it?

And what's the opportunity cost of the new features they can't create because they're rewriting existing apps?

If you're arguing that Python has more runtime overhead than Rust, I don't disagree.

But there's a reason people invented higher level languages than C. Rust is a far nicer systems language than C, but is it faster to develop in than Python or Kotlin? It really depends.

rb12345 · on May 8, 2020

I think the argument of the article is to pick the right language in the beginning so you save both developer time and runtime and don't have to rewrite in a better language later. For example, Python is a lot slower than threaded C, but quicker in terms of developer time and is memory-safe. If you pick one of the newer JVM languages though, you can get far closer to C than Python in performance once the JVM is running and still keep most of the expressiveness.

kuschku · on May 8, 2020

The reason I’m mentioning Rust is because you can still build your fun part of your application in python, and easily use the FFI to build any performance-critical part in Rust, C or C++.

That said, with Rust or C++ you can – depending on the situation – be clearly as effective, or even more effective than with Python, and you gain an enormous performance benefit (which in turn saves you money, which in turn means you can hire even more developers)

ncmncm · on May 8, 2020

It is not clear that Rust can be as fast to develop in as Python or Kotlin, for the typical run of coders--people using Rust now are mostly well above average--but it is abundantly clear that, supported by good libraries, modern C++ can. Faster would be a tall order, but is not necessary.

Put in the same effort, and get 10x-1000x faster code or lower resource needs. Why not?

pjmlp · on May 8, 2020

Any language with AOT/JIT tooling will do.

flukus · on May 8, 2020

I've gone down this rabbit hole recently, I was using apache because it come with a lot of nice to haves beside http handling, like authorization, session and cookie modules, but I wanted to see if c and cgi were viable for modern web development and the results were fantastic. Response times of 1ms while hitting an sqlite database every. I didn't go much further with the performance testing but it could easily handle more than most of the enterprise crap I build ever needs.

C and CGI are a great simple combo. When you get down to it 99% of web development is shuffling data from sql to html and vice versa. It's so stupidly simple that the main code doesn't even hit the rough edges of c, there are almost no allocations for isntance so no memory management to worry about, everything that could possible leak memory or cause security issues is in a handful of utility functions.

No MVC, no DTO's, no view models, no service layers, no templating, no client side javascript, just reading from and sql connection and printf'ing html to stdout. Without all the useless complications I ended up with far less code than the typical equivalent in a high level language with a framework. I'm now fairly convinced that most of the projects I've worked on would have been better off this way.

_4ziu · on May 8, 2020

I don't know why rachel doesn't ever post source code or simply say what architecture she's using. I'm left to assume it's just plain c++.

I'll just stick to python thanks.

radiator · on May 8, 2020

Posting the source code runs the risk of having something to defend oneself, instead of just having targets to criticize.

luhn · on May 8, 2020

As a fellow Python user, honestly I think you're fine. Rachel's issue with gunicorn stems from the gevent stuff, which you probably don't need and I would avoid unless you do.

I'm a bit mystified by Rachel's use case—A green thread hogging the process for so long that another request times out on the client? That means that requests are doing large amounts of processing and latency requirements are super tight, which sounds like a very specialized use case.

pjmlp · on May 8, 2020

Yes any managed language that has support out of the box for AOT or JIT compilers.

Go, any JVM language, any .NET language, OCaml, Haskel, D, JavaScript/TypeScript

As alternative you can try to use PyPy instead of CPython.