Hacker News new | past | comments | ask | show | jobs | submit login
How fast is ASP.NET Core? (dusted.codes)
93 points by dustedcodes on Nov 15, 2022 | hide | past | favorite | 29 comments



Really frustrating to see the amount of effort that went in to gaming those benchmarks when catastrophic performance issues in core Microsoft-authored libraries go unfixed for years. https://github.com/dotnet/SqlClient/issues/593


This is my favorite issue at the moment.

It's something that is completely unintuitive because MS wants you to use as much async as possible, provides an async API but is not fixing the damn issue for years. Most senior devs will not know about this, use the methods and probably won't even notice it in most production scenarios. At least not until it's hits them like a ton of bricks.

Another favorite .NET tidbit of mine is the existence of Microsoft.VisualStudio.Threading. If you use a lot of async code with UI you will eventually run into deadlocks. MS noticed that, too, and wrote this library for Visual Studio (hence the name) to work around that and then released it to the public.

Or:

  enumerable.Single(predicate) // slow
  enumerable.Where(predicate).Single() // faster
Also fun (but understandable why they don't fix it) BUT with VS2019 they included a new "code fix"/analyzer that very annoyingly tells you to convert the latter into the former for "code readability", making you code slower.

Don't get me wrong, I like .NET a lot but it has quite a few footguns.


Why is the latter faster than the former?


The latter has a fast path for arrays and lists, the former doesn't.

For Entity Framework (Core) and LINQ to SQL it doesn't matter, though.


> It does pretty much what you say: it repeatedly copies data (2 bytes at a time though), without materializing a string. So if the string you try to read is 10 packets long, you end up copying 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 = 55 packets. If you need 100 packets, you end up copying them 5050 times. So to read N bytes you copy O(N^2) bytes: it's impossible to scale this way. Whether bytes are copied 2 at a time or using a more optimal algorithm doesn't really matter.

Good lord. It took them about a year to find this obvious O(n^2) hot loop and they’re unable the fix it.

To me it seems like the people writing that parser are just doing it piecemeal, adding little spot fixes instead of having a consistent and robust stream parser that won’t crash or slow down to molasses if you look at it wrong.

In several places they mention that the parser was never tested with partial buffer fills because replayed packet captures from disk don’t do that. Real network sockets do.

Suddenly I can see how the security bug happened where data was getting received by the wrong SqlConnection under high load.

I bet they have a bunch of concurrency bugs in that spaghetti code…


I whipped up a realistic version of the Fortunes benchmark as would be written by a typical .NET developer. I.e.: Someone writing normal code instead of trying to game a benchmark.

I used the ASP.NET default project template and settings for .NET 7, using Dapper.NET, Microsoft.Data.SqlClient, and SQL Server 2019. I disabled the default layout to match the HTML expected by the Techempower benchmark, but I used a Razor template.

On an 8-core laptop[1] this yields about 33K requests per second, which is a far cry from the ~200K numbers being reported on Techempower.

I suspect I could nearly double that by optimising the test (e.g.: running the test on a separate machine), turning off various background processes, etc... none of which is "gaming" the application code itself. For example, using HTTP instead of HTTPS bumps the numbers up to 40K rps all by itself.

On the other hand, simply enabling Session State causes that to drop by about 4K rps.

I would like to see something akin to Techempower, but for a more realistic app that has JWT-based auth, sessions, multiple database queries, and reasonably complex markup. HTTPS would have to be enabled, compression on, and HTTP request logging enabled. Basically, the framework should be configured in the same way it would be in production.

Query code:

    public async Task OnGetAsync()
    {
        fortunes = (await _connection.QueryAsync<FortuneEntry>("SELECT id, message from Fortune")).ToList();
        fortunes.Add(new FortuneEntry(0, "Additional fortune added at request time."));
        fortunes.Sort( (a,b) => a.message.CompareTo( b.message ));
    }
}

Razor code for the table:

    <table>
    <tr><th>id</th><th>message</th></tr>
    @foreach (var f in Model.fortunes )
    {
        <tr><td>@f.id</td><td>@f.message</td></tr>
    }
    </table>

[1] https://ark.intel.com/content/www/us/en/ark/products/213803/...


> HTTPS would have to be enabled, compression on, and HTTP request logging enabled. Basically, the framework should be configured in the same way it would be in production.

Kestrel can be directly internet-interfacing, but that's not it's wheelhouse. Instead, in a typical high-traffic scenario, it sits behind a reverse proxy like nginx or apache, which in turn handles https, compression, etc.[1]

[1] https://learn.microsoft.com/en-us/aspnet/core/host-and-deplo...


The concept of "offloading" TLS compression belongs firmly to the 1990s. As you saw in my benchmark, the percentage difference is small, but the complexity and latency are both lower.

To correctly handle HTTPS offload, web frameworks have to "pick up" the X-Forwarded-For and X-Forwarded-Proto headers. This needs additional config or code in many frameworks, including ASP.NET Core. I.e.: https://learn.microsoft.com/en-us/aspnet/core/host-and-deplo...

If you forget, the result is a redirect loop. By "you" I mean a developer working for a company that isn't the one trying to deploy the code behind NGINX. This happens to me every few months, where Random Product(tm) doesn't work properly because it requires HTTPS despite being behind a HTTPS ingress solution.

No big deal you say, just add the setting and move on? Bzzt... now you've allowed headers to be injected into your applications by random end-users out on the internet, spoofing source IP addresses in your logs, etc...

So now your web app code must be aware of your authorised reverse proxy servers. This also has to be wired up, managed, and set in a config file somewhere.

You now also have a new point of failure, a new location that needs performance tuning, scaling, etc...

Fundamentally, a web server ought to be able to stream static content from memory cache about as fast as the wire can handle it. In which case every "hop" you add also has to have the same throughput! If your web server farm scales to 10 servers of 1 Gbps each, then your reverse proxy must scale to 10 Gbps, or equivalent.

For 'n' layers, the usable fraction of your total available bandwidth drops to 1/n!

Take a typical cloud-hosted Kubernetes solution with a web front end talking to an API tier (god help me I've seen too many of these!), and you could end up with 10+ layers, for 10% efficiency. E.g.:

Cloud load balancer -> Kubernetes Ingress -> Kubernetes Service -> Kubernetes NAT -> NGINX pod -> ... 3x ...

If you've ever wondered why modern apps "feel slow" despite theoretically great throughput... now you know.


I'm curious what this would be like without Dapper. Dapper's performance was awful compared to rolling our own mapping that generated a static collection of delegates. This was quite a while ago with .net Framework though.


As far as I know, Dapper performance is within a few percent of "hand rolled" data reader code, because it dynamically compiles the query reader code and then caches the delegate.

Update: a quick test with hand-rolled async query code got me to 41.5K rps, up 3.8% from 40K with Dapper, which doesn't seem worth it. Using non-async produces 43.2K rps for 8% higher perf. That goes to show that using async doesn't necessarily improve performance even under high load!

Entity Framework used to have atrocious performance, but it has been rewritten to work more like Dapper and is quite fast now.

The general issue is that any form of database query in .NET will produce a lot of garbage on the heap. Reading a bunch of "string" values out of a database will... allocate a bunch of strings.

It would be theoretically possible to convert this kind of code to use "ref" types, "span<char>", etc... but then the Razor templating engine wouldn't be able to use this. Similarly, it would be pointless if the output is something like JSON, produced by serializing an object graph. Garbage, garbage, and more garbage for the GC to clean up.

This is why Techempower is so unrealistic. Real developers never write code with "hand rolled delegates", and they shouldn't. They should be writing clear, concise, high-level code using ORMs and template languages like Razor. Not emitting byte arrays to shave nanoseconds off.

Ideally, the languages and frameworks should let the developers to have their cake and eat it too. Rust for example generally allows high level code to be written while minimising the amount of heap usage.


A redditor made an excellent writeup on why this specific set of benchmarks has many issues due to how easily it is gamed.

https://www.reddit.com/r/dotnet/comments/yuxkk7/comment/iwca...


> An untrained eye might be thinking that builder.UseHttpApplication<T>() is a method that comes with Kestrel, but that is not the case either.

As a regular code reviewer, I’ve grown to dislike extension methods quite a bit as it violates the principle of least surprise.

For some reason and unfortunately, the ASP.NET Core team seems to promote this feature quite a bit.


> As a regular code reviewer, I’ve grown to dislike extension methods quite a bit as it violates the principle of least surprise.

I disagree. The common use case for extension methods in ASP.Net Core apps is to extend the builder to apply configurations like all other projects. This pattern is so pervasive that the surprising outcome would be if a member function was not an extension method.


Yes this is a use case but I would hardly call it the common one. I agree with GP, it really makes things harder to read not knowing if it's an actual method on a class or just a static helper function with syntactic sugar on top.

I have begun writing all static method calls as static method calls even when they are defined as extension methods. It's a little less terse but that's worth the decreased surprise. You can do this with the ASP.net configuration calls as well.


I have the opposite feeling about them. I think they make the code more readable.


Yes, done well it adds that benefit and if the extension method is universal to the type then proceed ahead with some caution. LINQ is a great example of this.

Where it falls apart is when it turns out that slightly different behavior is needed based on sub-type.

The right thing is to then move past the extension method, but it's more compelling to take the short-term gains of bolting-on more logic into the extension method.


Nice article which actually goes behind the numbers to see which version of the .Net benchmarks that is comparable to to the other languages. Summary: .Net is fast enough. Java is faster and Node is slower. c++ and go is a little bit faster. Use .Net if you like the language and eco system.


.net is not "fast enough", it is 2x slower than Java

"fast enough" is subjective, for some people NodeJS is "fast enough", for some other people python is "fast enough"

cheating in benchmarks i guess is part of the job at these big companies, gotta find ways to justify your huge salary i guess


All depends on context.

One could say that:

In the "fortunes" category, the giraffe F# benchmark, which is a single 150 line file used 100% as advertised in the documentation performs better than many java frameworks, some CPP frameworks, etc: https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

However, that's not accurate. It's an incomplete take based on a fragment of the data. It's a matter of data presentation and sympathy/empathy with your audience (IMHO).


Fast enough since one most of the time is wasted waiting for io (database, disk). For most situations it is better to choose the tech stack you know and then optimize it afterwards.


Very foolish to use benchmarking as marketing. Yes, it is very convenient (as a .NET fanboy I appreciate to have something where .NET is outperforming others) but hack, the next framework can just outperform you while you do your presentation, complain about your implementation strategies or you have or have not cheated enough.

Write "fast". That is good enough. No comparison needed. The platform benchmark hits hardware limits, the fortunes hits numbers no one of us can code meaningful database/resource interactions against and no one from us wants to use barebone middleware for responses.



FWIW I asked a similar question in the discussion section of the repo, specifically since there is a column in the output to help with this sort of sorting (Implementation Type) and received an answer: https://github.com/TechEmpower/FrameworkBenchmarks/discussio...

The issue here seems to be of data-display and intent of information.

From what David Fowler has to say: https://twitter.com/spetzu/status/1592255871199096833, it does seem as though the engineering portion are using the benchmarks in an expected way. Multiple implementations to see "what perf impact is there to removing X part of the code" piece-by-piece.

The benchmarks display is a comparative-between-frameworks visualization. For that to be realistic, the types of implementations would need to be the same (hence my questioning the "Implementation Approach" column). Instead, you can find various quotes online of people comparing frameworks based on the techempower scores (usually just the fortunes or plaintext), which is disingenuous at best.

I think that a more varied and utilized implementation-approach column could help alleviate this aspect of it.

The other valuable data that doesn't have the best UX for access is: how is the a single framework-test doing over time? You can head to tfb-status and download the data for each run, but then you'll need to correlate commit-ids with your own changes to build a chart.

Either way I can say that I do like the idea of the benchmarks and I've learned some interesting things diving into some of the more stripped framework/implementations found within.


As of .NET 6/7, I'm pretty sure the implementation could be refactored to be much more simple and less reliant on hacks - minimal APIs give good performance and outputting raw bytes as "Hello World!"u8 seems innocent enough which many other benchmark implementations do anyway.


> minimal APIs give good performance

I thought the minimal APIs were just a bit of syntactic sugar over the same underlying architecture.

Are you saying they are actually different in a more fundamental way?


You are correct, they serve as a convenience and conciseness feature.

However, the way they are implemented lends itself well into ASP.NET Core’s requests handling pipeline which means that traditional controller-based pattern is actually more expensive.

The fastest option when using min apis is to take HttpContext directly, which is convenient anyways when you are outputting raw bytes.


I believe ASP.NET is vastly better than most other web application frameworks and relatively simple to use.

Also, I love the built in Outputcache support, which is absolutely critical in reducing latency and load on a web server.


Wow, surprisingly good read, thank you! I never bothered to check the code of the .NET apps on TechEmpower, goes to show I should have.


This reminds me of the Volkswagen scandal in a certain way.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: