Hacker News new | past | comments | ask | show | jobs | submit login
Haskell Snap Framework templating 3000x faster with new release (snapframework.com)
114 points by LukeHoersten on Dec 10, 2012 | hide | past | favorite | 32 comments



5 microsecs translates to 5000 nanos. Assuming an average CPI of 0.5 we have 10000 instructions to render the template. Or 30-50 main memory fetches. This isn't too shabby.

But the numbers before the improvement. Ouch. Those were bad.


I'm not sure measuring the number of CPU instructions makes much sense anymore.

In my experience micro-optimizing code, the majority of time is spent waiting for memory (bandwidth waits or even worse: latency costs).


Precisely. This is why I mentioned the 30-50 mems. A typical main memory fetch is somewhere in the 100-200ns range on modern hardware. I tend to measure the effectiveness of an algorithm based on, roughly, how it accesses memory.

Caches play a role as well, of course. L1 is around 1ns sometimes even less. if you hit L2 it is in the vicinity of 5-10ns. It easily translates to some 30-40 insns on modern hardware.

It also tells you that hunting for faster execution by compressing instructions down is not going to cut you that much extra speed nowadays. The key to fast programs is data representation. Good data representation.


I wanted to ask how is that even possible, then read:

> However, we realized that a lot of the transformations could be done at load time and preprocessed to an intermediate representation.

Lazy computation has its disadvantages. Still, impressive gain.


This isn't about deferring evaluation within a phase, its about compilation over interpretation, and the phase distinction.


> Lazy computation has its disadvantages. Still, impressive gain.

After reading the article I don't see anything about lazy evaluation. What am I missing?


I don't think you ARE missing anything. From my reading, the gains are obtained by pre-computing the string concatenation a single time (rather than while rendering) for all cases in which it can be.

It's similar to moving something from runtime to compiletime (although those distinctions don't quite apply).


Ah, that was exactly how I read it. Basically what Yesod uses template Haskell for in a variety of cases.


>I don't think you ARE missing anything. From my reading, the gains are obtained by pre-computing the string concatenation a single time (rather than while rendering) for all cases in which it can be.

Lazy evaluation: waiting until a computation needs to be done to perform it.

Problem here: it was inefficient to do a particular computation at the very moment before it was needed.

So, how is this not a lazy evaluation problem?


Laziness is a specific property of how variable binding and application works in a language, which is not at issue here.

Unless I am mistaken, they didn't change the template language evaluation strategy from call-by-name to call-by-value.

They did change the implementation from an interpreter to a compiler, though.


I had the same line of thinking as SilasX. Conceptually, the change is that instead of deferring work to the last moment available, it is now done immediately. This is the very difference between lazy and eager computation. As I'm not a Haskeller, I didn't immediately realize the strong connotation with language features. Sorry for the confusion.


Lazy evaluation implies that you keep the result for re-use later if it is shared between multiple users. So if laziness causes performance problems, it is generally:

* A space leak: The deferred computation holds lots of data alive for when it will be needed, whereas computing it would reduce all that data into a small result.

* A latency problem (sometimes called a "time leak") where we may idle around for a while, and only when some value is desperately needed, start computing it. We could preemptively compute the result to hide its latency.

These were not the problem here. The problem here was that part of the computation of the result is redone each time.

Sharing parts of the computation between the invocations was not trivial, and to do this, they found it easier to move the computation to the template loading time ("compile time").


Thanks. I agree that it does seem too good to be true. I was also quite surprised when I saw how much of a difference it was the first time I ran the benchmarks. The great thing is that the bigger your page is, the bigger the improvement will likely be.


I find it interesting that Heist looks to be inspired by or based on XSLT, yet XSLT is not mentioned anywhere in the documentation. Is it just a happy coincidence?


My inspiration for Heist came from Lift's template system. That and FBML. Heist is essentially a generalized system for building domain specific markup languages.


Independent of Haskell, Heist is one of my favorite HTML/XML template engines so it's great to see such huge advancements.


could you expand on why is it that you favor it?

I ask because I like the idea of a stateless xmlish template language, but I wonder what this offers over the zillion existing solutions.

"Separates view and business logic" and "enables DRY design" are valuable goals, but most template languages have them.


Good question. A few reason:

1. Heist allows you to define your own HTML/XML tags in the host language (Haskell in this case). This means you're only dealing with (an extended) XML document when doing layout and design so all the normal XML tools still work.

2. Some popular template engines try to separate logic and design but end up letting you cheat a little. Any time you want/have to cheat and put logic in the template, it really was a shortcoming of the template engine. In Heist, you can't cheat but you never want to.

3. The reason #1 and #2 work is because Heist's "recursively applied splices" is just the right abstraction. My HTML templates end up looking just as pretty as well factored Haskell code. Heist makes the perfectionist in me happy.

In short, I would say you're right here: '"Separates view and business logic" and "enables DRY design" are valuable goals, but most template languages have them.' But just because most template languages have these goals doesn't mean they've achieved their goals. Heist, in my experience and opinion, does achieve these goals.


From their compiled heist docs [1]:

There are two things that compiled Heist loses: the ability to bind new splices on the fly at runtime and splice recursion/composability.

I haven't checked or read the doc thoroughly, but if it's what I think it means - all we get is hierarchal splices. Which is still a lot, but it's not quite as magical.

[1]: http://snapframework.com/docs/tutorials/compiled-splices


We still keep some of the magic by allowing you to run the old "interpreted" style splices at load time. These don't have access to dynamic data, but they do have recursion/composability. This combination retains most of the power while allowing a huge speed increase. It just means that to take advantage of both you have to structure things in a certain way.

At this point it seems to me that this structure also ends up being a desirable one for organizational reasons. But the jury is still out as far as whether there will still be reason to want more. We're aware that there might be good reasons to support this extra power and I have a pretty good idea of how it would be implemented. But I want to get more people using it in the real world before we address that issue.


I'm not him, but personally I like it because html is actually a pretty good language for markup. I find any custom syntax to be worse than just html, and you lose the ability to use standard html tools, syntax highlighting, etc.

And as much as it may be possible to separate logic from presentation in a typical PHP/ASP/JSP style template, I've never actually seen it done. When something is made awkward, people tend to choose the more convenient approach, so you see an unfortunate amount of nested loops and conditionals in most templates. Being able to have designers write templates by simply telling them "anywhere you want dynamic content, just make up an appropriately named tag for it and pretend it is part of html" is really nice.


Right on here. Exactly how I feel.

Generating HTML is really the whole point of a web framework so it better be awesome at doing so. Heist does this well.


thank you both, but I think I failed to express myself: what I meant to ask is: how is this better than other xml based templates, such as wicket, TAL, Kid, Genshi etc.

I would be led to understand, given your comments, that Heist does not allow control structures in the templates, but looking at some snap code[0] it would seem iteration is right there. Which makes sense I guess.

Or am I missing something, and there is a fundamental difference between Heist's

    <posts:reverseChronological>
      <a href="${post:url}">stuff</a>
    </posts:reverseChronological>

and, say, Genshi's

   <a py:for="post in reverseChronological"  href="${post:url}">stuff</a>


Is the difference, and thus your preference, in the fact that posts:reverseChronological works more like a function call taking the content as argument, rather than a "classic" loop?

[0] https://github.com/snapframework/snap-website/blob/master/bl...


I think the difference is that Genshi's py:for appears to be a construct provided by the template system. In Heist, posts:reverseChronological, like you say, is just a function call. Looping doesn't happen in the template, it happens on the Haskell side. Genshi has to have a different construct py:if for conditionals. Heist doesn't need another construct. A function call serves both purposes. This has several positive effects. First, it means Heist's core is simpler since it only has one fundamental abstraction that has enough expressive power for implementing all of Genshi's special case keywords. And second, I think it gives a clearer separation between logic and view.


>how is this better than other xml based templates, such as wicket, TAL, Kid, Genshi etc.

I don't think it is. To me it is just the haskell template engine in that style (which is the style I prefer). That style of template engine is the minority, so most comparisons are vs either mixed style (php/asp/jsp/rails) or vs custom syntaxes (mustache, haml, etc). What is good about heist certainly applies to similar template engines like lifts, zopes, etc.


"Built for speed from the bottom up. Check out some benchmarks." from the frontpage of snapframework.com

"When we originally wrote Heist, speed was not our goal." from the link submitted here

Feel like a contradiction, couldn't they just say that it turned out fast enough on the first try?


I think that comment is referring to the framework / server, which is quite fast. Heist is a templating system authored by the same people that was not intended to be fast (though now it is). It is somewhat confusing, but I don't think the work they did on the framework/server (which is totally separate from Heist) should be discounted as "fast enough on the first try."


Correct. We always marketed Heist as a more experimental part of the framework as a whole. The server and associated API was initially our primary focus.


so why did the api have to change? couldn't the compilation be done on first use? is the api change simply a change of the encapsulating monad (guessing wildly)?

not trying to bash haskell, but i think there's an interesting q about how well it (or any other language) can hide changing implementation details (particularly major ones like a compilation phase) behind an unchanging api.

or maybe that would have been possible, but the api changed for other reasons (the general cleanup)?

really interesting article btw. would have loved more detail... an explanation of introducing compilation in haskell with example would be pretty cool (pretty sure either pg or norvig has written one - with lisp - that i vaguely remember reading years ago).


The changes were significant because Heist isn't just an API. It's an inversion of control where you provide routines that get run for various parts of your DOM. They used to be functions that took a node and returned a list of nodes. In order to do the optimizations that we wanted to do we had to change the type signature of the callbacks that the users write to something that took a node and returned a special data structure.

We actually did preserve the old API, so you can actually migrate without making significant changes to your code. Most of those changes are because of the general cleanup. So maybe my statement about big breaking changes was misleading. They're big breaking changes IF you want the performance increase. Otherwise things still work the way they did before. In fact, the process of implementing this refactoring impressed upon me that the old paradigm was even more important that I initially realized.

If you're interested in more detail, check out the rest of the docs linked at the end. They describe the concepts in more detail with a focus on how to use them. In January I will also be giving a presentation to the New York Haskell Users Group (http://www.meetup.com/NY-Haskell/) about some of the things I learned while implementing this new approach and merging it back into the original Heist code base.


But this doesn't apply to splices that need data at runtime, like say pulled from a database right? Isn't that typically going to be 95% of your splices? The performance increase seems a bit overstated if it only applies to splices that are just simple substitutions.


This might apply to 95% of splices, but not 95% of your template. This particular benchmark does show the best case, but the typical case of a few dynamic splices will not affect things much because the page is still getting converted into a concatenative style and a ton of the splice processing of things like <bind>, <apply>, etc is happening at load time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: