I think the real problem with Rails and similar frameworks is the culture of not giving a shit about performance. I would never ever release a new version of anything that is significantly slower than the previous one. It's simply a bug in my view.
But there are people who have convinced themselves that as long as it scales out, it doesn't matter how many servers you have to run. And if the reason is that something that's known statically is checked billions of times at runtime they don't care.
For a framework, the concept of premature optimization makes no sense.
Yes. On top of that the ruby ecosystem also seems to attract a certain class of programmers who simply don't know how their magic ruby code translates to CPU, I/O and memory operations.
My pet example is a certain popular rails auth framework that hits the database for every single request, to look up tokens that could simply be baked into the cookie. But there are plenty more - reviewing rails gems for performance is generally an exercise in frustration.
It's a pity. Moore's Law mostly mitigates the low performance of MRI and we could be fine that way. But it can not mitigate a culture of incompetent design.
Well, this comment just comes across as sour grapes to me. You could be criticising any memory managed programming language. Old timers have been complaining about the kids not knowing about assembly, registers, punch cards, valves, etc since computers were invented.
As for "incompetent design" - I don't know what you are on about. I think that Rails 3 is a frigging triumph of elegant, modular design. There are very many gems in the ruby ecosystem that I believe any unbiased person would say demonstrate exemplary design. In fact, I would say that the average rubyist is much more concerned with good design than, say, your average java dev. The whole reason I switched to ruby was because of the beauty of the language. And what is beauty but good design applied?
Of course there are always exceptions but on the whole I think this comment is quite unfair.
You could be criticising any memory managed programming language.
No, I'm explicitly criticizing Ruby. I've spent years in Python-land and the average code quality (and btw documentation standard) is much higher over there. Which is not to say Python doesn't have its own problems, but less so in this particular area.
say that the average rubyist is much more concerned with good design than, say, your average java dev.
Good design is a relative term. In ruby I frequently see it interpreted as: Pack as many layers of magic as possible, and then test that with as many layers of testing-frameworks as possible.
That's not too different from the common architecture-astronouting in java. One could argue it's just a different flavor. In either ecosystem the implementation of the actual business concerns ("What does this code actually do?") too often feels like an afterthought.
Disclaimer: I'm not condemning either language. I like Ruby and use it a lot, but this is one aspect that I can't ignore.
> Good design is a relative term. In ruby I frequently see it interpreted as: Pack as many layers of magic as possible, and then test that with as many layers of testing-frameworks as possible.
I have the same feeling sometimes. I wish the Ruby community would not be so concerned with magic and aesthetics, and target clarity.
I'm referring really to functional clarity. Fancy ruby DSLs that invoke method_missing and parse function names make for aesthetically pleasing end user code, but obfuscate the functionality for anyone unfamiliar with the code base.
Aesthetics and clarity are most certainly different in other ways (think pretty sites vs simple clear sites) but that's another discussion.
I've never ever gotten why people get so mad that the database gets hit on every request. It's going to get hit anyway on every request for something else. I've seen how easy it is to break into systems that don't re-auth the user on each request. I guess I just feel like people spend an inordinate amount of time worrying about database performance. A single select from a single table shouldn't be a performance bottleneck. It seems to me that a good database will have that query cached in memory. It may sound "icky" but if you look at the times, I bet your Rails app spends way more time in the renderer than it does for the "select users.* from user" that it uses to populat current_user.
I'm happy to be proven wrong, but I've written a lot of apps before I moved to Rails, and we used to avoid re-authing the user / looking it up on each request, and it's just never made a bit of difference to us.
Query execution time is only part of the equation. But even that gets ugly fast when you have locking going on. E.g., try adding an index to a non-trivially sized table in MySQL and see how fast that (unnecessary) SELECT performs.
We're a postgres shop and the problem we've run into lately is just the sheer number of open connections to the DB. Using something like PgBouncer helps tremendously, but it still is contingent on there being idle connections. Certainly read slaves could be thrown at the problem, too. But it's mounting complexity for something that isn't all that necessary shrug
> On top of that the ruby ecosystem also seems to attract a certain class of programmers who simply don't know how their magic ruby code translates to CPU, I/O and memory operations.
Isn't the entire goal of computer science to abstract away complexity?
The goal of abstraction is not to prevent you from knowing what's going on underneath the covers. It's generally to allow you to work on the whole with higher level constructs.
It makes a great deal of sense to understand the complexity you're abstracting away, first. Otherwise, you never really know what you're doing. At the end of the day, you're executing on a computer and you're doing yourself a serious disservice if you're ignoring that fact and hoping either Moore's Law or horizontal scaling are the answer.
I'm not convinced analogies such as this are particularly useful.
I'm pretty sure you could design a perfectly good car by treating an engine as a black box with a clearly defined set of inputs an outputs without caring how those inputs are translated in the outputs. Which is the goal of a framework such as rails.
Clearance probably. It's really not a big deal with the index in place and it gives you more flexibility for revoking access to specific users without blowing out everyone's cookie. In practice, it's not a big deal performance wise. In a system where it would matter, you'd probably be past the point of an off-the-shelf gem being a good solve.
What was originally a performance optimization on 1.9.2 caused a terrible slowdown on 1.8.7. It was reported and fixed within two days.
Does that still fit into your "not giving a shit about performance" critique? I think the "real problem" is that there are two major releases of MRI Ruby (not to mention JRuby) that have very different performance characteristics.
I think your example might actually support the parent's POV more than it refutes his stance. Exception handling isn't just slow in 1.8.7. It may be slower in 1.8.7 than 1.9.2, but exception handling is universally slow. There are a variety of reasons for this, but unrolling an exception is fundamentally slow and until there's native support for them on the die, that's likely to remain the case.
So, creating a situation where you're guaranteeing an exception is raised during non-exceptional execution flow is generally a bad idea. It may be cleaner or more concise code-wise, but performance-wise it almost certainly will always be slower.
For the most part I believe the parent is correct in that there is a much stronger emphasis placed on code conciseness/clarity than there is on avoiding performance anti-patterns. It's hard to measure whether the gains in "code agility" outweigh the local performance hit by making it easier to surface other performance problems. When there's less code involved it usually follows that it's easier to spot and fix problems. But I worry because my own experience suggests that death of a thousand cuts is a considerably harder situation to get out of than dealing with a handful of gnarlier bottlenecks.
Except that exception handling is so fast in 1.9.2 that it made this change a performance optimization in 1.9.2. Look lower in this topic for benchmarks showing that Rails 3 is faster in 1.9.2 than rails2 is under 1.8.7.
>performance-wise it almost certainly will always be slower
Except, it wasn't slower in the Ruby that most people running Rails 3 should be running (1.9.2).
I also point out, again, that they immediately fixed this problem when someone pointed out how slow it was in 1.8.7.
I don't really know what your background is, so apologies if I come off as patronizing. If you aren't familiar with how exception handling works under the covers, I encourage you to spend some time looking into it. It's really quite interesting and will likely impact your future design decisions in some way. Learning about longjmp and setjmp clarified a lot of things for me.
Now, I could be completely off-base and 1.9.2 may be the only system out there that has managed to make exception handling a cheap operation. But I'm highly skeptical of that. I think the more likely scenario is exception handling is much cheaper in 1.9.2 than 1.8.7, but still not cheap enough to favor over virtually anything else. I've not read anywhere in the linked issue or in the following comments that suggests using responds_to? was any slower than handling a NoMethodError exception. If it is, that largely points at a failure of responds_to? more than it trumpets the speed of ruby exception handling. And pretending that exception handling is fast enough for flow control and thus other problems needn't be fixed can land a community in serious trouble.
Now obviously there are exceptions to every rule. But the default position of any programmer really should be to reserve exceptions for exceptional circumstances because they are almost axiomatically slow, regardless of language or platform.
Ruby 1.9.2 has its own problems, especially with application startup. My application takes something like 45 seconds to boot on 1.9.2, on 1.8.7 REE it took 20 or so (which still sucks).
If the performance degradation is considered a bug, that changes the picture a little bit. But I agree with nirvdrum's comment on using exceptions as control stuctures.
I hear what you are saying about exceptions, but you also have to live in reality. Look at danielparks benchmarks in this thread; rails3 under 1.9.2 is faster than rails2 under 1.8.7.
Its easy enough to spout platitudes like "don't use exceptions for flow control", except remember two basic facts about the commit that I linked:
1. It was using exceptions for flow control as an optimization
2. That optimization was indeed faster on 1.9.2
On top of that, when the rails team was notified about how bad this "optimization" was in 1.8.7, they immediately fixed it.
I just don't see how any of this fits into "don't give a shit about performance".
Obviously, I didn't form my opinion on that attitude based on this particular issue. But maybe you're right and the Rails leopard has actually changed its spots. Apart from that, it's unclear whether the slowdown in NGIN is caused by this one line of code alone.
To be clear, I don't think that the NGIN slowdown is caused by this exception stuff, because that has been taken out and was really a different performance problem altogether. I only raised it because it was similar -- a problem only on 1.8.7 that was fixed as soon as someone mentioned it.
I think it is very interesting that the specific test produced in the original post is actually faster under Rails 3 and Ruby 1.9.2 than it is under Rails 2 and 1.8.7.
My only grief with you, fauigerzigerk, is that you claim the Rails team doesn't "give a shit about performance", when the performance problems seem to be related to people using an older version of Ruby. I think you are making an extraordinary claim -- that the rails team doesn't "give a shit", in in fact, they clear do give a shit.
I'm sure there are some individuals who do care. But the design of Rails and similar frameworks is consistent with what the benchmarks show. They have to be dog slow and they are. I want to assume that this is out of choice, not out of incompetence, hence my conclusion that they don't give a shit. I'm astonished (and actually pleased) to meet so much resistence. Most of the time I just get to hear something about how speed doesn't matter.
If I'm writing a framework on which others depend in ways I cannot entirely foresee, then the old features _must_ perform as well or better than before. No compromises.
That position of "no compromises" is probably why you're not writing a framework on which others depend. Complex software always involves compromises, especially general purpose frameworks used by a large population.
Even if you're saying that only in performance there can be no compromises (but generally people who say things like no compromises don't bound those statements) it will cause large compromises in other areas, such a future feature development or long term maintainability and readability.
Perhaps you should leave comments about my personality to those who actually know me. But if you insist on talking about character traits, let me say this to you: People who don't know where to compromise and where not to comproise generally make bad software.
That's a fair call - I don't know you and I am speaking about character traits.
I don't feel your original comment reflected a position on knowing where to (or not) compromise, you made a strong assertion to take one option off the table in all situations.
In my experience, developers who are make blanket rules up front about what can and can't be changed in the future development of a system don't end up making much of value.
That's just my perspective, and I apologise if I've misread your character.
What I really wanted to express is a dissatisfaction with a general attitude among some framework makers towards performance. Of course there are exceptions to every rule. If the slowdown of ActiveRecord was down to fixing a dangerous security bug or possible loss of data, that would be such an exception. But piling on new features in a way that degrades performance needs to stop somewhere. Hard constraints are good to focus the mind even if you accept rare exceptions.
All I can do is state my own point of view as clearly as I can. And my point of view is already a compromise. I didn't say make it as fast as possible even if you have to rewrite it in assembly. I didn't even say, don't write it in Ruby. All I said is don't make it significantly slower than it was before. That doesn't seem like such a big ask.
Out of interest, what language is your framework for? I'm seriously thinking of stopping using PHP because of the weight of the frameworks -- at least on Python there are microframeworks in which I am not left with a massive bottlenecks on the simplest Hello World page. Sure, you can scale but it's becoming ridiculous that I am forced to be so frugal with processor intensive tasks in the rest of the web application just because of the framework.
My own framework and ORM writing days are long past and the software I write nowadays has very different constraints (data/text mining).
I totally share your sentiment about having to be frugal because of wasteful frameworks. My approach is to prefer libraries over frameworks. I start out using the most bare-bones configuration possible (no frameworks, no ORM, no CMS, etc) and then I selectively add well maintained libraries created by people whose attitude I understand.
I think code reuse is generally overrated, particularly when it comes to the rather trivial things that web frameworks do.
This is probably not very helpful to you. I apologize.
I don't have the links now, but both Google and Amazon have concluded that faster page loads result in more sales/conversions. If you want to render the whole page in the browser in a couple of seconds (Google webmaster tools indicates the "fast" threshold at 1.5 seconds), spending over half a second on the server-side seems like a big penalty.
Here's the way I look at it: computing is becoming exponentially cheaper, but good developers are incredibly expensive. Anything that makes the later more productive is overwhelmingly likely to be worth an increase in hardware cost (to cover up lost performance).
The problem is that performance is an academically interesting problem to us geeks. We love to optimize, make quicker and oh-so clever.
I think this sharp antagonism between productivity and performance is a fallacy. It only works in the extremes. Making something slow doesn't automatically make it more productive to use. For framework writers to give some thought to performance issues does not make users of that framework less productive.
Also, if you look at what just a few selective type hints in clojure can do to performance or how terse Scala code is, you have to come to the conclusion that performance and productivity can go hand in hand.
I guess you could take that viewpoint if you're planning on releasing your product 3 years from now (for example, that might be okay for a game developer with long product cycles), but 500ms to render a page is slow today. So, is the idea to wait 5 years so your page renders in a reasonable amount of time?
Some people do: it's called mod_perl. You can compile a stripped-down Apache (written in C) with mod_perl (also written in C) built in. This basically runs as a C http server that understands Perl, running the script within Apache process space. It's pretty darn close to the metal.
The interpreter is the sticky part here and something for which Perl gets some criticism in this arena, but the Perl interpreter is also written in C and is quite fast, if not small.
Bag on Perl all you like as a language, I know I do (and I use it, as well as Rails), but if you can get past the language hump mod_perl will knock you sideways. You can easily pump out a network card's worth of static content with a leftover desktop machine, and DBI is no slouch when it comes to integrating dynamic content.
Yea, mod_perl isn't the same thing as writing your application in C, but anyway, my point was in reference to the parents comment:
"I would never ever release a new version of anything that is significantly slower than the previous one. It's simply a bug in my view."
I was trying to get across that we give some up performance for convenience now and again and it's fine to do so. I worked on the BBC iPlayer API which was written on mod_perl, by the way.
There clearly still are performance problems in Rails 3.0.x that aren't fixed. I am just now in the process of switching an application to Rails 3 and I have recently spent 5 days of hard work trying to investigate a big performance drop after the migration (under a different environment: Ruby 1.9.2 and PostgreSQL). I ended up finding two big performance issues in Rails PostgreSQLAdapter:
What they and everyone else experiencing performance problems should do, is to find one action that has the biggest performance difference between running under Rails 2 and Rails 3 and then profile it - easiest way is by wrapping the action in an around filter and using RubyProf. If there is a obvious performance problem somewhere, you will most often see a method that has a really huge number of calls and %self% time in comparison to all the others. Then you have to figure out what in detail happens - I ended up doing this by putting "begin; raise StandardException.new; rescue => e; puts e.backtrace; end;" into the "hot" methods to get call stacks and then went on to read the code involved.
I can't believe that in a modern web app, most of the time wouldn't be spent in the database, those graphs floored me. Other than expanding templates, what in the world is Rails doing with that time??
In my experience, Rails and frameworks in PHP are quite slow, and typical uses of databases are quite fast.
I think it boils down to generality. A database is a very specific thing, and the typical usage of the database (e.g. SELECT * FROM products WHERE id=?) can be very highly optimized.
A programming language or framework is much more general, and it is much harder to optimize common cases without losing layers of abstraction.
For example, it would be much faster to return records as arrays instead of ActiveRecord objects. That would probably be much better for certain uses — say, processing of lots and lots of records. It would be worse for most cases, though, and would make Rails that much harder to use.
This is probably exacerbated by the tendency to do everything in the application instead of in the DB (e.g. not using hand built SQL to get aggregate information). That said, it seems like there's been some movement toward shifting some processing back into the DB (e.g. joins).
Yeah, 400 ms spent on average just in Ruby on a modern server is really absurd, though. Think about that - half a second, before anything to do with the database. I try to shoot for sub-100ms avg. total, this would probably drive me insane.
I don't think I've ever managed to write a PHP page that took more than 100ms on average for the PHP alone on a decent server, and it's not from a lack of abuse.
I have no idea where you get 400ms on average. Here's the log for a blog post page of my site. It's a heavy and big page, it's on ruby 1.8.7 and rails 3, and I haven't done anything to make it faster. I could cache most of it rather easily.
Completed 200 OK in 205ms (Views: 168.9ms | ActiveRecord: 23.2ms | Sphinx: 9.6ms)
edit: a couple of notes. This is running on a cheap vps and would be faster on a real server. It would be faster if I used ruby 1.9 as well. If you want to see how fast a bare ruby webserver is then check out this. http://torquebox.org/news/2011/02/23/benchmarking-torquebox/
Performance depends on the application. You are spending 168ms rendering. If the page had a few more partials or some more data, you could be seeing 400ms for that page easily.
Rails view/partial rendering is pretty slow in general.
i know, but as I said, that page is pretty heavy and it's not close to 400ms. It's on a cheap vps and using 1.8.7, And I could easily cache most of it.
I think the some people could have read the comment I replied to and believe that 400ms is somehow normal. It's not.
It sounded to me as if you believed this was typical for a rails site. Maybe I read it wrong, but it isn't typical for a rails app that I've ever done.
Rails has supported joins since forever. However the problem with moving things to the DB is that the DB is very hard to scale while scaling the web app is almost trivial.
The read only parts are not that difficult to scale in the DB. It's the concurrent write load that is really difficult to scale. As long as a single DB server can cope with all the write load, everything is fine. That's like 99.9% of all sites and applications.
While not related to these graphs, in general ActiveRecord only performs very rudimentary caching. By default it lasts for the life a single request. There is no shared cache across requests or processes, so if you read in data that hardly ever changes, you're hitting the DB for that data on each request. This was a rather big shock to me when coming over from Java, where I used to use Cayenne* (http://cayenne.apache.org/) for the ORM. Cayenne has some pretty nifty tunable caching features out of the box and I just came to expect those as being a requirement of any serious ORM these days. That was three years ago.
That's not to say there aren't caching options for ActiveRecord or Rails. You just have to go out of your way to employ them. There's no out-of-the-box write-through cache of the DB.
* - I became a committer for the Cayenne project but haven't been active over the past couple years.
If the author is really concerned about rails3 performance has he opened a ticket with his specific case? tenderlove has been addressing AR performance issues in rails3 for quite some time.
Interesting, though I no longer use ActiveRecord in my largest app because of how slow it is (even in Rails 2). Try checking the overhead of instantiating (no save or db lookup) a simple model instance with 2 fields for example.
If you manage to avoid that, you've beaten the slowest parts. The view engine is also slow but I tend to do rendering in the browser these days so that barely matters too me.
If you are careful, it is very realistic to have response overhead from rails hover around 1-2ms for simple things and maybe up to 10-20ms for the most complex business logic (expensive validations). It's kind of odd that people who care about performance assume they have to take the whole package as is.
It depends on the data. For one app I've written my own CouchDB driver which, as slow as CouchDB is known to be, seems to kill most AR equivalent code. That's pending release as soon as it's stable.
Generally speaking, I think AR's problem isn't really an API one, it's a great interface. It's that it's implementation is far too clever and would benefit quite a bit from narrowing supported driver styles and removal of some old APIs that really hurt more than help. There could also be better control over opt-in on some behaviors that start adding up in overhead (i.e. change tracking keeps a lot of garbage around).
For our application that meant bringing 1 of 3 application servers down and slowing things down in general. So we're stuck with 3.0.5.
From what I understand the Rails community has declared 1.8.x as a thing of the past (tbf it IS a thing of the past. But in a non-perfect world migrations our not always easy). This may be a good thing (it forced us to march towards 1.9.2) but I'm sure it has caused a lot of problems.
The commit mostly fixes that particular issue (and is included in 3.0.7), but as the article point out Rails 3 AR performance still needs quite a bit of work.
Judging by the comments on that commit and the comments in this HN post, I would say that many of the performance regressions are only happening under ruby 1.8.x.
By this time, I wonder what the rails developers should spend their time with: making the framework better for people running the current version of the language or fixing performance problems due to shortcomings of the old version?
If you can't upgrade to 1.9 and the performance of rails 3 isn't good enough for you to upgrade rails, why don't you just stay with 2.3 and upgrade once you are ready to move to ruby 1.9?
That way your unwillingness to upgrade ruby doesn't "waste" the time of the rails developers by causing them to fix issues that don't even apply to more and more people over time.
Note: i'm talking about regressions that only affect 1.8
I'm finding it's not that people can't upgrade to 1.9.2, it's that many don't want to.
I have a pretty decent-sized application running on REE and Rails 2.3.x. I've made my code fully 1.9.2 compatible. My specs take on average 3.5 times longer to run in 1.9.2. Profiling that shows that close to 60% of the time is spent in Kernel#require. Now, it's pretty common for people to say only profile in the production environment, but I need to be able to run my specs. And it's not acceptable to shell that out to a CI server or to run REE in dev. mode and deploy with 1.9.2. Running in spork is a gross approximation, too; I've seen way too many issues there.
My suspicion is that if you have a small enough project, the speed difference is negligible. If you have a large project, it becomes much more pronounced. So, it'd really be nice for me to continue to upgrade and improve my Rails-based app without having to drag along an environment that's going to hobble me.
FWIW, I do know people that have started on 1.9.2 and rolled back to REE because speed was such a problem.
I see. My two projects are indeed quite small yet (but running the tests already takes quite long).
But: the solution to this problem isn't not upgrading and then demanding support for your outdated environment. The solution is to fix whatever is slow and try to get a patch upstream.
Now I don't know how likely the ruby developers are to accept a patch, but once this becomes too painful for me, I will try and have a look. I might not be able to fix it, but at least I know that I'm not stuck in the past, terrified and unable to move
Well, the solution isn't upgrading for the sake of upgrading either. 1.9.2 buys me absolutely nothing and has some major costs associated with it. REE is chugging along like a champ.
I don't really understand why the release of 1.9.2 meant all else had to be dropped. Most other communities continue to support their stable releases. It's not as if 1.9.2 has even displaced 1.8.7 with virtually any of the linux distros either. If your policy is to use security-audited / supported packages, as is the case in many environments, moving to 1.9.2 is a dealbreaker.
Anyway, supporting 1.8.7 and 1.9.2 is trivial in most cases. I'm not demanding support for my environment, which I prefer to think of more as stable and battle-tested than "outdated." But I don't understand actively dropping support for it either.
With Rails 3 in development mode, one of my pages loads in 2 seconds. On 2.3, it loads in about 0.6 seconds.
:( Rails 3 is awesome, except for the performance issues. I'm not sure how to go about looking into it. When I profile, the bulk of the time is spent garbage collecting apparently.
I just ran a simple test on every version of ActiveRecord that I could get running (standalone), using both Ruby 1.8.7 and 1.9.2, and graphed the numbers.
Why dont you meeasure it? His microbenchmark basically measures the time to do 10000 database round trips to extract a field. So long as you dont overoptimise and request all the data in a single transaction.
I've been a Rails dev for 5 years. I'm frequently considering leaving for another framework because of one thing: bootstrap time. Starting a test or server on my dev machine takes 20-30 seconds. Particularly with tests, this is a huge problem for doing proper TDD, particularly when you are trying to use tests to track down a bug. In that 20-30 seconds, I often console myself that we no longer need to print punch cards and wait a whole day, but that's little consolation when node.js starts up in under a second.
There are hacks and fixes to this like spork, but none of this should be necessary. The performance issues are not debated in the open enough. I don't understand how those startup times are acceptable to anyone, but I rarely get responses when I ask questions about it.
UPDATE: I have the latest top of the line Macbook Air - 2.13 Ghz Core 2 Duo with SSD.
I have other issues with Rails, but this is the only potential deal breaker. (I tend toward functional clarity instead of "human" readability in the magic debate, while most "Rubyists" would rather pollute the namespace (e.g. metawhere) and create 100 line method_missing calls to make one command slightly prettier aesthetically. And Cucumber - ugh what a ridiculous contraption that is.)
I have investigated this in detail and the reason for this is really ridiciulous - 80% of this time is spend on executing "require" statements and this is due to really bad design of RubyGems and/or Bundler. They both do their job by augmenting $LOAD_PATH to include all directories containing gem contents. If you then look into how Ruby deals with the $LOAD_PATH, it turns out each time you do a "require" it will go through all of those directories and their subdirectories in search for a file matching what you have required. I used strace to see how this impacts doing "rails console" on an application with around 30 gems - it ended up doing 35000 open calls that ended up with ENOENT. It is beyond my mind how this can remain unfixed for so long. If RubyGems instead maintained a cache from all the gem directories, it could map requires to files without touching the filesystem and it would work many times as fast. Even creating a cache at runtime and then using it for the lookup would be many times as fast. Unfortunately, it is hard to determine a strategy for building this cache that would map the requires to files in exactly the same was as RubyGems, because RubyGems has some pretty weird strategy for determining which files take priority over which files when you do an ambiguous require. Because of this, I haven't yet succeeded in implementing a fix myself and I'm not 100% sure if it is possible. I also tried to contact one of the Bundler guys, but so far haven't had any reply about this.
Ok so first this is not exactly an empty Rails project. The Gemfile.lock has 221 lines, that amount of dependencies is not normal. To compare the application code for http://www.getharvest.com/ a +5 year old rails project with customers et all and it has 245 gem dependencies (including some that are our own). The +3 year old http://www.coopapp.com has 181 dependencies. Both of these start up within 3-4 seconds on my desktop and due to their age & size they have admitedly too many dependencies. Everytime one is removed I've rejoice.
Please don't call something with 221 dependencies a blank Rails project. More the amount of conflicting half baked gems you've added makes it an unfair complaint about Rails. I've assure you that there is no language / framework in the universe in which does not take a hit when adding too many dependencies. Either execution wise but more often it just breaks your spirit.
Anyway back to numbers, on 3 year old desktop under Linux, executing tests on the "blank" project yielded:
<pre>
real 0m8.698s
user 0m5.996s
sys 0m2.532s
</pre>
For some reason ruby is much slower on OSX than linux, that is an interesting project to investigate. I've think the platform difference shows the 10 second difference, spite the slower CPU. It would be quite interesting to understand why ruby is slow on OSX. Each to his own itch to scratch.
Why are you equating lines in Gemfile.lock with the number of dependencies?
There are 28 gems specified in the Gemfile. That's not an unreasonable amount of gems. With dependencies, the total amount of gems being required is 76.
Hell, only using Rails and sqlite-ruby will require 26 gems to be installed.
I'm not that deeply in Ruby/RubyGems/Bundler internals to be 100% sure, but on the other hand I really can't see how it could work reasonably fast while using the strategy for looking up libraries it uses.
Great work! Yehuda is brilliant, although I'm sure he's busy now and I don't know how much bandwidth he would have to address that. I think he's full bore on Sproutcore.
If I calculate (20 seconds) x (number of times I bootstrap rails), it might even be worth my time to have a go at it.
There is a bug supposed to be fixed in ruby 1.9.3 that addresses something to do with the load path and bootstrap performance, but no idea when that's due to land. If it's truly the cause, it seems like this should be a patch to 1.9.2 instead.
I haven't been using Rails for nearly as long as you have. As a result, I thought the bootstrap time was something Rails developers just "put up with". I would have thought that someone else would have noticed this problem and done something about it...surprisingly I don't think many have.
Django, by comparison, bootstraps the environment almost instantly.
I have also been looking for an answer for how to deal with this issue, so I just started a bounty on your question for 200 reputation. Looking forward to a good answer; it may very well convince to go back to Rails.
You can slow down the startup time for Django if you have lots of installed apps and nested modules but it is still in the order of seconds rather than tens of seconds.
That's not the dev server, which is able to start and reload pretty fast. That post is complaining about the startup time for loading several pre-forked Python VMs on the first request.
That's pretty much a production problem. It's somewhat common to see "warmup" scripts that make sure all the VMs start.
"That's not the dev server, which is able to start and reload pretty fast...That post is complaining about...a production problem."
The issue is actually that it takes time to load the application code and dependencies, something that applies to Django, Rails and just about any similar framework, regardless of whether it's the production or development environment. In general, the production environment will actually be faster at this.
Just to clarify the context to future readers, I deleted by post because I wasn't interested in discussing django, but since you responded I'm addressing it. My original post pointed out that in my experience it indeed does take time to load django and linked to this post: http://stackoverflow.com/questions/1702562/speeding-up-the-f...
Yeah, I tried Django 2 years ago and sort of wish I had stuck with it primarily for this reason. (Additionally I much prefer the Python "no magic" philosophy, although I've gotten used to tracing through Ruby craziness.)
The Rails community (at least at the time) was so much larger and more open that it seemed like a better bet. Rails 3 was coming up and I knew Yehuda was doing brilliant things with the Rails 3.0 architecture. Plus, Moore's law right? - not helping!
I'd love for some hard core Rails guys to get in here and offer some real solutions! As much as I have issues with some things about Rails, it is largely a great tool for me, and I know it so well I'd like to be able to continue using it without tearing my hair out every time I need to run a simple test.
I dunno, I've been at this for 6 years with Rails, am running a 4 year old mbp that dogs it with Java stuff but hangs in ust fine with Ruby. Maybe my apps aren't nearly as huge as the ones you're loading, but I've done some large ones.
But I share your frustrations re: "metawhare" :) Solution is just to hang around with responsible devs.
"Starting a test or server on my dev machine takes 20-30 seconds. Particularly with tests, this is a huge problem for doing proper TDD"
Long startup times or not, sounds like you should probably be using autotest or watchr. 20-30 seconds is not normal. Unless the app has a very large amount of code and dependencies, that sounds like an environment issue.
"while most "Rubyists" would rather pollute the namespace"
That's unfair, and I say that as someone who is very adamant about clarity, explicitness and simplicity in my apps. In my experience, most (not all, but most) of the libraries where people are doing weird things are libraries that are far from necessary, so avoiding that is trivial.
For instance, you cite metawhere. First, I haven't investigated the implementation in depth, but, FYI, it doesn't appear to use method_missing at all. Secondly, I'd be very hesitant to include a library like this in an app, regardless of the implementation, since it's too invasive of a dependency.
autotest or watchr doesn't help startup times -- I'm not sure why you mentioned that here.
I don't think the performance problems are affecting ALL applications - some people don't see any problems. I do think there is something wrong in Ruby or Rails that doesn't happen in all code paths.
I'm not sure why you think that. Unless you use spork or something that preloads the rails environment then forks it on every test run, autotest/watchr will boot the whole rails environment from scratch on each run.
You're right. I'm distracted by the reported 20-30 second environment load time for each test when it only takes about 7 seconds for my tests to start with autotest in my current pretty large app on an older MBP.
Autotest doesn't help startup times, but it can help mitigate the problem somewhat since the tests will start almost instantly when you change a file, saving a few seconds.
It's a good suggestion, but doesn't solve the problem.
"Long startup times or not, sounds like you should probably be using autotest or watchr. 20-30 seconds is not normal. Unless the app has a massive amount of code and dependencies, that sounds like an environment issue."
Autotest helps some, but 20-30 seconds is pretty standard fare. Maybe I use too many gems, but I don't think I should forgo reusing community code to reduce start times.
Maybe it's unfair to make a generalization about "rubyists," but it seems people in the community get overly excited about something that saves 5 characters in their codebase or makes something readable to non-devs (for all those non-devs who will read your code.
Another example - the ubiquity of DelayedJob. The standard usage is to insert delay into your call chain, so foo.delay.send_emails instead of foo.send_emails. This pollutes the ENTIRE object space with the method delay, in addition to the fact that it doesn't work well on many objects. I complained about this practice and the response was that it was clearly better to write foo.delay.send_emails instead of Delayed.add_job(foo, :send_emails) or something. For me, I find any pollution of namespace where conflict is possible highly questionable, although sometimes it's ok.
Metawhere adds methods like .eq and .lt to Symbol (symbol is something like an intern string, e.g. :foo or 'foo in Python).
Polluting the namespace like metawhere does is a significant problem.
We use datamapper on our app, and were looking at doing some of our analytics on mongo at one point, but since both mongomapper and datamapper both define 'helper' methods on Symbols, they can't both be in use at the same time.
If we _really_ wanted to use mongo it would need to be from a completely separate codebase either talking to the db via Sequel or something, or talk to our main app via REST calls - neither being a particularly pleasant solution.
"I don't think I should forgo reusing community code to reduce start times."
Using too many gems is an problem for a lot of reasons, though. You are introducing dependencies that you then need to manage and update, they often do more than you need to while being harder to maintain (you might end up with a bunch of forks in your gemfile as things get outdated), they are sometimes invasive and will require big rewrites to switch from.
"it seems people in the community get overly excited about something that saves 5 characters"
I agree (though I'd modify to to say "some people in the community"), I just find that I don't encounter it much when I'm following best practices myself.
True that gems often do more than I need, but should I?
1) Write my own authentication code
2) Write my own ical feed code (just added this in about 2 hours using ri_cal, which has way more features than I need)
3) Write my own code to inline css in email
4) Write my own Facebook API library
5) Write my own admin interface
6) Write my own SCSS and HAML processors
7) Write my own file attachment and processing code
8) Write my own background processing library
9) Write my own CSS/JS compression library
You get the point.
I use gems that save me tons of time. If not for gems, I would not be using Rails.
That's only 9 libraries, and that shouldn't be enough to cause your load times to reach 20-30 seconds.
Also, since you asked :-) I would question #5. Auto admin libraries (and this applies to multiple languages and frameworks) are generally heavy, too general and have code hidden away in the library. In contrast, banging out crud views and controllers in a locked-down namespace is easy, only relies on same dependencies as the rest of the app, can be customized for the specific business needs and all the code is in one obvious place.
It would take me many days to create and test an admin with multiple objects, even a simple one, and then I have to maintain it. I install rails_admin and I've got what I need. This is for internal testing and maintenance. I don't care if it's heavy.
I suppose your logic makes sense if you're gainfully employed, but I bang this stuff out for myself and clients. A 90% solution with one line of code sure beats weeks of work.
I was at the BBC working on a large Perl codebase which was a fairly typical Catalyst/DBIx::Class stack. It had thousands of tests which thanks to the foolish way we loaded reference data and pre-populated memcache etc. had the test suite running for 3 hours. As you can imagine, it was a nightmare for TDD and indeed for large merges (by the time you've run tests trunk would have inevitably changed). We never fixed it thanks to a combination of lazyness and a case of "it's legacy code, won't be around much longer" syndrome.
http://qwerly.com runs a tight and fast Node.js stack at the front-end with a test suite that takes less than a second to run (a few hundred assertions). I'll dedicate much time to the framework around it's tests and make sure it's never slow.
I had a similar issue using Django+PostgreSQL on Ubuntu 11. Running the test suite was taking 10s of seconds in that combination which wasn't too bad but hurt my test->fix->test cycle when debugging PostgreSQL related issues (during normal dev I run the test suite in a in memory SQLite database).
Took a little time to work it out as I am new to PostgreSQL but it was worth it to avoid a pause in the problem solving cycle.
One thing I've been investigating is using RabbitMQ to separate the logic out of the monolithic rails application.
Beetle looks promising. http://xing.github.com/beetle/ You can do both RPC and async messaging. (Might want to wait a few days before using it, there's issues with some libraries that it depends on that will be fixed soon).
I don't understand how that helps. This is a problem for pretty simple Rails applications.
Besides, why should I have to set up and learn RabbitMQ to have a decently performing web framework? This is 2011 and this shit is slower than my old Java framework running on an ancient processor in 1995.
I just tested this on one of my apps (yakkstr.com) and 'rails c' and 'rails s' both startup in about 5 seconds. I'm on 1.8.7 and rails 3, not sure if that makes a difference, but 20-30 seconds doesn't mesh with my experience at all.
For me on 1.8.7, it takes about 7-8 seconds. Annoying because the application is empty, it's just 28 gems being loaded. On 1.9.2, it's 20 seconds or so.
Re: your knock on Cucumber – isn't Cucumber explicitly designed to enable human readability so that non-programmers can collaborate on features in quasi-colloquial English? Can you expand on what you dislike about it?
I'm not trolling; I'm genuinely curious about this.
Keep console open and use reload! when you change your models, for starting a server I don't have much other than keep it running the background, running a test -- use spork, migrate your db and run rake tasks -- try to take advantage of the fact that you can specify multiple rake tasks at a time; "rake db:migrate db:test:prepare"
More good suggestions. I'd love to get spork working, but rspec syntax makes me cringe and I haven't gotten it to work with test-unit. I found a plugin and a fork, neither of which I could get working.
But there are people who have convinced themselves that as long as it scales out, it doesn't matter how many servers you have to run. And if the reason is that something that's known statically is checked billions of times at runtime they don't care.
For a framework, the concept of premature optimization makes no sense.