So we just finished a new round of optimizations as part of our ongoing quest to prove that with sufficient caching you can serve arbitrarily large numbers of requests with arbitrarily slow languages.
No, I'm sure I can squeeze out a lot more performance. But I have a lot of other stuff going on at the moment (Demo Day is this week, among other things) so it is nice to be able to buy my way out of trouble for a change.
with sufficient caching you can serve arbitrarily large numbers of requests
Note the collorary: with a sufficient business model you can earn arbitrarily large amounts of money relative to hardware costs.
Which always makes me wonder why the clooooooooooooud gets so much press for cost reduction. If your application doesn't value data by weight (or worse, values data at 0), your hosting costs are going to be rounding error. Reducing the rounding error to half-sized doesn't strike me as the number one best use of time for the majority of businesses.
True, though reduced hardware cost is only part of the promise of cloud computing.
Elastic scalability is a major advantages in some cases, allowing you to rapidly scale up (and back down) while minimizing waste. "The cloud" absorbs the fluctuations.
Easy deployment and thus reduced administrative costs might be another. With EC2, etc that's might not be the case, since you still have to set up the OS images and other infrastructure yourself, but things like Heroku might help.
Looks like everything went well. Handy having Robert Morris as your sysadmin. The new server seems to be about 2x as fast. The frontpage renders for me in about 50 msec. But the site should seem more than 2x faster (for logged-in users) because many requests will terminate before being interrupted.
There's now enough memory that we can fit all the links and comments in memory at once again. We should be good for another year or so.
news.ycombinator.com already points to the new ip address. The message with that ip addr in it will only be seen by people who still have the old ip cached in their browser. (Obviously for them we have to refer to the new server by its ip addr.)
You were running FreeBSD 5.3 until recently? You know it reached its EoL at the end of October 2006, right? There have been lots of security issues over the past 2.5 years which weren't fixed in FreeBSD 5.3.
And by "FreeBSD 7.1" you mean 7.1-RELEASE-p3, right? :-)
So, if you've got all those in-memory data structures, there are lots of lists of pointers to objects. How does it compare on a 32-bit OS, with 32-bit pointers, vs a 64-bit with 64-bit pointers?
You mean compare in terms of memory usage? It will certainly eat more memory for the same data structures.
We've got a simple functional language here - being very unscientific, and looking at the key data structures, for my current workload, straight pointers are about 5-10% of the allocation. In practice I do see a 10-25% memory usage increase going from 32bit to 64bit (Linux).
(Anecdotally -- I've read 20% is a rough rule of thumb for 64bit anyway).
There is also an increase in data structure size due to data alignment, but this is going to vary quite a bit -- depends on the structure, the compiler -- and if you're doing dynamic memory allocation you may even find it makes no difference at all -- as malloc may have been delivering an oversize allocation anyway.
So, it's certainly not a trivial increase... Especially as we're currently running on a 256mb slice (I sort of wish they had a supported 32bit option)... but it's not massive.
However, for us, 64 bit still has a lot of advantages. For example, you can do more expansive memory mapping and the like.
the freeBSD binaries on this page are i386, if thats what the new server is using then it won't make a difference. The source is available as well, if pg compiled the source in 64 bit mode it would increase memory usage.
But using a 64 bit OS doesn't necessitate using a 64 bit address space.
I'd guess that most of the memory is used by blocks of text which are much larger than the pointers which reference them, so probably the size_t expansion has a minimal impact on the total memory usage.
That's true, but 64 bit will affect other things as well - pointer size goes up, your stack will be bigger and the packing (alignment) of your structures changes.
If your compiler is unfriendly, your structure size can change significantly -- afaik, GCC is pretty clever about structure packing, but I'd imagine it varies a lot depending on the architecture.
Perhaps if they had been running Erlang on the server you wouldn't have had to take the site down while migrating to a new server. Sorry just had to point this out
Out of curiousity, does erlang really support that kind of hot migration of running processes along with their data?
Until now I only knew that erlang can span multiple hosts - but only when you wrote your application that way?
What has really helped us in terms of machine migration (but at the cost of additional overhead) is running stuff inside Xen and OpenVZ virtual machines. Those can really be migrated seamlessly - often without a noticable downtime.
Erlang supports hot code migration, which allows you to use your new code in each process AFTER you merge the new code in. The processes that were running when you merged the new code will still be using the old code, until they end execution. This isnt really that big of an issue if you write your code in the erlang-way, which is using as many processes as you can to spread data, and then passing messages between the different processes. In this situation,each HTTP session would probably be running the old code, but each new one created would be running the new code.
Typically sites with multiple servers put there web and database server software on different machines. Hacker News doesn't use a database and stores everything in memory, so this wouldn't make any sense for this codebase.
http://ycombinator.com/newsnews.html#15jan09
Is that quest over, or is upgrading the server fair game? :)