x86-64 architecture presently allows 256 TB (48-bits) physical addressing, and Itanium and at least some its operating systems (HP-UX, OpenVMS) have 50-bit physical addressing available.
The big HP NUMA boxes are Superdome and Superdome 2, and those have global memory access, though access to the local memory is faster than the remote memory; the requisite non-uniform memory access speeds.
Last I checked, the current Superdome and Superdome 2 series maxed out around two and four terabytes of physical memory.
If the current four terabyte maximum memory configurations and possibly the architectural 50-bit addressing aren't enough and you're willing and able to buy that much physical memory, then HP would probably interested in chatting with you.
For some reason, I thought that Intel had abandoned the Itanium platform. I was surprised to see that HP was still marketing machines based on that processor line.
I'd be more interested in a writeup of what data you actually store. I've used Basecamp at a customer's before, and it certainly didn't justify the requirement of 1TB RAM, let alone 100kb RAM to serve quickly.
This sounds more like Reddit's problem where some architectural simplifications might net a giant win versus piling yet more gunk on top (Reddit is still perceptibly doing random IO for every comment in a thread during page load, or perhaps some insanely slow sorting, I have NFC how they haven't fixed this yet).
I'd love to hear how Reddit is doing it wrong and can be improved. As far as I can tell, they want users to see the freshest content possible (latest comments, up/down vote counts, etc.). Is it really that easy?
I worked on a Mac Pro once that someone had 64GB of RAM installed in... took the better part of a minute to get the startup gong.
They ended up splitting the memory between a few machines as it became obvious that the applications being used (Adobe CS stuff) wasn't going to use all that RAM in their use case, and other machines needed it more.
Although the new Final Cut Pro X eats RAM for breakfast. Our editing Mac Pro at work regularly uses 16GB+ when working in FCPX with multiple projects and events. Frustrating at times, to say the least.
Are you defining dependencies between the keys somehow so that keys invalidated further down in the hierarchy propagate up to invalidations all the way up the stack?
Whoa—I can’t believe this #cache_key and :touch stuff was introduced years ago and I somehow completely missed it. This is the most useful and elegant thing ever!
We do this to invalidate multiple keys at once. All related keys are created based on a master key (well, the value of that key). Once that is changed, all keys are invalidated.
Only if you don't have good enough software engineers.
The CPU is organized that way and it works very well and that caching scheme is implemented directly on the metal. Any decent software engineer should be able to design a working implementation with an architecture that is clean.
One of the bright spots of my career was building a distributed system that spanned 64 different nodes and regularly did over 100Mb/sec... Built on machines holding (at the time) a top-end config of 2Gb/ram per machine...
This is what 864GB of RAM looks like after you laid it all out on a table without using ESD (electro static discharge) protection. A few of these DIMMs are probably bad now, but you won't know which ones because you're going to stuff them all in a Linux box that you built yourself... this is such a bad idea for a production system.
In my decades of building and taking apart computers, I have never had static discharge ruin a component. I can't remember the last time I was shocked by static discharge for that matter.
Who are these people who need to be grounded all the time else they build up to component destroying levels of static?
Handle enough DIMMs and the benefits of ESD protection start to become obvious. I've seen a very high correlation between DIMM errors and whether or not I was using a wrist strap while I was fiddling with the computer.
You don't need a noticeable spark or shock to make the DIMMs unhappy.
You don't need to feel the shock for it to be enough to destroy a component.
ICs are more robust when they're built into a circuit, but it's still a good idea to use sensible ESD protection when handing electronics, especially if you're paying lots on production systems.
It depends on the humidity where you live, and what kind of clothes you wear, and if you're on a carpet or tile, or probably what you had for breakfast etc. I've had ESD fry a hard drive a RAM module, and maybe an SSD (or maybe it was bad firmware).
Ahhh. I've never lived very far from the ocean, so it's always pretty humid. Plus, I've always had wood or concrete floors which limits static build up and wood or metal tables.
I suppose if you're in Arizona, static build up is a much bigger problem.
I'm in Arizona right now. I cuss out the static electricity about every other day, so, yes, around here I'd use an ESD wrist strap or something if I were going to handle DIMMs.
The static fears came from the early unbuffered CMOS chips, which were insanely sensitive to it. Yes, in theory you can still wreck a chip that way, but for the most part it isn't going to happen unless you're in an environment that's unusually prone to static buildup.
IC are like cats. They both have 9 lives and each time you handle it outside a protective environment it loses a life (the IC, not the cat :-)
You may not see a failure right now but you will see it later when the life-counter reaches 0.
You are probably more right than you know... I've always suspected something like this; although mainly that's just from my personal experience when I haven't used ESD, sometimes I've given a board a bad shock and it still miraculously worked afterwards. I've even seen people plug in power supplies backwards, smoke a component on a board, and have the computer still miraculously boot afterwards. No idea if the computer was stable long term, but it's surprising sometimes how much damage these components can take. Then again, on the other hand, sometimes they stop working when they are in a clean room where everyone wears Intel bunny suits and is grounded 24/7.
I generally do 24-72 hour burn-in tests on my computers every time I put them together and I only take them apart when they need upgrading or something breaks.
Maybe I'm just incredibly lucky. Then again, I've watched Dell and IBM service techs take apart laptops we had on-site support for without ESD protection, so maybe everyone else is simply paranoid.
> Then again, I've watched Dell and IBM service techs take apart laptops we had on-site support for without ESD protection, so maybe everyone else is simply paranoid.
This is the correct answer. Either that or every single Apple, Dell or UofT tech ever needs to be fired because I've had frequent repair experience with all of them and not once have they ever worn a wrist strap or mat. In fact, most techs mock the idea in my experience.
I would consider it if I was handling literally hundreds of modules like this, but for regular old desktop PC support it's complete overkill. The time it takes you setup an anti-static mat and wrist strap every time you need to swap something out costs you WAY more overall than the occasional lost module, which rarely if ever happens. If you're really so concerned, set aside $100 to cover any potential dead module. $20 says you'll never use it and you'll make more money because you saved time.
Every field engineer is trained on the proper use of a static mat and wrist strap. It takes only 30 seconds to put one on. Just because the field engineers are lazy and are ignoring their training doesn't make the risk of ESD any lower than it is. In cold winter climates with forced air heating and humidity less than 10%, how likely is it to cause an ESD?
Inexplicable component failures; random errors weeks or even months down the road. These are all caused by ESD. Just because you want to ignore the laws of physics doesn't mean they don't apply to you.
> Every field engineer is trained on the proper use of a static mat and wrist strap.
Right, and they still don't use it and mock it as they are being trained.
> It takes only 30 seconds to put one on.
To dig out the mat and wrist strap, fold it out, put the machine on the thing and clip it properly takes longer than 30 seconds, even if you have it around already. On top of that, you have to do it every time you want in and out of the case, on every call or event, every time. That adds up quickly.
> Just because the field engineers are lazy and are ignoring their training doesn't make the risk of ESD any lower than it is.
If ALL field techs without exception don't use it and mock it because in practice the risk is so ridiculously remote that it's not worth dealing with, then yes, it does make it lower than people like you make it out to be. I'm actually going to trust the guy who does it multiple times a day every day for years and years over the guy who insists the risk is greater than has ever been proven.
Either way, why should you care? I'm the one who will supposedly be paying for all these random dead components. I mean it hasn't happened once in years of being a field tech and never once using a strap and neither has it happened to anyone I know and the people I know are mostly techs and never use straps, but who knows what the future will bring?
> Inexplicable component failures; random errors weeks or even months down the road. These are all caused by ESD.
How convenient. We get to blame all potential future failures on ESD too. Nothing is a bad part or wear over time. It was all caused by you touching it without wearing the magic strap.
> Just because you want to ignore the laws of physics doesn't mean they don't apply to you.
Or Apple, Dell or UofT either. I'm just going to have to assume there is a magical anti-static field over Canada then.
I used to be a Sun field engineer. I had a foldable static mat with wrist straps in the pockets. It literally took me less than 30 seconds to unfold the mat, slide on the strap, and throw whatever system board I was repairing on the mat. The mat was padded as well, so I didn't have to worry about finding a safe surface to drop a 10 pound system board on.
The reason why the Dell techs are not using wrist straps is because they are lazy, and they know that even if ESD causes a component failure, some other poor tech will get the follow up service call and just come replace another part, or replace the same part again.
Have you ever wondered why a lot of replacement parts, after being installed without ESD protection, somehow are DOA (dead on arrival)? The factory that manufactures them surely tests them before shipping.
Are you seriously arguing that static charges of several thousand volts can't damage integrated circuits?
Some places get very dry and staticky in the winter. Grounding yourself can be as easy as pressing a bare ankle against a metal table leg; better safe than sorry.
The DC took the photo for us, I'd have to check with them what precautions they took but it looks like the build station if my memory serves. The RAM went in Dell boxes.
We've been playing with a Kove xpd (http://kove.com/xpress) with 2 TB RAM lately. 2 TB RAM and four Infiniband cards make for a very fast DB server... :-D
Ok, it may have been tongue-in-cheek, but I was serious. I love that there are so many great cloud hosting solutions now. I honestly don't want to run servers.
I question the architecture and approach of this much caching. Because this is memory, I assume it's for a read-only cache?
That said: most read heavy services only need at most 10% of their working set cached, in memory. The hardest problem in caching is figuring out what that set is and how to control consistency. Not how to store it.
So, having this much cache seems to imply that you think you're going to have large amounts of read intensive data to cache.
Or else you intend to cache your entire working set? Either way, you'll have a single point of failure (either in a single server, single datacenter, or single geographic area).
Any large operation can tell you that the problem becomes 90% network and stuff like a CDN become far more important than how fast and big your cache is.
Perhaps a nice technical writeup on your architecture would silence the pundit inside me?
Well, it's 256 GB of ram each for 13 servers. I do wonder if the second half of that RAM, with the money used for SSDs, would result in better performance -- as your cache could be larger for the same budget. I think it'd really come down to your cache hit rate / cache size.
memcached is standalone, no one server knows about any other server, so at a simple level, no. It also can't go out and fetch data from else where, it's either got the data or it hasn't.
You could write a more complex cacheing layer on top of memcached that looked first in memcached, then fetched from solid state storage if it wasn't there. That's kind of what we will be using it for by wrapping Rails around it.
> You could write a more complex cacheing layer on top of memcached that looked first in memcached, then fetched from solid state storage if it wasn't there.
I wonder how that would compare to just setting up an SSD as swap space.
Memcached is designed to be fast. Anything that makes it slower that doesn't need to be there isn't there. For example: authentication (anyone who can access the port can fetch data), indexing (you need to know the key), deallocating memory (you configure memcached with an upper bound; it keeps allocating memory as needed until it gets there), etc.
Virtual memory (I assume you actually mean 'does memcached not page data out to disk') would make it much slower. Since memcached is just that - a memory cache - if you're out of memory it just expires the least-recently-used data. In your application, you fetch the key, and if that fails you fetch it from the primary data store (or wherever else you can find it).
In theory it probably could, but you're losing all benefit at that point. The big selling point of memcahce is that it's VERY fast. Our memcache server (Ours is only 8GB, not nearly as impressive as some) averages under 1ms object fetch times, even accounting for network overhead.
Is there any point to not bruteforcing it? While you twiddle bits, I can throw money on hardware and get my product out on the market. Then, when times are more stable, I can worry about optimizations.
Also, since I haven't done any twiddling yet, there is likely to be a lot of low hanging fruit. I was going to need that hardware anyway if I was to grow. Now I can have a period of time where my savings allow me to slow down on buying more hardware.
I can throw money on hardware and get my product out on the market.
Up to a certain scale what you say is true - but it's not interesting at all. As in, anyone can drive 26 miles but running a Marathon is still impressive. Or anyone can order dinner in a restaurant, but not everyone can cook.
Running a marathon is impressive, yes. But if I'm in the business of delivering food (to stretch an already thin analogy ;), there's no point in training for a marathon. The constraints of my business make driving around in a car a better choice.
And unless I'm a professional chef, I'd rather not spend time on cooking dinners when I could do something to improve my business.
The point of building something (in a commercial environment) is not to impress, but to ship something.
If everyone else was saying that you should use an army of autonomous quadcopters to deliver stuff, then it wold be worth pointing out that a simpler solution is available. 37S is challenging the cloud and scale-out cargo cults.
It's just plain cool for someone that's interested in computer technology. I like to see this type of picture for all of my interests: people's amazing home theaters, people's really nice guitars, etc. Sure, it's just money that pays for this stuff, so it's not necessarily impressive in an intellectual sense. It's just cool.
> Anyone can brute-force it.
Not exactly. It's tough to increase server hardware any faster than polynomially over time, but most networks tend to go through exponential phases of growth when they scale. Also, diminishing returns and bottlenecking can often kill attempts to scale a network by just throwing more of the same hardware at the problem.
Based on these prices, the options on this Thinkmate server configuration page (goes up to 2TB), and the reports in this thread, it looks like in 2012, 864GB is not necessarily a "brute" amount of RAM.
At this point, apparently, 864 GB is just a really big server, but not nearly the biggest. If the RAM is only $12,000, then that is way less than most of the employee's cars at Basecamp cost. The cars are just to drive individual employees to and from work etc.
This server is going to handle how many thousands (millions?) of people's business? People will spend $50k on one car. The server probably costs less than that.
Anyway, 864GB of RAM is impressive to look at because most people have never seen that amount of RAM all at once before.
So I would say that knowing how to build and take advantage of surprisingly large amounts of RAM _is_ an impressive and useful skill these days. Probably more useful than one's skill in using very small amounts of RAM.
I remember the first time I saw a PC unix server that had 32MB of RAM. This was back when 640k was enough for everyone. The harddrive capacity on my home PC was 40MB. It was also back in the day when the BIOS made a ticking sound as it checked your RAM on boot.
I was sitting at another workstation and I could hear this ticking from the other room. I got up, walked over, and got to see this machine ticking away for over minute while it booted up. 32MB of RAM ... amazing.
The impressive part is them understanding and advantage of "opportunity cost".
In business, nobody, cares what you can do with "how little [computing] resources". Running a web services is not a circus act.
What matters is how effectively you do what you do, and that is not just a parameter of "using little computing resources". Efficiency includes saving time of your team, saving engineering pay, not evolving your system into a messy architecture, and having peace of mind. When a bucketload of RAM costs 1/5th of a developer's yearly salary, use RAM and not 2-3 developers for a year.
For that same reason, the fact that "anyone can brute-force it" doesn't mean a thing. It's like saying "anyone can save money". Actually it reminds me of this classic quote:
Gonzo: Well, I want to go to Bombay, India and become a movie star.
Fozzie: You don't go to Bombay to become a movie star! You go where we're going: Hollywood.
No, but a lot of "sysadmins" are being raised with no knowledge of hardware and some others like to cache knowledge for years and need a good shock to bring them into the present.
I'd bet that Facebook uses a fairly complex cacheing system though, so your point that using extensive cacheing says anything about the ability of the underlying stack to scale still doesn't really work for me.
Yes, big companies put some things into a different perspective. Just today I heard a sysadmin saying they lost 50GB or so storage per server due to some partman issue. But that's ok... adds up only to ~ 10+ TB and only temporarily (fix via lvm resizing possible).
It's great that companies post articles describing their architectures and backend systems (and I'm certainly looking forward to your follow-up post detailing your caching system) but why show a picture of commodity hardware available for anyone with enough money? What's impressive is your skills, now how much money is in your bank account.
Because it's fun. Toys are fun. Upgrades are fun. That's all it is. And most of us never get to work with that kind of hardware. It's fun to see it all in one place.
They have lots of cores and are very NUMA.