Redis 2.2.0 RC1 is out

delano · on Dec 15, 2010

This is great news! I'm particularly interested in these changes:

* Sorted sets are now less memory hungry.

* Now write operations work against keys with an EXPIRE set! Imagine the possibilities.

I use Redis for many, many things. In fact, I realized the other day that without it, I probably wouldn't still be bootstrapping. Not because I couldn't use something else, but because I wouldn't enjoy the work nearly as much. Starting a company is a long, arduous road and finding joy in the work is really important.

stavros · on Dec 16, 2010

Out of curiosity, what do you use it for? I use it for caching and session storage, but I still keep postgres for the heavy lifting. I love that combination, especially the fact that I can put session (and other non-essential) data into redis and they'll be persisted and available in milliseconds, but it can crash and I lose nothing.

delano · on Dec 16, 2010

I use sorted sets for a lot of stuff (timelines mostly). A few other examples are job queuing, error logs (I write all error messages to a fixed length list), and rate limiting. Actually that last one works pretty well. The key is something like:

  v1:limiter:UNIQUEID:TYPE:HHMM

where UNIQUEID is a username or IP address, etc, TYPE is the specific action to limit (login, signup, etc), and HHMM is a quantized timestamp (e.g. 1205, 1210). You increment this key every time the action occurs and use an expire to keep it clean. It's not perfect, but it does the trick for most usecases.

I also make pretty judicious use of databases to group different classes of data together. Primarily into 3 groups actually: transient data, persistent/customer data, and internal stuff like the error logs and job queues. That way I can dump data at different intervals for each group (I use a tool I wrote called redis-dump: https://github.com/delano/redis-dump).

stavros · on Dec 16, 2010

Thanks, I'm just gathering use cases to see if people are using it as a replacement for a relational database. I don't think I would, I think it's not the best tool for that job.

Thanks again for the data point!

beagledude · on Dec 15, 2010

Redis is one of the best pieces of software I've used in the past few years. I use it so many ways in production it's loco. Having flexible data structures besides (string)k, (string)v is a huge boost.

antirez · on Dec 15, 2010

Thank you, what is interesting about 2.2 is that we discovered that also operations against the plain old strings can enhance a lot the power of Redis.

For instance GETBIT/SETBIT will turn strings into a large bitmap. While GETRANGE/SETRANGE makes users able to use strings as arrays for fixed length data.

I think there will be big use cases for this new commands, as it is possible to store a lot of data in little space, with O(1) random access.

superjared · on Dec 15, 2010

These additions make compact data storage and access much simpler from our perspective. Instead of loading and parsing an entire entry, we have access to just the data we want.

Nice work, Salvatore. Seriously.

riffraff · on Dec 16, 2010

re SETBIT, I was thinking "cool, redis is now also a bloom filter server!" . Then I realized I've been out of the loop for a while and probably you already implemented commands for that :)

geoffc · on Dec 15, 2010

I love Redis and the new features rock! I use Postgres and Redis in tandem and it is a great combo. Postgres for all the stuff SQL databases are good at and Redis for all the stuff they are not.

seunosewa · on Dec 16, 2010

Really? What stuff is PostgreSQL not good at, exactly?

fredoliveira · on Dec 16, 2010

That tone is asking for some passionate replies ;-)

This isn't about PostgreSQL at all. There's things where relational databases are a good fit, and things that are a bad fit. As such, this isn't about what PostgreSQL does poorly, but about the cases where a key-value store with some pretty good data structures comes in handy. Would you really store temporary data in your relational database? Would you use it as a caching mechanism? Perhaps as a queue? Not unless you're crazy. So I guess this isn't about what PostgreSQL performs poorly, but about where Redis is a better solution.

defen · on Dec 16, 2010

I'm not sure how to efficiently implement redis-style sorted sets using SQL, but the fact that I have to think about it at all means it's easier to use redis.

A lot of the other things could probably be done with the right combination of stored procedures and clever SQL but again, redis makes it so much easier as to make things qualitatively different. It's just a bunch of C files with no real external dependencies (I don't even have to run ./configure before building it!).

Operationally, for a generic non-db-expert kind of person like me, redis is much simpler to manage than PostgreSQL. With redis I don't need to worry about vacuuming, write ahead logs, archiving, query tuning & statistics, lock management, etc (to name a few things from the config file).

antirez · on Dec 16, 2010

trivial example: lists.

You have items that you want to retrieve (LRANGE) in the same order, or in reverse order you push inside (LPUSH / RPUSH). Usually you need the latest items (think to timelines), and all this should be FAST and efficient.

In Redis this is trival to model. SQL is so far from modeling this in the right way that you need an "ORDER BY" statement for all your LRANGE-alike query even if there is nothing to order, you want to retrieve things in their natural insertion order.

Sorted sets can model a zillion of use cases in the same way. They are ordered per score on insertion. You can ask for ranges, for the RANK of an element, and things like this, in a matter of microseconds per queries. Completely impossible to model with SQL in a natural and fast way.

Just to cite another one (but there are tens of cases like this), bloom filters anyone? With Redis 2.2 you can manipulate a unique bitfield at the lower level, and even accessing to the sub bytes if you wish, with very efficient operations.

delano · on Dec 16, 2010

I use sorted sets quite heavily for timelines. And it's awesome.

One usecase I can't figure out, is storing/querying IP address ranges. Is there a natural way with Redis to check if a number is within a given range? (without storing every value and without multiple calls)

antirez · on Dec 16, 2010

There is a very easy way to model this, just convert the IP address into a 32bit integer! :) Then use ZRANGEBYSCORE to query.

delano · on Dec 16, 2010

What about the other way around though? Storing the range, say, 159.18.0.0 - 159.18.255.255, and then querying to check if an address is in that range.

The only way I can think to do this is to store the range as two integers, as you suggest, and query twice. The first to find the nearest lower bound for an IP and then a second time for the upper bound.

stavros · on Dec 16, 2010

Writing things without doing a sync for every write.

rorrr · on Dec 16, 2010

Simple storage. Just compare the speed of storing key/value pairs. Redis is many times faster, particularly for inserts and updates.

henryprecheur · on Dec 15, 2010

I've been using Redis for 2 months now. It's a real pleasure to use.

I especially like the simple protocol. It's possible to write a simple client without any external libraries within days --maybe hours if you're really good ;). Try to do that with SQL or MongoDB (Javascript parser anyone?).

catch23 · on Dec 15, 2010

i think mongodb fills a need that isn't covered by redis or sql -- people still need semi-relational data that can scale beyond 100M rows.

redis is a nice memcached or scalable data structures replacement. we use it as a simple rabbitmq replacement.

henryprecheur · on Dec 15, 2010

Yes. I didn't want to criticize MongoDB or SQL based databases. Just point out that Redis' protocol is small and easy to implement.

riffraff · on Dec 16, 2010

why did you have the need to replace rabbitmq at all? (I love redis, but I'd expect using something designed specifically for X is better than using something else that also kind of does X)

moe · on Dec 16, 2010

RabbitMQ is a complex beast and was, last time I used it at scale, ridden with problems under load. We had extremely serious issues like spontaneous lockups.

We eventually abandoned it when we realized our queueing needs could be modeled in Redis. In a way that we fully control, understand and can debug. And it wasn't even much work!

I'd argue this is highly preferable unless your project really needs complex routing of the kind that only AMQP can provide. Most projects don't.

stavros · on Dec 16, 2010

I use redis to replace RabbitMQ as well, but it's all abstracted away in Celery. A question: Don't you need polling when you use redis in this way? Does it have some sort of notification functionality, apart from the recently added pubsub?

moe · on Dec 16, 2010

Our queues are implemented using BLPOP which blocks until items arrive, so no polling is needed.

The newer pub/sub stuff promises to be very useful, too, but we didn't have a use-case for that, yet. Our app generally needs messages to persist until they are consumed.

What I can say however is that none of the concerns we initially had about performance/scalability held any water.

We are still running on a single redis instance (plus a slave) on a moderately sized server and it happily processes 100 messages/sec average between 4 producers and a varying number of consumers (20-100).

To add insult to injury our monitoring metrics show that this isn't even a worthwhile load for redis. The server is nowhere near breaking a sweat, the CPUs barely drop below 90% idle, there's no disk i/o to speak of, and the memory usage is more than reasonable (plenty headroom for our purposes).

Thus "just throw it at redis" has long become a common stopgap meme in this particular project. And so far we didn't have to replace any of these supposed stopgaps with something else.

stavros · on Dec 16, 2010

Ah, very nice. It seems that I shouldn't have any qualms about throwing redis at most of my problems then, thanks!

catch23 · on Dec 19, 2010

in case you're wondering, we use a simple queueing thing called redpack:

https://github.com/luxdelux/redpack

JulianMorrison · on Dec 16, 2010

http://code.google.com/p/redis/wiki/BrpoplpushCommand

riffraff · on Dec 16, 2010

thanks for the reply, I can see if you make no use of functionalities other than push/pop how redis would be just fine.

Could you be so kind to qualify what load are we talking about, as producers/consumers & message rate/size/persistence level?

I have used ActiveMQ with systems of about ten nodes and a message rate of <2k/sec and it worked fine, and I always believed rabbitmq was faster/more stable (erlang bias!) so if your volume is 500k/s I'm ok, otherwise I'd file this as another slightly worrying information about rabbitmq (after the problems I heard from reddit)

uggedal · on Dec 15, 2010

After doing some stability testing, I'm now running 2.2.0 RC1 in production on http://wasitup.com

Unwise, maybe. But thanks to the ease of doing this upgrade on several nodes with Puppet I couldn't help myself.

antirez · on Dec 15, 2010

Jeremy Zawodny is also running 2.2.0 at Craigslist, so it's really the case of saying: you are in good company :)

mythz · on Dec 15, 2010

Nice work! I'm also a big fan of Redis, use it extensively as a front-end smart caching layer in-front of an RDBMS. Compared to DB calls each operation is essentially a NoOp. Just like turning nitro on your web app - instant speed-up!

jpspeno · on Dec 15, 2010

And he (antirez) just said:

"oops, just found a bug on RC1 (setbit/getbit) I guess it's going to be RC2 soon ;)"

antirez · on Dec 15, 2010

sorry false alarm, no bug :)

jpspeno · on Dec 15, 2010

and then he said:

"Nevermind, no bug about RC1, just my mistake".

Twitter to HN gateway. :-)

kristianp · on Dec 16, 2010

Does anyone actually use the replication in Redis to improve durability?

dschn · on Dec 15, 2010

What does LPUSHX/RPUSHX do differently?

antirez · on Dec 15, 2010

Push if list exists. This are the operations that are at the base of the twitter Redis caching implementation.

This operations are useful every time you want to use Redis for caching timelines, as the idea is: if this user is already in cache (at least a single item exists) then push against the cache, otherwise not.

dschn · on Dec 15, 2010

Ah, thanks for the explanation. I'm actually exploring Redis for storing timelines!

Twitter's Redis-backed timeline storage is here, if anyone is interested: https://github.com/twitter/haplocheirus