Redis入门教程
最近互联网公司中很流行key-value数据库,但这是大学数据库课上学不到的东西,所以我讲通俗的翻译篇这方面的教程
简短总结下Redis NOSQL世界的充满冒险的旅程
我们旅途的行程
就像其他的旅途一样, 我们的旅途也是有很多行程组成, 在开始之前我先将他们写下来:
- Redis? 这是什么?
- 可用的数据类型
- 表在哪里?
- 一个简单例子
- 回顾
行程1: Redis? 这是什么?
为了长话短说, Redis是一种加速的key-value数据库. 为什么加速? 简单说他非常快(所有的数据都加载在内存中和磁盘中) 和丰富的特色 (它支持多种数据类型和对其的复杂操作).
一个很重要的redis特色是它不是普通意义上的数据库;它是一种理解上的数据库(能为你储存和保存数据),他它不支持任何的sql语句(其他关系型数据库都支持sql语句). 不要害怕, Redis 不是一种容易丢失数据的数据库 。它只是不提供我们钟爱的SQL语句的支持,但是它有健壮和测试的能和我们交互的协议 。
在Redis中,不处理表和查询,连接,视图语句; 不提供整型和浮动字符串等。你不得不去操作低层的数据集结构和更多的原始类型。
行程2:可用的数据类型
让我们进一步观察下这种奇怪的数据库的工作原理。按照前面行程所说, Redis是基于key-value形式存储的,我们重点要观察他key的部分.
Key是一种简单的字符串(officially not binary safe. UPDATE: now, key strings are binary safe.) 例如“username" 或"password". 你不能在key中有空格,但对字符等支持的非常好如 ".", ":", "_" , 所以我们可以用以下类似的key "user_name", "user:123:age", "user:123:username" 他们是完全合法的。
我们必须注意key,因为它不像关系型数据库,依靠字段名来区分数据。 Redis 根本就不认识数据表,非常简单的语句"SELECT username from users WHERE user_id=123;" 不得不通过其他的方法来实现。 举个例子,为了实现前面的SQL查询,我们不得不通过"user:123:username"为key来得到你要的结果。通过ID 来达到关系型数据库中字段的效果。为了更深入了解key,所以我们来看下key可用的数据类型。
Strings(以下说的很清楚,就不翻译了)
Redis strings are the most basic available datatype; they are normal, binary-safe strings of maximum dimension of 1 Gb.
You can set a key to a string value using the SET command and get it using GET. If you need to store a numeric value, such as counter, you can safely save it on a string and use INCR or DECR to increment or decrement it.
Lists
Redis lists are collections of Redis strings sorted by insertion order. You can think of a list as a chain, you can add a new link on the left extremum (the head) or on the right (the tail); you can also add a link in the center, but you have to broke another link before.
You can add elements to a list using the LPUSH and RPUSH commands (L for left and R for right), get an element (removing it) with LPOP and RPOP and get a range of elements (without removing any) with LRANGE. You can also add an element on a given point with LSET, but, as for the chain example, this operation is much more slower than a simple push.
Hashes
At the time of writing, hashes are the newest datatype, implemented few days ago, it will be released shortly in the coming release. Hashes are a relatively long waited feature that helps storing data in a more coherent and neat way. Hashes implements a sort of key-value paradigm inside a key, so, for example an hash "user" may cointain some different fields to wich correspond different values, just like hashes do in programming languages such as ruby, javascript, etc…
Sets
Redis sets are just like the homonymous mathematical structure, collection of distinct objects; in this particular case the objects are Redis strings. As you could argue, sets are different from lists for two main reasons: elements are unordered and they are all distinct, you can’t have two equal elements.
You can add an element to a set with the command SADD, remove an element with SREM, get (and remove) a random element with SPOP and perform the set operations of union, intersection and difference with SUNION, SINTER and SDIFF respectively.
Ordered sets
Redis ordered sets are quite like normal sets, with the exception that every set element has an associated weight used to take this element in order with others.
You can work with sorted sets as with normal ones, what changes is just the names of the commands: ZADD to add an element, ZREM to remove it and so on. Two different commands are ZINCR and ZSCORE, the former is used to increment the score of an element and the latter to get its score.
行程3: 表在哪里呢?
So, working with Redis data is really different from what we’re used to with SQL tables, you haven’t a language to query the server, you just have some commands to manipulate the keys stored in the database. Commands are datatype-sensitive, you can’t use a set command on a list, otherwise you’ll get an error. Commands could be issued via the redis-cli or using one of the many wrappers for your preferred programming language. In the following, we’ll be agnostic about the way you’ll query Redis, we’ll just write the commands and you’ll issue them as you prefer.
Let’s think about a simple SQL table where we could save users for some application:
id | username | password | name | surname |
---|---|---|---|---|
1 | user1 | pass1 | Bob | Smith |
2 | user2 | pass2 | Mario | Rossi |
存储数据
We want now to save the same data in Redis, so we have to design our database to fit this situation. Maybe, it’s better to think about the problem from an application-centric point of view; in the SQL version, our application would get users’ data issuing a SELECT statement having as clause the users’ id; in other words, we need a way to distinguish between different data entries. Additionally, we need to get user’s data with the sole knowledge of an unique identifier. This situation could be solved solved if we use the user id as a part of the redis key, so the records of the previous table would translate as
Key | Value |
---|---|
user:1:username | user1 |
user:1:password | pass1 |
user:1:name | Bob |
user:1:surname | Smith |
user:2:username | user2 |
user:2:password | pass2 |
user:2:name | Mario |
user:2:surname | Rossi |
Well, given the id of an user, we could now get all its data reading the keys user:id:username, user:id:password, user:id:name and user:id:surname.
用户登录
Our data are suitable for a login process, so we have to provide a way to gets the user’s id, given the username; in other words, we have to establish a relation between usernames and user ids. This can be done if we add another redis key to our data design: user:username:id.
Key | Value |
---|---|
user:user1:id | 1 |
user:user2:id | 2 |
So, if Mario Rossi will try to login to our application, we can get his id knowing its username and so we’ll get all its data.
主键?
Another problem is the uniqueness of our ids. In the SQL world we say, for example, "id int primary key auto_increment", now we have to implement a similar behavior to ensure that we have a different id for each user. As in the previous case, Redis has a solution: we can simply create another key "user:next_id" and use it as a counter that we manipulate via the INCR command whenever we add a new user.
SELECT * FROM users;
The next problem is to get the user list. We could think that what we’ve already done is enough to get this list: we could get the current value "user:next_id" and get users’ data from 0 to this value in one or more steps. Well, now let’s think about an user that has cancelled it’s subscription (our next problem), when we’ll traverse all the integers from 0 to user:next_id we’ll find its old id that now should not has any data.
Althought this is not a serious problem, we wouldn’t waste our time trying to get the data of users that don’t exist anymore, so we’ll create a new key "user:list" of list (or set) type in wich we’ll add all new user ids, removing them when needed. We prefer using a list because this give us the possibility to do a sort of "pagination" (maybe you’re thinking about "LIMIT"?) using the LRANGE command.
删除用户
The last problem is a matter of "data integrity"; what would happen when we delete a user? We sould remove anything that references to it. In other words, we should delete all the keys user:id:*, user:username:id and the id in "user:list".
行程4: 一个简单实例
As an example of what we learned in this journey, let’s try to design an application that will work as a virtual library that let us group our books by topics. This example will be slightly more complex than the user’s table, but we’ll teach us how to handle relations with Redis.
In our application, we should be able to collect our books, storing informations such as title, author(s), topic(s), pages, price, ISBN and a description. Obviously, we could have a book with more than one author and that cover more than one topic at the same time (for example, a book that is about programming ad ruby). Additionally, one author could have written more than one book and a topic could be dealed with in more than one book. From what we’ve written it arises clearly that there’re a many-to-many relation between authors and books and between topics and books.
The SQL case
First of all, let’s try to design the SQL tables (with some data) that model this situation, in order to see how works the translation in the Redis idiom:
Books
id | title | pages | price | isbn | description |
---|---|---|---|---|---|
1 | Programming Ruby: The Pragmatic Programmers’ Guide | 829 | $26 | 0974514055 | The bible of the ruby programming language |
2 | Erlang Programming | 496 | $42 | 0596518188 | An interesting introduction to erlang |
Authors
id | name | surname |
---|---|---|
1 | Dave | Thomas |
2 | Chad | Fowler |
3 | Andy | Hunt |
4 | Francesco | Cesarini |
5 | Simon | Thompson |
Topics
id | name | description |
---|---|---|
1 | programming | Books about programming |
2 | ruby | Books about ruby |
3 | erlang | Books about erlang |
Books-Authors
book_id | author_id |
---|---|
1 | 1 |
1 | 2 |
1 | 3 |
2 | 4 |
2 | 5 |
Books-Topics
book_id | topic_id |
---|---|
1 | 1 |
1 | 2 |
2 | 1 |
2 | 3 |
The Redis case
It should be already clear that the three tables Books, Authors and Topics are not a problem; we’ve already learned how to save this data in Redis in the previous leg (see the next image). The problem arises when considering the link tables Books-Authors and Books-Topics that implement the two many-to-many relations. How to implement them? Let’s think about topics only, the problem for authors is merely a twin (with a different name) of the former and we’ll use the same strategy.
For every book we have to know the topics that belongs to it and, in the same way, for every topic we have to know every book that deals with it. In other words, for each book we need a collection of the related topic ids and for each topic a collection of the related book ids. This is a problem where sets fit perfectly! We’re going to create two sets, book:id:topic and topic:id:books, in the former we’ll save the topic ids that match the book and in the latter we’ll save the ids of the books that match this topic. For example, referring to the data shown in the previous sub-leg, the book "Programming Erlang" (book id 2) will have a key book:2:topics of set type, with elements (1, 3); while the topic programming will have a set topic:1:books with elements (1, 2).
As said early, the same works for authors, so we could say that a SQL’s many-to-many relation translates in Redis as two sets. This is really useful, because it give us the ability to obtain an important feature for free: we could easily find a book that covers more than one topic doing a set intersection between all topic:id:books sets related to the topics we are interested on. So, for example, the intersection between topic:1:books (books about programming) and topic:2:books (books about ruby) will return the set (1), i.e. only the book "Programming Ruby" with id 1.
A special care should be used whenever we remove something from our database. As shown in the previous figure, for example, topics have a reference to books and books have a reference to topics, so, what to remove? Well, surely we should remove every book:id:* keys, but before doing this we have to traverse all topics to remove the book id from the set topic:id:books. Additionally, we should remove the book id from the book:list list. If we want to remove a topic, we have to procede in a very similar way: before removing each topic:id:* key we have to traverse all books referenced by the topic we want to delete and remove the topic id from the set book:id:topics. The same apply for authors.
We like fun and programming, so to lear by doing consider taking a look to the companion code of our virtual library. It is implemented in ruby, using ezmobius’ redis-rb. Both the code and the look’n’feel are very rough, so feel free to make a pull request if you have improvements.
行程5: 回顾
我们的旅途结束了, 让我们回顾下整理下旅途中的行李和纪念品。
在我们行李里放满了Redis相关内容: 数据类型, 命令等,
纪念品是三类redis使用模式:
- handling the unique id problem with a string key and the INCR command;
- handling a user login via an user:username:id type key;
- handling many-to-many relations with sets.