a bitwise operation 广告投放监控
将随着时间不断增大的数字N个依次编号为1到N的N个球,颜色每次随机为红黑蓝,时间上先后逐个放入篮子中,
计算离现在最近的24个球的红、黑、蓝颜色数
广告投放监控
a bitwise operation
http://redis.io/commands/bitop
http://blog.getspool.com/2011/11/29/fast-easy-realtime-metrics-using-redis-bitmaps/
BITOP operation destkey key [key ...]
Perform a bitwise operation between multiple keys (containing string values) and store the result in the destination key.
The BITOP command supports four bitwise operations: AND, OR, XOR and NOT, thus the valid forms to call the command are:
BITOP AND destkey srckey1 srckey2 srckey3 ... srckeyN
BITOP OR destkey srckey1 srckey2 srckey3 ... srckeyN
BITOP XOR destkey srckey1 srckey2 srckey3 ... srckeyN
BITOP NOT destkey srckey
As you can see NOT is special as it only takes an input key, because it performs inversion of bits so it only makes sense as an unary operator.
The result of the operation is always stored at destkey
.
Handling of strings with different lengths
When an operation is performed between strings having different lengths, all the strings shorter than the longest string in the set are treated as if they were zero-padded up to the length of the longest string.
The same holds true for non-existent keys, that are considered as a stream of zero bytes up to the length of the longest string.
Return value
The size of the string stored in the destination key, that is equal to the size of the longest input string.
Redis bitmaps – Fast, easy, realtime metrics
At Spool, we calculate our key metrics in real time. Traditionally, metrics are performed by a batch job (running hourly, daily, etc.). Redis backed bitmaps allow us to perform such calculations in realtime and are extremely space efficient. In a simulation of 128 million users, a typical metric such as “daily unique users” takes less than 50 ms on a MacBook Pro and only takes 16 MB of memory. Spool doesn’t have 128 million users yet but it’s nice to know our approach will scale. We thought we’d share how we do it, in case other startups find our approach useful.
Crash Course on Bitmap and Redis Bitmaps
Bitmap (aka Bitset)
A Bitmap or bitset is an array of zeros and ones. A bit in a bitset can be set to either 0
or 1
, and each position in the array is referred to as an offset. Operations such as logical AND, OR, XOR, etc. and other bitwise operations are fair game for Bitmaps.
Population Count
The population count of a Bitmap is the number of bits set to 1
. There are efficient algorithms for calculating population count. For instance, the population count of a 90% filled bitset containing 1 billion bits took 21.1 ms on a MacBook Pro. There is even a hardware instruction in SSE4 for the population count of an integer.
Bitmaps in Redis
Redis allows binary keys and binary values. Bitmaps are nothing but binary values. The setbit(key, offset, value)
operation, which takes O(1)
time, sets the value of a bit to 0
or 1
at the specified offset for a given key.
A simple example: Daily Active Users
To count unique users that logged in today, we set up a bitmap where each user is identified by an offset value. When a user visits a page or performs an action, which warrants it to be counted, set the bit to 1
at the offset representing user id. The key for the bitmap is a function of the name of the action user performed and the timestamp.
In this simple example, every time a user logs in we perform a redis.setbit(daily_active_users, user_id, 1)
. This flips the appropriate offset in the daily_active_users bitmap to 1. This is an O(1) operation. Doing a population count on this results in 9 unique users that logged in today. The key is daily_active_users
and the value is 1011110100100101
.
Of course, since the daily active users will change every day we need a way to create a new bitmap every day. We do this by simply appending the date to the bitmap key. For example, if we want to calculate the daily unique users who have played at least 1 song in a music app for a given day, we can set the key name to be play:yyyy-mm-dd.
If we want to calculate the number of unique users playing a song each hour, we can name the key name will be play:yyyy-mm-dd-hh
. For the rest of the discussion, we will stick with daily unique users that played a song. To collect daily metrics, we will simple set the user’s bit to 1 in the play:yyyy-mm-dd
key whenever a user plays a song. This is an O(1) operation.
redis.setbit(play:yyyy-mm-dd, user_id, 1)
The unique users that played a song today is the population count of the bitmap stored as the value for the play:yyyy-mm-dd
key.To calculate weekly or monthly metrics, we can simply compute the union of all the daily Bitmaps over the week or the month, and then calculate the population count of the resulting bitmap.
You can also extract more complex metrics very easily. For example, the premium account holders who played a song in November would be:
(play:2011-11-01 ∪ play:2011-11-02 ∪...∪play:2011-11-30) ∩ premium:2011-11
Performance comparison using 128 million users
The table below shows a comparison of daily unique action calculations calculated over 1 day, 7 days and 30 days for 128 million users. The 7 and 30 metrics are calculated by combining daily bitmaps.
Period | Time (ms) |
---|---|
Daily | 50.2 |
Weekly | 392.0 |
Monthly | 1624.8 |
Optimizations
In the above example, we can optimize the weekly and monthly computations by caching the calculated daily, weekly, monthly counts in Redis.
This is a very flexible approach. An added bonus of caching is that
it allows fast cohort analysis, such as weekly unique users who are also
mobile users — the intersection of a mobile users bitmap with a weekly
active users bitmap. Or, if we want to compute rolling unique users over
the last n
days, having cached daily unique counts makes this easy — simply grab the previous n-1
days from your cache and union it with the real time daily count, which only takes 50ms.
Sample Code
A Java code snippet below computes unique users for a given user action and date.
import redis.clients.jedis.Jedis; import java.util.BitSet; ... Jedis redis = new Jedis("localhost"); ... public int uniqueCount(String action, String date) { String key = action + ":" + date; BitSet users = BitSet.valueOf(redis.get(key.getBytes())); return users.cardinality(); }
The code snippet below computes the unique users for a given given user action and a list of dates.
import redis.clients.jedis.Jedis; import java.util.BitSet; ... Jedis redis = new Jedis("localhost"); ... public int uniqueCount(String action, String... dates) { BitSet all = new BitSet(); for (String date : dates) { String key = action + ":" + date; BitSet users = BitSet.valueOf(redis.get(key.getBytes())); all.or(users); } return all.cardinality(); }