Ray's playground

 

Aggregation(Chapter 6 of MongoDB The Definitive Guild)

  MapReduce is the Uzi of aggregation tools. Everything described with count, distinct, and group can be done with MapReduce, and more. It is a method of aggregation that can be easily parallelized across multiple servers. It splits up a problem, sends chunks of it to different machines, and lets each machine solve its part of the problem. When all of the machines are finished, they merge all of the pieces of the solution back into a full solution.
  MapReduce has a couple of steps. It starts with the map step, which maps an operation onto every document in a collection. That operation could be either “do nothing” or “emit these keys with X values.” There is then an intermediary stage called the shuffle step: keys are grouped and lists of emitted values are created for each key. The reduce takes this list of values and reduces it to a single element. This element is returned to the shuffle step until each key has a list containing a single value: the result.

posted on 2010-10-12 21:17  Ray Z  阅读(210)  评论(0编辑  收藏  举报

导航