用map、reduce完成一个排序算法

Google提出的MapReduce编程框架能够很好的解决大规模数据的排序问题，这也是Hadoop这个MapReduce开源实现中直接将TeraSort作为发布版的测试用例的原因吧。Yahoo!还拿这个参加了TeraSort的比赛，拿了个‘冠军’。

其实MapReduce框架能够用来进行排序的原因很简单：Map阶段和Reduce阶段都会自动进行排序。但是显而易见的是，函数式编程中所指的map和reduce原语都是没有排序动作的。因此在本文中，我们将尝试使用函数式编程中的map和reduce语义来探讨一下排序算法。

From Wikipedia：

"In many programming languages, map is the name of a higher-order function that applies a given function to each element of a list, returning a list of results."

"In functional programming, fold, also known variously as reduce, accumulate, compress, or inject, is a family of higher-order functions that iterate an arbitrary function over a data structure in some order and build up a return value."

从维基百科可以看出，map和reduce都是高阶函数的一种，map负责将一个函数作为参数，单独作用在一个列表中所有元素上，并且返回一个每一次作用返回值的列表。reduce函数也被称为fold函数、accumlate、compress函数，用来以某种顺序遍历数据结构，并作用给定的函数，最后生成一个返回值。

比较通用的排序算法一般有下面几种，插入排序、快速排序、堆排序、归并排序。其中插入排序效率最差，不予考虑。另外三种排序算法复杂度相似。我们选其一来实现，比如归并排序。

def sort (List)
    Avg = List.length/2
    sort_helper(sort(List[1...Avg], List[Avg...]))

def sort_helper (L1, L2)
    l1 = L1.car
    l2 = L2.car
    if (l1 < 12)
        l1 + sort_helper(L1.cdr, L2)
    else
        l2 + sort_helper(L1, L2.cdr)

按照上面的伪代码，写了一个Erlang的版本：

-module(sort).
-export([sort/1]).

sort([]) ->
    [];
sort(List) when is_list(List) ->
    case length(List) of
	1 ->
	    List;
	Size ->
	    Avg = Size div 2,
	    {L,R} = lists:split(Avg, List),
	    sort_helper(sort(L), sort(R))
    end.

sort_helper([], []) ->
    [];
sort_helper([], R) ->
    R;
sort_helper(L, []) ->
    L;
sort_helper(L, R) ->
    case hd(L) < hd(R) of
	true ->
	    [hd(L)|sort_helper(tl(L), R)];
	false ->
	    [hd(R)|sort_helper(L, tl(R))]
    end.

我们为什么没有用map和reduce来描述这个归并排序？因为归并排序不需要map和reduce，只需要递归就可以了。快速排序和插入排序都是比较容易使用map和reduce加递归实现的，比如：

"Insert Sort"
Insort(L) = Insort(filter(max(L), L)) + max(L) ;;其中max(L)的实现是需要使用reduce方法的

"Quick Sort"
Qsort(L) = Qsort(map( less< hd(L), L)) + hd(L) + Qsort(map(more> hd(L), L))

posted @ 2011-10-17 17:38 MMJX 阅读(1383) 评论(0) 编辑收藏举报

刷新页面返回顶部

用map、reduce完成一个排序算法

公告