【原译】Erlang列表处理(Efficiency Guide)

 

摘要: 翻译太菜希望大牛指正。以后会逐渐加入自己写的注释和疑问,有些地方自己也不太了解,知道或者感兴趣的同学可以指出来,一起讨论一下。

=====================================================================================================

List handling

1  Creating a list

创建一个列表

Lists can only be built starting from the end and attaching list elements at the beginning. If you use the ++ operator like this

列表只能从尾端开始创建,从头部加入元素。如果你像这样使用++操作符

 

List1 ++ List2

 

you will create a new list which is copy of the elements in List1, followed by List2. Looking at how lists:append/1 or ++would be implemented in plain Erlang, it can be seen clearly that the first list is copied:

你将创建一个新的列表,这个列表是List1的副本加上List2。看看lists:append/1函数或者++操作符在Erlang里面是怎样实现的,可以明了地看到第一个列表被复制了:(%% 取一个输入列表的元素,加到新的列表中 %%)

 

append([H|T], Tail) ->
    [H|append(T, Tail)];
append([], Tail) ->
    Tail.

 

So the important thing when recursing and building a list is to make sure that you attach the new elements to the beginning of the list, so that you build a list, and not hundreds or thousands of copies of the growing result list.

所以在递归遍历和构造一个列表的时候,要确保把新的元素附加到列表的头部,这样就能避免产生成百上千的列表的副本。

Let us first look at how it should not be done:

让我们首先来看看应该使用的方法:

DO NOT

 

bad_fib(N) ->
    bad_fib(N, 0, 1, []).

bad_fib(0, _Current, _Next, Fibs) ->
    Fibs;
bad_fib(N, Current, Next, Fibs) -> 
    bad_fib(N - 1, Next, Current + Next, Fibs ++ [Current]).

 

Here we are not a building a list; in each iteration step we create a new list that is one element longer than the new previous list.

这里我们并不是在创建一个列表;每一次迭代中,我们都产生了一个新的列表,这个新的列表比上次产生的列表还多一个元素。

To avoid copying the result in each iteration, we must build the list in reverse order and reverse the list when we are done:

为了避免在每次迭代中复制最后的结果,我们必须以相反顺序来构造列表,然后在结束的时候反转列表:

DO

 

tail_recursive_fib(N) ->
    tail_recursive_fib(N, 0, 1, []).

tail_recursive_fib(0, _Current, _Next, Fibs) ->
    lists:reverse(Fibs);
tail_recursive_fib(N, Current, Next, Fibs) -> 
    tail_recursive_fib(N - 1, Next, Current + Next, [Current|Fibs]).


 

2  List comprehensions

列表解析

Lists comprehensions still have a reputation for being slow. They used to be implemented using funs, which used to be slow.

列表解析仍然被认为是效率慢的。它们以前是使用函数来实现的,而函数过去也是效率慢的。

In recent Erlang/OTP releases (including R12B), a list comprehension

在最近的Erlang/OTP版本中(包括R12B),列表解析

 

[Expr(E) || E <- List]

 

is basically translated to a local function

被转换成一个本地函数

 

'lc^0'([E|Tail], Expr) ->
    [Expr(E)|'lc^0'(Tail, Expr)];
'lc^0'([], _Expr) -> [].

 

In R12B, if the result of the list comprehension will obviously not be used, a list will not be constructed. For instance, in this code

在R12B版本中,如果列表解析的结果明确表示出将不会被使用的,那么列表将不会被构造出来。例如,在下面这段代码中

 

[io:put_chars(E) || E <- List],
ok.

 

or in this code

或者下面这段代码

 

.
.
.
case Var of
    ... ->
        [io:put_chars(E) || E <- List];
    ... ->
end,
some_function(...),
.
.
.

 

the value is neither assigned to a variable, nor passed to another function, nor returned, so there is no need to construct a list and the compiler will simplify the code for the list comprehension to

列表里面的元素既不会赋值给一个变量,也不会传给另外一个函数,或者直接返回,所以这种情况下是没有必要构造一个列表的,编译器会简单地把列表解析的代码简化成下面的形式(%% 一是没有了列表结构,二是用了尾递归 %%)

 

'lc^0'([E|Tail], Expr) ->
    Expr(E),
    'lc^0'(Tail, Expr);
'lc^0'([], _Expr) -> [].


 

3  Deep and flat lists

深度扁平列表

lists:flatten/1 builds an entirely new list. Therefore, it is expensive, and even more expensive than the ++ (which copies its left argument, but not its right argument).

lists:flatten/1函数会构造一个完全崭新的列表。因此,它的开销很大,甚至比++操作符的都要来的大(++操作符只复制它的左值,不复制右值)。

(%% flatten的效果就是把有深度的列表,如 [[1], [2], [3]] ,转换成[1, 2, 3],会复制原有列表中所有的元素,如果没理解错误的话 %%)

In the following situations, you can easily avoid calling lists:flatten/1:

  • When sending data to a port. Ports understand deep lists so there is no reason to flatten the list before sending it to the port.

  • When calling BIFs that accept deep lists, such as list_to_binary/1 or iolist_to_binary/1.

  • When you know that your list is only one level deep, you can can use lists:append/1.

下面几种情况中,你可以容易地避免去调用lists:flatten/1函数:

  • 发送数据到一个端口的时候。端口知道(%% 或者说能够解析 %%)deep list(深列表),所以在把列表发送到端口之前不需要把它扁平化。

  • 调用像list_to_binary/1或者iolist_to_binary/1等这样的接收deep lists(深列表)的内置函数的时候。

  • 你知道列表只有一级深度的时候,可以使用list:append/1函数。(%% one level deep 应该是类似于[[1], [2], [3]]这种,只需使用append函数来扁平化 %%)

Port example

端口样例

DO

 

      ...
      port_command(Port, DeepList)
      ...

 

DO NOT

 

      ...
      port_command(Port, lists:flatten(DeepList))
      ...

 

A common way to send a zero-terminated string to a port is the following:

下面是一种常见的(但不推荐的)把0结尾字符串发送给一个端口的方式:

(%% 两种附加元素的方式的对比,最后结果的深度不一样 %%)

DO NOT

 

      ...
      TerminatedStr = String ++ [0], % String="foo" => [$f, $o, $o, 0]
      port_command(Port, TerminatedStr)
      ...

 

Instead do like this:

应该像这样来做:

DO

 

      ...
      TerminatedStr = [String, 0], % String="foo" => [[$f, $o, $o], 0]
      port_command(Port, TerminatedStr) 
      ...

 

Append example

附加样例

DO

 

      > lists:append([[1], [2], [3]]).
      [1,2,3]
      >

 

DO NOT

 

      > lists:flatten([[1], [2], [3]]).
      [1,2,3]
      >


 

4  Why you should not worry about recursive lists functions

你不需要担心递归列表函数的原因

In the performance myth chapter, the following myth was exposed: Tail-recursive functions are MUCH faster than recursive functions.

在性能误区那一章,讲到了这个误区:尾递归函数比普通递归函数更高效。

To summarize, in R12B there is usually not much difference between a body-recursive list function and tail-recursive function that reverses the list at the end. Therefore, concentrate on writing beautiful code and forget about the performance of your list functions. In the time-critical parts of your code (and only there), measure before rewriting your code.

总结一下,在R12B版本中,体递归函数和尾递归函数,同样在结束时反转列表,通常是没有多大区别的。因此,你只需要集中精力书写优美的代码,不用去担心列表函数的性能。只有在时间性能极其重要的那部分代码里,你需要在重写前掂量一下。

Important note: This section talks about lists functions that construct lists. A tail-recursive function that does not construct a list runs in constant space, while the corresponding body-recursive function uses stack space proportional to the length of the list. For instance, a function that sums a list of integers, should not be written like this

重点:这一节讲到了构造列表的函数。不构造列表的尾递归函数只需常量空间,相反,体递归函数占用的栈空间跟列表的长度成正比。例如,一个计算整数列表元素总和的函数,不应该写成这样:

DO NOT

 

recursive_sum([H|T]) -> H+recursive_sum(T);
recursive_sum([])    -> 0.

 

but like this

应该这样

DO

sum(L) -> sum(L, 0).

sum([H|T], Sum) -> sum(T, Sum + H);
sum([], Sum)    -> Sum.



posted @ 2012-10-22 17:01  CJ_Ruan  阅读(2215)  评论(0编辑  收藏  举报