【原译】Erlang列表处理（Efficiency Guide）

摘要: 翻译太菜希望大牛指正。以后会逐渐加入自己写的注释和疑问，有些地方自己也不太了解，知道或者感兴趣的同学可以指出来，一起讨论一下。

=====================================================================================================

List handling

1 Creating a list

创建一个列表

Lists can only be built starting from the end and attaching list elements at the beginning. If you use the ++ operator like this

列表只能从尾端开始创建，从头部加入元素。如果你像这样使用++操作符

List1 ++ List2

you will create a new list which is copy of the elements in List1, followed by List2. Looking at how lists:append/1 or ++would be implemented in plain Erlang, it can be seen clearly that the first list is copied:

你将创建一个新的列表，这个列表是List1的副本加上List2。看看lists:append/1函数或者++操作符在Erlang里面是怎样实现的，可以明了地看到第一个列表被复制了：（%% 取一个输入列表的元素，加到新的列表中 %%）

append([H|T], Tail) ->
    [H|append(T, Tail)];
append([], Tail) ->
    Tail.

So the important thing when recursing and building a list is to make sure that you attach the new elements to the beginning of the list, so that you build a list, and not hundreds or thousands of copies of the growing result list.

所以在递归遍历和构造一个列表的时候，要确保把新的元素附加到列表的头部，这样就能避免产生成百上千的列表的副本。

Let us first look at how it should not be done:

让我们首先来看看不应该使用的方法：

DO NOT

bad_fib(N) ->
    bad_fib(N, 0, 1, []).

bad_fib(0, _Current, _Next, Fibs) ->
    Fibs;
bad_fib(N, Current, Next, Fibs) -> 
    bad_fib(N - 1, Next, Current + Next, Fibs ++ [Current]).

Here we are not a building a list; in each iteration step we create a new list that is one element longer than the new previous list.

这里我们并不是在创建一个列表；每一次迭代中，我们都产生了一个新的列表，这个新的列表比上次产生的列表还多一个元素。

To avoid copying the result in each iteration, we must build the list in reverse order and reverse the list when we are done:

为了避免在每次迭代中复制最后的结果，我们必须以相反顺序来构造列表，然后在结束的时候反转列表：

tail_recursive_fib(N) ->
    tail_recursive_fib(N, 0, 1, []).

tail_recursive_fib(0, _Current, _Next, Fibs) ->
    lists:reverse(Fibs);
tail_recursive_fib(N, Current, Next, Fibs) -> 
    tail_recursive_fib(N - 1, Next, Current + Next, [Current|Fibs]).

2 List comprehensions

列表解析

Lists comprehensions still have a reputation for being slow. They used to be implemented using funs, which used to be slow.

列表解析仍然被认为是效率慢的。它们以前是使用函数来实现的，而函数过去也是效率慢的。

In recent Erlang/OTP releases (including R12B), a list comprehension

在最近的Erlang/OTP版本中（包括R12B），列表解析

[Expr(E) || E <- List]

is basically translated to a local function

被转换成一个本地函数

'lc^0'([E|Tail], Expr) ->
    [Expr(E)|'lc^0'(Tail, Expr)];
'lc^0'([], _Expr) -> [].

In R12B, if the result of the list comprehension will obviously not be used, a list will not be constructed. For instance, in this code

在R12B版本中，如果列表解析的结果明确表示出将不会被使用的，那么列表将不会被构造出来。例如，在下面这段代码中

[io:put_chars(E) || E <- List],
ok.

or in this code

或者下面这段代码

.
.
.
case Var of
    ... ->
        [io:put_chars(E) || E <- List];
    ... ->
end,
some_function(...),
.
.
.

the value is neither assigned to a variable, nor passed to another function, nor returned, so there is no need to construct a list and the compiler will simplify the code for the list comprehension to

列表里面的元素既不会赋值给一个变量，也不会传给另外一个函数，或者直接返回，所以这种情况下是没有必要构造一个列表的，编译器会简单地把列表解析的代码简化成下面的形式（%% 一是没有了列表结构，二是用了尾递归 %%）

'lc^0'([E|Tail], Expr) ->
    Expr(E),
    'lc^0'(Tail, Expr);
'lc^0'([], _Expr) -> [].

3 Deep and flat lists

深度扁平列表

lists:flatten/1 builds an entirely new list. Therefore, it is expensive, and even more expensive than the ++ (which copies its left argument, but not its right argument).

lists:flatten/1函数会构造一个完全崭新的列表。因此，它的开销很大，甚至比++操作符的都要来的大（++操作符只复制它的左值，不复制右值）。

（%% flatten的效果就是把有深度的列表，如 [[1], [2], [3]] ，转换成[1, 2, 3]，会复制原有列表中所有的元素，如果没理解错误的话 %%）

In the following situations, you can easily avoid calling lists:flatten/1:

When sending data to a port. Ports understand deep lists so there is no reason to flatten the list before sending it to the port.
When calling BIFs that accept deep lists, such as list_to_binary/1 or iolist_to_binary/1.
When you know that your list is only one level deep, you can can use lists:append/1.

下面几种情况中，你可以容易地避免去调用lists:flatten/1函数：

发送数据到一个端口的时候。端口知道（%% 或者说能够解析 %%）deep list（深列表），所以在把列表发送到端口之前不需要把它扁平化。
调用像list_to_binary/1或者iolist_to_binary/1等这样的接收deep lists（深列表）的内置函数的时候。
你知道列表只有一级深度的时候，可以使用list:append/1函数。（%% one level deep 应该是类似于[[1], [2], [3]]这种，只需使用append函数来扁平化 %%）

Port example

端口样例

      ...
      port_command(Port, DeepList)
      ...

DO NOT

      ...
      port_command(Port, lists:flatten(DeepList))
      ...

A common way to send a zero-terminated string to a port is the following:

下面是一种常见的（但不推荐的）把0结尾字符串发送给一个端口的方式：

（%% 两种附加元素的方式的对比，最后结果的深度不一样 %%）

DO NOT

      ...
      TerminatedStr = String ++ [0], % String="foo" => [$f, $o, $o, 0]
      port_command(Port, TerminatedStr)
      ...

Instead do like this:

应该像这样来做：

      ...
      TerminatedStr = [String, 0], % String="foo" => [[$f, $o, $o], 0]
      port_command(Port, TerminatedStr) 
      ...

Append example

附加样例

      > lists:append([[1], [2], [3]]).
      [1,2,3]
      >

DO NOT

      > lists:flatten([[1], [2], [3]]).
      [1,2,3]
      >

4 Why you should not worry about recursive lists functions

你不需要担心递归列表函数的原因

In the performance myth chapter, the following myth was exposed: Tail-recursive functions are MUCH faster than recursive functions.

在性能误区那一章，讲到了这个误区：尾递归函数比普通递归函数更高效。

To summarize, in R12B there is usually not much difference between a body-recursive list function and tail-recursive function that reverses the list at the end. Therefore, concentrate on writing beautiful code and forget about the performance of your list functions. In the time-critical parts of your code (and only there), measure before rewriting your code.

总结一下，在R12B版本中，体递归函数和尾递归函数，同样在结束时反转列表，通常是没有多大区别的。因此，你只需要集中精力书写优美的代码，不用去担心列表函数的性能。只有在时间性能极其重要的那部分代码里，你需要在重写前掂量一下。

Important note: This section talks about lists functions that construct lists. A tail-recursive function that does not construct a list runs in constant space, while the corresponding body-recursive function uses stack space proportional to the length of the list. For instance, a function that sums a list of integers, should not be written like this

重点：这一节讲到了构造列表的函数。不构造列表的尾递归函数只需常量空间，相反，体递归函数占用的栈空间跟列表的长度成正比。例如，一个计算整数列表元素总和的函数，不应该写成这样：

DO NOT

recursive_sum([H|T]) -> H+recursive_sum(T);
recursive_sum([])    -> 0.

but like this

应该这样

sum(L) -> sum(L, 0).

sum([H|T], Sum) -> sum(T, Sum + H);
sum([], Sum)    -> Sum.

posted @ 2012-10-22 17:01 CJ_Ruan 阅读(2215) 评论(0) 编辑收藏举报

刷新页面返回顶部

CJ_Ruan的网络日志

【原译】Erlang列表处理（Efficiency Guide）

List handling

1 Creating a list

创建一个列表

2 List comprehensions

列表解析

3 Deep and flat lists

深度扁平列表

4 Why you should not worry about recursive lists functions

你不需要担心递归列表函数的原因

公告