【原译】Erlang列表处理(Efficiency Guide)
摘要: 翻译太菜希望大牛指正。以后会逐渐加入自己写的注释和疑问,有些地方自己也不太了解,知道或者感兴趣的同学可以指出来,一起讨论一下。
=====================================================================================================
List handling
1 Creating a list
创建一个列表
Lists can only be built starting from the end and attaching list elements at the beginning. If you use the ++ operator like this
列表只能从尾端开始创建,从头部加入元素。如果你像这样使用++操作符
List1 ++ List2
you will create a new list which is copy of the elements in List1, followed by List2. Looking at how lists:append/1 or ++would be implemented in plain Erlang, it can be seen clearly that the first list is copied:
你将创建一个新的列表,这个列表是List1的副本加上List2。看看lists:append/1函数或者++操作符在Erlang里面是怎样实现的,可以明了地看到第一个列表被复制了:(%% 取一个输入列表的元素,加到新的列表中 %%)
append([H|T], Tail) ->
[H|append(T, Tail)];
append([], Tail) ->
Tail.
So the important thing when recursing and building a list is to make sure that you attach the new elements to the beginning of the list, so that you build a list, and not hundreds or thousands of copies of the growing result list.
所以在递归遍历和构造一个列表的时候,要确保把新的元素附加到列表的头部,这样就能避免产生成百上千的列表的副本。
Let us first look at how it should not be done:
让我们首先来看看不应该使用的方法:
DO NOT
bad_fib(N) ->
bad_fib(N, 0, 1, []).
bad_fib(0, _Current, _Next, Fibs) ->
Fibs;
bad_fib(N, Current, Next, Fibs) ->
bad_fib(N - 1, Next, Current + Next, Fibs ++ [Current]).
Here we are not a building a list; in each iteration step we create a new list that is one element longer than the new previous list.
这里我们并不是在创建一个列表;每一次迭代中,我们都产生了一个新的列表,这个新的列表比上次产生的列表还多一个元素。
To avoid copying the result in each iteration, we must build the list in reverse order and reverse the list when we are done:
为了避免在每次迭代中复制最后的结果,我们必须以相反顺序来构造列表,然后在结束的时候反转列表:
DO
tail_recursive_fib(N) ->
tail_recursive_fib(N, 0, 1, []).
tail_recursive_fib(0, _Current, _Next, Fibs) ->
lists:reverse(Fibs);
tail_recursive_fib(N, Current, Next, Fibs) ->
tail_recursive_fib(N - 1, Next, Current + Next, [Current|Fibs]).
2 List comprehensions
列表解析
Lists comprehensions still have a reputation for being slow. They used to be implemented using funs, which used to be slow.
列表解析仍然被认为是效率慢的。它们以前是使用函数来实现的,而函数过去也是效率慢的。
In recent Erlang/OTP releases (including R12B), a list comprehension
在最近的Erlang/OTP版本中(包括R12B),列表解析
[Expr(E) || E <- List]
is basically translated to a local function
被转换成一个本地函数
'lc^0'([E|Tail], Expr) ->
[Expr(E)|'lc^0'(Tail, Expr)];
'lc^0'([], _Expr) -> [].
In R12B, if the result of the list comprehension will obviously not be used, a list will not be constructed. For instance, in this code
在R12B版本中,如果列表解析的结果明确表示出将不会被使用的,那么列表将不会被构造出来。例如,在下面这段代码中
[io:put_chars(E) || E <- List],
ok.
or in this code
或者下面这段代码
.
.
.
case Var of
... ->
[io:put_chars(E) || E <- List];
... ->
end,
some_function(...),
.
.
.
the value is neither assigned to a variable, nor passed to another function, nor returned, so there is no need to construct a list and the compiler will simplify the code for the list comprehension to
列表里面的元素既不会赋值给一个变量,也不会传给另外一个函数,或者直接返回,所以这种情况下是没有必要构造一个列表的,编译器会简单地把列表解析的代码简化成下面的形式(%% 一是没有了列表结构,二是用了尾递归 %%)
'lc^0'([E|Tail], Expr) ->
Expr(E),
'lc^0'(Tail, Expr);
'lc^0'([], _Expr) -> [].
3 Deep and flat lists
深度扁平列表
lists:flatten/1 builds an entirely new list. Therefore, it is expensive, and even more expensive than the ++ (which copies its left argument, but not its right argument).
lists:flatten/1函数会构造一个完全崭新的列表。因此,它的开销很大,甚至比++操作符的都要来的大(++操作符只复制它的左值,不复制右值)。
(%% flatten的效果就是把有深度的列表,如 [[1], [2], [3]] ,转换成[1, 2, 3],会复制原有列表中所有的元素,如果没理解错误的话 %%)
In the following situations, you can easily avoid calling lists:flatten/1:
-
When sending data to a port. Ports understand deep lists so there is no reason to flatten the list before sending it to the port.
-
When calling BIFs that accept deep lists, such as list_to_binary/1 or iolist_to_binary/1.
-
When you know that your list is only one level deep, you can can use lists:append/1.
下面几种情况中,你可以容易地避免去调用lists:flatten/1函数:
-
发送数据到一个端口的时候。端口知道(%% 或者说能够解析 %%)deep list(深列表),所以在把列表发送到端口之前不需要把它扁平化。
-
调用像list_to_binary/1或者iolist_to_binary/1等这样的接收deep lists(深列表)的内置函数的时候。
-
你知道列表只有一级深度的时候,可以使用list:append/1函数。(%% one level deep 应该是类似于[[1], [2], [3]]这种,只需使用append函数来扁平化 %%)
Port example
端口样例
DO
...
port_command(Port, DeepList)
...
DO NOT
...
port_command(Port, lists:flatten(DeepList))
...
A common way to send a zero-terminated string to a port is the following:
下面是一种常见的(但不推荐的)把0结尾字符串发送给一个端口的方式:
(%% 两种附加元素的方式的对比,最后结果的深度不一样 %%)
DO NOT
...
TerminatedStr = String ++ [0], % String="foo" => [$f, $o, $o, 0]
port_command(Port, TerminatedStr)
...
Instead do like this:
应该像这样来做:
DO
...
TerminatedStr = [String, 0], % String="foo" => [[$f, $o, $o], 0]
port_command(Port, TerminatedStr)
...
Append example
附加样例
DO
> lists:append([[1], [2], [3]]).
[1,2,3]
>
DO NOT
> lists:flatten([[1], [2], [3]]).
[1,2,3]
>
4 Why you should not worry about recursive lists functions
你不需要担心递归列表函数的原因
In the performance myth chapter, the following myth was exposed: Tail-recursive functions are MUCH faster than recursive functions.
在性能误区那一章,讲到了这个误区:尾递归函数比普通递归函数更高效。
To summarize, in R12B there is usually not much difference between a body-recursive list function and tail-recursive function that reverses the list at the end. Therefore, concentrate on writing beautiful code and forget about the performance of your list functions. In the time-critical parts of your code (and only there), measure before rewriting your code.
总结一下,在R12B版本中,体递归函数和尾递归函数,同样在结束时反转列表,通常是没有多大区别的。因此,你只需要集中精力书写优美的代码,不用去担心列表函数的性能。只有在时间性能极其重要的那部分代码里,你需要在重写前掂量一下。
Important note: This section talks about lists functions that construct lists. A tail-recursive function that does not construct a list runs in constant space, while the corresponding body-recursive function uses stack space proportional to the length of the list. For instance, a function that sums a list of integers, should not be written like this
重点:这一节讲到了构造列表的函数。不构造列表的尾递归函数只需常量空间,相反,体递归函数占用的栈空间跟列表的长度成正比。例如,一个计算整数列表元素总和的函数,不应该写成这样:
DO NOT
recursive_sum([H|T]) -> H+recursive_sum(T);
recursive_sum([]) -> 0.
but like this
应该这样
DO
sum(L) -> sum(L, 0).
sum([H|T], Sum) -> sum(T, Sum + H);
sum([], Sum) -> Sum.