【原译】Erlang 函数(Efficiency Guide)
翻译太菜希望大牛指正。以后会逐渐加入自己写的注释和疑问,有些地方自己也不太了解,知道或者感兴趣的同学可以指出来,一起讨论一下。
===============================================================================================
Functions
1 Pattern matching
模式匹配
Pattern matching in function head and in case and receive clauses are optimized by the compiler. With a few exceptions, there is nothing to gain by rearranging clauses.
函数头以及case和receive子句中的模式匹配是经过编译器优化的。重新整理子句是没有用的,当然也有一些例外。
One exception is pattern matching of binaries. The compiler will not rearrange clauses that match binaries. Placing the clause that matches against the empty binary last will usually be slightly faster than placing it first.
一个例外是二进制数据的模式匹配。编译器不会去重新整理那些匹配上二进制数据的子句。把匹配空数据的子句放到最后比放在首位效率要稍快一些。
Here is a rather contrived example to show another exception:
这里是一个人为(一般不应该这样)的例子来表示另外一个例外:
DO NOT
atom_map1(one) -> 1;
atom_map1(two) -> 2;
atom_map1(three) -> 3;
atom_map1(Int) when is_integer(Int) -> Int;
atom_map1(four) -> 4;
atom_map1(five) -> 5;
atom_map1(six) -> 6.
The problem is the clause with the variable Int. Since a variable can match anything, including the atoms four, five, and six that the following clauses also will match, the compiler must generate sub-optimal code that will execute as follows:
问题在于那个带变量的子句。因为变量可以匹配任何值,包括后面的原子four, five和six也是,编译器会产生次优的代码,像下面这样执行:
First the input value is compared to one, two, and three (using a single instruction that does a binary search; thus, quite efficient even if there are many values) to select which one of the first three clauses to execute (if any).
首先,输入值会跟one, two和three比较(使用一个二分查找的指令;这样的话,即使有很多值也会很高效)来选择三个子句里面执行哪一个。
If none of the first three clauses matched, the fourth clause will match since a variable always matches. If the guard test is_integer(Int) succeeds, the fourth clause will be executed.
如果前三个子句没有一个匹配上,第四个子句会去匹配,因为变量总是能匹配上输入值。如果验证is_integer(Int)成功,第四个子句会被执行。
If the guard test failed, the input value is compared to four, five, and six, and the appropriate clause is selected. (There will be a function_clause exception if none of the values matched.)
如果验证失败,输入值会跟four, five和six去比较,选择其中合适的子句。(如果没有一个匹配上,会产生一个function_clause的异常。)
Rewriting to either
重写成下面任意一个
DO
atom_map2(one) -> 1;
atom_map2(two) -> 2;
atom_map2(three) -> 3;
atom_map2(four) -> 4;
atom_map2(five) -> 5;
atom_map2(six) -> 6;
atom_map2(Int) when is_integer(Int) -> Int.
or
DO
atom_map3(Int) when is_integer(Int) -> Int;
atom_map3(one) -> 1;
atom_map3(two) -> 2;
atom_map3(three) -> 3;
atom_map3(four) -> 4;
atom_map3(five) -> 5;
atom_map3(six) -> 6.
will give slightly more efficient matching code.
会产生更高效的代码。
Here is a less contrived example:
这里有个稍贴近一般情况的例子:
DO NOT
map_pairs1(_Map, [], Ys) ->
Ys;
map_pairs1(_Map, Xs, [] ) ->
Xs;
map_pairs1(Map, [X|Xs], [Y|Ys]) ->
[Map(X, Y)|map_pairs1(Map, Xs, Ys)].
The first argument is not a problem. It is variable, but it is a variable in all clauses. The problem is the variable in the second argument, Xs, in the middle clause. Because the variable can match anything, the compiler is not allowed to rearrange the clauses, but must generate code that matches them in the order written.
第一个参数没有问题,它是一个变量,所有子句中都有的那个变量。问题在于中间那个子句的第二个变量Xs。因为变量可以匹配任何值,所以这里不允许编译器来重新整理子句,而是按照所写的顺序来产生编译代码。
If the function is rewritten like this
如果像这样来重写这个函数
DO
map_pairs2(_Map, [], Ys) ->
Ys;
map_pairs2(_Map, [_|_]=Xs, [] ) ->
Xs;
map_pairs2(Map, [X|Xs], [Y|Ys]) ->
[Map(X, Y)|map_pairs2(Map, Xs, Ys)].
the compiler is free to rearrange the clauses. It will generate code similar to this
编译器就能够重新整理子句。它会产生类似于这样的代码
DO NOT (already done by the compiler)
explicit_map_pairs(Map, Xs0, Ys0) ->
case Xs0 of
[X|Xs] ->
case Ys0 of
[Y|Ys] ->
[Map(X, Y)|explicit_map_pairs(Map, Xs, Ys)];
[] ->
Xs0
end;
[] ->
Ys0
end.
which should be slightly faster for presumably the most common case that the input lists are not empty or very short. (Another advantage is that Dialyzer is able to deduce a better type for the variable Xs.)
对于可预见的大多数情况来讲(输入参数列表为非空或者很短),效率会快一点。
2 Function Calls
函数调用
Here is an intentionally rough guide to the relative costs of different kinds of calls. It is based on benchmark figures run on Solaris/Sparc:
- Calls to local or external functions (foo(), m:foo()) are the fastest kind of calls.
- Calling or applying a fun (Fun(), apply(Fun, [])) is about three times as expensive as calling a local function.
- Applying an exported function (Mod:Name(), apply(Mod, Name, [])) is about twice as expensive as calling a fun, or aboutsix times as expensive as calling a local function.
(%% 这边以及下面所指的fun,应该是Module:Function(Arguments)这种形式的函数,其中M,F,A可以是变量类型,值不是固定的 %%)
这边是一个粗略的对不同类型调用的开销做个比较的指南。这是根据Solaris/Sparc上benchmark的跑分来评判的。
- 直接调用本地或者外部函数是效率最快的。
- 直接调用一个fun,或者使用apply的方式来调用一个fun(Fun(), apply(Fun, [])),开销大概是直接调用本地函数的三倍左右。
- 使用apply的方式来调用一个外部模块的公开函数(Mod:Name(), apply(Mod, Name, [])),开销大概是调用一个fun的两倍,或者说,是直接调用本地函数的六倍。
Notes and implementation details
注意点和实现细节
Calling and applying a fun does not involve any hash-table lookup. A fun contains an (indirect) pointer to the function that implements the fun.
调用和应用一个fun,没有用到任何哈希表查询。一个fun(要应用的那个函数的名称)包含一个(间接)指针,指向那个实现功能的函数。
Tuples are not fun(s). A "tuple fun", {Module,Function}, is not a fun. The cost for calling a "tuple fun" is similar to that of apply/3 or worse. Using "tuple funs" is strongly discouraged, as they may not be supported in a future release, and because there exists a superior alternative since the R10B release, namely the fun Module:Function/Arity syntax.
元组不是fun(s)。"tuple fun",{Module, Function},不是一个fun。调用一个"tuple fun"的开销和apply/3的方式差不多或者更糟。强烈建议不要使用"tuple funs",因为后续的版本可能不再支持,而且从R10B版本开始,有一个更好的替代方式,叫做Module:Function/Arity syntax。
(%% tuple fun就是{Module, Function},没遇到过 %%)
apply/3 must look up the code for the function to execute in a hash table. Therefore, it will always be slower than a direct call or a fun call.
apply/3方式必须在一个哈希表中查找对应函数的代码。因此,它总是比直接调用或者fun调用要慢得多。
(%% apply/3方式和Module:Function(Argument)的区别 %%)
It no longer matters (from a performance point of view) whether you write
(从性能角度来讲)你用(下面)哪种方式写都没有关系
Module:Function(Arg1, Arg2)
or
apply(Module, Function, [Arg1,Arg2])
(The compiler internally rewrites the latter code into the former.)
(编译器会在内部结构上把后者重写成前者)
The following code
下面的代码
apply(Module, Function, Arguments)
is slightly slower because the shape of the list of arguments is not known at compile time.
会稍慢一点,因为参数列表Arguments在编译期还是未知状态。
3 Memory usage in recursion
递归中的内存使用情况
When writing recursive functions it is preferable to make them tail-recursive so that they can execute in constant memory space.
写递归函数的时候,推荐使用尾递归的形式,这样函数就可以在常量内存空间中执行。
DO
list_length(List) ->
list_length(List, 0).
list_length([], AccLen) ->
AccLen; % Base case
list_length([_|Tail], AccLen) ->
list_length(Tail, AccLen + 1). % Tail-recursive
DO NOT
list_length([]) ->
0. % Base case
list_length([_ | Tail]) ->
list_length(Tail) + 1. % Not tail-recursive