【原译】Erlang常见注意事项(Efficiency Guide)
发现很多大牛也在翻译Erlang/OTP的一些内容,先膜拜一下,可能不小心会重复,但是也只有在翻译过程中才会遇到并且学习到跟自己原想不一样的东西,翻译太菜希望大牛指正。
=====================================================================================================
Common Caveats(常见注意事项)
Erlang/OTP R15B02
Here we list a few modules and BIFs to watch out for, and not only from a performance point of view.
这里我们列出了需要注意的一些模块和内置函数,不仅仅是从性能的角度来看。
1 The timer module
定时器模块
Creating timers using erlang:send_after/3 and erlang:start_timer/3 is much more efficient than using the timers provided by the timer module. The timer module uses a separate process to manage the timers, and that process can easily become overloaded if many processes create and cancel timers frequently (especially when using the SMP emulator).
使用erlang:send_after/3和erlang:start_timer/3来生成定时器比用timer模块提供的定时器效率要高的多。timer模块使用一个分离的进程来管理定时器,而且如果很多进程频繁地创建和取消这些计时器(特别是当使用SMP模拟器时),那个管理进程很容易负载过重。
The functions in the timer module that do not manage timers (such as timer:tc/3 ortimer:sleep/1), do not call the timer-server process and are therefore harmless.
那些在timer模块中的不用来管理定时器的函数(例如,timer:tc/3,timer:sleep/1),不调用timer-server进程,因此是无害的。
2 list_to_atom/1
list_to_atom/1函数
Atoms are not garbage-collected. Once an atom is created, it will never be removed. The emulator will terminate if the limit for the number of atoms (1048576 by default) is reached.
原子是不会被垃圾回收器回收的。一旦一个原子被创建,它将永远不会被移除。如果原子的数量达到了限制数量(默认是1048576),模拟器会终止。
Therefore, converting arbitrary input strings to atoms could be dangerous in a system that will run continuously. If only certain well-defined atoms are allowed as input, you can use list_to_existing_atom/1 to guard against a denial-of-service attack. (All atoms that are allowed must have been created earlier, for instance by simply using all of them in a module and loading that module.)
因此,在一个持续运转的系统中,将任意的字符串输入转换为原子是危险的。如果只允许输入某些良好定义的原子,你可以使用list_to_existing_atom/1函数,来防范denial-of-service攻击(拒绝服务攻击)。(所有被允许的原子必须先创建好,比如在一个模块中使用,然后再加载那个模块)
Using list_to_atom/1 to construct an atom that is passed to apply/3 like this
像下面这样使用list_to_atom/1函数来构建一个原子,然后传给apply/3
apply(list_to_atom("some_prefix"++Var), foo, Args)
is quite expensive and is not recommended in time-critical code.
代价相当大,在时间因素很重要的代码里不推荐这样使用。
3 length/1
length/1函数
The time for calculating the length of a list is proportional to the length of the list, as opposed to tuple_size/1, byte_size/1, and bit_size/1, which all execute in constant time.
计算列表长度的时间和这个列表的长度成正比,相反的,tuple_size/1,byte_size/1,bit_size/1都在常量时间内计算。
Normally you don't have to worry about the speed of length/1, because it is efficiently implemented in C. In time critical-code, though, you might want to avoid it if the input list could potentially be very long.
通常你不必担心length/1函数的速度,因为它是用C来有效实现的。尽管如此,在时间因素很重要的代码里,如果输入的列表有可能会非常长,你仍然可能会避免使用它。
Some uses of length/1 can be replaced by matching. For instance, this code
一些length/1函数的使用情况可以用匹配来替代。比如,下面代码
foo(L) when length(L) >= 3 ->
...
can be rewritten to
可以被重写成
foo([_,_,_|_]=L) ->
...
(One slight difference is that length(L) will fail if the L is an improper list, while the pattern in the second code fragment will accept an improper list.)
(一点轻微的不同在于,如果输入L是个不合适的列表,length(L)会执行失败,但是第二个代码块中的匹配方法能够接收不合适的列表)
4 setelement/3
setelement/3函数
setelement/3 copies the tuple it modifies. Therefore, updating a tuple in a loop usingsetelement/3 will create a new copy of the tuple every time.
setelement/3函数会复制它要修改的那个元组。因此,在一个循环中使用setelement/3函数来更新一个元组每次都会产生一个新的副本。
There is one exception to the rule that the tuple is copied. If the compiler clearly can see that destructively updating the tuple would give exactly the same result as if the tuple was copied, the call to setelement/3will be replaced with a special destructive setelement instruction. In the following code sequence
对于这种复制元组的规则,有一个例外。如果编译器可以清楚的知道,这种破坏性的更新会产生跟复制一样的结果,那么一种特殊的setelement指令会代替原有的对setelement/3函数调用。下面的代码序列中
multiple_setelement(T0) ->
T1 = setelement(9, T0, bar),
T2 = setelement(7, T1, foobar),
setelement(5, T2, new_value).
the first setelement/3 call will copy the tuple and modify the ninth element. The two following setelement/3 calls will modify the tuple in place.
第一个setelement/3函数会复制元组并修改第九个元素。后面两个setelement/3函数会原地修改这个元组。
For the optimization to be applied, all of the followings conditions must be true:
- The indices must be integer literals, not variables or expressions.
- The indices must be given in descending order.
- There must be no calls to other function in between the calls to setelement/3.
- The tuple returned from onesetelement/3 call must only be used in the subsequent call to setelement/3.
只有在下面的这些条件都成立时,才能执行优化:
索引必须是整数字符,不能是变量或者表达式
索引必须是降序的
在连续的setelement/3调用之间,不能有其他的函数处理。
一个setelement/3函数的返回结果必须只能用在随后的setelement/3函数中
If it is not possible to structure the code as in the multiple_setelement/1 example, the best way to modify multiple elements in a large tuple is to convert the tuple to a list, modify the list, and convert the list back to a tuple.
如果不能像multiple_setelement/1例子一样构建代码,在一个大型的元组中修改多个元素的最佳方式就是,把元组转换成一个列表,修改列表,再把列表改回成元组。
5 size/1
size/1函数
size/1 returns the size for both tuples and binary.
size/1函数返回元组或者二进制串的大小。
Using the new BIFs tuple_size/1 andbyte_size/1 introduced in R12B gives the compiler and run-time system more opportunities for optimization. A further advantage is that the new BIFs could help Dialyzer find more bugs in your program.
使用R12B版本中新引入的内置函数tuple_size/1和byte_size/1,能够让编译器和虚拟机做更多的优化。进一步的优点就是新的内置函数能够帮助Dialyzer发现程序中更多的bug。
6 split_binary/2
split_binary/2函数
It is usually more efficient to split a binary using matching instead of calling thesplit_binary/2 function. Furthermore, mixing bit syntax matching andsplit_binary/2 may prevent some optimizations of bit syntax matching.
相比较调用split_binary/2函数来分解一个二进制串而言,匹配通常更加有效。而且,混合使用bit语法匹配和split_binary/2函数,可能会阻碍某些bit语法匹配的优化工作。
DO
<<Bin1:Num/binary,Bin2/binary>> = Bin,
DO NOT
{Bin1,Bin2} = split_binary(Bin, Num)
7 The '--' operator
'--'操作符
Note that the '--' operator has a complexity proportional to the product of the length of its operands, meaning that it will be very slow if both of its operands are long lists:
注意,'--'操作符的复杂度和它的操作数的长度的乘积成正比,这表示,如果两个操作数是长列表,那么它的处理会很慢:
DO NOT
HugeList1 -- HugeList2
Instead use the ordsets module:
要使用ordsets模块:
DO
HugeSet1 = ordsets:from_list(HugeList1),
HugeSet2 = ordsets:from_list(HugeList2),
ordsets:subtract(HugeSet1, HugeSet2)
Obviously, that code will not work if the original order of the list is important. If the order of the list must be preserved, do like this:
很显然,如果原来的列表中元素的顺序很重要,那么上面的代码是行不通的。如果列表中元素的顺序必须保留,那么像这样做:
DO
Set = gb_sets:from_list(HugeList2),
[E || E <- HugeList1, not gb_sets:is_element(E, Set)]
Subtle note 1: This code behaves differently from '--' if the lists contain duplicate elements. (One occurrence of an element in HugeList2 will remove all occurrences in HugeList1.)
细节事项1:如果列表中包含重复的元素,那么这块代码跟'--'操作符的效果不同。(HugeList2中出现一个元素即会删掉HugeList1中所有重复的这个元素)
Subtle note 2: This code compares lists elements using the '==' operator, while '--' uses the '=:='. If that difference is important,sets can be used instead of gb_sets, but note that sets:from_list/1 is much slower than gb_sets:from_list/1 for long lists.
细节事项2:这块代码使用'=='来比较列表元素,而'--'操作符使用'=:='来比较列表元素。如果这个区别显得很重要,那么可以用sets模块来替代gb_sets模块,但是记住,对于长列表,sets:from_list/1函数比gb_sets:from_list/1函数要慢得多。
Using the '--' operator to delete an element from a list is not a performance problem:
使用'--'操作符来删除列表中的一个元素不存在性能问题:
OK
HugeList1 -- [Element]