Converting Boolean-Logic Decision Trees to Finite State Machines 如何将布尔表达式转换为有限状态机,用于更简单、高性能的网络安全事件检测

Converting Boolean-Logic Decision Trees to Finite State Machines

for simpler, high-performance detection of cybersecurity events

将布尔逻辑决策树转换为有限状态机

用于更简单、高性能的网络安全事件检测

Problem

  • Source address: 123.123.123.123:14001
  • Destination address: 192.168.0.1:19001
  • URL: https://myservice.com/path1

Boolean expressions

and(
or(
tcp:80, tcp:8080
),
ipv4:10.0.0.1,
or(
url:http://www.example.com/malware.dat,
url:http://example.com/malware.dat
)
)

Input

ipv4:123.123.123.123
tcp:14001
ipv4:192.168.0.1
tcp:19001
url:https://myservice.com/path1

A basic evaluation algorithm

一个基本的评估算法

  • For each type:value attribute, see if there is a corresponding type:value term in the boolean tree. If it exists, set the term node as true, and evaluate the parent node.
  • When evaluating a parent or node, when any child is true, the or node is true, and its parent node is evaluated.
  • When evaluating a parent and node, when ALL children are true, the and node is true, and its parent node is evaluated.
  • When evaluating a parent not node, when the child node is true, the not node is false. Once evaluation of all attributes is complete, if a not node has not been deemed false because its child is false, then it is evaluated true, and it’s parent node is evaluated.
  • 对于每个type:value属性,查看布尔树中是否有对应type:value的项。如果存在,则将术语节点设置为真,并评估父节点。
  • 在评估父节点节点时,当任何子节点为真时,节点为真,并评估其父节点。
  • 在评估父节点节点时,当所有子节点都为真时节点为真,并评估其父节点。
  • 在评估父节点时,当子节点为真时,节点为假。一旦完成所有属性的评估,如果节点因为其子节点为假而未被视为假,则将其评估为真,并评估其父节点。

Converting to an FSM

Step 1: Identify the ‘basic states’

转换为 FSM

第 1 步:确定“基本状态”

  1. The root of a tree is inherently a hit state, which means the boolean expression is true. This is a basic state.
  2. not node is never a basic state.
  3. A child of an and node is a basic state unless it is a not node.
  4. A child of a not node is a basic state unless it is a not node itself.
  1. 树的根本质上是一个hit状态,这意味着布尔表达式为真。这是一个基本状态。
  2. 节点永远不是基本状态。
  3. and节点的子节点是基本状态,除非它是 not节点。==》and也有布尔子表达式呢???
  4. 节点的子节点是基本状态,除非它本身是非节点。

Step 2: Identify the ‘combination states’

第 2 步:确定“组合状态”

  • init: The empty set
  • s3: The first or node:
  • s4: The ip:10.0.0.1 node
  • s7: The second or node
  • s3-4: The first or node and ip:10.0.0.1
  • s4-7The ip:10.0.0.1 node and the second or node
  • s3-7: The first and second or nodes
  • hit: the root node
  • init: 空集
  • s3:第一个节点:
  • s4:ip:10.0.0.1节点
  • s7:第二个节点
  • s3-4:第一个节点和ip:10.0.0.1
  • s4-7:ip:10.0.0.1节点和第二节点
  • s3-7:第一和第二节点
  • hit:根节点

Step 3: Find all match terms

第 3 步:查找所有匹配项

第 4 步:查找所有转换

Step 4: Find all transitions

For every combination state:
Work out the state name of that 'input' combination state
For every match term:
Given the input state
What state results from evaluating that term as true?
Work out the state name of that 'output' combination state
Record a transition (input, match term, output)
Given the input state
What state results from evaluating end: as true?
Work out the state name of that 'output' combination state
Record a transition (input, end:, output)

Step 5: Remove invalid transitions

第 5 步:删除无效转换

  1. Create a set containing only the combination state hit.
  2. Iterate over the FSM adding all transitions for which there is a navigation to any state in the set.
  3. Repeat 2. until the full set of states is discovered.
  1. Construct a set containing only init.
  2. Iterate over the FSM finding all transitions for which there is a navigation from any state in the set.
  3. Repeat 2. until the set of states is discovered.

Resultant FSM

结果 FSM

Using the FSM

  • The FSM starts in the init state.
  • As attributes are discovered, the type:value is compared to the transitions from the current state. If a transition exists, the FSM moves to a new state.
  • When the hit state is achieved, that is the equivalent of the boolean expression evaluating to true.
  • When the fail state is achieved, no further attribute discovery is needed, and the evaluation can be fast-failed.

使用有限状态机

  • FSM 在该init状态下启动。
  • 当发现属性时,会将其type:value与当前状态的转换进行比较。如果存在转换,则 FSM 会移动到新状态。
  • 当达到hit状态时,这相当于布尔表达式的计算结果为真
  • 当达到fail状态时,不需要进一步的属性发现,并且评估可以快速失败。

Example 2: Using not

and(
not(
or(
tcp:8081, tcp:8082
)
),
and(
tcp:80,
or(
url:http://www.example.com/malware.dat,
url:http://example.com/malware.dat
)
)
)
  • There is no transition to the combination state consisting only of state s9. This is because it is not possible to arrive at this state without evaluating both s5 and s8 as true. There are transitions that lead from s9, to hit, but there is no path which leads to the s9 state, so they can never be taken.
  • State s5-8 is similarly not valid, if s5 and s8 are evaluated as true, s9 is also true. In both cases, the valid state for this condition would be s5-8-9. This results in an unreachable part of the FSM with two nodes which is not connected with the rest of the FSM.
  • Any state with s3 true cannot lead to hit because the root and node is necessarily false. All nodes which include s3 can be replaced by the fail state.

Example 3: More state

and(
or(
url:http://www.example.com/malware.dat,
url:http://example.com/malware.dat
),
ipv4:10.0.0.1,
not(
and(
or(
tcp:8081,
tcp:8082
),
ipv4:10.0.0.2
)
)
)

Evaluating many rules concurrently

同时评估许多规则

Implementation: cyberprobe indicators

cyberprobe项目包括一种以JSON 格式编写规则的方法有许多实用程序可以解析规则格式并输出 FSM 信息,例如获取规则/指标文件并转储文件中每个规则的 FSM。这是以人类可读形式显示状态转换的输出:indicators-show-fsm(todo,实践下)

[indicators]$ indicators-show-fsm case1.json 
3ce77704-abe4–4527–84e6-ed6a745aebcf: URL of a page serving malware
init — tcp:8080 -> s6
init — tcp:80 -> s6
init — url:http://example.org/malware.dat -> s3
init — url:http://www.example.org/malware.dat -> s3
s3 — tcp:8080 -> hit
s3 — tcp:80 -> hit
s6 — url:http://example.org/malware.dat -> hit
s6 — url:http://www.example.org/malware.dat -> hit
[indicators]$ indicators-graph-fsm case1.json \
3ce77704-abe4–4527–84e6-ed6a745aebcf > graph.dot
[indicators]$ dot -Tpng graph.dot > graph.png
#!/usr/bin/env python3import sys
import cyberprobe.fsm_extract as fsme
from cyberprobe.logictree import And, Or, Not, Matchexpression = And([
Or([
Match("tcp", "80"), Match("tcp", "8080")
]),
Match("ipv4", "10.0.0.1"),
Or([
Match("url", "http://www.example.com/malware.dat"),
Match("url", "http:/example.com/malware.dat")
])
])fsm = fsme.extract(expression)# Dump out FSM
for v in fsm:
for w in v[1]:
print(" %s -- %s:%s -> %s" % (v[0], w[0], w[1], v[2]))

A quick word on performance

Conclusion

  • How boolean expressions can be represented as trees
  • Mapping boolean expressions to finite state machines
  • How the use of an FSM simplifies detection logic and enables performance advantages
  • How to use multiple FSMs for evaluation of concurrent boolean expressions
  • That the FSM approach is implemented in the open source cyberprobe project
Programming
Fsm
Detection
Cybersecurity
Boolean Search
posted @ 2023-02-03 16:51  bonelee  阅读(57)  评论(1编辑  收藏  举报