Reading SketchVisor Robust Network Measurement for Sofeware Packet Processing
SIGCOMM17
摘要
在现有的网络测量任务中包括流量监测、数据收集和一系列网络攻击的预防。现有的基于sketch的测量算法存在严重性能损失、大量计算开销以及测量的精确性不足,而基于硬件的优化方法并不适合sketch。为了完成这些任务,设计了一种基于纯软件包转发的网络测量框架,并改进现有算法提出了两种算法。这个网络测量框架具有高性能(line-rate)、高精度、广泛性(适用于多种sketch算法)、自动化(自动调节负载)的特点。框架包括数据平面和控制平面,每个软件交换机对应一个数据平面,每个数据平面包括nomal path和fast path。一旦流量过载,SketchVisor将过载的流量重定向到Fast Path,以保持高性能和高精度(虽然有轻微的损失),其中在Fast Path设计了Top-k算法。有一个总的控制平面,设计Compressing Sensing算法,对分布的交换机提供的数据加以整合,恢复出全网数据。最后实验验证了,SketchVisor使一系列基于sketch的测量方法达到了高性能和高精度要求。
Background
teminopogy
- Epoches: one or multiple time periods。
- Traffic statistics can be either flow-based (identified by 5-tuples) or host-based (identified by IP addresses); or either volume-based (measured by byte counts) or connectivity-based (measured by distinct flow/host counts).
- Sketch: At a high level, a sketch is a compact data structure comprising a set of buckets, each of which is associated with one or multiple counters. It maps each packet to a subset of buckets with independent hash functions, and updates the counters of those buckets. Network operators can query the counter values to recover traffic statistics.
Measurement Tasks
- Network measurement tasks includes: monitor traffid and colect traffic statistics, and some network attack.
- Attack:Heavy hitter,Heavy changer,DDos,Superspreader,Cardinality,Flow size distribution,ENtropy.
Performance flaws
Observation
- Sketches are only primitives that cannot be directly used for network measurement
- In order to collect meaningful traffic statistics, we must add extensions to sketches to make them reversible, meaning that sketches not only store traffic statistics, but also efficiently answer queries on the statistics
- Although sketches are efficiently designed, applying them in network measurement inevitably incurs heavy computational overhead.
- Sketches are compact data structures that can summarize traffic statistics of all packets with fixed-size memory, while incurring only bounded errors
Microbenchmark
- Providing comparation of exsited methods.
Problem
- Existing sketch-based measurement solutions suffer from severe performance drops under high traffic load.
- Heavy computational overhead:existing representative sketch-based solutions in software actually consume substantial CPU resources additional extensions or components that often incur heavy computations.
- Optimizing specific functions (e.g., using hardware-based hash
computations) may not work well for all sketch-based solutions.
design goals
- Performance: It processes packets at high speed and aims to fulfill
the line-rate requirement of the underlying packet processing
pipeline. - Resource efficiency: It efficiently utilizes CPU for packet processing
and memory for data structures. - Accuracy: It preserves high measurement accuracy of sketches.
- Generality: It supports a wide range of sketch-based measurement
tasks. - Simplicity: It automatically mitigates the processing burdens of
sketch-based measurement tasks under high traffic load, without
requiring manual per-host configurations and result aggregations
by network operators.
Solution
-
SketchVisor: a robust network measurement framework for software packet processing.
-
Load banlancing: Distributed data plane, each of which processes packets based on the sketch-based measurement tasks as assigned by network operators , and redirects excessive packets to the fast path if
the tasks are overloaded and cannot process those packets at high
speed. -
Track large flows: A new top-k algorithm for the fast path.
-
Track small flow*s: A global counter to track the traffic entering the fast
path so as to capture the aggregate characteristics of small flows
as well. -
Merge results: deploys a centralized control plane to
merge the local measurement results -
Our work is to mitigate the computational overhead of sketch-based measurement, while preserving the theoretical guarantees of sketches.
Implement
Aechitechture
- the SketchVisor comprises Data Plane and Control Plane.
Data plane
- each host possess a data plane, data plane can choose monitor ingress or egress traffic in case duplicated count.
- Data plane has two path, one is Normal path and another is Fast path, when buffer is full, the SketchVisor instructs the software switch to redirect overflowed packets to the fast path.
- They don't consider any proactive approach that examines packets and deciedes which packets should be dispatched into either the normal path or the fast path,as it will incur non-trival overhead.
- The Fast path is less accurate than the Normal path.
- The Fasy path should satisfy:fast enough to absorb all redirected traffic;highly accurate although slightly degrade from original sketch-base measurement;general for various traffic statics because each statics probably redirect into the Fast path.
Control plane
- the Control plane collects each switch's results and merges them to provide network-wide measurement.
- the Control plane should satisfy:eliminate the extra errors due to fast path (the error shoule only come from sketches themselves.);must be general to accommodate various measurement tasks.
SkechVisor
- Two algorithmic solutions, one builds on counter-based algorithms while the second builds on compressive sensing to design a network-wide recovery algorithm.
Fast Path
- To avoid the measurement failed and keep accuracy, Sketchvisor redirects overflow traffic into Fast Path.
- Design top-k algorithm which builds on Misra-Gries’s top-k algorithm for fast path.
- First, in order to kick out a small flow and add a (potentially) large flow, it performs O1ko operations to update k counters in a hash table; the overhead becomes significant when there are many small flows to kick out.
- Second, it has loose bounds on the estimated values of the top-k flows. To overcome both limitations, we combine the idea of probabilistic lossy counting (PLC) , a probabilistic algorithm that improves accuracy for tracking skewed data, with
Misra-Gries’s algorithm. - Specifically, we kick out multiple small flows each time, obviating the need of performing O1ko counter update operations for kicking out each flow (i.e., we amortize the operations over multiple kick-outs).
- Also, instead of using one
counter per flow, we carefully associate three counters with each
flow to provide tight per-flow lower and upper bounds.
Compressive Sensing
- Use Compressive sensing to recover network-wide statistics.
Related Work
- Sampling: widely used in software-defined measurement for low measurement overhead, but inherently misses information and supports only coarse-grained measurement.
- Sketches: Many architechtures employ sketches as primitives to chieve fine-grained measurement for various measrurement tasks, but incurs high computational overhead.
- TCAM:can be used to acheieve high-performance network measurement.
- Rule matching: selectively processes only packets of interest, thereby reducing measurement overhead,but hash-table incurs much higher memory overhead than sketched-based overhead.
- recover missing information:a matrix
interpolation problem to enable the control plane to recover missing
information via compressive sensing
Advantages
- high throughput and high accuracy
- fine grained
- accurately reason about the behavior of high traffic load
- resource-efficient
- recovers network-wide
conclusion
- Design and implement SketchVisor, a robust network-wide measurement architecture for software packet processing, with a primary goal of preserving performance and accuracy guarantees even under high traffic load. SketchVisor employs sketches as basic measurement primitives, and achieves high data plane performance with a fast path to offload sketch-based measurement under high traffic load. It further leverages compressive sensing to achieve accurate network-wide measurement. Experiments demonstrate that SketchVisor achieves high performance and high accuracy for a rich set of sketch-based solutions.