代码改变世界

metron简介

2020-09-16 10:45  宋海宾  阅读(726)  评论(0编辑  收藏  举报

一、简介:

http://metron.apache.org/current-book/index.html

 

Metron是一种多功能的安全遥测数据捕获、流分析和威胁响应平台,前身为Cisco公司的开源大数据系统安全框架项目OpenSOC。Metron提供的功能包括:日志的聚合、对网络包全面捕获的索引和存储、高级行为分析及数据浓缩,并可以将当前的威胁情报信息应用到安全遥测中。从概念上可划分为四个组件:数据捕获与摄取、实时数据处理、受保证的数据持久化和存储、用于驱动监控和风险报警服务的机器学习模型。

二、逻辑组件:

在这里插入图片描述

三、逻辑架构:

在这里插入图片描述

  1. Telemetry Event Buffer:遥感事件接收缓存,将传感器的数据存放到kafka消息队列中。(官方解释:All raw events from each telemetry security data source captured by Apache Nifi or custom Metron probe will be pushed into its own Kafka topic. The arrival of a telemetry event into the ingest buffer marks the start of where the Metron processing begins.)
  2. Process :将传感器数据标准化处理,便于后续模块处理。(官方解释:Each raw event will be parsed and normalized into a standardized flat JSON structure. Every event will be standardized into at least a 7-tuple JSON structure. This is done so the topology correlation engine further downstream can correlate messages from different topologies by these fields. The standard field names are as follows:ip_src_addr: layer 3 source IP、ip_dst_addr: layer 3 dest IP、ip_src_port: layer 4 source port、ip_dst_port: layer 4 dest port、protocol: layer 4 protocol、timestamp (epoch)、original_string: A human friendly string representation of the message)。
  3. Enrich:丰富标注化的数据,比如由IP地址可以得知地域信息(城市)。(官方解释:Once the raw security telemetry event has been parsed and normalized, the next step is to enrich different data elements of the normalized event. Examples of enrichment are GEO where an external IP address is enriched with GeoIP information (lat/long coordinates + City/State/Country) or HOST enrichment where an IP gets enriched with Host details (e.g: IP corresponds to Host X which is part of a web server farm for an e-commerce application.)
  4. Label:给丰富后的数据做标签,比如标记其是否是威胁,是何种威胁。(官方解释:After enrichment, the telemetry event goes through the labeling process. Actions done within this phase include threat intel cross reference checks where elements within the telemetry event can be used to do look ups against threat intel feed data sources like Soltra produced Stix/Taxii feeds or other threat intel aggregator services. These threat intel services will then “label” the telemetry event with threat intel metadata when a hit occurs.Other types of services include executing/scoring analytical models using model as a service pattern with the telemetry events that are flowing in.)
  5. Alert and Persist:某些遥感数据可能引发告警,metron将此数据持久化并建立索引,为方便后续处理。(官方解释:During this phase, certain telemetry events can initiate alerts. These types of telemetry events are then indexed in an alert index store. A telemetry event can spawn an alert triggered by a number of factors including:1)The event type - The raw telemetry event itself is an alert. For example, any event generated by Snort is an alert so it will automatically be indexed as an alert.2)Threat intel hit - If raw telemetry event has a threat intel hit, it will be marked as an alert.Also during this step, all enriched and labeled telemetry events are indexed and persisted in Hadoop for long term storage. The storage of these events in Hadoop produces a security data vault within the enterprise that enables next generation analytics to be performed.)
  6. UI Portal and Data & Integration Services:数据、威胁可视化处理

 

Apache Metron

Metron integrates a variety of open source big data technologies in order to offer a centralized tool for security monitoring and analysis. Metron provides capabilities for log aggregation, full packet capture indexing, storage, advanced behavioral analytics and data enrichment, while applying the most current threat intelligence information to security telemetry within a single platform.

For the latest information, please visit our website at http://metron.apache.org/

Metron can be divided into 4 areas:

  1. A mechanism to capture, store, and normalize any type of security telemetry at extremely high rates. Because security telemetry is constantly being generated, it requires a method for ingesting the data at high speeds and pushing it to various processing units for advanced computation and analytics.

  2. Real time processing and application of enrichments such as threat intelligence, geolocation, and DNS information to telemetry being collected. The immediate application of this information to incoming telemetry provides the context and situational awareness, as well as the who and where information critical for investigation

  3. Efficient information storage based on how the information will be used:

    • Logs and telemetry are stored such that they can be efficiently mined and analyzed for concise security visibility
    • The ability to extract and reconstruct full packets helps an analyst answer questions such as who the true attacker was, what data was leaked, and where that data was sent
    • Long-term storage not only increases visibility over time, but also enables advanced analytics such as machine learning techniques to be used to create models on the information. Incoming data can then be scored against these stored models for advanced anomaly detection.
  4. An interface that gives a security investigator a centralized view of data and alerts passed through the system. Metron’s interface presents alert summaries with threat intelligence and enrichment data specific to that alert on one single page. Furthermore, advanced search capabilities and full packet extraction tools are presented to the analyst for investigation without the need to pivot into additional tools.

Big data is a natural fit for powerful security analytics. The Metron framework integrates a number of elements from the Hadoop ecosystem to provide a scalable platform for security analytics, incorporating such functionality as full-packet capture, stream processing, batch processing, real-time search, and telemetry aggregation. With Metron, our goal is to tie big data into security analytics and drive towards an extensible centralized platform to effectively enable rapid detection and rapid response for advanced security threats.

Obtaining Metron

To obtain a release of Metron, please visit http://metron.apache.org/documentation/#releases

This repository is a collection of submodules for convenience which is regularly updated to point to the latest versions. Github provides multiple ways to obtain Metron’s code:

  1. git clone –recursive https://github.com/apache/metron
  2. Download ZIP
  3. Clone or download each repository individually

Option 3 is more likely to have the latest code.

Getting Started

To start exploring the capabilities of Apache Metron follow these instructions to launch Metron in a single-node VM on your own hardware.

Building Metron

Build the full project and run tests:

$ mvn clean install

Build without tests:

$ mvn clean install -DskipTests

Build with the HDP profile:

$ mvn clean install -PHDP-2.5.0.0

You can swap “install” for “package” in the commands above if you don’t want to deploy the artifacts to your local .m2 repo.

Build Metron Reporting

To build and run reporting with code coverage:

$ mvn clean install
$ mvn site site:stage-deploy site:deploy

Code coverage can be skipped by skipping tests:

$ mvn clean install -DskipTests site site:stage-deploy site:deploy

The staged site is deployed to /tmp/metron/site/index.html, and can be viewed by opening the file in a browser.

Building with Docker

A Docker container with all the required software, with the proper versions, is available to be used as well. see ansible-docker

Navigating the Architecture

Metron is at its core a Kappa architecture with Apache Storm as the processing component and Apache Kafka as the unified data bus.

Some high level links to the relevant subparts of the architecture, for more information:

  • Parsers : Parsing data from kafka into the Metron data model and passing it downstream to Enrichment.
  • Enrichment : Enriching data post-parsing and providing the ability to tag a message as an alert and assign a risk triage level via a custom rule language.
  • Indexing : Indexing the data post-enrichment into HDFS, Elasticsearch or Solr.

Some useful utilities that cross all of these parts of the architecture:

  • Stellar : A custom data transformation language that is used throughout metron from simple field transformation to expressing triage rules.
  • Model as a Service : A Yarn application which can deploy machine learning and statistical models onto the cluster along with the associated Stellar functions to be able to call out to them in a scalable manner.
  • Data management : A set of data management utilities aimed at getting data into HBase in a format which will allow data flowing through metron to be enriched with the results. Contains integrations with threat intelligence feeds exposed via TAXII as well as simple flat file structures.
  • Profiler : A feature extraction mechanism that can generate a profile describing the behavior of an entity. An entity might be a server, user, subnet or application. Once a profile has been generated defining what normal behavior looks-like, models can be built that identify anomalous behavior.

Notes on Adding a New Sensor

In order to allow for meta alerts to be queries alongside regular alerts in Elasticsearch 2.x, it is necessary to add an additional field to the templates and mapping for existing sensors.

Please see a description of the steps necessary to make this change in the metron-elasticsearch Using Metron with Elasticsearch 2.x