网络安全公开数据集
from: http://users.cis.fiu.edu/~lpeng/Datasets_detail.html
DARPA入侵检测数据集 DARPA数据集是迄今为止网络入侵检测领域的标准数据集,该数据集包括DARPA 1998、DARPA 1999和DARPA 2000三个数据集。
收集了9周的 TCPDUMP网络连接和系统审计数据,7周的训练数据,2周的测试数据,包含了Probe、DoS、R2L、U2R四大类攻击。
DARPA 1999覆盖了Probe、DoS、R2L、U2R和Data等5大类58种典型攻击方式,是目前最为全面的攻击测试数据集,作为研究领域共同认可及广泛使用的基准数据集,DARPA 1999评测数据给出了5周的模拟数据。其中前两周是提供给参于评测者的训练数据:第1,3周为不包含任何攻击的正常数据;第2周中插入了属于18种攻击类型的43次攻击实例,第4,5周用于测试。
DARPA 2000在DARPA 1999基础上攻击数据中加入了DDoS (Distributed Deny of Service)攻击,并增加了内部攻击、内部监听数据,以及Windows NT流量和攻击。
KDD Cup 99数据集
来自哥伦比亚大学的Sal Stolfo 教授和来自北卡罗莱纳州立大学的 Wenke Lee 教授采用数据挖掘等技术对DARPA 98和DARPA 99数据集进行特征分析和数据预处理,形成了一个新的数据集。该数据集用于1999年举行的KDD CUP竞赛中,成为著名的KDD CUP 99数据集。虽然年代有些久远,但KDD CUP 99数据集仍然是网络入侵检测领域的事实Benckmark,为基于计算智能的网络入侵检测研究奠定了研究基础。
NSL-KDD数据集
针对KDD CUP 99数据集出现的不足,NSL-KDD 数据集除去了KDD CUP 99数据集中冗余的数据,克服了分类器偏向于重复出现的记录,学习方法的性能受影响等问题,另外,对正常和异常的数据比例进行了合适选择,测试和训练数据数量更合理,因此更适合在不同的机器学习技术之间进行有效准确的评估。
Honeynet数据集
Honeynet 数据集是由HoneyNet组织收集的黑客攻击数据集,能较好地反映黑客攻击模式,数据集包括从2000年4月到2011年2月,累计11个月的Snort报警数据,每月大概60-3000多条Snort报警记录,其网络由8个IP地址通过ISDN连接到ISP,这样与大多数家庭和商业用户的网络环境基本一致,运行的操作系统包括Solaris Sparc, WinNT, Win98, and Linux Red Hat。
Challenge 2013数据集
Challenge 2013是IEEE Visualization 举办的可视分析挑战赛VAST Challenge 2013 中关于网络安全数据可视分析的竞赛数据集,该数据集提供了某虚构的跨国公司内部网络两周的运行日志,日志类型有3种,分别是网络流量Netflow日志数据和Big Brother 网络健康和状态数据,日志包括:第一、二周的Netflow和Big Brother日志,第二周的入侵预防系统日志数据,通过日志的分析可以找出网络中存在的异常,网络包含的主机和服务器约1100 台,原始日志量接近10 GB,记录数超过9000万行,下载要先输入邮箱地址。
Adult数据集
该数据集来自UCI,又名人口调查数据集,来自于美国1994年人口调查数据库,共有记录48842条,格式为TEXT,包含14个属性,分别为Age,workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,race,sex,capital-gain,capital-loss,hours-per-week,native-country,该数据集适用于机器学习、数据挖掘和隐私保护等。
恶意软件数据集
该数据集由West Virginia University的Yanfang Ye 提供,包括二个部分,其中第一个用于恶意软件检测,包含50000个实例,其中一半是恶意软件中提取的特征,另外一半是良性文件中提取的特征,通过该数据集,可以在数据挖掘和大数据建模技术的基础上,通过Win API调用提取特征集进行恶意软件检测。
本地下载 或 网络下载
第二个用于基于文件说明的恶意软件聚类,包含69,165个文件样本,其中3095个是恶意软件,22,583个是良性文件,其余45,487个是未知文件。
一些开放的网络安全数据集——木马、蠕虫、僵尸网络等数据集
ISCX-2016-SlowDos
名称 类型
slowbody2
slowread
ddossim DoS GET
goldeneye DoS improved GET
slowheaders
rudy slow send body
hulk DoS GET
slowloris slow-send headers
Slowhttptest slow-read 、slow-send headers、slow send body
地址:https://www.unb.ca/cic/datasets/dos-dataset.html
ISCX-Bot-2014
为了保证僵尸网络符合真实环境的情况,混合了
ISOT dataset、ISCX 2012 IDS dataset 、Botnet traffic generated by the Malware Capture Facility Project的子集
名称 类型
Neris IRC
Rbot IRC
Menti IRC
Sogou HTTP
Murlo IRC
Virut HTTP
NSIS P2P
Zeus P2P
SMTP Spam P2P
UDP Storm P2P
Tbot IRC
Zero Access P2P
Weasel P2P
Smoke Bot P2P
Zeus Control (C&C) P2P
ISCX IRC bot P2P
地址:https://www.unb.ca/cic/datasets/botnet.html
isot_app_and_botnet_dataset
类别:HTTP僵尸网络
应用范围:DNS
组成:由不同僵尸网络生成的恶意DNS流量组成的僵尸网络数据集和由不同已知软件应用程序生成的DNS流量组成的良性数据集。
https://www.uvic.ca/engineering/ece/isot/datasets/
Alenazi A., Traore I., Ganame K., Woungang I. (2017) Holistic Model for HTTP Botnet Detection Based on DNS Traffic Analysis. In: Traore I., Woungang I., Awad A. (eds) Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments. ISDDC 2017. Lecture Notes in Computer Science, vol 10618. Springer, Cham
CTU-13 DATASET
包含了13个场景下的僵尸网络流量数据
image
地址:https://mcfp.weebly.com/the-ctu-13-dataset-a-labeled-dataset-with-botnet-normal-and-background-traffic.html
MSNBC.com匿名网络数据
数据描述了1999年9月28日访问过msnbc.com的用户的页面访问量。访问次数记录在URL类别级别(请参阅说明),并按时间顺序记录
https://kdd.ics.uci.edu/databases/msnbc/msnbc.html
UNINA traffic traces
真实网络的流量跟踪和时间序列 。
http://traffic.comics.unina.it/Traces/ttraces.php
USC ISI web server
类型涉及TCP、IP、DNS、HTTP、ICMP
包括正常、异常检测、木马、蠕虫、僵尸网络等数据集
就是申请很是麻烦
数据集:http://www.isi.edu/ant/traces/dataset_list.html
申请地址:https://ant.isi.edu/datasets/requests.html
————————————————
原文链接:https://blog.csdn.net/qq_29857719/article/details/89211420
Datasets
Canadian Institute for Cybersecurity datasets are used around the world by universities, private industry, and independent researchers.
The following datasets are currently available:
- CCCS-CIC-AndMal2020
- DNS over HTTPS (CIRA-CIC-DoHBrw2020)
- CICMalDroid 2020
- Darknet 2020
- Investigation of the Android Malware (CIC-InvesAndMal2019)
- DDoS Evaluation Dataset (CIC-DDoS2019)
- IPS/IDS dataset on AWS (CSE-CIC-IDS2018)
- Intrusion Detection Evaluation Dataset (CIC-IDS2017)
- Android Malware Dataset (CIC-AndMal2017)
- Android Adware and General Malware Dataset (CIC-AAGM2017)
- DoS dataset (application-layer) 2017
- VPN-nonVPN traffic dataset (ISCXVPN2016)
- Tor-nonTor dataset (ISCXTor2016)
- URL dataset (ISCX-URL2016)
- ISCX Android Botnet dataset 2015
- ISCX Botnet dataset 2014
- ISCX Android Validation dataset 2014
- ISCX IDS dataset 2012
- ISCX NSL-KDD dataset 2009
DDoS Evaluation Dataset (CIC-DDoS2019) 举例
2. Dataset
CICDDoS2019 contains benign and the most up-to-date common DDoS attacks, which resembles the true real-world data (PCAPs). It also includes the results of the network traffic analysis using CICFlowMeter-V3 with labeled flows based on the time stamp, source, and destination IPs, source and destination ports, protocols and attack (CSV files).
Generating realistic background traffic was our top priority in building this dataset. We have used our proposed B-Profile system (Sharafaldin, et al. 2016) to profile the abstract behavior of human interactions and generates naturalistic benign background traffic in the proposed testbed (Figure 2). For this dataset, we built the abstract behaviour of 25 users based on the HTTP, HTTPS, FTP, SSH, and email protocols.
Machine | OS | IPs |
---|---|---|
Server | Ubuntu 16.04 (Web Server) |
192.168.50.1 (first day) 192.168.50.4 (second day) |
Firewall | Fortinet | 205.174.165.81 |
PCs (first day) |
Win 7 Win Vista Win 8.1 Win 10 |
192.168.50.8 192.168.50.5 192.168.50.6 192.168.50.7 |
PCs (second day) |
Win 7 Win Vista Win 8.1 Win 10 |
192.168.50.9 192.168.50.6 192.168.50.7 192.168.50.8 |
In this dataset, we have different modern reflective DDoS
attacks such as PortMap, NetBIOS, LDAP, MSSQL, UDP, UDP-Lag, SYN, NTP,
DNS, and SNMP. Attacks were subsequently executed during this period. As
Table III shows, we executed 12 DDoS attacks includes NTP, DNS, LDAP,
MSSQL, NetBIOS, SNMP, SSDP, UDP, UDP-Lag, WebDDoS, SYN and TFTP on the
training day and 7 attacks including PortScan, NetBIOS, LDAP, MSSQL,
UDP, UDP-Lag and SYN in the testing day. The traffic volume for WebDDoS
was so low and PortScan just has been executed in the testing day and
will be unknown for evaluating the proposed model.
Days | Attacks | Attack Time |
---|---|---|
First Day |
PortMap NetBIOS LDAP MSSQL UDP UDP-Lag SYN |
9:43 - 9:51 10:00 - 10:09 10:21 - 10:30 10:33 - 10:42 10:53 - 11:03 11:14 - 11:24 11:28 - 17:35 |
Second Day |
NTP DNS LDAP MSSQL NetBIOS SNMP SSDP UDP UDP-Lag WebDDoS SYN TFTP |
10:35 - 10:45 10:52 - 11:05 11:22 - 11:32 11:36 - 11:45 11:50 - 12:00 12:12 - 12:23 12:27 - 12:37 12:45 - 13:09 13:11 - 13:15 13:18 - 13:29 13:29 - 13:34 13:35 - 17:15 |
3. Using the dataset
The dataset has been organized per day. For each day, we recorded the raw data including the network traffic (Pcaps) and event logs (windows and Ubuntu event Logs) per machine. In features extraction process from the raw data, we used the CICFlowMeter-V3 and extracted more than 80 traffic features and saved them as a CSV file per machine.
If you want to use the AI techniques to analyze, you can download our generated data (CSV) files and analyze the network traffic.
If you want to use a new feature extractor, you can use the raw captured files (PCAP) to extract your features. And then, you can use the data mining techniques for analyzing the generated data.
4. License
You may redistribute, republish, and mirror the CICDDoS2019 dataset in any form. However, any use or redistribution of the data must include a citation to the CICDDoS2019 dataset and related published paper. A research paper outlining the details of analyzing the similar IDS/IPS dataset and related principles:
- Iman Sharafaldin, Arash Habibi Lashkari, Saqib Hakak, and Ali A. Ghorbani, "Developing Realistic Distributed Denial of Service (DDoS) Attack Dataset and Taxonomy", IEEE 53rd International Carnahan Conference on Security Technology, Chennai, India, 2019