【逆向】使用yarGen自动生成yara规则

前言

yarGen是一款yara规则生成器,它可以从恶意软件中获取可疑字符串来创建yara规则,同时过滤掉正常的字符串。

下载安装

yarGen是一个开源项目,源代码以zip和tar.gz的形式提供,你可以在Github页面中进行下载。

使用以下命令安装所有依赖项

sudo pip install scandir lxml naiveBayesClassifier pefile
//如果出现错误请尝试以下命令:
sudo pip install pefile
sudo pip install scandir lxml naiveBayesClassifier

使用以下命令下载内置数据库,并保存到“./dbs”子文件夹中

python yarGen.py --update

也可以从百度云下载,下载后解压到“./dbs”子文件夹中即可

使用示例

安装完成后可以使用“python yarGen.py -h”命令获取更多命令行参数信息。

 1 usage: yarGen.py [-h] [-m M] [-y min-size] [-z min-score] [-x high-scoring]
 2                  [-w superrule-overlap] [-s max-size] [-rc maxstrings]
 3                  [--excludegood] [-o output_rule_file] [-e output_dir_strings]
 4                  [-a author] [-r ref] [-l lic] [-p prefix] [-b identifier]
 5                  [--score] [--strings] [--nosimple] [--nomagic] [--nofilesize]
 6                  [-fm FM] [--globalrule] [--nosuper] [--update] [-g G] [-u]
 7                  [-c] [-i I] [--dropzone] [--nr] [--oe] [-fs size-in-MB]
 8                  [--noextras] [--debug] [--trace] [--opcodes] [-n opcode-num]
 9 
10 yarGen
11 
12 optional arguments:
13   -h, --help            show this help message and exit
14 
15 Rule Creation:
16   -m M                  Path to scan for malware
17   -y min-size           Minimum string length to consider (default=8)
18   -z min-score          Minimum score to consider (default=0)
19   -x high-scoring       Score required to set string as 'highly specific
20                         string' (default: 30)
21   -w superrule-overlap  Minimum number of strings that overlap to create a
22                         super rule (default: 5)
23   -s max-size           Maximum length to consider (default=128)
24   -rc maxstrings        Maximum number of strings per rule (default=20,
25                         intelligent filtering will be applied)
26   --excludegood         Force the exclude all goodware strings
27 
28 Rule Output:
29   -o output_rule_file   Output rule file
30   -e output_dir_strings
31                         Output directory for string exports
32   -a author             Author Name
33   -r ref                Reference (can be string or text file)
34   -l lic                License
35   -p prefix             Prefix for the rule description
36   -b identifier         Text file from which the identifier is read (default:
37                         last folder name in the full path, e.g. "myRAT" if -m
38                         points to /mnt/mal/myRAT)
39   --score               Show the string scores as comments in the rules
40   --strings             Show the string scores as comments in the rules
41   --nosimple            Skip simple rule creation for files included in super
42                         rules
43   --nomagic             Don't include the magic header condition statement
44   --nofilesize          Don't include the filesize condition statement
45   -fm FM                Multiplier for the maximum 'filesize' condition value
46                         (default: 3)
47   --globalrule          Create global rules (improved rule set speed)
48   --nosuper             Don't try to create super rules that match against
49                         various files
50 
51 Database Operations:
52   --update              Update the local strings and opcodes dbs from the
53                         online repository
54   -g G                  Path to scan for goodware (dont use the database
55                         shipped with yaraGen)
56   -u                    Update local standard goodware database with a new
57                         analysis result (used with -g)
58   -c                    Create new local goodware database (use with -g and
59                         optionally -i "identifier")
60   -i I                  Specify an identifier for the newly created databases
61                         (good-strings-identifier.db, good-opcodes-
62                         identifier.db)
63 
64 General Options:
65   --dropzone            Dropzone mode - monitors a directory [-m] for new
66                         samples to processWARNING: Processed files will be
67                         deleted!
68   --nr                  Do not recursively scan directories
69   --oe                  Only scan executable extensions EXE, DLL, ASP, JSP,
70                         PHP, BIN, INFECTED
71   -fs size-in-MB        Max file size in MB to analyze (default=10)
72   --noextras            Don't use extras like Imphash or PE header specifics
73   --debug               Debug output
74   --trace               Trace output
75 
76 Other Features:
77   --opcodes             Do use the OpCode feature (use this if not enough high
78                         scoring strings can be found)
79   -n opcode-num         Number of opcodes to add if not enough high scoring
80                         string could be found (default=3)

使用“-m”参数对“vir”文件夹中的样本自动生成yara规则

//在不使用其它参数的情况下,会在当前目录下输出一个名为“yarGen_rules.yar”的规则文件
python yarGen.py -m vir

规则解释

yarGen会对规则中的每个字符串进行评分,并按得分高低进行分类,每个分类的字符串会以不同的前缀进行区分
以“$s”开头的字符串,是“高度特定的字符串”,该类字符串不会出现在合法软件中。
以“$x”开头的字符串,是“特定的字符串”,该类字符串可能同时在恶意软件与合法软件中出现。
以“$z”开头的字符串,是“普通字符串”,该类字符串可能很常见,但是还没有被收集加入到合法字符串数据库中。

总结

关于更多yarGen使用说明,可以查看“-h”参数,或者参考github中作者列出的博客文章,这里不再赘述。
对于yarGen自动生成的yara规则,我们需要在其基础上根据不同的字符串前缀,结合实际分析情况进行修改完善后,才能将其应用到实际工作中。

参考资料

https://github.com/Neo23x0/yarGen
https://securityonline.info/yargen-generator-yara-rules/
https://medium.com/bugbountywriteup/diving-into-yargen-9e8c00e18b65

posted @ 2020-04-17 21:46  SunsetR  阅读(1758)  评论(0编辑  收藏  举报