模糊测试基本概念FuzzTest

fuzz test

1. what is FUZZ TESTing?

Fuzz Testing is an automated software testing technology, originally developed by Barton Miller of the University of Wisconsin in 1989, which is usually used to identify potential vulnerabilities in programs. The core of Fuzz Testing is to automatically or semi automatically generate random data and input it into the application, while monitoring program exceptions, such as crashes and code assertion failures, to find possible program errors, such as memory leaks.

Fuzzification refers to the automatic generation and execution of tests. The random data input in the fuzzy test is called "Fuzzy". The types of random data include: super long strings; Random numbers such as negative numbers, floating point numbers, super large numbers, and special characters such as~! @ # $% Such characters with special meanings as input may cause error; Unicode coding, because some programs do not support unicode.

Fuzzy testers for fuzzy testing are divided into two categories:

  • One is a fuzzy tester based on mutation, which creates test cases by mutation of existing data samples
  • The other is the generation based fuzzy tester, which models the protocol or file format used by the system under test, generates inputs based on the model and creates test cases accordingly.

Fuzzy test process🌟🌟🌟

Fuzzy testing usually includes the following basic stages:

  1. Determine the test objectives: determine the nature, function, operating conditions and environment of the target program, the language for writing the program, the vulnerability information found by the software in the past, and the interface for external interaction, etc
  2. Determine input vectors, such as file data, network data, and environment variables.
  3. Generating fuzzy test data: after determining the input vector, design the method of fuzzy test and test data generation algorithm, etc
  4. Execute fuzzy test data: automatically complete the process of sending a large amount of test data to the test target, including starting the target process, sending test data, opening files, etc
  5. Monitoring exceptions: monitor whether the target program generates exceptions, and record the test data and exception related information that cause the program to generate exceptions
  6. Determine whether the discovered vulnerabilities can be used: by resending the data generating exceptions to the target program, track the processing flow related to the program before and after the exception is generated, analyze the cause of the exception, and determine whether it can be used.
  7. Output the test log.

image-20221003160450733

Basic Requirements

To achieve efficient fuzzy testing, the following requirements are usually required:

  1. Reproducibility: The tester must be able to know what the test data corresponding to the state change of the target program is. If the tester does not have the ability to reproduce the test results, the whole process will lose its significance. One way to achieve reproducibility is to record the status of the test data and the target program while sending the test data

  2. Reusability: Carry out modular development, so that there is no need to re develop a fuzzy tester for a new target program

  3. Code coverage: refers to the number of all codes and process states that the fuzzy tester can make the target program reach or execute

  4. Exception monitoring: it is very important to accurately determine whether an exception occurs to the target program

Existing problems

  1. Strong blindness: even if you are familiar with the protocol format, you still haven't solved the problem of duplicate test case paths, resulting in low efficiency
  2. Large redundancy of test cases: many test cases are generated through random strategies, resulting in repeated or similar test cases
  3. The pertinence of the associated fields is not strong: most of the time, it is just a random generation or variation of data for multiple elements, lacking the pertinence of the protocol associated fields

Method implementation

Association analysis of input data

Normally, the application will check the format of the input data object. It is an important step to improve the success rate of fuzzy testing by analyzing the structure of data objects input to the program and the dependency between its constituent elements, and constructing test cases that meet the format requirements so as to bypass the program format check.

The input data of an application usually follows certain specifications and has a fixed structure. For example, network packets usually conform to a specific network protocol specification, and file data usually conforms to a specific file format specification. Input data structure analysis is to analyze the structure of these network data packets or file formats, identify specific fields that may cause application parsing errors, and build test cases through mutation or generation. Usually focus on the following fields: fields representing length, fields representing offset, fields that may cause applications to execute different logic, variable length data, etc

The data objects that an application can handle are very complex. For example, MS Office files are composite files stored based on object embedding and linking. They can not only embed files in other formats, but also contain various types of metadata. This complexity leads to the fact that the vast majority of test data generated in the process of fuzzy testing cannot be accepted by applications. The data block association model is an effective way to solve this problem. The model takes data blocks as the basic element, and uses the correlation between data blocks as the link to generate deformity test data. Among them, the data block is the basis of the data block association model. Usually, a data object can be divided into several data blocks, and the dependency between data blocks is called data association

The division of data blocks generally follows three basic principles:

  1. Keep the correlation between data blocks as small as possible
  2. Divide data with specific meaning into a data block
  3. Divide a continuous and fixed data into the same data block

Division of data block association model:

  • Association method

    • Internal Association: refers to the association between different data blocks in the same data object.

      • Length association: one or several data blocks in a data object represent the length of another data block. It is the most common data association method in file format, network protocol and ActiveX control fuzzy testing.
    • External association: refers to the association between multiple different data blocks belonging to multiple different data objects

      • Content association: A data block of a data object represents the value of another data block of another (or the same) data object. It often appears in network protocol applications requiring user authentication.
  • Correlation strength

    • Strong Association: the number of associated data blocks is greater than or equal to the number of non associated data blocks.
    • Weak association: the number of associated data blocks is less than the number of non associated data blocks.
  • evaluation criterion
    Effective data object efficiency: the ratio of the number of malformed data objects constructed to the number of data objects that can be accepted and processed by the application.

Construction method of test case set

Common construction methods are as follows:

  1. Random method: simply generate a large number of pseudo random data to the target program.
  2. Mandatory test: The fuzzy tester starts from a valid protocol or data format sample, and continuously scrambles every byte, word, doubleword or string in the data package or file.
  3. Pre generation of test cases: study a special specification to understand all supported data formats and the acceptable value range of each data format, and then generate hard coded data packages or files used to test boundary conditions or force violations of the specification.
  4. Genetic algorithm: the test case generation process is transformed into a numerical optimization problem using genetic algorithm. The search space of the algorithm is the input domain of the software to be tested, and the optimal solution is the test case that meets the test objectives. First, use the initial data and seeds to generate data, then test and evaluate the data, and monitor the test process. If the conditions for test termination are met, output the test results, otherwise generate new data through selection, hybridization, and mutation
  5. Error injection and fuzzy heuristic
    • Error injection: It refers to generating faults artificially and consciously according to specific fault models, and applying specific faults to the software system to be tested to accelerate the occurrence of system errors and failures.
      • Error types that can be injected generally: memory error, processor error, communication error, process error, message error, network error, program code error, etc
    • Fuzzy Heuristics: The specific potential danger values contained in the fuzzy string or fuzzy value list are called fuzzy heuristics.
      • Boundary integer value: integer value overflow, underflow, symbol overflow, etc.
      • String duplication: stack overflow, etc.
      • Field separator: Randomly include non alphanumeric characters such as spaces, tabs, etc. into the fuzzy test string.
      • Format string: it is better to select "% s", "% n", etc. to be included in the string.
      • Character conversion and translation: Special attention is paid to the processing of extended characters.
      • Directory traversal: appending symbols such as "../" to the URL will cause attackers to access unauthorized directories.
      • Command injection: pass unfiltered user data to API calls such as "exec()" and "system()".

Test exception analysis

In the process of program dynamic analysis, there are several ways to obtain relevant information:

  • Get information through the normal output of the program
  • Get information through static code pegging
  • Get information through dynamic binary instrumentation
  • Get information through virtual machine
  • Get information through the debugging interface or debugger

Fuzzy testing framework

The fuzzy testing framework is a general fuzzier, which can perform fuzzy testing on different types of targets. It abstracts some monotonous work and reduces these work to a minimum. Generally, the fuzzy testing framework includes the following parts:

  • Fuzzy test data generation module
    • Raw data generation module: it can directly read some manually constructed normal data, or automatically generate normal test data according to the structure definition
    • Deformity data generation module: make some modifications and deformation on the basis of the original data to generate the final deformity data
  • Dynamic debugging module: use the debugging interface provided by the operating system to realize the dynamic debugging function to capture the abnormal information generated by the debugged program
  • Execution monitoring module: on the basis of dynamic debugging module, it can monitor the execution status of the debugged program during the running process of the debugged program, so as to decide when to terminate the running of the debugged program
  • Automatic script module: provides more complex monitoring functions on the basis of executing the monitoring module
  • Exception filtering module: based on the dynamic debugging module, it can filter the results of exceptions in real time
  • Test result management module: in addition to the abnormal information in the test result database, the abnormal data will also be saved. Regression testing can be realized by using the test result database.
posted @ 2022-10-03 16:07  ivanlee717  阅读(218)  评论(0编辑  收藏  举报