What is Fuzz Testing and How Does It Work?

Fuzzing as a Software Testing method

In this blog post you’ll discover what is fuzz testing, also known as fuzzing, and all types of fuzz testing that exist with examples.

Furthermore, you’ll understand the benefits of using fuzz test and also check out some projects in which it was used in Cheesecake Labs.

What is Fuzz Testing?

Fuzz Testing Process Phases
Fuzz Testing Process Phases

Fuzz testing is an automated testing method in which we provide random input values to discover unexpected bugs.

Fuzzing can be traced back to the University of Wisconsin in 1988. There, Professor Barton Miller gave a class project titled “Operating System Utility Program Reliability – The Fuzz Generator.”

It was the first – and simplest – form of fuzzing, and included sending a stream of random bits to UNIX programs with the use of a command line fuzzer.

Nowadays, there are some different types of Fuzz Testing: dumb fuzzing, smart fuzzing, feedback-based fuzzing, mutation-based fuzzing, and generation-based fuzzing, which are all described below.

Types of Fuzzers

Randomizing styles

There are two Fuzzing styles of generating randomized data:

  • Entirely random: dumb fuzzing test;
  • Partially randomized to match some criteria: smart fuzzing test.

Dumb Fuzzing Test

Dumb fuzzing requires less amount of work, it’s easier to maintain, but it has limited coverage. 

It is relatively easy to set up, but it is also very inefficient because it is not based on any criteria, it just generates the entirely random data.

Smart Fuzzing Test

Otherwise, Smart fuzzing has a greater code coverage, catching more bugs. Smart Fuzzing Test produces inputs that are based on valid input formats.

It is very useful since we need to match certain patterns. This style of fuzzing requires detailed knowledge about input format, so it is more costly to set up, which involves a longer development and bigger strategy.

Depending on the environment, we can use both types in the code (Smart and Dumb Fuzzing).

Smart is selected for parts where we need more strategy and match specific criteria. Dumb is for the simplest tests, where we just need to randomize without any criteria.

Fuzzing Types

Smart randomizing style can be divided into three types:

  • Mutation-based
  • Generation-based
  • Feedback-based.

Sometimes we can find different names or explanations for each fuzzing style/type. 

Here is a description of each one for a better understanding.

Mutation-based 

It changes the valid inputs generating a collection of mutated inputs, which then introduces small changes to existing inputs that may still keep the input valid with new behavior.

Sometimes it’s classified as “dumb-fuzzer” considering the mutation of the whole input data.

Bit Flipping for Fuzz Testing
Example: Bit Flipping (bits are flipped in some sequence or randomly)

Generation-based 

It creates entirely new data matching the valid input, according to the defined structure. It can be defined by data modeling or a state modeling, for example.

The Model is defined and the fuzzer randomizes the data according to the model structure.

Here is a sample Data Model for HTTP Protocol:

Data model for HTTP Protocol

Here is a sample State Model and Data for HTTP:

State Model and Data for HTTP

Feedback-based

Also called “Coverage-based”. It uses code coverage information when generating new inputs.

It is considered the smartest way of Fuzzing, but also the most costly one. This type measures the amount/share of code executed during a given test.

For example, it may read the number of statements in the code and compare it to the statements in the software source.

Same thing for the number of lines, the share of defined functions that have been called, the execution of code lines within basic blocks, the edges of the control graph that are being followed, the number of logical paths, etc.

Diagram of how Feedback-Base Fuzzing Test works
A diagram of how Feedback-Based Fuzzing works

Another important configuration

Fuzzing Tests can be highly configurable. Some important configurations we had to use in our projects are listed below.

Fuzztime

Considering we worked with blockchain, it was very important to determine a limit of time for a specific test. This way, we could also test performance and determine a time limit so we would not be sending infinite data.

Parallel Processes

We could configure Fuzzing to be working in several parallel processes. It also got useful for performance testing and we could test the behavior on the network receiving several parallel data at the same time.

Data Type definition

The Fuzzing can be configured to accept only specific types of data. Very useful for an environment where we need to randomize, but aiming only one or more specific types of data.

Why use it on blockchain projects?

Considering we need to pay special attention to security on blockchain projects, and need to test the widest range of accepted data possible, fuzzing tests are very interesting for the blockchain environment.

Fuzzing is very important to test security because it may test what kind of data the system is accepting or not accepting, and also test simultaneous parallel processes certifying one process is not overlapping another concurrent process using random inputs. 

It’s important to certify the system is working according to the real environment of a blockchain network, receiving several different data simultaneously, and being accepted by the network rules without losing important information.

Why did Cheesecake Labs decide to use it?

When facing specific scenarios, such as creating different varieties of crypto wallets and transferring values between them, it was necessary to test a large amount of random and valid inputs, simulating simultaneous transactions such as in a blockchain network.

A good code coverage was required for unit tests but in a randomized way. This helped us identify a lot of bugs we couldn’t find with approaches other than Fuzzing. 

There were many unexpected inputs that generated errors and other issues that emerged when simulating heavy usage. 

Which difficulties were found working with Fuzz Testing?

We had difficulty implementing the Fuzzing Test using a specific programming language because each language has its own patterns for the Fuzzing Test.

For example, we coded using Go Language, which had Fuzzing structure available only in version 1.8.

Depending on the project’s programming language, we would have to implement Fuzzing with another language that has a structure for it.

Another difficulty was performance. We used smart fuzzing, but couldn’t immediately find where the code was breaking.

Because of that, we had to study the Fuzzing structure for the specific programming language in our project environment, which changed the form of implementation.

In a blockchain project, while sending parallel fuzzing events, sometimes we had problems with events that couldn’t be sent in a specific sequence, so we had to find a way to verify the event condition (see example in the diagram below).

Which kind of bugs did we find using the Fuzzing Test?

When running parallel fuzzing to send multiple random transactions, we found out that some transactions were being generated without the proper sequencing management, which would cause transactions to fail during peak activity.

Once this scenario was identified, it was possible to improve the solution by applying a different submission strategy and scaling it up to handle a high volume of transactions with consistency.

Fuzzing allowed us to identify and work around some edge cases like the inability to handle some non-ASCII characters and other similar unforeseen scenarios.

Project structure and code examples of Fuzz Testing

Here is a project diagram example with Fuzzing Test applied in blockchain:

Project Diagram example using Fuzz Testing
This diagram shows an example of a smart Fuzzing test applied in a project.
  • First, it’s set up defining which are the valid inputs, so the Fuzzing starts simultaneous parallel processes until the defined Fuzz time (a time limit to finish sending events).
  • Each event response is analyzed by a structure, defining if it is successful or not.
  • If the response is successful, it will be validated both with our rules (validating balances) and with the Blockchain network rules.
  • If the response is not successful,  the possibility of it being really unsuccessful will be analyzed. Then, the error being about a specific error (wallet without funds for this example) is verified and there is a new try to get a wallet with funds.

Example of code (using Go Language 1.18) for fuzzing test in a HTTP handler that calculates the highest value amongst a slice of ints:

func FuzzTestHTTPHandler(f *testing.F) {
	// Create a new server hosting our calculate func
	srv := httptest.NewServer(http.HandlerFunc(CalculateHighest))
	defer srv.Close()

	// Create example values for the fuzzer
	testCases := []ValuesRequest{
		ValuesRequest{[]int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}},
		ValuesRequest{[]int{10, 9, 8, 7, 6, 5, 4, 3, 2, 1}},
		ValuesRequest{[]int{-50, -9, -8, -7, -6, -5, -4, -3, -2, -1}},
		ValuesRequest{[]int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}},
		ValuesRequest{[]int{10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200}},
	}

	// Add Corpus Seeds
	for _, testCase := range testCases {
		// Skip error, very bad practice
		data, _ := json.Marshal(testCase)
		// Add JSON data as Corpus
		f.Add(data)

	}

	// Start fuzzing
	f.Fuzz(func(t *testing.T, data []byte) {
		// Validate that it is proper JSON before sending
		if !json.Valid(data) {
			t.Skip("invalid JSON")
		}
		// Make sure it is a real ValuesRequest formatted payload by marshalling into
		vr := ValuesRequest{}
		err := json.Unmarshal(data, &vr)
		if err != nil {
			t.Skip("Only correct requests are intresting")
		}
		// Create a new request
		// using DefaultClient, should not be done in prod
		resp, err := http.DefaultClient.Post(srv.URL, "application/json", bytes.NewBuffer(data))
		if err != nil {
			t.Errorf("Error: %v", err)
		}
		// Check status Code
		if resp.StatusCode != http.StatusOK {
			t.Errorf("Expected status code %d, got %d", http.StatusOK, resp.StatusCode)
		}

		// Umarshal data
		var response int
		err = json.NewDecoder(resp.Body).Decode(&response)
		if err != nil {
			t.Errorf("Error: %v", err)
		}

	})
}

Code font: towardsdatascience.com/fuzzing-tests-in-go

About the author.

Lucas da Silva Machado
Lucas da Silva Machado

A person who loves to see others happy and change the world for better. Lover of pop culture and fantasy world. In addition to technology, enjoys studying philosophy, psychology and magick.