Fuzz Testing in Go

In this article, I’ll cover the basics of fuzz testing and how to perform it with the built-in tools in Golang. This is actually the written in English version of the presentation I gave at Gophers Ankara in 2023 as well as GoConf Istanbul in February 2024.

If you’re interested, you can watch the full presentation from Istanbul GoConf via this link.

You can download the presentation from here.

You can find the code in demo here to try it yourself.

What is Fuzzing?

Let’s look at the meaning of the word “fuzzy” from Cambridge dictionary first.

The meaning in our context is very similar to the dictionary meaning. In conventional testing methods, the programmer determines the inputs and the corresponding outputs. It’s clear and sharp edged. But in fuzz testing, there’s randomness involved and things are not that clear anymore.

Fuzz testing is a method of finding vulnerabilities in a system by providing it with random or modified inputs. This term actually dates back to the 1980s. In 1988 when Professor Barton Miller and his team were working on a dial-up network and noticed that their system malfunctioned due to a lightning strike. They thought that including such random signals in testing could be very beneficial, and the concept of fuzz testing has evolved from that day to today [1].

Now, you might wonder, why should I use fuzz testing when I have unit tests?

Fuzz testing is a complementary method and is very helpful in identifying and finding edge cases. Unit tests are brilliant when we test the basic functionality but when it comes to the edge cases programmers can be biased or cannot think of every case as we programmers are very clever but at the same time just human beings :) And it’s easy to use so the real question is why shouldn’t we use it?

In general, it could be wise to add fuzz tests where array/map operations, mathematical calculations or string manipulations are performed. Such as:

Parser functions
Crypto, compression
Codecs for audio, video, image
Databases
Text editor processors
Browsers
Formatters, template engines [2]

Fuzzing in Go

Before Go 1.18, Go did not have native fuzz testing feature, and programmers had to use extra packages if they wanted to fuzz. However, with 1.18, testing.T and testing.B were joined by testing.F, and now we can perform fuzz testing using Go’s own standard library. Go’s native fuzz testing is supported by OSS-Fuzz. OSS-Fuzz is a product of Google; its full name is Open Source Security Fuzz. This platform helps open-source projects perform fuzz testing. It was established in 2016 and is actively used at Google. By August 2023, more than 36,000 bugs had been found in over 1,000 projects thanks to this platform, with 200 of them belonging to Go’s standard library and more than 15,000 to Chrome[3]. You can also check the updates from the “Fuzz Trophy Case” page on Go’s Wiki page, where the bugs found using Go’s native fuzz testing are listed.

Types of Fuzz

There are 3 types of fuzzing. Blackbox, Greybox and Whitebox fuzzing.

Let’s start with something we’re familiar with. Professor Barton Miller’s experiment that was mentioned earlier was actually blackbox testing. As can be understood from the name it was like a blind shoot. They provided the system with random inputs without knowing the system’s internals or expected outputs. On the opposite side, with whitebox testing, the inputs are generated by an algorithm that analyzes the function’s content and algorithm to produce more intelligent inputs, which consumes a lot of power. Greybox testing falls somewhere in between. Inputs are neither generated as blindly as in blackbox testing, nor is as much computational power used as in whitebox testing. Go’s native support uses greybox testing, also known as coverage-guided testing. The input generation algorithm is focused on how much code each input covers, and efforts are made to generate inputs that will increase code coverage. [4] [5]

How does it work?

Let’s learn a few terms first:

Seed Corpus: User-specified set of inputs to a fuzz test which will be run by default with go test.

Mutator: The component that works with a generator to mutate bytes to be used as input to the fuzz test.

Fuzzing Engine: The main engine that maintains the corpus, invokes the mutator, identifies new coverage, and reports failures. Also, writes the values that expand the test coverage into $GOCACHE/fuzz while running.

Interesting value: The input that is created by the engine and increases the coverage. Here, we have a diagram to explain how it works:

The first step is to check if there is a seed corpus, i.e., whether the user has provided initial inputs for testing. If they have, testing starts with those inputs. If not, it begins with inputs generated by the engine itself. Then, the coverage is recorded, and the path is minimized. During minimization process, the fuzzing engine tries to achieve the same path and coverage with a smaller input and converts it into a format that the user can understand [6]. Then, it modifies the input slightly, records how much coverage is achieved, and checks if the path has changed compared to the previous one. If the path has changed, it considers this an interesting value and saves it in the fuzz folder under the go-fuzz directory. But if the path hasn’t changed, it minimizes the trace again, modifies the input, records the coverage again, and continues this loop until it finds a bug. This can be considered the fuzz test template in the Go language.

How to Fuzz?

Fuzzing in go is very similar to unit testing. It consists of three main parts: the Fuzz Target, which is the part we are testing; Fuzzing Arguments, the inputs that fuzzing will generate; and the Seed Corpus, which are the inputs provided by the user.

The seed corpus is added by using the Add function. The seed corpus can also be given in a file in the testdata/fuzz/FuzzFoo directory instead of f.Add()

Fuzzing arguments can only have the following types: string, bool, float32, float64, int, int8, int16, int32, int64, uint, uint8, uint16, uint32, uint64, []byte [4 add ref]

After writing the fuzz test, if you want to take it a step further, you can integrate tools like CIFuzz, ClusterFuzzLite or Fuzzit into your GitLab or GitHub pipeline and run fuzz tests periodically to find bugs more easily [7]. However, we won’t be diving into those topics today.

In summary, the steps are:

Programmer determines the target system; what are we going to test?
Programmer defines what the inputs will be.
Fuzz Engine starts generating data.
Fuzz Engine executes the function with this data and then analyzes the system’s behavior to identify any issues.

DEMO

The function below, called ShiftLeft(), shifts a string to the left. It rotates the string in a circular fashion by taking the first character and moving it to the end.

You can see an example of how the ShiftLeft function is used in the main function.

Typically, unit test are enough to check the code. Here are the test cases I’ve written:

The testcases seem comprehensive and the test passes. This may seem sufficient, but I also want to run a fuzz test to uncover any unexpected bugs.

Below is the fuzz test function. It’s important to define the fuzzing arguments first, as the seed corpus will be based on these inputs. In this case, I specify the inputs I want the fuzz engine to generate, which are the “input” and the “shift” value. These values are then added to the seed corpus using the Add function.

At this point, it’s necessary to determine some logical rules. In unit testing, the input and expected output are known. However, with fuzz testing, this isn’t the case.

First rule: If the input is valid UTF-8, the output should also be valid UTF-8. It’s guaranteed that valid UTF-8 strings will be passed to this function. Therefore, the first step is to filter out any non-UTF-8 values generated by the fuzz engine.

Second rule: Input and output should have the same character count.

Third rule: If the input is shifted to the left by its character count, the original string should be returned.

Initially, I won’t activate the fuzz engine; I’ll only test it with the seed corpus with the command below. This executes my four test cases, and no issues are found.

After that, we can run the fuzz engine by adding the -fuzz flag. Initially, it runs all the unit tests before activating the fuzz engine to ensure they pass. Next, there’s a section called “gathering baseline coverage” where it calculates how much of the code is covered by the seed corpus.

Shortly after generating inputs, test fails with the error “index out of range” which triggers a panic. The engine reports that it saved this faulty condition as a file in the testdata/fuzz/shiftLeft folder.

As you can see from the file content below, the shift amount was given as a negative value which we hadn’t include our test cases. We should do a fix here, we can either support or not support negative values.

I choose not to support them because the function shifts to left. If the shift is less than zero, I’ll directly return the input as we don’t return error. (We could have added an error to the return types but this will make things unnecessarily complicated for this subject.)

I run it again using the -run flag to test the faulty condition by giving the filename as parameter and it passes.

Before running fuzz engine, it’s important to add this case to the unit tests to easily catch a similar bug in the future.

After modifying the function, I run it again and encounter another error. This time, the error occurs on line 77, where I didn’t get the correct result when shifting the string by its own length. I see that this error is once again saved as a file in the testdata folder.

Let’s debug the code and try to address the error. When we check the debug variables on the left side below, we see the the inputs length is 2 whereas the outputs length is 3. My mistake here was to convert the string to a byte array instead of a rune array. Because special characters or all UTF-8 characters don’t always consist of a single byte. To perform more accurate string manipulation in Go, I should have converted these to array of runes instead of bytes.

After the fix, test passes.

Then I’ll add this case to my unit tests to avoid encountering the same error in the future.

By the way, the inputs generated by the fuzz engine were a special character string and a shift value. However, we encountered this error in the control mechanism I implemented afterward, meaning I needed to perform a 2-character shift to detect this error. I added this control to my unit tests, thereby adding an extra test case. I run it again and see that there’s no issue. Looking at the elapsed time, it ran for a minute, and no issues occurred during that time.

Meanwhile, the fuzz engine continues running indefinitely unless stopped. Initially, our seed corpus consisted of 4 values, as you may recall, but it has now increased to 11 values.

The reason for this is that, when the fuzz engine finds interesting values, it saves them in the gofuzz folder even though they don’t cause errors. And uses all of them as the new seed corpus in the next run.

Now, let’s do a quick calculation. I navigate to the gofuzz directory, find the path where my function is located within the fuzz folder, and see that there are 38 files in this directory. So, we have 38 interesting values, and we had found two faulty cases earlier. Under the test data folder, we initially added 4 values to the seed corpus. I also found and added one extra value. Adding that, we reach a total of 45 values.

Actually, the seed corpus grows each time with the values it previously found. And we can look at the files under this folder, use them, and if we want, we can add them to our unit tests. We just discovered that we have 45 values.

Now, when it recalculates the baseline coverage, it does so considering that there are 45 values in the seed corpus and continues generating new inputs.

This process can continue indefinitely unless stopped. For that, we have a parameter called -fuzztime. You can specify seconds, hours, or any duration you want. And if no error is found when the time is up, the test will be considered passed.

The last thing I’d like to mention is the workers. The ten workers mentioned every time while starting to fuzz means that it runs on 10 Go routines. But if you don’t want to put that much load on your computer and prefer to have fewer workers running or vice versa. In that case, you can use the -parallel flag to control how many fuzz tests run simultaneously.

Real life example

It helps us better understand the concepts when we see large companies applying the same practices as we do. There’s an example from Datadog Agent in “Fuzz Trophy Case” page on Go’s website. They had found a bug in a normalization function. We don’t need to know the details of the function. Let’s follow their steps:

This is actually similar to what we’ve written in the demo. They call the function and then follow a set of logical rules:

If a string gets normalized twice, it returns the same output for both.
The resulting string should not be longer than the maximum allowed length.

They discovered that in a switch structure, the default case didn’t increase the character count when an illegal character was encountered. Fix the problem and added the edge case to their unit tests.

Conclusion

Complementary to unit tests
Useful to detect bugs that are hard to predict
Not deterministic
No control over inputs so failures can be detected based on errors, panics or a property of the return value
Don’t forget to integrate the fuzzed data to regression tests

REFERENCES

Blog Archive

Archive of all previous blog posts

Blog Archive

Archive of all previous blog posts