Introduction to Mutation Testing

Developers probably spend a lot of time writing, running, and maintaining tests, and having them all pass is a great feeling! But what if there is something missing? In this case, developers and testers can be blissfully unaware that they're shipping time bombs to the customer. How would they know that? Who is testing the tests?

Aymen Maaoui

Ehemaliger

mehr von Aymen

As developers we’ve been all at some point in our career in a situation where we would have forgotten to write a test for a certain function no matter what the reason was and this can lead to some issues:

It becomes very difficult to identify these tiny cracks in test coverage when the app gets bigger and complex.
Gaps smaller than skipping an entire function, like statements half-tested and/or mocks half-mocked, might also be a real headache to find.
Holes we wouldn’t necessarily notice when reviewing unit tests, but which still mean that we can break our system and be lied to by our supposedly-trusty test suite.

Unit testing is standard practice now, but we can only rely on our suite, if we make sure it covers enough of the application’s functionality. A great way to do this is with a mutation testing, but in order to understand its true value, we need to think about why we actually need to write tests in the first place.

I. Why Do We Need Unit Testing?

There are lots of reasons to write tests, to name a few:

Enabling the development of production code: by defining a failing check developers have an executable feedback about the next production code change they should do.
Encourage Clean Architecture and best practices: as untestable code is often bad code, developers tend to reinforce the use of better design principles.
Regression validantion: when a behavior changes, a unit check should fail. If developers change a behavior in the product code and all checks pass, it means the changed behavior of the application was not covered properly by the unit check suite.

TDD (Test-Driven Development) is not enough for delivering lean code that works exactly towards expectations. Mutation testing is a powerful step forward.

II. What Is Mutation Testing and Why Should We Care about It?

Mutation testing is a fault-based testing technique that investigates how systems would behave if something changes. This is done to determine the effectiveness of the test set by isolating the deviations. It sounds a little complicated, isn’t it? Let’s break it down and try to understand it piece by piece.

A mutation test is a test for your tests: We can think of mutation test as test for our tests, even when we practice TDD, which we start by writing tests, then writing code that makes that test pass, so we are still writing a separate program. What people who practice automated testing often do is writing that code in a simple way that there is no bugs. We avoid things like iterations or conditional logic. We do that so it is very easy to understand these programs, however we are still human beings and errors can occur no matter how much effort we put in.

So, mutation tests are a way for us to validate the tests we wrote in the way we expect them to, and this solves the problem of who tests the tests.

A mutation test generates an efficacy score for your test suite, called „mutation score“
A perfect mutation score is 100
The lower bound for mutation score is 0
A low mutation score means many changes are left undetected

A mutation score reflects how effective a test suite is at detecting logic changes and can tell us how performant our tests are by noticing whether or not a side effect occurre

Mutation Testing Process

Briefly, the mutation testing process works as follows:

Iterate through the set of mutation operators and apply it to the original program which results in a set of mutated programs,
Run all tests for mutated programs
Classify each mutated program as:

- Surviving mutant if all tests pass in.
- Killed mutant if at least one test failed in.

A Deep Dive into How Mutation Testing Works

A really cool thing nowadays about mutation testing is that you do not write them! That’s handled by a tool for you. It does this by introducing small changes into your app’s code.

A single change is called a mutant
One or more mutants may be inserted at a time:
inserting one mutant at a time is called first-order mutation testing
inserting more than one mutant at a time is called higher-order mutation testing

The key thing is that these mutants always mimic realistic programming errors :).

A mutant is introduced by a mutation operator, and what mutation testing tool has is a collection of mutation operators.
After applying the mutation operator, the tests are run, and the results get recorded.
A test suite which failed in response to a mutant, killed the mutant.It might be a bit confusing for newbies, but a passing mutation test is a failing test suite. Killing the mutant means our test suite failed in response to the change (inserting a fail).
Once all possible mutants have been applied to the code, mutation testing is finished, and a mutation score gets generated for the test suite and source files.

The existence of surviving mutants exposes which pieces of code need changes.

Mutation Score

The mutation score is defined as follows:

Mutation Score = (Killed Mutants / Total number of Mutants) * 100

If the mutation score is 100%, test cases are mutation adequate. This is usually not achievable, like a 100% code coverage.

To answer the question “How do I know it works, especially since over time the complexity of the app is going to grow, and it is not going to be only one developer who touches the code?” we can simply refer to the mutation score.

Mutation Testing Types & Examples

Note: All the following examples are in Swift.

Value Mutations: An attempt to change the values to detect errors in the programs. We usually change one value to a much larger value or one value to a much smaller value. The most common strategy is to change the constants.

Decision Mutations: In decision mutations logical or arithmetic operators are changed to detect errors in the program.

Statement Mutations: Changes done to the statements by deleting or duplicating the line, that may arise when a developer is copy pasting the code from somewhere else.

We have to mention that mutation testing relies on two things:

Competent Programmer: If a programmer is competent enough to write and test a small piece of code, s/he will be good at writing larger programs too.
Coupling effect: Lots of developers often say, “I don’t write test like this, because it gets coupled to that.”, If you do not couple your tests to your production code, you are not testing your production code!

My advice here is: Be very intentional with what you are choosing to be coupled. Unit tests should couple to the method implementation, UI tests should couple to the navigation hierarch

Mutation Testing Tools

Lots of tools exists for nearly all modern programming languages. In the following, I will be citing the most known in the mobile app development field:

Programming Language	Tool
JVM Languages	PITest
LLVM Languages	Mull
Swift	Muter

Dos and Don’ts of Mutation Testing

Dos

Start step by step: mutation testing can take a long time, so if you are planning to apply it to an existing project, it is wise to incrementally mutation test your project.

Hint: some tools allow you to explicitly specify files to run mutation.

Review test metrics: It is crucial to regularly review your test metrics. Use code coverage to see how much of your code you are testing, as well as to figure out if you need to write more unit tests, and use mutation score to see from where to start mutation test.

Don’ts

Address all surviving mutants at once: This can lead to a lot of refactoring to your test suite as well as to your production code. Instead pick the logic that is the most important.
Mutation test when code coverage is low: Mutation tests tell us how effective our tests interact with our app code, so with a low code coverage we will get a false sense of confidence from it.

Advantages & Disadvantages of Mutation Testing

Advantages

Powerful approach to attain high coverage of the source code
Brings a good level of error detection into the program
Very good at detecting hidden defects, which might be impossible to identify using the conventional testing techniques
Discovers ambiguities in the source code
Increase customer satisfaction by providing the most reliable and stable product

Disadvantages

Extremely costly and time-consuming, but it’s fair to say that this testing cannot be done without an automation tool.
Each mutation will have the same number of test cases as the original program.
Complex mutations are difficult to implement
Testers need to have programming skills to do mutation testing
Not applicable for Black Box Testing at all, as it involves a lot of source code changes

III. Conclusion

As human being you can’t always avoid making a mistake, but when it comes to software development, there are ways to catch that mistake before it goes to production. One way is mutation testing. Even though it is time-consuming and requires automation, mutation testing is still the undisputed best technique to test any program and highlights code that needs additional tests or better assertions.

I hope that I managed to give you a brief introduction on what mutation testing is about and how it works, and it’s over to you now to look into how to leverage it and put it into practice. Otherwise, if you are interested to dig deeper into the topic, we'll put mutation testing to good use and tackle some everyday scenarios in a future article.