Hire Me
← All Writing Testing

Mutation Testing with PITest in Java

What mutation testing reveals that code coverage misses, how to configure PITest in a Java Spring Boot project, and how to act on surviving mutants effectively.

Eighty percent code coverage feels reassuring until you look at what it’s actually measuring. Coverage tells you which lines were executed during your test run — not whether your tests would catch a bug if one were introduced. I’ve reviewed codebases with 80% coverage and trivially-injectable bugs: a > flipped to >= in a boundary check, a && replaced by || in a validation rule, a + where there should be a -. The tests ran. Everything was green. The bug would have gone to production.

Mutation testing is the answer. It introduces small, deliberate faults — mutations — into your production code and checks whether your tests detect them. A test suite that doesn’t catch the mutation is demonstrably weaker than you thought.

What PITest Does

PITest (PIT Mutation Testing) generates mutants by applying transformation operators to your compiled bytecode. Common operators include:

For each mutant, PITest runs your test suite. If at least one test fails because of the mutation, the mutant is killed — your tests caught the fault. If all tests pass despite the mutation, the mutant survives — you have a gap in your test coverage.

The mutation score is killed / (killed + survived). A score of 100% means every mutation was caught. In practice, 85–90% is a realistic and meaningful target for domain logic.

Configuring PITest in Maven

Add the PITest Maven plugin to your pom.xml:

<build>
    <plugins>
        <plugin>
            <groupId>org.pitest</groupId>
            <artifactId>pitest-maven</artifactId>
            <version>1.15.3</version>
            <dependencies>
                <!-- Required for JUnit 5 support -->
                <dependency>
                    <groupId>org.pitest</groupId>
                    <artifactId>pitest-junit5-plugin</artifactId>
                    <version>1.2.1</version>
                </dependency>
            </dependencies>
            <configuration>
                <targetClasses>
                    <param>com.trinitylogic.claims.domain.*</param>
                </targetClasses>
                <targetTests>
                    <param>com.trinitylogic.claims.domain.*Test</param>
                </targetTests>
                <mutators>
                    <mutator>DEFAULTS</mutator>
                </mutators>
                <outputFormats>
                    <outputFormat>HTML</outputFormat>
                    <outputFormat>XML</outputFormat>
                </outputFormats>
                <timestampedReports>false</timestampedReports>
                <threads>4</threads>
            </configuration>
        </plugin>
    </plugins>
</build>

Run it with:

mvn org.pitest:pitest-maven:mutationCoverage

The HTML report lands in target/pit-reports/index.html. Open it in a browser — it shows class-by-class mutation scores with the surviving mutants highlighted inline in the source.

timestampedReports=false keeps the output directory consistent (target/pit-reports rather than target/pit-reports/202503060934) which makes it easier to integrate with CI.

Reading the Report

The report has three levels of interest. At the summary level, you see each class with its line coverage and mutation score. At the class level, you see which mutants were killed and which survived. At the source level, the surviving mutants are highlighted inline — you see exactly which line and which transformation exposed a gap.

A typical surviving mutant looks like this. Given a StakeCalculator:

public BigDecimal calculateMaxStake(BigDecimal bankroll, double riskPercent) {
    if (riskPercent <= 0 || riskPercent > 1.0) {
        throw new IllegalArgumentException("Risk percent must be between 0 and 1");
    }
    return bankroll.multiply(BigDecimal.valueOf(riskPercent));
}

PITest generates a conditional boundary mutant: riskPercent <= 0 becomes riskPercent < 0. If your test only passes riskPercent = -0.1, the mutant survives — you haven’t tested the exact boundary at riskPercent = 0. The surviving mutant is telling you to add:

@Test
void throwsOnZeroRiskPercent() {
    assertThrows(IllegalArgumentException.class,
        () -> calculator.calculateMaxStake(new BigDecimal("1000"), 0.0));
}

That’s the value of mutation testing — it points you to the specific assertion your test is missing, rather than just telling you coverage is low.

What to Do When Mutants Survive

Not every surviving mutant requires a new test. There are three reasonable responses:

Kill it — write a test that exercises the specific condition. This is the right response for domain logic where correctness matters.

Accept it — some mutations survive because the code path genuinely doesn’t affect outcomes in a meaningful way. A log statement mutation, or a mutation in a getter that’s only used for serialisation, isn’t worth testing exhaustively. Mark these as excluded.

Exclude the class — infrastructure glue, generated code, and framework callbacks (Spring @Configuration classes, JPA entity mappings) are noise in a mutation report. Exclude them explicitly:

<configuration>
    <excludedClasses>
        <param>com.trinitylogic.claims.config.*</param>
        <param>com.trinitylogic.claims.infrastructure.persistence.entity.*</param>
    </excludedClasses>
</configuration>

The goal is signal, not a perfect score on code you don’t control.

The Domain Model Focus Strategy

The highest-value place to run PITest is your domain model — the classes that contain your business rules. These are the classes where a survived mutant corresponds to a real business bug.

For a benefit claims system, that’s the eligibility rules, the state machine governing claim status transitions, the calculation logic for payment amounts. For a trading system, it’s the position sizing logic, the order validation rules, the P&L calculations.

// High-value mutation testing target: state transition guard
public class ClaimStateMachine {

    public ClaimStatus transition(ClaimStatus current, ClaimEvent event) {
        return switch (current) {
            case SUBMITTED -> switch (event) {
                case ASSIGN    -> ClaimStatus.ASSIGNED;
                case WITHDRAW  -> ClaimStatus.WITHDRAWN;
                default        -> throw new InvalidTransitionException(current, event);
            };
            case ASSIGNED -> switch (event) {
                case APPROVE   -> ClaimStatus.APPROVED;
                case REJECT    -> ClaimStatus.REJECTED;
                case WITHDRAW  -> ClaimStatus.WITHDRAWN;
                default        -> throw new InvalidTransitionException(current, event);
            };
            default -> throw new InvalidTransitionException(current, event);
        };
    }
}

PITest will generate mutants that bypass exception throwing, skip transitions, or return wrong statuses. A comprehensive test for ClaimStateMachine needs to exercise every valid transition and verify that invalid transitions throw. Mutation testing will tell you precisely which ones you’ve missed.

Running PITest on a Spring Boot Project

PITest runs the test suite repeatedly — once per mutant. A Spring Boot integration test that starts the full application context per run will make mutation testing prohibitively slow. Configure PITest to target unit tests only:

<configuration>
    <targetTests>
        <param>com.trinitylogic.claims.domain.*Test</param>
        <!-- Explicitly exclude integration tests -->
    </targetTests>
    <excludedTestClasses>
        <param>com.trinitylogic.claims.*IT</param>
        <param>com.trinitylogic.claims.*IntegrationTest</param>
    </excludedTestClasses>
</configuration>

With four threads and a focused target package of ~20 domain classes, a PITest run typically completes in 60–90 seconds. Slow enough that you don’t run it on every compile, fast enough to run as a pre-push hook or as a separate CI stage.

Pro tip: Run PITest only on your domain model package. That’s where your business logic lives and where a surviving mutant represents a genuine risk. Targeting infrastructure, controller, or persistence layers adds noise and run time without proportionate value. A 90% mutation score on your domain model means far more than a 75% score across the entire codebase.

Integrating PITest into CI

Add a dedicated Maven profile and a separate CI step:

<profiles>
    <profile>
        <id>mutation-tests</id>
        <build>
            <plugins>
                <plugin>
                    <groupId>org.pitest</groupId>
                    <artifactId>pitest-maven</artifactId>
                    <configuration>
                        <mutationThreshold>85</mutationThreshold>
                        <coverageThreshold>90</coverageThreshold>
                    </configuration>
                </plugin>
            </plugins>
        </build>
    </profile>
</profiles>

The mutationThreshold causes the build to fail if the mutation score drops below 85%. This makes mutation score a gating quality metric — the same way you’d fail a build on line coverage below a threshold, but with a much stronger guarantee.

Mutation testing won’t tell you your tests are good. It will tell you, precisely and specifically, where they aren’t. That’s a more useful thing to know.

If you’re working on a Java codebase where test quality matters and want an engineer with strong views on how to achieve it, get in touch.