How to build reliable, high-throughput batch jobs with Spring Batch — chunk processing, partitioned steps, skip/retry, and production monitoring patterns.
Batch processing is one of those workloads that sounds straightforward until you’re dealing with 50 million records, partial failures, restartability, and a Monday morning SLA. Spring Batch has been the go-to framework for this in the Java ecosystem for over a decade — and for good reason. It provides a well-designed model for chunk-oriented processing, job restartability, skip/retry semantics, and parallel execution that would take months to build reliably from scratch.
This post covers the architecture, the core abstractions, and the patterns that matter in production.
A Spring Batch Job is a sequence of Steps. Each Step is either a Tasklet (a single unit of work — useful for setup/teardown) or a chunk-oriented step with three components: ItemReader, ItemProcessor, and ItemWriter.
Job
└── Step 1 (chunk-oriented)
├── ItemReader — reads one item at a time
├── ItemProcessor — transforms/filters items
└── ItemWriter — writes a chunk at a time
└── Step 2 (tasklet)
└── Tasklet — single operation
The JobRepository persists job and step metadata (execution state, parameters, exit status) to a database. This is what enables restartability — if a job fails at step 3, a restart skips steps 1 and 2 and picks up from where step 3 left off.
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
</dependency>
spring:
batch:
job:
enabled: false # don't auto-run jobs on startup
jdbc:
initialize-schema: always
Chunk processing reads items one at a time, accumulates them into a chunk, and writes the whole chunk atomically. If the write fails, only that chunk is rolled back.
@Configuration
public class SettlementBatchConfig {
@Bean
public Job settlementJob(JobRepository repo, Step settleStep) {
return new JobBuilder("settlementJob", repo)
.start(settleStep)
.build();
}
@Bean
public Step settleStep(JobRepository repo,
PlatformTransactionManager tx,
ItemReader<UnsentledBet> reader,
ItemProcessor<UnsentledBet, SettledBet> processor,
ItemWriter<SettledBet> writer) {
return new StepBuilder("settleStep", repo)
.<UnsentledBet, SettledBet>chunk(500, tx)
.reader(reader)
.processor(processor)
.writer(writer)
.build();
}
}
The generic types on chunk() are <I, O> — input type from reader, output type to writer.
For large database tables, JdbcCursorItemReader keeps a JDBC cursor open and streams rows without loading everything into memory:
@Bean
@StepScope
public JdbcCursorItemReader<UnsettledBet> unsettledBetReader(DataSource ds) {
return new JdbcCursorItemReaderBuilder<UnsettledBet>()
.name("unsettledBetReader")
.dataSource(ds)
.sql("""
SELECT id, market_id, selection_id, stake, price, side, placed_at
FROM bets
WHERE status = 'PLACED'
AND market_settled_at IS NOT NULL
ORDER BY id
""")
.rowMapper(new BetRowMapper())
.build();
}
@StepScope is required for readers that use job parameters — it creates a new reader instance per step execution rather than at application startup.
@Component
public class BetSettlementProcessor implements ItemProcessor<UnsettledBet, SettledBet> {
private final MarketResultService results;
@Override
public SettledBet process(UnsettledBet bet) {
var result = results.getResult(bet.marketId(), bet.selectionId());
if (result == null) {
return null; // returning null filters the item — it won't be written
}
double pnl = switch (bet.side()) {
case BACK -> result.won()
? bet.stake() * (bet.price() - 1)
: -bet.stake();
case LAY -> result.won()
? -(bet.stake() * (bet.price() - 1))
: bet.stake();
};
return new SettledBet(bet.id(), pnl, result.won(), Instant.now());
}
}
Returning null from a processor silently drops the item. Use this for filtering rather than throwing an exception.
@Bean
public JdbcBatchItemWriter<SettledBet> settledBetWriter(DataSource ds) {
return new JdbcBatchItemWriterBuilder<SettledBet>()
.dataSource(ds)
.sql("""
UPDATE bets
SET status = 'SETTLED',
pnl = :pnl,
won = :won,
settled_at = :settledAt
WHERE id = :id
""")
.beanMapped()
.build();
}
beanMapped() binds named parameters from the object’s property names. For chunk size 500, this issues one batch UPDATE of 500 rows — far more efficient than individual updates.
In a real settlement job, some bets will have no result available yet, and some market result lookups might transiently fail. Spring Batch’s skip/retry policies handle this without aborting the entire job:
@Bean
public Step settleStep(/* ... */) {
return new StepBuilder("settleStep", repo)
.<UnsettledBet, SettledBet>chunk(500, tx)
.reader(reader)
.processor(processor)
.writer(writer)
.faultTolerant()
.skip(MarketResultNotFoundException.class)
.skipLimit(1000) // abort if >1000 skips
.retry(TransientDataAccessException.class)
.retryLimit(3)
.build();
}
Skipped items are logged to the BATCH_SKIP_LOG table. After the job completes, query it to reprocess anything skipped:
SELECT * FROM BATCH_SKIP_LOG WHERE JOB_EXECUTION_ID = ?;
A single-threaded reader scanning 50 million rows is slow. Partitioned steps split the data into ranges and process each range concurrently in separate step executions:
@Bean
public Step partitionedSettleStep(JobRepository repo, Step settleStep) {
return new StepBuilder("partitionedSettleStep", repo)
.partitioner("settleStep", betIdPartitioner())
.step(settleStep)
.gridSize(8) // 8 partitions
.taskExecutor(new SimpleAsyncTaskExecutor())
.build();
}
@Bean
public Partitioner betIdPartitioner(DataSource ds) {
var partitioner = new ColumnRangePartitioner();
partitioner.setDataSource(ds);
partitioner.setTable("bets");
partitioner.setColumn("id");
return partitioner;
}
ColumnRangePartitioner queries MIN(id) and MAX(id), divides the range into gridSize equal slices, and passes each slice as an ExecutionContext to the worker step. Each worker gets minValue and maxValue parameters it uses to scope its reader query:
@Bean
@StepScope
public JdbcCursorItemReader<UnsettledBet> unsettledBetReader(
DataSource ds,
@Value("#{stepExecutionContext['minValue']}") Long minId,
@Value("#{stepExecutionContext['maxValue']}") Long maxId) {
return new JdbcCursorItemReaderBuilder<UnsettledBet>()
.name("unsettledBetReader")
.dataSource(ds)
.sql("""
SELECT id, market_id, selection_id, stake, price, side, placed_at
FROM bets
WHERE status = 'PLACED'
AND market_settled_at IS NOT NULL
AND id BETWEEN :minId AND :maxId
ORDER BY id
""")
.queryArguments(minId, maxId)
.rowMapper(new BetRowMapper())
.build();
}
With 8 partitions across 50 million rows, each worker processes 6.25 million rows in parallel. Wall-clock time drops roughly proportionally (accounting for I/O contention).
@Service
public class SettlementJobLauncher {
private final JobLauncher launcher;
private final Job settlementJob;
public JobExecution runSettlement(LocalDate settlementDate) throws JobExecutionException {
var params = new JobParametersBuilder()
.addLocalDate("settlementDate", settlementDate)
.addLong("run.id", System.currentTimeMillis()) // ensures unique execution
.toJobParameters();
return launcher.run(settlementJob, params);
}
}
run.id forces a new execution even if the same settlementDate was run before. Without it, Spring Batch considers the job with the same parameters already complete and won’t restart it.
For scheduled runs, annotate the launcher method:
@Scheduled(cron = "0 0 2 * * *") // daily at 02:00
public void scheduledSettlement() throws JobExecutionException {
runSettlement(LocalDate.now().minusDays(1));
}
Spring Boot Actuator exposes batch job and step execution details:
management:
endpoints:
web:
exposure:
include: batch
GET /actuator/batch/jobs lists all jobs and their last execution status. For production, query the BATCH_JOB_EXECUTION and BATCH_STEP_EXECUTION tables directly for dashboarding — they contain read count, write count, skip count, commit count, and processing time per step.
@Component
public class BatchMetricsListener implements JobExecutionListener {
private final MeterRegistry registry;
@Override
public void afterJob(JobExecution execution) {
execution.getStepExecutions().forEach(step ->
Metrics.gauge("batch.step.read.count",
Tags.of("step", step.getStepName()),
step, StepExecution::getReadCount));
}
}
Chunk size is the main performance lever. Too small and you’re committing thousands of tiny transactions. Too large and a single failure rolls back too much work. 200–1000 is typical; benchmark against your actual write pattern.
Always use @StepScope for stateful beans. Readers that maintain cursor state are not thread-safe. @StepScope ensures each step execution gets its own instance.
Set saveState: false on readers that don’t need restartability. The default saves cursor position after every chunk. For idempotent jobs this is unnecessary overhead.
JobRepository needs its schema created once. Use spring.batch.jdbc.initialize-schema: always in development and embedded in tests. In production, run the schema SQL manually during deployment and set this to never.
Spring Batch handles the hard parts of production batch processing — restartability, atomicity, parallel execution, and audit trails — so you can focus on business logic. If you’re building data pipelines or settlement systems at scale, let’s work together.