Consumer Offset Management and Commit Strategies in Spring Boot

Kafka’s durability guarantee is only as strong as your offset management. An offset is a pointer to the last message your consumer processed — commit it at the wrong time and you either lose messages (skip past failures) or reprocess them endlessly (never advance past a persistent error). Getting offset management right requires understanding what Kafka promises, what Spring Kafka provides, and where you need to take manual control.

What an offset is

Every message in a Kafka topic partition has a monotonically increasing offset. When a consumer group processes a partition, it commits the offset of the last successfully processed message. If the consumer restarts, it resumes from the committed offset.

The consumer group offset is stored in the internal __consumer_offsets topic, not on the broker message log. Committing is an explicit act — messages are not auto-acknowledged by delivery.

Auto-commit: the default and why it lies

Spring Kafka’s default configuration wraps Kafka’s enable.auto.commit=true. Every auto.commit.interval.ms (default 5 seconds), the consumer commits the highest offset it has fetched, regardless of whether your code has processed those messages.

The lie: between fetching a batch and the next auto-commit interval, your consumer could crash. On restart, it fetches from the committed offset — which is before the messages it had fetched but not yet processed. Reprocessing. Or worse: if auto-commit fires after fetch but before processing, and the consumer crashes mid-processing, those messages are skipped permanently.

Auto-commit is acceptable only for non-critical, idempotent consumers where occasional reprocessing or loss is tolerable. For anything that writes to a database, places an order, or sends a notification, disable it.

Disabling auto-commit in Spring Boot

spring:
  kafka:
    consumer:
      enable-auto-commit: false
    listener:
      ack-mode: MANUAL_IMMEDIATE

With MANUAL_IMMEDIATE, the listener receives an Acknowledgment object and must call acknowledgment.acknowledge() explicitly after successful processing.

Manual acknowledge: at-least-once delivery

@KafkaListener(topics = "order-events", groupId = "order-processor")
public void onOrderEvent(OrderEvent event, Acknowledgment ack) {
    try {
        orderService.process(event);
        ack.acknowledge();
    } catch (RetryableException e) {
        log.warn("Transient failure processing order {}, will retry", event.orderId());
        // Do NOT acknowledge — Kafka will redeliver from this offset on restart
        throw e;
    } catch (FatalException e) {
        log.error("Permanent failure for order {}, sending to DLQ", event.orderId());
        deadLetterPublisher.send(event);
        ack.acknowledge();   // advance past the poison message
    }
}

This gives at-least-once delivery: every message is processed at least once, and acknowledged only after success. In the transient-failure case, no acknowledge means the partition assignment must be re-fetched — but Spring Kafka’s DefaultErrorHandler with backoff handles retry within the same consumer before the partition is reassigned.

Batch acknowledgement

For consumers that process messages in bulk — writing to a database in batches, for example — commit after the batch completes:

@KafkaListener(topics = "market-prices", groupId = "price-processor",
               containerFactory = "batchFactory")
public void onPriceBatch(List<MarketPrice> prices, Acknowledgment ack) {
    priceRepository.saveAll(prices);   // single DB write for the batch
    ack.acknowledge();                 // commit after the entire batch is persisted
}

Configure the batch factory with appropriate fetch sizes:

@Bean
public ConcurrentKafkaListenerContainerFactory<String, MarketPrice> batchFactory(
        ConsumerFactory<String, MarketPrice> cf) {

    ConcurrentKafkaListenerContainerFactory<String, MarketPrice> factory =
        new ConcurrentKafkaListenerContainerFactory<>();
    factory.setConsumerFactory(cf);
    factory.setBatchListener(true);
    factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL_IMMEDIATE);
    return factory;
}

Error handling with DefaultErrorHandler

Spring Kafka 2.8+ provides DefaultErrorHandler as the replacement for SeekToCurrentErrorHandler. Configure exponential backoff with a dead-letter topic:

@Bean
public DefaultErrorHandler errorHandler(KafkaTemplate<String, Object> template) {
    DeadLetterPublishingRecoverer recoverer =
        new DeadLetterPublishingRecoverer(template,
            (record, ex) -> new TopicPartition(record.topic() + ".DLT", record.partition()));

    ExponentialBackOffWithMaxRetries backoff = new ExponentialBackOffWithMaxRetries(3);
    backoff.setInitialInterval(1_000L);
    backoff.setMultiplier(2.0);
    backoff.setMaxInterval(10_000L);

    return new DefaultErrorHandler(recoverer, backoff);
}

After 3 retries with exponential backoff, the message is published to the .DLT topic and the original offset is committed. The consumer advances — it does not get stuck on a permanently failing message.

Idempotent consumers and exactly-once

At-least-once delivery means your consumer must be idempotent — processing the same message twice must produce the same result as processing it once. The simplest approach: check before acting.

public void process(OrderEvent event) {
    if (orderRepository.existsByIdempotencyKey(event.idempotencyKey())) {
        log.debug("Duplicate event {}, skipping", event.idempotencyKey());
        return;
    }
    orderRepository.save(Order.from(event));
}

True exactly-once semantics — processing exactly once with no duplicates even across crashes — requires transactional producers and the Kafka Streams transactional API (isolation.level=read_committed). For most Spring Boot consumers writing to a database, idempotent at-least-once is the practical choice: simpler, sufficient, and far easier to reason about under failure.

Monitoring consumer lag

Consumer lag — the number of uncommitted messages between the latest offset and the consumer position — is the primary health metric for a Kafka consumer. Zero lag is ideal; growing lag means the consumer cannot keep up with the producer.

Expose lag via Micrometer (included in Spring Boot Actuator):

management:
  metrics:
    tags:
      application: order-processor

Spring Kafka automatically registers kafka.consumer.fetch-manager-metrics which includes records-lag-max. Wire this to a Prometheus alert:

alert: KafkaConsumerLagHigh
expr: kafka_consumer_fetch_manager_metrics_records_lag_max > 10000
for: 5m
labels:
  severity: warning

Partition assignment logging

For debugging rebalance storms or uneven partition distribution:

@Bean
public ConsumerAwareRebalanceListener rebalanceListener() {
    return new ConsumerAwareRebalanceListener() {
        @Override
        public void onPartitionsAssigned(Consumer<?, ?> consumer,
                                         Collection<TopicPartition> partitions) {
            log.info("Assigned partitions: {}", partitions);
        }

        @Override
        public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
            log.info("Revoked partitions: {}", partitions);
        }
    };
}

Frequent rebalances — visible as rapid assigned/revoked cycles in the logs — indicate a consumer that is too slow to process its batch within max.poll.interval.ms. Either reduce the batch size or increase the poll interval.

Offset management is not a configuration detail — it is a correctness boundary. The choice between auto-commit, manual acknowledge, and idempotent processing determines whether your consumer loses data, duplicates work, or handles failures cleanly.

If you’re working on Kafka consumer infrastructure in Spring Boot and want a review of your offset and error handling strategy, get in touch.