SQS Messaging Patterns for Spring Boot Microservices

SQS is the bread-and-butter messaging service for AWS-based microservices. It’s simpler to operate than Kafka or RabbitMQ, integrates cleanly with IAM and other AWS services, and scales without you touching anything. But simple to operate does not mean simple to use correctly. At DWP Digital we handled high-volume benefit payment events over SQS, and the operational lessons from that work are what this post is about.

Standard vs FIFO Queues

The first decision is queue type.

Standard queues offer unlimited throughput, but SQS guarantees at-least-once delivery — not exactly-once. Messages can arrive out of order and can be delivered more than once. Your consumer must be idempotent.

FIFO queues guarantee ordered delivery and exactly-once processing within a message group, but throughput is capped (300 TPS per queue, or 3,000 with batching). They require a MessageGroupId on every send and a MessageDeduplicationId for deduplication.

For most microservice use cases, standard queues are the right default. FIFO is worth the throughput constraint when strict ordering genuinely matters to the business domain — for example, processing customer account state changes in sequence.

Spring Cloud AWS Dependency

Add spring-cloud-aws-sqs to your pom.xml:

<dependency>
    <groupId>io.awspring.cloud</groupId>
    <artifactId>spring-cloud-aws-sqs</artifactId>
</dependency>

With Spring Cloud AWS 3.x, the SqsAsyncClient is auto-configured from your AWS credentials and region. For local development, point at LocalStack:

spring:
  cloud:
    aws:
      sqs:
        endpoint: http://localhost:4566
      region:
        static: eu-west-2
      credentials:
        access-key: test
        secret-key: test

Receiving Messages with @SqsListener

The @SqsListener annotation is the Spring Cloud AWS equivalent of @KafkaListener:

@Component
public class PaymentEventConsumer {

    private final PaymentService paymentService;

    @SqsListener("payment-events")
    public void receive(
            @Payload PaymentEvent event,
            @Header("ApproximateReceiveCount") String receiveCount) {

        log.info("Processing payment {} (receive attempt {})",
            event.getPaymentId(), receiveCount);

        paymentService.process(event);
        // message is deleted automatically on successful return
    }
}

When the method returns without throwing, Spring Cloud AWS deletes the message from the queue. When it throws, the message becomes visible again after the visibility timeout and will be redelivered. This is the at-least-once delivery contract.

Sending Messages with SqsTemplate

SqsTemplate is the high-level send abstraction:

@Service
public class PaymentEventPublisher {

    private final SqsTemplate sqsTemplate;

    public void publish(PaymentEvent event) {
        sqsTemplate.send(to -> to
            .queue("payment-events")
            .payload(event)
            .header("source", "payment-service"));
    }
}

For FIFO queues, include messageGroupId and messageDeduplicationId:

sqsTemplate.send(to -> to
    .queue("payment-events.fifo")
    .payload(event)
    .messageGroupId(event.getAccountId())
    .messageDeduplicationId(event.getPaymentId()));

Dead Letter Queues — Non-Negotiable

Configure a DLQ for every queue. Without one, a poison message that always fails will be redelivered until it expires (up to 14 days), consuming your visibility windows and polluting your metrics.

Set up the DLQ in your AWS infrastructure (CloudFormation, Terraform, or CDK):

"RedrivePolicy": {
    "deadLetterTargetArn": "arn:aws:sqs:eu-west-2:123456789:payment-events-dlq",
    "maxReceiveCount": 3
}

maxReceiveCount: 3 means after three failed delivery attempts, SQS routes the message to the DLQ. Alert on DLQ depth — a non-zero value means something is wrong in the consumer and requires investigation.

Visibility Timeout Tuning

When a consumer picks up a message, SQS hides it from other consumers for the visibility timeout period. If your consumer doesn’t delete the message before the timeout expires, SQS makes it visible again and another consumer picks it up.

The default visibility timeout is 30 seconds. If your processing regularly takes longer than that, you’ll get spurious duplicates. Set the visibility timeout to at least 6x the expected processing time:

// Extend visibility timeout if processing will take a while
@SqsListener(value = "batch-jobs", acknowledgementMode = SqsListenerAcknowledgementMode.MANUAL)
public void processBatch(BatchJob job, Acknowledgement ack) {
    // Extend by 60s before starting long work
    ack.changeVisibility(Duration.ofSeconds(60));

    doLongRunningWork(job);

    ack.acknowledge();
}

Idempotency — SQS Will Deliver Twice

Even with a correctly tuned visibility timeout, SQS standard queues can deliver the same message more than once. A consumer process that crashes after processing but before deleting the message will cause redelivery. Your consumer must tolerate this.

The standard pattern is to track processed message IDs in a durable store:

@Service
public class IdempotentPaymentProcessor {

    private final ProcessedMessageRepository processedMessages;
    private final PaymentRepository payments;

    @Transactional
    public void process(PaymentEvent event) {
        String messageId = event.getMessageId();

        if (processedMessages.existsById(messageId)) {
            log.info("Duplicate message {}, skipping", messageId);
            return;
        }

        payments.save(Payment.from(event));
        processedMessages.save(new ProcessedMessage(messageId, Instant.now()));
    }
}

The idempotency check and the business write must be in the same transaction. If your downstream is not transactional (e.g. a third-party HTTP call), use a conditional write or an outbox pattern instead.

Handling Poison Messages

A poison message is one that always fails processing regardless of how many times it’s retried — usually because the payload is malformed or the system it references no longer exists.

With maxReceiveCount set, these route to the DLQ automatically. In your DLQ consumer, log the raw message body along with all headers before doing anything else:

@SqsListener("payment-events-dlq")
public void handleDeadLetter(
        String rawBody,
        @Header("ApproximateFirstReceiveTimestamp") String firstReceived,
        @Header("ApproximateReceiveCount") String receiveCount) {

    log.error("Dead letter received after {} attempts. First seen: {}. Body: {}",
        receiveCount, firstReceived, rawBody);

    alertingService.notifyDeadLetter("payment-events", rawBody);
}

Injecting the raw String rather than a typed payload means deserialization failures won’t prevent the DLQ handler from logging the problem.

Operational Tips

Monitor ApproximateNumberOfMessagesNotVisible alongside queue depth. A large not-visible count with low throughput means your consumers are getting messages but not processing them fast enough.
Set message retention to at least 4 days on the DLQ — gives you time to diagnose and replay without losing the evidence.
Avoid Thread.sleep() in consumers — it blocks the polling thread. Use SQS visibility timeout extension for rate limiting instead.
Long polling is always on by default in Spring Cloud AWS 3.x — it reduces empty receive calls and cost compared to short polling.

If you’re building AWS microservices and want messaging patterns that hold up under production load, get in touch.