Idempotent Event Consumer
Consumers must be able to process events idempotently. In general, an at least once delivery guarantee is assumed. This means that a consumer may receive the same event multiple times.
There are several reasons for this:
-
Typically, consumers receive and acknowledge events in batches. If processing is interrupted, the same batch will be delivered again. If some events in the batch have already been processed, a non-idempotent consumer will process them again.
-
Some brokers (including Kafka) support transactions, allowing receiving, acknowledging, and sending new events to occur atomically. However, it’s not that simple:
-
First, these transactions are very costly and lead to significantly higher latencies. In larger systems, those costs add up. These costs arise because processing no longer happens in batches as each event is transmitted individually, along with additional calls similar to a two-phase commit.
-
Second, this approach only works as long as a single technology is used. Otherwise, a transaction manager is required - the difference between local and global transactions. Else, events may be lost or published multiple times — bringing us back to the original problem. (See atomic event publication)
-
Idempotent Business Logic
In most (or even all?) cases, business logic can be made idempotent. It’s important to distinguish between absolute and relative state changes.
-
Absolute state changes:
With an absolute state change, a value is typically replaced by a new one — such as an address. These operations are always idempotent. -
Relative state changes:
A relative state change occurs when, for example, an amount is deducted from an account balance. Reprocessing the event would alter the balance again. Alternatively, an account balance can be modeled as a sequence of transactions. If the event ID is unique and part of the SQL schema, reprocessing would result in an error — this is essentially event sourcing.
Remembering Processed Events
Alternatively, the IDs of processed events can be stored in the database. Before processing, a lookup can be performed to check whether the event has already been handled.
For consistency, it’s crucial that storing the event ID happens within the same transaction as all other domain changes.
A prerequisite for a simple solution without locking is that the same event cannot be processed in parallel by different consumers. With brokers like Kafka, this is guaranteed — there can only be one consumer per partition.
Transactional Event Processing
Another alternative is the use of local (or global) transactions, accepting the associated drawbacks — assuming the broker supports transactions at all.
It’s important to note that when using multiple technologies, a transaction manager is required. Local transactions only work within a single technology — in this case the broker. That’s why exactly once processing works in Kafka Streams, where Kafka supports the role of a database as well.