Event sourcing is the beating heart of modern distributed systems — capturing every change as an immutable log of events.
And UUIDs?
They’re the identity layer of those events. The tags. The trail of breadcrumbs.
But while UUIDs provide uniqueness, their role in event sourcing is more nuanced than just “generate and forget.”
Let’s explore how UUIDs are used — and misused — in distributed event systems.
🔁 What UUIDs Do in Event Sourcing
In distributed event sourcing, UUIDs are commonly used for:
- Event IDs: globally unique reference for each change
- Idempotency keys: to detect and skip duplicate writes
- Correlation IDs: tracing a request across services
- Aggregate IDs: identifying the entity being updated
They help ensure:
- No duplicate replays
- No partial writes
- Consistent cross-node processing
📦 Common Patterns That Work Well
1. **UUIDs as Event Identity**
Each event gets a UUIDv4 or UUIDv7:
{
"event_id": "550e8400-e29b-41d4-a716-446655440000",
"type": "UserCreated",
"payload": { "user_id": "abc123" },
"timestamp": "2024-07-15T14:00:00Z"
}Even across retries or node restarts, that event_id remains constant — critical for deduplication.
2. **Idempotent Handlers with UUID Checking**
Store processed event_ids in a side table or Redis:
IF NOT EXISTS (SELECT 1 FROM processed_events WHERE event_id = ?) THEN
INSERT INTO processed_events ...
HANDLE EVENT ...
END IFThis makes consumers replay-safe.
3. **Aggregate-Level UUIDs**
Aggregate roots (e.g. users, accounts, carts) can use UUIDs to:
- Partition event streams
- Ensure cross-service correlation
- Maintain uniqueness in global topics
"user_id": "9f97b4af-8312-4fae-a3c5-76f7d819b2ab"❌ Pitfalls and Anti-Patterns
1. **Duplicate UUID Generation**
You’d think UUIDs are always unique — but if you’re using poor RNG sources (like Math.random() or local clock + MAC without a namespace), collisions happen.
Fix: Always use a CSPRNG, and consider deterministic v5 UUIDs for idempotency across retries.
2. **Non-Sortable UUIDs in Ordered Logs**
UUIDv4 is fully random — which means no natural sort order.
If your event store (e.g. Kafka, Pulsar) expects messages to be ordered by ID or timestamp, UUIDv4 won’t help.
Fix: Use UUIDv7 or ULID for sortable, millisecond-accurate IDs.
3. **Timestamp Confusion**
Event consumers may assume UUID timestamp == event timestamp. But:
- UUIDv1 embeds time
- UUIDv4 doesn’t
- UUIDv7 does, but you must interpret it
Fix: Always store an explicit created_at timestamp in ISO8601, separate from the UUID.
4. **Truncated UUIDs in Cache Keys**
Some teams shorten UUIDs for performance:
cache_key = "user:" + uuid[0..8]Which works... until you have collisions.
Fix: Use full UUIDs or hash the UUID if space is a concern. Never truncate blindly.
🛠️ Best Practices
| Principle | Recommendation |
|---|---|
| Event ID format | UUIDv7 or ULID for sortability |
| Aggregate identity | UUIDv4 or namespaced UUIDv5 |
| Replay protection | Store processed event_ids |
| Message ordering | Use timestamp-based UUIDs, not v4 |
| Collision prevention | CSPRNG or deterministic UUIDv5 |
| Logs and traces | Correlate with UUID per request or workflow |
🧪 Sample UUID Strategy
Here’s a JSON event model that uses UUIDs effectively:
{
"event_id": "01H8TVF3YVRXN6BGC36FXCX8YT",
"aggregate_id": "9a4cfe23-cd5d-4d20-a2e4-66efb4303a1a",
"type": "OrderPlaced",
"payload": {
"order_id": "O-12345",
"amount": 99.99
},
"occurred_at": "2024-07-15T13:45:00.000Z"
}Notes:
event_idis ULID (sortable + compact)aggregate_idis UUIDv4occurred_atis explicit and canonical
Final Thoughts
UUIDs bring order to chaos in event-driven systems — but only if you use them with intention.
- Don't rely on "random = safe"
- Know your UUID version
- Design your identifiers like they’re part of your architecture — because they are
🎯 In event sourcing, identity is everything. And UUIDs? They're the passports your events use to move through time, space, and system boundaries.
