Event Sourcing for mere mortals

It’s a safe bet that every developer has heard of Event Sourcing. Like it, hate it, embrace it, ignore it … as with everything else, there is a wide spectrum of reactions. This article is for those who don’t really understand what it is. It does not aim to be a scientifically correct academic research but rather a collection of practical simplifications. If you have that itchy feeling that Event Sourcing can help you but you’re not exactly sure how – keep reading.

Auteurs: Lucas Campos & Milen Dyankov

There are so many terms that follow the [event][space | dash][fancy word] pattern that it’s next to impossible to remember them all, let alone tell the differences. Event Sourcing is probably the least understood one among them. At the time of writing, even Wikipedia can’t tell you what it is.

Simply put, Event Sourcing is an architectural pattern which warrants that decision making components do not store and update their current state but (re-)build it on demand from historical events stored in an event store.

If that’s too abstract, let’s try with an example. How much money do you have in your wallet now? Most people don’t have that figure stored in their minds. But they may recall that at some memorable point in time they had X and then at another memorable point in time they gained/spent Y and before they know it now. They can construct the state of the wallet from a series of events. Yep, that’s what Event Sourcing is in a nutshell.

Don’t confuse it with event streaming which refers to the ability to use events to asynchronously communicate between modules, services, systems and actors. Event streaming can be (and often is) a nice foundation on top of which Event Sourcing is implemented. But the sole fact that something can send and react to events, does not mean it can reliably (re-)construct its state from them. Often people will be proudly talking about implementing an event-driven architecture (EDA) relying on event streaming. While that’s an achievement of its own, it’s not necessarily an Event Sourcing based solution.

What it is good for

Consider the following table showing two different ways of storing information that describes the outcome of a user’s actions.

USER ACTION	STORING STATE	STORING HISTORY
Create shopping cart	cart: { products: [] }	events: [ created_cart: {id: 1} ]
Add product A to the cart.	cart: { products: [A] }	events: [ created_cart: {id: 1}, added_product: { cart_id: 1, product: A } ]
Add product B to the cart.	cart: { products: [A, B] }	events: [ created_cart: {id: 1}, added_product: { cart_id: 1, product: A }, added_product: { cart_id: 1, product: B } ]
Remove product A from the cart.	cart: { products: [B] }	events: [ created_cart: {id: 1}, added_product: { cart_id: 1, product: A }, added_product: { cart_id: 1, product: B }, removed_product: { cart_id: 1, product: A } ]

If your ONLY concern is the checkout process, the storing state will look not only sufficient but more concise and better optimized to you. Perhaps the biggest (yet solvable) technical challenge here, is proper synchronization due to the mutable state.

The business issue with it could be way more serious though. Storing state ignores and loses forever potentially valuable information (in this case, the fact the user switched from product A to B). Irrelevant to the checkout process, that fact can be priceless for any purchase and user behaviour analysis. But if you don’t have it already by the time you need it, there is nothing you can do about it.

Storing events (history) is an immutable (append only) process. While way more verbose, it does provide the full picture to everyone interested. Not only the picture you care about now but also one that someone may care about in the future. It can be used to calculate different states (as we’ll see in Aggregates and Projections below) representing any given point in time.

Essential concepts

Now that you know Event Sourcing can be super handy (mostly to your future self) all you have to do is figure out how to store and make sense of zillions of events. That’s not a joke – one of the AxonIQ’s customers stores 1 billion events per day for a few years now. It may sound outrageous at first but there is a method in this madness. The following concepts are the essential ingredients that make it all work.

Aggregates (a.k.a. scope and boundaries)

An aggregate is a conceptual building block that groups together and encapsulates a collection of domain objects that must be treated as a whole when modified. It defines the scope and the boundaries in which a state changing decision happens.

Let’s reuse the money example to illustrate it. Say you want to purchase something and to make that decision you only need to establish from all past events how much money you have now. Notice that there may be tons of other events in the same timeframe but they are all out of the scope of our decision making process. Here is sample Java code that aggregates the objects that may change as a result of that decision (Listing 1).

public class Belongings {
  private User user;
  private List<Thing> things;
  private Money money;

  public Belongings(User user) {
    this.user = user;
  }

  public void handleRequest(PurchaseThing request) {
    if (haveEnoughMoney(request, money)) {
      sendEvent(new ThingPurchased(new Date(), request.thing, request.price));
    }
  }

  public void handleEvent(BelongingsRelevantEvent event) {
    updateThingsStateFrom(event);
    updateMoneyStateFrom(event);
  }
}

Listing 1.

Before you can interact with that aggregate, you need to get an instance of it from some place. For now simply imagine that there is an MagicHelper that can construct the instance for you from all relevant past events like in Listing 2.

public class MagicHelper {
  public static Belongings pullBelongingsFromTheHat(User user) {
    Belongings belongings = new Belongings(user);
    List<BelongingsRelevantEvent> events = eventStore.eventsFor(user);
    events.forEach(belongings::handleEvent);
    return belongings;
  }
}

Listing 2.

You can call MagicHelper.pullBelongingsFromTheHat(user) to get an instance of the Aggregate with its state already constructed from the relevant past events. Don’t be fooled by the three lines long sample. As things evolve, you’ll realize the poor MagicHelper will have to learn way more magic tricks. We’ll get back to that in a minute. For now let’s focus on the Aggregate.

The user in the above example is what is known as an Aggregate Identifier and allows you to request a specific instance of the aggregate.

The handleEvent(…) is called an Event Sourcing Handler and is responsible for constructing the current state from events. Every time you load an Aggregate, it will start as an empty instance and then it will handle all the past events in the same order they occurred.

The PurchaseThing is a Command. It’s just an intent to do something that would change the state. It’s the Command Handler that is responsible for deciding if it is a valid request or not.

The handleRequest(…) is called a Command Handler. It is the way to interact with the aggregate. It is responsible for making decisions and announcing the result by publishing an event (or a few). It never modifies the state directly.

The ThingPurchased object (note the usage of past tense in the name) is the Event representing the result of the decision made. In this example it is one of potentially many relevant event types represented by the generic BelongingsRelevantEvent type. It will be stored in the Event Store and stay there forever.

Projections (a.k.a. States, a.k.a. View Models)

At that point you may be thinking “All that work just to be able to tell how much money I have?”. Fair point. Except, state based approaches make it impossible to answer the questions you’ll have tomorrow, remember? For example, you may want to know which weekdays you spend the most money. That’s a whole different state. Luckily it can be build from the events. Both states look at the same data from different angles, thus they are called Projections or View Models.

You are free to create as many Projections as you want, any time you want and store the data anywhere you want. For example the code in Listing 3.

public void handle(ProductPurchasedEvent event) {
  SpendingsPerWeekDayEntity entity = entityManager.get(...);
  entity.addAmount(event.date.getDay(), event.price);
  entityManager.persist(entity);
}

Listing 3.

This demonstrates an Event Handler that stores a Projection in a database. Since Projection’s state is persistent, something needs to track what events have been processed by who and only send new ones. Here, we’ll likely need to teach the MagicHelper a few more magic tricks again.

As tempting as it is, avoid using Projections for making decisions in Aggregates. Remember that Projections are biased representations of the data and the only source of truth is the event store.

Meet the MagicHelper

With the essential concepts covered, it’s now time to pay the long due attention to the MagicHelper and its friends. As you’ve probably guessed, those are the really hard pieces to write in a bulletproof and scalable way.

One way would be to use a combination of event streaming (Kafka), a database (MySQL) and then write the event storing / retrieving / filtering etc. yourself to build your very own MagicHelper.

Another option is to “adopt” one. For example Axon Framework offers convenient annotations like:

@Aggregate
@AggregateIdentifier
@CommandHandler
@EventSourcingHandler
@EventHandler
…

which represent the very same concepts we’ve described above. Once you annotate your Aggregate, you can interact with it via CommandGateway. Given a single construct like:

 commandGateway.send(new PurchaseThing(id, ...))

AxonFramework will:

find the Aggregate that can handle the command;
create new object instances for the respective Aggregate Identifier;
call their Event Sourcing Handlers to rebuild their state;
call their Command Handlers to process the Command;
receive the Events they send and store them in the Event Store.

Pretty cool for a free MagicHelper, isn’t it? It makes the process of writing event sourced applications a well organized and pleasant journey. But before you sign up for it, be warned.

“There be dragons” – know your spells

Storing events is easy. Fetching the relevant ones and keeping track of what has been processed is hard. Using (No)SQL database may work but a scalable Event Store combined with a reliable Message Bus is the best option. That’s exactly what an Axon Server is.

Eventually an aggregate may have too many relevant events to process them one by one every time. That’s when a Snapshot (the reconstructed state of the Aggregate at some point in time) becomes useful. With the Axon Framework you can create a SnapshotTriggerDefinition to instruct your Aggregates when to do Snapshots.

Sometimes you’ll need to update a Projection. Say you don’t only want the sum per weekday but also the count. You can of course create a new one. But it may be more convenient to change and reset an existing one and Replay events to build a new state. Axon Framework offers you fine grained control over that process via Processing Groups and Tracking Tokens.

As your system evolves, your Events might also evolve. Since they are facts, you can’t just modify them. But you can transform (Upcast) “old” events on delivery. With Axon Framework registering custom Upcasters is trivial.

Another important aspect is the sensitive data. According to GDPR and other laws – every user has the right to be forgotten. But no event, containing user data or not, can ever be removed. AxonIQ’s Data Protection Module provides a solution to those conflicting requirements based on Crypto-shredding.

Summary

There are some applications that will never benefit from Event Sourcing. Though, a vast majority would, in one way or another. “Then why is it not everywhere?” you may ask. There are several reasons. It’s a bit more work than simply saving state. Most developers focus on getting something out today – the past is not interesting and the future needs belong to the future. Also most people have learned to make peace with forever lost things (data included) especially when there is a good excuse like “we didn’t know you’d need that at the time we wrote the code”.

It’s all costs vs benefits. Hopefully this article demonstrated the cost is not as high as it’s often perceived and the benefits are often underestimated.

BIOGRAFIE

Lucas Campos is a Java Developer at AxonIQ aiming at spreading knowledge while learning as much as possible.

Milen Dyankov is a Developer Advocate at AxonIQ on a mission to help fellow Java developers around the globe design and build clean, modular and future proof software!