System Architecture

Add git-like functionality to your application using Event Sourcing


In this post we’ll implement a tool for managing products using event sourcing, node.js and neo4j. Event sourcing will enable us to support neat git-like features such as versioning, branching and merging that would be very hard to support with a more traditional approach. We’ll use neo4j because, as you’ll see below, storing our data as a graph fits perfectly with the application and will allow us to do queries more efficiently that using other types of databases (RDMBS, document, key-value, etc…).

Example

The idea is to create a web service for managing products for an eCommerce website like Amazon. The user can create and manage categories and products. The application is very simple, but it will be enough for demonstrating the concepts and ideas explained below. I hope that you’ll find the implementation useful too.

Example

View demo Download source

Event Sourcing

The typical way one would create an application like this is by implementing and exposing CRUD functionality for both categories and products. We would have endpoints for creating products, adding categories, deleting records, etc. While for most applications this would be a perfectly adequate solution, for this example we’ll use event sourcing instead.

In short, the idea of event sourcing is to represent changes to the application state as events. These events are immutable and are stored sequentially in some sort of event store for the entire life of the application. For example, if we wanted to delete a product, we would append a Delete Product event, containing the id of the product to be deleted. In order to access the actual state of the application we need to build it by running trough all the events in sequential order and apply them. In functional terms, this would be a left fold of all events. For the example in the image above, our event store may like something like this:

Event Store

Implementation

For this demo we have 5 main event types: ‘Add Category’, ‘Add Product’,’ Remove Category’, ‘Remove Product’ and ‘Edit Product Attribute’. Each event type is represented as a class. Here is the code for the Add Category event:

const Category = require('../models/Category');

class AddCategoryEvent {
    constructor(inventory, categoryId, categoryName) {
        this.inventory = inventory;
        this.categoryId = categoryId;
        this.categoryName = categoryName;
        this.name = 'AddCategoryEvent';
        this.parent = null;
    }

    process() {
        this.inventory.addCategory(new Category(this.categoryId, this.categoryName));
    }
}

module.exports = AddCategoryEvent;

Each event class has a process, that is called when the list of events is played in order to re-build the actual state of the system. Each event is serialized as JSON and stored as a node on neo4j. New nodes are connected to their parent node by a directed edge. Here is what it looks like on the neo4j console:

Events in neo4j

Having our events stored like that, sequentially, one after the other, makes the retrieval process extremely simple and efficient.

The web server exposes an api endpoint that allows clients to insert new events and, optionally, attach them to an existing node. Here is an example of the payload:

{
   "events":[
      {
         "name":"AddCategoryEvent",
         "categoryName":"Phones",
         "categoryId":"1234"
      }
   ],
   "parentId":1
}

Versioning

Since we’re storing all historical changes as individual events, we can very easily see what our data looked like at any point in time. Instead of replaying all events up to the last one, we can replay them up to any specific, arbitrary, point.

This ability to see what the state of the system was at any point in time is a very good use case for event sourcing. Implementing features like unlimited undo/redo becomes trivial.

In order to support this functionality, our web service has an endpoint that receives an event id and returns the resulting state that we get from replaying all the corresponding events up to the input event.

Branching

So far we’ve only appended new events to the last event in the chain. However, because of the way we’ve implemented things, there is nothing preventing us from appending events to any event in the chain. This would allow us to support branching: we could have different versions of the application state co-existing at the same time. Maybe different users want to work on their own version of the data. Or maybe we want to add events without affecting the latest stable version.

This branching feature is the reason why I chose neo4j for the event store. Once we start branching out, our chain of events becomes a tree of events, which is straightforward to model and query in a graph database.

If you need similar branching functionality in your application, do consider using event sourcing.

Merging

Now that we can have multiple branches, it would be great to be able to merge branches, similar to what git does.

In this case, the user needs to choose two events (or branches): the source and the target. The idea is to find all the events that the source has but the target doesn’t, optionally process them, and then append them to the target event.

Finding the events

We need to find all the events (or changes) that precede the source event and that are missing in the target event. You’ll recall that all the events in all branches form a tree. The first thing we need to find is the Lowest Common Ancestor (LCA) between the source and target events. Then the events that we’re looking for are all the events between the LCA event (excluding) and the source event (including).

Fortunately, Cypher, neo4j’s query language, is expressive enough that we can do do all that in one query:

...
    static getEventsforMerge(eventId, eventFromId) {
        const command = `MATCH (e1:Event)<-[:APPEND*0..]-(x:Event)-[:APPEND*0..]->(e2:Event)
        MATCH (x)-[:APPEND*0..]->(e:Event)-[:APPEND*0..]->(e2) where ID(e1) = ${eventId} 
        and ID(e2) =${eventFromId} and ID(e) <> ID(x)
        return e`;

        return new Promise((resolve, reject) => {
            session.run(command).then(result => {
                const events = result.records.map((record) => record.get(0).properties);

                resolve(events);
            });
        });
    }
...

The query finds the LCA event and returns the corresponding list of events that we need.

Merging the events

Now that we have the events, we need to append them to the target event. For this demo, this is exactly what we’re doing: just appending the events. However, for most applications, before we add the events we’ll probably need to check for conflicts (similar to what git does) or inconsistencies. What happens if the merge will try to edit a deleted product (in the target branch)? Or if the merge tries to modify a value that has been modified in the target branch too? Which value is the right one? What type of conflicts might there be and how to solve them depends on the application. In some cases, the application will be able to solve the conflicts automatically, while in others, user interaction will be needed.

Here you can see the source code for the method that performs the merge. It finds the relevant events (as described above) and appends them to the target event:

...
   static mergeEvents(eventId, eventFromId, callback) {
        return new Promise((resolve, reject) => {
            EventRepository.getEventsforMerge(eventId, eventFromId).then((events) => {
                Controller.insertEvents(events, eventId, resolve);
            });
        });
    }
...

Mobile applications

This pattern of branching and merging is a very good fit for mobile applications that need to work online as well as offline and especially for applications that allow collaboration between users. Every time a user changes data in the app, the corresponding events will be processed locally and also stored in an internal buffer or queue of events. Whenever the app is online, the events will be synced with the server. If the app happens to be offline, the events in the buffer will stay there until the app goes back online and they can be synced with the server. The server, which might get requests from multiple clients working on the same data, will handle the merging logic.

Memoization

You can use memoization in order to avoid replaying all past events every time you need to rebuild the current state of the system. You can have checkpoints between X number of events, where each checkpoint contains a memoization of the application state up to the previous events. You can also keep an up-to-date cache of the application state in memory. Every time there is a new event you append it to the events store and also apply it to the copy in memory.

Hope you found this post useful. If you have any questions or comments, please let me know!

System Architecture
Integration Patterns for Microservices Architectures: The good, the bad and the ugly
Programming Patterns
MVC Pattern For Building Three.js Applications
Software Engineering
Welcoming change: how decoupling can make your application more flexible