Published on 27 July 2022 by Andrew Owen (4 minutes)
Modern software development is all about automation, continuous integration, continuous delivery and software-defined life cycles. The idea is to maintain quality while enabling features to be delivered as soon as they are production ready. You’re also probably familiar with the move away from monolithic systems to a microservices architecture. The goal there is to build the system out of components that can be swapped out without having to rebuild the whole thing.
But on the bleeding edge, there’s the event-driven architecture (EDA). Put simply, this is a design where the system as a whole exists as a state, and any changes to that state produce events. Instead of storing that state as a single datum, EDAs use event sourcing. Data is stored as a series of immutable events over time in an event store. Essentially, it’s a lot like a ledger. After they are created, events can’t be modified. Any changes to the data (state) happen as the result of a new event. There are several advantages to this approach:
A wise woman once said: “The main problem with event-driven architectures, is that you have to repeatedly explain what they are to customers.”
The building blocks of an EDA are its microservices. Typically, these are self-contained and don’t interact directly with each other. This is often implemented by having each microservice run in its own Docker container, managed by Kubernetes. Each microservice should have its own bounded context (for example, orders, cart, products and customers). When communication between microservices is required (for example, between orders, customers, and products), this is carried out by having each microservice subscribe to the events that directly affect it. Events are processed in the order that they occurred, and a checkpoint records the last event processed. This means:
Event relationships usually fall into two categories:
Typically, in API-first systems, events relate to a specific API call. These calls may be generated by one or more components of the system, through user interaction through a web interface or by a microservice.
In my experience, you may well end up with customers who want to see the events log. That can include events that you never intended to be public, such as those relating to the inner-workings of the system, for example private APIs. These are unlikely to provide useful data points, and ideally you’ll want to find a way to scrub these before you pass the data to your customers.
EDAs typically use RESTful APIs to provides methods for accessing data such as customer information. User interactions are managed through client applications. Each application has its own API. Typically, the API exists in the same network as the microservices it calls, to reduce latency and increase performance. The API also determines what should happen in the case that a specific microservice isn’t available.
A variety of secure authentication options can be integrated into the APIs:
Scheduled reporting can be provided through event subscription, while dashboards can be supported through stream-based aggregations. Typically, this takes the form of Extract, Load, and Transform (ELT) processing. The original events are never modified, only the output.
The biggest problem I’ve found in explaining EDAs is when a business analyst asks for the database schema. EDAs are typically developed around data lakes, where data is stored in its raw format. The database on the backend is likely to be a NoSQL solution such as Mongo DB. I prefer to think of it as a data skip. You throw stuff in there, and it’s in there somewhere, but you don’t know exactly where. The beauty of the ledger is, you really don’t need to know.
Image: Original by kooikkari.