Monday, January 13, 2020

Distributed Systems: Orchestration, Invocation, Events


A very typical concern in distributed systems is the need to coordinate logic in different domains / bounded contexts. Let’s say we have two domains, A and B. When A performs a bit of logic ‘a’, B needs to perform corresponding logic ‘b’. There are generally three ways to accomplish this:

1. Create an orchestrator:


Pros:


  • A and B remain simple and agnostic of each other.

Cons:


  • Orchestrator absorbs the complexity of knowing about both A and B, interfacing with APIs exposed by A and B, implementing rollback and/or retry logic to mitigate failures. 
  • If either A or B change, the Orchestrator has to accommodate those changes. 
  • If we want to add another version of B (let’s call it B′) or add another participant C, we have to modify the Orchestrator to deal with all such changes. 
  • Orchestrator is in danger of becoming a god service — one that has to know everything about everything and that has to stay in sync with changes throughout the system. 
  • This is a path to a distributed monolith.

2. Direct invocation:


Pros:


  • Simplicity.

Cons:


  • Tight coupling between A and B: A now has to understand the logic of B and interface with its APIs. If B changes, A might have to change. 
  • This effectively forces A to be the orchestrator, violating the single responsibility principle. 
  • A becomes stateful, holding state until the invocation of B successfully completes or handling failures. 
  • While simple with just A and B, in the context of dozens or hundreds of services, point to point invocations quickly become difficult to manage, highly dependent on discovery and topology, etc. 


3.1. Event-Driven Architecture (Uncompensated): 


Pros: 


  • A and B completely agnostic of each other 
  • Loose coupling, allowing adding new publishers and subscribers without having to modify existing actors 
  • With event sourcing based on appropriate message transport (e.g. Kafka), ability to replay/re-process events 

Cons: 


  • Assumes reliability of message transport or ability by subscribers to compensate 


3.2. Event-Driven Architecture (with Observers): 


Pros: 


  • A and B completely agnostic of each other 
  • Loose coupling, allowing adding new publishers and subscribers without having to modify existing actors 
  • With event sourcing based on appropriate message transport (e.g. Kafka), ability to replay/re-process events 
  • Observer processes can be simple, only looking to match “done ‘a’” with “done ‘b’” without knowledge of details of ‘a’ and ‘b’, and flagging missing correlates 

Cons: 


  • Requires a bit of extra sophistication to implement 


Conclusion:

Option 3.2 is clearly the best pattern, even if it requires familiarity with implementation of this pattern. I’d argue that if a team or org are embarking on building distributed systems, it should invest in building the requisite expertise to do it right. 

Luckily, implementations of this core pattern are increasingly available in the open-source world, without the need to build such plumbing from scratch.