The title comes from David Luckham’s great book of the same name.
Over the past three years of so I have been helping to define and build IBM ODM Decision Server Insights, a platform for distributed event processing based on business rules technology. As Chief Architect I’ve had the opportunity to speak to many IBM customers and prospects about their event processing challenges, as well as gaining an understanding of the market dynamics and major players.
Fundamentally event processing platforms allow companies to process event streams, looking for interesting patterns in those event streams and triggering an action when a pattern of interest is detected. Event streams may be high or low volume, and patterns may be simple or complex. In many cases multiple streams are combined (joined/fused) to create a consolidated stream.
Stream processing is typically incremental (stateful) in nature, for efficiency, so rather than dumping all events into a database and periodically querying the database, instead the stream processor maintains context in memory and can react as soon as a trigger event is received. When processing >100 events per second there simply isn’t time to make an out of process call to a database to run a complex query over a large table.
When working with customers I often use the term situation detection to describe the role of an event processor in their overall architecture. Events are a means to an end, the goal is usually to detect a situation and to trigger an appropriate and timely action.
Situations may represent threats (such as fraud or network intrusion) or opportunities (such as marketing campaigns or promotions).
The need for event processing platforms has increased as companies move from an OLTP or batch-oriented worldview, to a view of the world in which events occur, and must be responded to immediately. In some industries this is driven by the prevalence of mobile technology, where you can have a bidirectional 1:1 interaction with an employee or customer, in near real time. For example, a delivery driver can be notified that a customer has changed their availability for delivery, or a customer could be proactively warned that their driver was running late due to heavy traffic on their route.
Simple event patterns may only require looking at a single event in isolation (the value of a field in the event is over a threshold), or may require aggregating over a timespan (the average of a field value over the past 5 minutes exceeds a threshold).
More complex patterns may require combining multiple event streams, for example, detect a car that has entered the city containing a passenger whose face is in the wanted person database. This pattern likely requires combing events from CCTV cameras connected to a facial recognition system, and vehicle license plate recognition systems.
Other complex patterns may require information that is not carried on the incoming events at all, for example, detect a patient that is over 45, has given birth in the past 6 months and whose average heart rate over the past 30 seconds is 20% higher than their average heart rate 30 minutes before. These patterns require the notion of an entity, correlating entities with event streams, and joining across event stream data and entity data.
Finally, some use cases require going beyond merely detecting situations based on events that have already occurred, to predicting that a situation is likely to occur. Predicitive models are fed data from event streams in near real time, and their results are used in the definition of patterns. For example, detect a customer that has purchased more than $500 of products over the past 24 months, has not purchased a product in the past month, and whose predicted churn score is more than 0.8.
A study of different event processing platforms will typically expose their origins. These commonly are one of:
- Pure event stream processors, whose programming model is fundamentally about mathematical operations over unbounded (infinite) sets of events. These are typically very efficient but hard to program and it can be challenging to map from a business problem to a graph of stream processing operators. Including external entity data typically required cumbersome custom coding and performance and caching can be an issue.
- Relational Databases, whose programming model is fundamentally about running SQL-style queries faster and more incrementally. Again, the mapping from business problem to solve to implementation (a set of queries and data structures) can be challenging.
- Big Data Platforms, often batch in nature (e.g. Hadoop or Spark), but without the necessity for a relational data model. Increasingly some platforms are also supporting “micro batch” or stream processing modes of operations (e.g. Spark Streaming). Again, mapping from business problem to solve to a graph of operators over immutable data can be a challenge.
- Business rules platforms, tends to view the problem in terms of building a context from a finite set of events (often a sliding window) and then running rules over that context to detect patterns of interest. Typically it is easier to map from a business problem to an implementation, however memory usage and efficiency can become an issue at high event volumes.
- Business process engines, though this is less common, for example, using asynchronous features in BPMN.
- Message processing engines, though this is less common, for example by combining message transformation, enrichment with a business rules engine that runs declarative logic over received messages/events.
In all cases event processing platforms for enterprise usage are typically distributed systems, as no single node is capable of processing the complete event stream. The event stream is partitioned (distributed across shards) using a consistent partitioning scheme for processing by many nodes in parallel. This tends to make these platforms comparatively hard to functionally test, performance test, manage (HA/DR), and scale out when compared to stateless execution of business logic.
Over the past 24 months there has been an explosion of Open Source event processing toolkits and platforms, many Apache projects contributed by large Internet companies (LinkedIn, Twitter, Google, Facebook etc).