Apache Kafka & Table-Stream Duality
Apache Kafka has found its niche as an event streaming platform that does the following very well.
1. Publish and subscribe to events
2. Store events
3. Process and analyze events
We are grateful to the good folks at Confluence for teaching us about Kafka. They’ve helped us understand concepts such as stream-table duality.
We look at a stream as an immutable record of history and a table as a mutable representation of state. Further, the relationship between streams and tables is:
1. A stream can be represented as a table aggregation, and
2. A table represented as a stream via change data capture
We use either ksqlDB or Kafka Streams to continually upgrade a stream into a table. This is typically a two step process.
First, we create the stream. If using ksqlDB, we can use its CREATE STREAM command. Its analog in Kafka Streams in the KStream interface. This interface is an abstraction of a record stream of key-valued pairs usually called events.
Second, we create a table to serve as a view of the aggregate of events of interest. As you may guess, ksqlDB has a CREATE TABLE command to do that. Its equivalent in Kafka Streams is the KTable interface. This interface is an abstraction of a changelog stream from a primary-keyed table.