Working with Queues and Streams

Published on July 1, 2024

Target Audience

Technology Executives, Software Engineering/Development Directors & Managers, Software Architects, Software Engineers/Developers, DevOps Engineers, Technical Program/Project Managers, Technical Writers

Abstract

An Event-Driven Architecture (EDA) supports a loosely coupled, or decoupled design by promoting the production, detection, consumption of, and response to events. In EDA, components of a system interact with each other primarily through events. An event is defined primarily as a change in state that is meaningful within a system or business process. One component will publish an event when a certain action has taken place, and other components will react to the event, only if they are "interested" in that action -- e.g. by maintaining a subscription to the publisher.

The goal of any loosely coupled or decoupled architectural system design is to reduce the risk that changes in one component will negatively impact or require changes in others, thus making a system more maintainable, adaptable, and resilient.

Message queues and data streams are both widely used in modern software architecture for communication between systems, with the goal of fully decoupling, or at least only loosely coupling components in distributed systems.

Though similar, there are some distinctions that set each implementation apart for differing use cases. Before selecting and implementing either, some considerations should be made. This paper is an effort to explain the differences and recommend possible implementation strategies.

Table of Contents

Introduction

The choice between a message queue and a stream depends on the specific use case. If the requirement is for processing individual messages independently, a message queue might be the best choice. If the requirement is to process a continuous flow of data in real-time, a stream would be more appropriate.

Event Driven Architecture

In an event driven architecture, services communicate through events. A service will emit an event, and any other service interested in that event can listen for it and react. This reduces direct dependencies between components.

The goal of a loosely coupled, event driven design is to reduce the direct dependencies between services, making the system more modular and easier to change and scale.

Common Schemas

In a distributed system, having shared object model schemas, such as JSON schemas, can significantly reduce errors and provide a common interface for sharing data between systems with disparate code bases. This can be achieved through the following ways:

  • Standardization: Shared schemas provide a standardized way for systems to communicate with each other. Regardless of the underlying code base or technology, every system understands how to read and write data in the agreed-upon format.

  • Validation: Schemas can be used to validate data before it's sent or after it's received. This can catch errors early, before invalid data is processed. For example, a JSON schema can specify required fields, field types, and other constraints.

  • Documentation: Schemas serve as a form of documentation. They describe the structure of the data, making it easier for developers to understand what data is available and how it's structured.

  • Evolution: Schemas can be versioned, allowing the data format to evolve over time without breaking existing systems. New versions of a schema can add new fields but should avoid removing or changing the meaning of existing fields.

In an event-driven architecture, these benefits are particularly important. When a system publishes an event, it doesn't necessarily know (or care) which other systems will consume the event or how another service will process it. By using a shared schema, the publisher can be confident that all consumers will interpret the event data, that they care about, in the same way. This still maintains a loosely-coupled approach, as a shared schema is simply a decoupled validation mechanism.

Use Case

Consider a system where when a user updates their address, and an event is published to a queue with the new address data. If there's a shared schema for this event, all systems that monitor the queue and consume the event can validate the message against the shared schema, and will know exactly what fields to expect, what types they are, and how to validate and/or update the data. This reduces the chance of errors and makes it easier to add new systems or modify existing ones.

Loosely Coupled Systems

In a loosely coupled system, services interact through well-defined interfaces or APIs and are generally unaware of each other's internal workings. They communicate by exchanging messages (events), and the structure of these messages is agreed upon in advance, often by a schema definition. This reduces behavioral coupling and makes the system more flexible and easier to maintain.

Behavioral Coupling

in the context of distributed systems, Behavioral Coupling generally refers to the degree to which one service in the system is dependent on the behavior of another service.

In a highly behaviorally coupled system, a service needs to have knowledge about the internal workings, data structures, or specific algorithms of another service to function correctly. This can lead to problems because a change in one service might require changes in all other services that are behaviorally coupled to it.

Use Case

If Service A sends a command to Service B to perform a specific operation and expects a certain response, this is a form of behavioral coupling. If the operation in Service B changes, Service A also needs to be updated to handle the new behavior.

In an event driven architecture, where services react to events rather than send commands to each other, there is a common way to reduce behavioral coupling in distributed systems.

Coupling of both behavior and data can and [should be avoided]{.underline} by adhering to one or more of the design principles and patterns defined below:

  • Interface Segregation Principle (ISP): This principle states that no client should be forced to depend on interfaces they do not use. By creating small, focused interfaces, you can ensure that components only know about the methods or properties that they need to interact with.

  • Dependency Inversion Principle (DIP): This principle states that high-level modules should not depend on low-level modules. Both should only depend only on abstractions. By depending on abstractions, and not on concrete implementations, a system can avoid behavioral coupling.

  • Data Encapsulation: To avoid data coupling, it's important to hide the internal state of a component and only expose what's necessary through its interface. This can be achieved by using private fields and providing public getter and setter methods (depending on language).

  • DTOs (Data Transfer Objects): If a system needs to share data between components, use DTOs that only contain the necessary data and do not expose the internal state of other components.

The goal of this design principle is to reduce the knowledge that one component has of another to the bare minimum necessary for them to interact. This makes the system more flexible and easier to maintain.

Uses

  • Scalability: Both message queues and streams allow systems to handle high volumes of data by distributing the data across multiple consumers.

  • Resilience: They provide a level of fault tolerance. If a consumer fails, messages can be reprocessed or reassigned to another consumer.

  • Real-Time Processing: Streams enable real-time data processing, which is crucial for many businesses for tasks like real-time analytics, fraud detection, and more.

  • Decoupling: Message queues help in decoupling processes in a system, which can make the system more manageable and less prone to error. In an asynchronous, loosely coupled system:

    • Producers of events do not have knowledge about which components will consume their events. They simply publish the events and move on.

    • Consumers of events do not have knowledge about where the events originated from. They simply react to the events as they arrive.

    • Events are typically placed in a queue or message broker (like RabbitMQ, Kafka, or AWS SQS) to be processed later. This allows the system to handle high loads and spikes in traffic and ensures that events are not lost if a consumer is down or busy.

Use Case

Imagine a bank as a busy restaurant, and the message queue as the kitchen's order system.

When customers (users) place orders (requests), such as making a deposit, withdrawing money, or transferring funds, these orders are sent to the kitchen (the bank's systems) via the order system (message queue).

The cooks (the bank's processes) pick up orders from the system one by one, prepare the dishes (process the transactions), and then deliver them (update the accounts). If a lot of orders come in at once, they're lined up in the order system, ensuring none are lost and the kitchen doesn't get overwhelmed.

If the restaurant gets busy, they can bring in more cooks (scale up the system) to handle the orders. Each chef picks up an order from the system when they're ready, ensuring all orders are processed as quickly as possible.

So, a bank can use a message queue to handle customer transactions efficiently, ensure no transactions are lost, while being able to rapidly scale and keep their systems running smoothly even during peak times.

Core Concepts & Comparisons

Data Retention

Data retention refers to how long messages are kept before they are discarded. In both cases of queues and streams, data retention is about managing storage and ensuring that messages are available when needed but also making sure that storage doesn't fill up with old, unnecessary data.

  • Queues: Once a message is consumed from the queue, it is typically deleted and cannot be consumed again. This is ideal for tasks that should only be processed once.

  • Streams: Messages are retained for a certain period (configurable), allowing them to be consumed multiple times. This is useful for tasks that need to be processed by multiple consumers or reprocessed in case of failures.

Use Case

AWS Kinesis Streams has a feature called data retention. By default, Kinesis retains the data in a stream for 24 hours. This means that once an event is added to the stream, it is available to be read for the next 24 hours. After that, it's automatically deleted.

This 24-hour retention period allows the system to have a full day to process events. For example, they might have a system that reads the events from the stream every hour to update user statistics.

Message Ordering

Message ordering refers to the sequence in which messages are sent and received in the context of streams and queues. In simpler terms, if you think of messages like a line of people waiting for a bus (queue), people get on the bus in the order they arrived at the bus stop. But if there are multiple buses (partitions in a stream), while people get on each bus in order, you can't guarantee the order of people getting on across all buses.

  • Queues: Some message queues, like Amazon SQS Standard Queues, do not guarantee the order of messages. However, FIFO (First-In-First-Out) queues like Amazon SQS FIFO Queues do maintain the order.

  • Streams: Messages are ordered within a partition in a stream. This is useful for tasks where the order of events matters.

The more common message ordering strategies are:

  • FIFO (First-In-First-Out): This is the most common ordering method. Messages are sent and received in the order they are placed in the queue. The first message sent is the first one received. Many message queue systems, like Amazon SQS and RabbitMQ, support FIFO queues.

  • Priority Queuing: In some systems, each message is assigned a priority. Messages with higher priority are dequeued before messages with lower priority. Note that this can lead to situations where low-priority messages are starved if there are always higher priority messages in the queue.

  • Topic/Subtopic Ordering: In publish-subscribe systems, messages might be ordered within each topic or subtopic. Subscribers to a particular topic or subtopic receive messages in the order they were published.

Again, this can be both a feature of the system or a method of implementation, but the most typical strategies for strict ordering (if your system depends on it) are:

  • Sequential Processing: One way to maintain order is to process messages sequentially on a single thread. However, this can limit throughput and scalability.

  • Message Grouping: Some systems allow messages to be grouped together and guarantee that messages in the same group are delivered in order. This allows for parallel processing while maintaining order within each group.

Use Case

An e-commerce website uses RabbitMQ for processing orders.

When a customer places an order on the website, an order message is sent to a RabbitMQ queue. This message might include details like the order ID, the customer ID, the items ordered, etc.

RabbitMQ guarantees that messages sent to a single queue will be received in the same order they were sent. This is important for the e-commerce website because it means that orders will be processed in the same sequence they were placed. This is known as First-In-First-Out (FIFO) ordering.

For example, if Customer A places an order before Customer B, the order from Customer A will be processed first. This ensures fairness - customers who place their orders earlier will have their orders processed earlier.

However, if the e-commerce website needs to scale up and starts using multiple queues (for example, one queue for each category of products), the order of messages across different queues might not be maintained. In this case, the website might need to implement additional logic to handle out-of-order processing.

Throughput

Message throughput refers to the number of messages that a system can process in a given amount of time. It's like the speed of a conveyor belt in a factory - the faster the belt, the more items it can move from one end to the other in a certain period.

  • Queues: Queues typically have high throughput and can handle sudden spikes in traffic by scaling automatically.

  • Streams: Streams can also handle high throughput, but they are especially good at handling a continuous flow of data and processing it in real-time.

Use Case

In a newly built smart city, there are thousands of IoT (Internet-of-Things) devices installed all over the city, including traffic sensors, air quality monitors, and weather stations. These devices continuously send data to an Azure Event Hub, which acts as the entry point for the data stream.

Azure Stream Analytics is then used to process this incoming data. The throughput of Stream Analytics - that is, the amount of data it can process per unit of time - is crucial for the success of this project.

For example, if each IoT device sends 1 message per second, and there are 10,000 devices, then the system needs to handle 10,000 messages per second. If the throughput of the Stream Analytics job is less than this, it won't be able to keep up with the incoming data, and messages will start to backlog, leading to delays in data processing and potential loss of data.

Replay

The concept of replay in the context of message queues and message streams refers to the ability to reprocess messages or events from a certain point in time.

  • Queues: Traditional message queues typically do not support replaying messages. Once a message is consumed and acknowledged, it is removed from the queue and cannot be consumed again. If you need to support replaying messages with a message queue, you will need to store the messages somewhere else after consuming them, which can add complexity to your system.

  • Streams: One of the key features of message streams is the ability to replay messages. Message streams keep all messages for a certain period (the retention period), regardless of whether they have been consumed. This allows consumers to "rewind" the stream and reprocess messages from any point within the retention period. This can be very useful in scenarios where you need to reprocess events due to changes in business logic or to recover from errors.

Use Case

In Apache Kafka, a popular message streaming platform, each consumer keeps track of its offset (the position in the stream) for each partition it is consuming. To replay messages, the consumer can simply reset its offset to an earlier position. This makes it easy to reprocess messages in the event of a failure or a change in processing logic.

Consumers

In the context of streams and queues, a consumer is like a worker on an assembly line in a factory. In both cases of queues and streams, the workers (consumers) are responsible for handling the boxes (messages) that come down the line, doing whatever work needs to be done with them.

  • Queues: Typically used in a point-to-point communication system where each message is consumed by a single consumer.

    • In some cases, this is a command, where the sender intends to invoke a behavior from a consumer.

    • In all cases within a decoupled system, the publisher has no knowledge of how the message is consumed, or what service is consuming it; it is up to the consumer to understand the purpose of the message, the schema, and the data within.

    • Many times, these commands are business-critical and well-suited to queues.

  • Streams: Typically used in a publish-subscribe system where each message can be consumed by multiple consumers.

    • Streams tend to be less business-critical, meaning a missed, dropped, or unprocessed message has a minimal business impact.

Use Case

Imagine a queue as a conveyor belt carrying boxes (messages). Each worker (consumer) picks up a box from the conveyor belt, does some work with it (processes the message), and then the box is removed from the conveyor belt. If there are more boxes coming in than a single worker can handle, you can add more workers (consumers) to pick up and process boxes.

Now, imagine a stream as multiple conveyor belts side by side (partitions). Each worker (consumer) might be responsible for one or more conveyor belts. They pick up boxes (messages) from their assigned conveyor belts and process them. If boxes start coming in faster, you can add more conveyor belts (partitions) and more workers (consumers) to keep up with the pace.

Delivery Semantics

Delivery semantics in the context of streams and queues refer to the rules that determine how messages are delivered from the producer (sender) to the consumer (receiver).

The most common delivery semantics are:

  • At-Most-Once: This means that each message is delivered once or not at all. In the event of a failure, it's possible that a message may be lost. This is the weakest delivery guarantee and is typically used in systems where occasional message loss is acceptable.

  • At-Least-Once: This means that each message is guaranteed to be delivered, but it may be delivered more than once. This can happen if a consumer fails after processing a message but before acknowledging it, causing the message to be redelivered. This is a stronger guarantee than at-most-once delivery, but it requires consumers to be idempotent, meaning they can handle duplicate messages.

  • Exactly-Once: This means that each message is guaranteed to be delivered exactly once - no more, no less. This is the strongest delivery guarantee and is typically used in systems where message duplication or loss could lead to incorrect results. However, exactly once delivery is more complex and resource-intensive to implement than the other delivery semantics.

Use Case

Consider a real-world example of a ride-sharing app like Uber or Lyft that uses AWS Kinesis Streams for real-time processing of ride data.

At-Most-Once: When a driver accepts a ride request, a message is sent to a Kinesis Stream. The ride-sharing app's backend system consumes this message to update the ride status. However, if the message fails to be delivered or processed due to a network issue or system failure, the ride status might not be updated. This is an example of at-most-once delivery semantics, where each message is delivered once, if at all, but there's no guarantee of delivery.

At-Least-Once: To ensure that ride status updates are not missed, the ride-sharing app could implement a retry mechanism. If the backend system doesn't confirm receipt of the message, it's sent again. This could result in the same message being processed multiple times, but it ensures that no ride status updates are missed. This is an example of at-least-once delivery semantics.

Exactly-Once: To avoid processing the same message multiple times, the ride-sharing app could implement a mechanism to track which messages have been processed. When a message is received, the system checks if it has already been processed. If not, it processes the message and updates the ride status; if it has, it discards the message. This ensures that each message is processed exactly once.

In this example, the delivery semantics of AWS Kinesis Streams help the ride-sharing app ensure that ride status updates are processed in a reliable and efficient manner.

Durability

Durability in the context of a message stream refers to the guarantee that once a message has been successfully added to the stream, it will not be lost and can be consumed, even in the event of failures.

This is typically achieved through a combination of persistent storage and replication:

  • Persistent Storage: When a message is added to the stream, it is stored in a durable storage system (like a disk or a distributed file system). This ensures that the message will not be lost even if the process that added the message crashes or restarts.

  • Replication: To protect against hardware failures, each message is replicated across multiple nodes in the system. If one node fails, the message is still available on the other nodes.

  • Acknowledgements: When a message is added to the stream, the sender receives an acknowledgement once the message has been successfully stored and replicated. This allows the sender to know that the message has been durably stored. In addition, many message streaming systems provide a retention policy, which specifies how long messages should be kept in the stream before they are deleted. This allows consumers to read and re-read the same data multiple times, which can be useful for scenarios like event sourcing or reprocessing data after a bug fix.

Use Case

Consider a banking system where customers are making transactions like transfers, deposits, and withdrawals. Each transaction needs to be processed reliably and in the correct order to maintain the integrity of the system.

When a customer initiates a transaction, the transaction details could be placed in a durable queue or stream. The banking system can then process transactions one at a time from the queue. If the system crashes or restarts during processing, the durable queue ensures that no transactions are lost. When the system comes back online, it can continue processing from where it left off.

Without durability, a system crash could result in lost transactions, leading to discrepancies in customers' account balances and a loss of trust in the banking system. Therefore, in such a scenario, durability is crucial to ensure reliable and accurate processing of financial transactions.

Implementations

Message Queues

A message queue is a form of asynchronous service-to-service communication used in serverless and microservices architectures. Messages are stored on the queue until they are processed and deleted. Each message is processed only once, by a single consumer. Message queues can help to decouple heavy load tasks, provide some level of fault tolerance, and improve system responsiveness.

There are several common implementations of message queues in hosted services:

  • Amazon Simple Queue Service (SQS)

  • Azure Service Bus

  • Google Cloud Pub/Sub

  • RabbitMQ

Considerations

Message queues decouple the sender (producer) and receiver (consumer) of messages. This means that the producer and consumer don't need to interact with the message at the same time. The producer can add messages to the queue without waiting for the consumer to process them, and the consumer can process messages at its own pace.

Scalability

Message queues provide several advantages in terms of scalability:

  • Load Balancing: Message queues can distribute processing tasks across multiple workers (see Competing Consumers below), which can be scaled up or down depending on the load. If the number of messages being added to the queue increases, you can simply add more workers to handle the increased load.

  • Decoupling: Message queues decouple the producers and consumers of messages. This means that you can scale the producers and consumers independently. If you have a high rate of message production but slow processing, you can add more consumers without having to change the producer service.

  • Resilience: If a consumer fails while processing a message, the message can be returned to the queue or moved to a dead-letter queue to be retried or processed later. This means that you can handle temporary spikes in load by processing the backlog of messages when the load returns to normal.

  • Throttling: If your consumers have a limit to how many messages they can process per unit of time (for example, due to API rate limits), a message queue can hold the messages until the consumers are ready to process them, preventing overloading of your consumers.

  • Parallel Processing: Message queues allow for concurrent processing of messages. If the messages are independent of each other, multiple consumers can process multiple messages at the same time, leading to significant increases in throughput.

Competing Consumers

The Competing Consumers pattern is a design pattern that can be used in distributed systems where multiple consumers concurrently process messages or events from the same message channel (like a queue or a stream). The goal is to improve system performance, reliability, and scalability.

In an event-driven architecture, events are typically published to a message queue. In a simple scenario, a distributed system might have one consumer that reads and processes these events. However, if the rate of incoming events is high, that single consumer might not be able to keep up.

This is where the Competing Consumers pattern comes in. Instead of having a single consumer, an implementation is made to have multiple consumers, with all of them reading from the same message queue. These consumers are now "competing" in the sense that they are all trying to read and process messages as they arrive.

A typical Competing Consumer implementation works like this:

  1. Each consumer independently reads a message from the queue.

  2. The consumer processes the message. This could involve performing some computation, updating a database, calling an API, etc.

  3. Once the consumer has successfully processed the message, it acknowledges the message to the message queue.

  4. The message is then removed from the queue.

  5. If a consumer fails to process a message (for example, if it crashes), the message is not acknowledged and remains in the queue to be processed by another consumer.

This pattern allows the system to process messages more quickly by distributing the load across multiple consumers. It also improves reliability, as a failure in one consumer doesn't prevent other messages from being processed by other consumers.

However, it does introduce complexity. For example, ensuring that messages are processed in the correct order if that's important for the service. Also, handling a case where two consumers process the same message (for example, if a consumer crashes after processing a message but before acknowledging it).

Durability

Message queues can offer significant advantages in terms of durability, but there can also be potential disadvantages depending on the specific implementation and use case.

Advantages
  • Reliability: Message queues can ensure that messages are not lost in the event of a system failure. Once a message is added to the queue, it remains there until it is successfully processed and acknowledged by a consumer. This can be crucial in systems where data loss is unacceptable.

  • Recovery: In the event of a consumer failure during message processing, the message can be returned to the queue or moved to a dead-letter queue for later retry or analysis. This ensures that messages are not lost due to temporary issues or bugs in the consumer.

Disadvantages
  • Complexity: Implementing durability in a message queue can add complexity to the system. For example, you'll need to handle scenarios like consumer failures and unacknowledged messages. This can make the system harder to design, implement, and maintain.

  • Performance: Durable message queues require persistent storage to ensure that messages are not lost if the queue service fails. Writing to persistent storage can be slower than writing to memory, so this can potentially impact performance. However, many message queue services offer options to balance durability and performance according to your needs.

Delivery Semantics

Message queues generally operate under three types of delivery semantics: at most once, at least once, and exactly once, as described in the above section. These semantics describe how often a message is delivered in the face of failures or network issues.

It's important to note that these are idealized models. In practice, achieving exactly once delivery semantics can be very challenging due to the complexities of distributed systems and network communications.

Often, message queue systems aim for at least once delivery and make efforts to handle or minimize duplicates at the application level.

Ordering

Message queues can be designed to maintain the order of messages, but this comes with its own set of trade-offs.

Ordered Delivery

In an ordered delivery system, messages are delivered to consumers in the same order they were added to the queue. This is often achieved by assigning a sequence number to each message as it arrives.

Pros

Predictability: Ordered delivery ensures that if message A was sent before message B, A will be processed before B. This is crucial in scenarios where the order of operations matters, such as financial transactions.

Cons

Performance: Maintaining order can slow down processing. If one message is slow to process, subsequent messages can't be processed until the slow message is done, even if they're independent of the slow message. This is known as the "head-of-line blocking" problem.

Complexity: Implementing ordered delivery can add complexity to the system. For example, you'll need to handle scenarios where a message fails to process. Do you move on to the next message, or do you retry the failed message and delay all subsequent messages?

Unordered Delivery

In an unordered delivery system, messages are delivered to consumers in any order. This allows for more parallelism, as multiple messages can be processed at the same time.

Pros

Performance: Unordered delivery can lead to higher throughput, as multiple messages can be processed in parallel.

Cons

Unpredictability: With unordered delivery, you can't predict the order in which messages will be processed. This can be a problem in scenarios where the order of operations matters.

Ease of Use

Message queues, generally, are simpler and more straightforward to use. They follow a basic producer-consumer model where messages are added to the end of the queue and removed from the front. This makes them very easy to understand and implement.

Message queues are great for tasks that need to be executed in the future or distributed among multiple workers. They provide a simple way to decouple components of a system, allowing them to scale independently.

However, message queues typically only allow a message to be processed by a single consumer. If you need to broadcast a message to multiple consumers, you might need to use a publish-subscribe model or use multiple queues.

Cost

This is a general overview using AWS cost data as of 2024.

Parameters:

  • 1,000,000 message per day

  • 10 FIFO queues

Since the first million messages per month are free, sending 29M messages per month, the monthly cost for this would be approximately $14.50.

The cost for message queues is typically lower than the cost of a message stream, but again, they serve different purposes.

Message Streams

A stream is a sequence of data elements made available over time. It is particularly useful for handling real-time data. Unlike message queues, streams are designed to handle real-time data and can process multiple messages concurrently. Streams can also handle "replay" scenarios, where the same data is consumed multiple times for different purposes.

There are several common implementations of message streams in hosted services:

  • Apache Kafka

  • Azure Event Hubs

  • Amazon Kinesis

Considerations

The primary purpose of a message stream is to handle continuous, real-time data. It's about processing and analyzing a continuous flow of data, often with the ability to process the same data multiple times.

Scalability

Message streams are designed to handle high volumes of data and can scale horizontally to support high throughput and many concurrent consumers.

A message stream implementation can scale under heavy load through a combination of partitioning and replication:

  • Partitioning: The stream of messages is divided into partitions, each of which can be processed independently. This allows the load to be distributed across multiple consumers. Each partition maintains the order of its messages, but there's no guaranteed order across partitions. The key used for partitioning can be chosen based on the specific requirements of the application. For example, you might partition based on user ID to ensure all messages for a particular user are processed in order.

  • Replication: Each partition can be replicated across multiple nodes for fault tolerance. This ensures that if one node fails, another can take over without data loss. The degree of replication (i.e., the number of copies of each partition) can be adjusted based on the reliability requirements of the application.

  • Consumer Groups: In some systems, like Apache Kafka, you can use consumer groups to allow multiple consumers to work together to process data from the partitions. Each consumer within a group will read from a unique subset of partitions, allowing the data processing to be distributed across the consumers in the group.

  • Auto-scaling: Some cloud-based stream processing services, like Amazon Kinesis or Azure Event Hubs, can automatically adjust the number of shards (partitions) based on the volume of incoming data. This allows the system to scale up during periods of high load and scale down when the load decreases.

Durability

Message streams, like those provided by Apache Kafka or Amazon Kinesis, are designed with durability in mind to ensure that data is not lost in the event of a failure. Here's how they achieve this:

  • Replication: Message streams typically replicate data across multiple nodes or servers. This means that even if one node fails, the data is still available on other nodes. The degree of replication (i.e., the number of copies of data) can often be configured based on your needs for data redundancy and durability.

  • Persistent Storage: Message streams store data on disk, not just in memory. This means that even if the service is restarted or crashes, the data is not lost and can be recovered when the service comes back online.

  • Retention Policies: Message streams can be configured to retain data for a certain period. This means that even after a message has been read, it remains in the stream and can be re-read if necessary. This is useful for scenarios where you need to reprocess data or recover from a failure.

  • Checksums: To protect against data corruption, message streams often use checksums. A checksum is a value calculated from the data that can be used to check whether the data has been altered. If the data is corrupted, the checksum will not match, and the corruption can be detected.

Delivery Semantics

Message streams allow multiple consumers to read from the same stream independently at their own pace. This means that messages in a stream are not removed after being read. Instead, each consumer keeps track of its own position in the stream (often called an offset). This allows for multiple reads and replayability of the stream.

Delivery semantics in message streams are typically "at least once", as consumers can read and process the same message multiple times. However, ensuring "exactly once" semantics can be more complex and often requires additional coordination between the producer, consumer, and the stream itself.

Ordering

In message streams, events are produced, sent, received, and processed in the same order. However, due to network delays, system failures, or other factors, this isn't always the case. Therefore, most systems (like Kafka and AWS Kinesis) will typically provide mechanisms to ensure ordering.

Shards and Partitions

For example, Amazon Kinesis ensures ordering of records, with the order of records being preserved within a shard (similar to Kafka's partition). When data is put into a Kinesis stream, a partition key for each record is specified, and Kinesis uses this key to determine which shard the record goes to.

A Kinesis Stream is composed of one or more shards. The total capacity of a stream is the sum of the capacities of its shards. As of the time of this paper, a Kinesis shard provides a capacity of 1MB/sec data input, and 2MB/sec data output, with a shard being able to support 1,000 record writes/second.

Each shard in a Kinesis Stream has a sequence of data records. Each data record has a sequence number that is assigned by Kinesis. Data records are ordered by their sequence number in a shard. Records with the same partition key go to the same shard, preserving their relative order.

Ease of Use

Message streams can be more complex to set up and manage compared to message queues. They often require careful configuration to ensure performance and reliability, and consuming messages in order can be challenging if you have multiple consumers.

When handling high volumes of messages, supporting multiple consumers, and maintaining the order of messages, a message stream is more suitable, but it can also be more complex to set up and manage.

Cost

Pricing data for managed streams can vary greatly year by year, but the following are provided as a general comparison based on data available at the time of this writing.

Parameters:

  • 1,000,000 message per day

  • 10 FIFO queues

AWS Kinesis Data Streams

This is a general overview using AWS Kinesis Data Streams cost data as of 2024.

The cost per shard-hour is $0.015, and the cost per 1 million PUT payload units was $0.014.

A PUT payload unit is counted in 25KB chunks. If messages are 25KB or less, each message is one PUT payload unit.

Here's a rough calculation:

  • Shard cost: 10 shards * 24 hours/day * 30 days/month * $0.015/shard-hour = $108/month

  • Data cost: If each message is 25KB or less, 1 million messages/day * 30 days/month * $0.014/million messages = $0.42/month

This would give a total cost of approximately $108.42/month.

Confluent Cloud

Confluent Cloud is a fully managed Kafka service, and the pricing is based on a few factors:

  • Storage: The amount of data you store in your topics.

  • Ingress: The amount of data you send into your Kafka cluster.

  • Egress: The amount of data you read from your Kafka cluster.

  • Data Retention: The length of time data is stored.

Assuming each message is 1KB, 1 million messages would be approximately 1GB of data per day, or 30GB per month.

Here's a rough calculation based on Confluent Cloud's pricing as of 2021:

  • Storage: $0.10 per GB-month * 30GB = $3.00

  • Ingress: $0.11 per GB * 30GB = $3.30

  • Egress: Assuming you read the data once, $0.11 per GB * 30GB = $3.30

So, the total cost would be approximately $9.60 per month for storage and data transfer.

Summary

Message Queues

  1. Processing: Messages in a queue are typically processed by a single consumer and then removed. Once a message is consumed, it's no longer available in the queue.

  2. Ordering: While some queue systems maintain the order of messages, it's not guaranteed in a distributed system with multiple consumers.

  3. Scalability: Queues can scale by adding more consumers, but this can lead to out-of-order processing.

  4. Use Cases: Queues are often used for task distribution and load balancing where order is not critical, however, messages are often business critical.

Message Streams

  1. Processing: Messages in a stream can be consumed by multiple consumers, without being removed from the stream. This allows for replayability of messages.

  2. Ordering: Streams maintain the order of messages, which is critical for certain use cases like event sourcing, log processing, and real-time analytics.

  3. Scalability: Streams can scale while maintaining the order of messages, but typically require partitioning of data.

  4. Use Cases: Streams are often used for real-time data processing, event-driven architectures, and maintaining state changes in an application. Messages within a stream are typically not business critical, so the occasional drop or lost message is not critical.

References

  1. Amazon Web Services (2024, June 27). AWS Well Architected Framework: Implement Loosely Coupled Dependencies. Docs.aws.com. https://docs.aws.amazon.com/wellarchitected/latest/framework/rel_prevent_interaction_failure_loosely_coupled_system.html

  2. Pandio.com (2023, April 7). How Event-Driven Architectures Benefit from Stream Processing. Iron.io. https://pandio.com/event-streams-queues

  3. Villalba, M. (2022, October 29). Introduction to Event-Driven Architectures. Marcia Villalba. https://blog.marcia.dev/introduction-to-event-driven-architectures

  4. Bhaduri, K. (2023, February 19). Message Queue vs Streaming. Iron.io. https://blog.iron.io/message-queue-vs-streaming

  5. Cleary, S. (2021, January 14). Asynchronous Messaging Part 2: Durable Queues. Stephen Cleary (the Blog). https://blog.stephencleary.com/2021/01/asynchronous-messaging-2-durable-queues.html

  6. John, S. (2019, June 29). Kafka Producer Delivery Semantics. Medium.com. https://medium.com/@sdjemails/kafka-producer-delivery-semantics-be863c727d3f

  7. (n.d.). What is Stream Processing? Macrometa.com. https://www.macrometa.com/topics/stream-processing

  8. (n.d.). What is a Message Queue? Macrometa.com. https://www.macrometa.com/articles/message-queue

  9. AWS SQS Pricing (2024), https://aws.amazon.com/sqs/pricing

  10. AWS Kinesis Stream Pricing (2024), https://aws.amazon.com/kinesis/data-streams/pricing

  11. Confluent Cloud (Kafka) Pricing (2024), https://www.confluent.io/confluent-cloud/pricing