Skip to main content

Choices: Events/Messaging

Event-Driven APIs

Webhooks

  • Web hooks are a mechanism for sending real-time event notifications from one application to another over HTTP.
  • A receiving application can register a URL to receive notifications when specific events occur in the sending application.
  • Wide range of use cases, such as triggering a notification when a new customer is added or when an order is shipped.
  • Benefits
    • Real-time event notifications, allowing for more immediate and responsive workflows.
    • Lightweight and simple to implement, requiring only a URL to be registered and a notification payload to be sent.
    • Highly customizable, allowing developers to specify which events to receive notifications for and how to process them.
    • WReliable way to receive notifications, as they are delivered over HTTP and can be retried if delivery fails.
  • Costs
    • API provider responsible for failures and retries.
    • Security measures like firewalls may block webhooks.
    • Many events in a burst may cause a webhook to fail or extra complexity.
    • It may be difficult to rewind evens from a webhook if it fails.

Web Sockets

  • Web sockets provide a persistent, bi-directional communication channel between a client and server over a single TCP connection.
  • With web sockets, a client can send and receive real-time data without the need for repeated HTTP requests and responses.
  • Web sockets can be used for a wide range of use cases, such as chat applications or real-time data visualization.
  • Benefits:
    • Real-time, low-latency communication, allowing for highly responsive applications.
    • Single TCP connection, reducing the overhead and complexity of managing multiple connections.
    • Full-duplex communication, allowing data to be sent and received simultaneously.
    • Highly scalable, as they can handle large numbers of concurrent connections.
  • Costs:
    • Client must maintain a persistent connection, which can be difficult to manage or have less reliability.
    • Scaling web sockets can be difficult, as they require a single connection per client.
    • Same-origin policy, which prevents web sockets from being used across domains.

HTTP Streaming

  • HTTP streaming is a mechanism for sending real-time data from a server to a client over HTTP.
  • With HTTP streaming, a server can send data to a client in real-time without the need for repeated HTTP requests and responses.
  • HTTP streaming can be used for a wide range of use cases, such as real-time data feeds or stock tickers.
  • Chunked transfer encoding is a mechanism for sending data in chunks over HTTP.
  • Server-sent events (SSEs) are a mechanism for sending real-time data from a server to a client over HTTP.
  • Use by Twitter for example for messaging updates.
  • Benefits:
    • Provides real-time data delivery, allowing for more immediate and responsive workflows.
    • Single HTTP connection, reducing the overhead and complexity of managing multiple connections.
    • Full-duplex communication, allowing data to be sent and received simultaneously.
    • Highly scalable, as it can handle large numbers of concurrent connections.
  • Costs
    • Bidirectional communication requires a persistent connection, which can be difficult to manage or have less reliability.
    • Buffering can cause delays in data delivery.

Service Choices: Events/Messaging

Current popular choices include:

  • Apache Kafka: highly scalable event streaming and messaging.
  • AWS SQS: basic message queue.
  • AWS SNS: AWS pub/sub
  • AWS Event Bus: larger scale rule-based pub/sub.

Message Brokers, Reasons to use

  • Organizing Data

    • Message brokers can be used to organize data from different sources and provide a single access point for data.
    • For example, a message broker can be used to compile data from webhooks and APIs and make it available to applications in a single place.
  • Reliable Messaging

    • Message brokers provide reliable messaging by ensuring that messages are delivered in order, with all messages being processed, and in a timely manner.
    • For example, a message broker can be used to ensure that messages sent between two applications are received, processed, and responded to.
  • Load Balancing

    • Message brokers can be used to balance the load of messages across multiple applications.
    • For example, a message broker can be used to distribute messages to multiple applications for processing, ensuring that no single application is overwhelmed by the message load.
  • Integrating Applications

    • Message brokers can be used to integrate applications by providing a central hub for communication.
    • For example, a message broker can be used to enable communication between two different applications, providing a single access point for sending and receiving messages.
  • Scalability

    • Provide scalability by allowing applications to scale up and down as needed.
    • For example, a message broker can be used to add additional applications to a system as needed, ensuring that the system is able to handle increased message load.
  • High Availability

    • Message brokers provide high availability by ensuring that messages are sent and received even when applications fail.
    • For example, a message broker can be used to provide redundancy and failover for applications, ensuring that messages are delivered even if one application fails.
  • Data Transformation

    • Message brokers can be used to transform data from one format to another.
    • For example, a message broker can be used to convert XML data into JSON data, or vice versa.
  • Message Filtering

    • Message brokers can be used to filter messages based on certain criteria.
    • For example, a message broker can be used to filter out messages that contain sensitive information, ensuring that they are not sent to the wrong recipients.
  • Message Routing

    • Message brokers can be used to route messages based on certain criteria.
    • For example, a message broker can be used to route messages to the appropriate application based on the message content.

Top Messaging software

  1. Apache Kafka: An open-source distributed streaming platform that enables real-time data processing, helps in building real-time data pipelines, and allows for the creation of event-driven applications.

  2. RabbitMQ: An open-source message broker software that implements the Advanced Message Queuing Protocol (AMQP) and supports multiple messaging protocols.

  3. Apache Pulsar: An open-source distributed publish-subscribe messaging system that provides strong durability guarantees and supports multiple programming languages.

  4. Apache ActiveMQ: An open-source message broker written in Java that supports a wide range of messaging protocols and is highly configurable.

  5. Google Cloud Pub/Sub: A globally distributed messaging system by Google Cloud that enables real-time communication between independent applications.

  6. Microsoft Azure Event Grid: An event routing service in Microsoft Azure that enables event-driven architecture by delivering events to specific handlers.

  7. Amazon Simple Notification Service (SNS): A highly scalable and flexible pub/sub messaging service provided by Amazon Web Services (AWS).

  8. Apache RocketMQ: An open-source distributed messaging and streaming platform that provides low latency, high throughput, and reliability for large-scale applications.

  9. Apache Storm: An open-source distributed real-time computational system that processes large amounts of data in parallel.

  10. Apache Flink: An open-source stream processing framework that can handle both batch and real-time data processing tasks.

Event Bus, reasons to use

  • Organize events

    • An event bus allows you to organize events that can be triggered and subscribed to across your application. This helps to reduce code complexity and improve maintainability.
  • Centralize communication

    • An event bus helps to centralize communication between different parts of your application and allows for simpler implementations of complex functionality.
  • Decoupling

    • Event buses allow for decoupling of components and services which can help to improve scalability and maintainability of the application.
  • Event-driven architecture

    • An event bus can be used to implement an event-driven architecture that allows for better scalability and performance.
  • Easy debugging

    • By centralizing the communication of events within an application, debugging is simplified as errors can be traced back to the source of the event.
  • Improved flexibility

    • The decoupling of components and services allows for easier changes and improvements to be made without affecting the functionality of the application.
  • Asynchronous processing

    • An event bus can be used to process events asynchronously, allowing for improved performance and scalability.
  • Simplified testing

    • By using an event bus, testing can be simplified as it allows for easy mocking and stubbing of events.
  • Real-time updates

    • An event bus can be used to provide real-time updates to different parts of the application. This is especially useful in applications that require up-to-date data.
  • Event logging

    • An event bus can be used to log the events that occur within an application, allowing for better tracking and analysis of the application's behavior.

Kafka

Differences between Kafka and SQS

SQS:

  • a managed message queue service
  • provides a way to send, store, and receive messages between applications
  • designed to be a highly available and scalable solution for exchanging messages between applications.
  • SQS supports both standard and FIFO (First-In-First-Out) queues, and
  • provides a reliable way to exchange messages between applications.
  • SQS is a managed message queue service provided by Amazon Web Services (AWS).
  • SQS is a good choice for small-scale, simple messaging scenarios,
  • open-source stream processing platform
  • provides a publish-subscribe messaging system.
  • Unlike SQS, which is focused on delivering messages in a guaranteed order, Kafka is designed for high-speed, high-throughput processing of real-time data streams.
  • Kafka provides a distributed architecture that allows for parallel processing of streams, making it well-suited for large-scale data processing tasks.
  • Apache Kafka is a powerful, open-source platform for high-speed, large-scale data processing.
  • a better choice for high-speed, large-scale data processing and real-time data streams.

Kafka Docs: https://kafka.apache.org/documentation/

"Kafka combines three key capabilities so you can implement your use cases for event streaming end-to-end with a single battle-tested solution:

  • To publish (write) and subscribe to (read) streams of events, including continuous import/export of your data from other systems.
  • To store streams of events durably and reliably for as long as you want.
  • To process streams of events as they occur or retrospectively."

"What can I use event streaming for? Event streaming is applied to a wide variety of use cases across a plethora of industries and organizations.

Its many examples include:

  • To process payments and financial transactions in real-time, such as in stock exchanges, banks, and insurances.
  • To track and monitor cars, trucks, fleets, and shipments in real-time, such as in logistics and the automotive industry.
  • To continuously capture and analyze sensor data from IoT devices or other equipment, such as in factories and wind parks.
  • To collect and immediately react to customer interactions and orders, such as in retail, the hotel and travel industry, and mobile applications.
  • To monitor patients in hospital care and predict changes in condition to ensure timely treatment in emergencies.
  • To connect, store, and make available data produced by different divisions of a company.
  • To serve as the foundation for data platforms, event-driven architectures, and microservices."

According to Apache Kafka docs, it's used in a variety of industries, including:

  • Grab deploys, maintains, operates, and expands on Kafka's capabilities to support TB/hour scale, mission critical event logs, event sourcing and stream processing architectures.
  • Pinterest uses Apache Kafka and the Kafka Streams API at large scale to power the real-time, predictive budgeting system of their advertising infrastructure
  • Airbnb event pipeline, exception tracking & more to come.
  • Cloudflare uses Kafka for our log processing and analytics pipeline, collecting hundreds of billions of events/day and data from thousands of servers.
  • Datadog uses Kafka brokers data to most systems in metrics and events ingestion pipeline.
  • Hotels.com uses Kafka as pipeline to collect real time events from multiple sources and for sending data to HDFS
  • LinkedIn for activity stream data and operational metrics. This powers various products like LinkedIn Newsfeed, LinkedIn Today in addition to offline analytics systems like Hadoop.
  • LucidWorks Search (Solr) with incoming data from Hadoop and also for sending LucidWorks Search logs
  • Shopify Access logs, A/B testing events, domain events ("a checkout happened", etc.), metrics, delivery to HDFS, and customer reporting
  • Twitter as part of Storm stream processing infrastructure

Differences between Kafka and AWS EventBridge

Similarities

  • Both AWS EventBridge and Apache Kafka can be used as event brokers that allow different applications to communicate with each other by sending and receiving events.
  • Both systems provide mechanisms for producers to publish events and consumers to receive and process them.
  • Event Streaming: both can handle event streaming, filtering and transformations. However,EventBridge is not optimized for high-throughput data streaming like Apache Kafka. While EventBridge can handle event streaming, it may not be the best choice for applications that require extremely high throughput, low latency, and parallel processing of large volumes of data.

Architecture

  • AWS EventBridge is a fully managed service offered by Amazon Web Services
  • Apache Kafka is an open-source solution that can be deployed on-premises or in the cloud.
  • With AWS EventBridge, you don't have to worry about managing the underlying infrastructure, whereas with Apache Kafka you need to manage the cluster yourself.

Features

  • AWS EventBridge provides a simple and flexible event-driven architecture that makes it easy to build event-driven applications. It has built-in support for filtering and transforming events, as well as a variety of integrations with other AWS services.
  • Apache Kafka is known for its very high throughput, scalability, and reliability.
  • Kafka provides more advanced features such as parallel processing, event replay, and partitioning.

Use cases

  • AWS EventBridge is designed for event-driven architectures, where events from different sources can be sent to different targets. It's ideal for use cases such as cloud-native application development, IoT, and serverless architectures.
  • Apache Kafka, on the other hand, is designed for large-scale data streaming and is often used in big data and real-time analytics applications.
  • If implementing specifically with AWS services, AWS EventBridge is a good choice. If you need to integrate with other services, Apache Kafka may be a better choice.

Deployment

  • AWS EventBridge is a fully managed service, which means that it is deployed and maintained by AWS. You do not need to worry about provisioning, scaling, or managing the infrastructure.
  • Apache Kafka requires you to set up, manage, and maintain the infrastructure yourself, either on-premises or in the cloud.

Cost

  • The cost of using AWS EventBridge depends on the number of events and the number of targets that you have. There is no upfront cost, and you only pay for what you use. With Apache Kafka, the cost will depend on the resources required to run your cluster and the number of nodes you need. It may also require additional costs for monitoring, management, and maintenance.

Integration

  • AWS EventBridge provides built-in integration with a variety of AWS services, such as Lambda, S3, and CloudWatch, which makes it easier to integrate with other AWS services. Apache Kafka requires additional configuration and setup to integrate with other services.

Latency

  • AWS EventBridge provides low latency event delivery, which makes it suitable for real-time applications.
  • Apache Kafka is known for its high-throughput and low-latency capabilities, but the actual latency will depend on the size of your cluster and the number of nodes.

Monitoring and management

  • AWS EventBridge provides a centralized management console and monitoring capabilities that make it easy to manage and monitor your events.
  • With Apache Kafka, you need to set up your own monitoring and management tools, or use third-party tools, to monitor and manage your cluster.

Security

  • AWS EventBridge provides built-in security features, such as encryption at rest and in transit, as well as the ability to define fine-grained access controls. With Apache Kafka, you need to implement security measures, such as encryption and access controls, yourself.
  • EventBridge provides fine-grained access controls through AWS IAM (Identity and Access Management), which allows you to define who can access and manage your events.
  • EventBridge provides monitoring and logging capabilities through AWS CloudTrail and CloudWatch, which makes it easy to audit and monitor access to your events.
  • EventBridge is compliant with a variety of security and privacy standards, including SOC 2, PCI DSS, and ISO 27001, which helps ensure that your data is handled securely and appropriately.

AWS Choices: Events/Messaging

  • Amazon Simple Queue Service (SQS): A fully managed message queuing service that enables you to send, store, and receive messages between services.

  • Amazon Simple Notification Service (SNS): A fully managed messaging service that enables you to send push notifications, SMS messages, and emails.

  • Amazon Kinesis: A fully managed, real-time data streaming service that enables you to collect, process, and analyze streaming data in real-time.

  • AWS EventBridge: A serverless event bus that makes it easy to connect applications together using data from your own apps, SaaS apps, and AWS services.

  • AWS AppSync: A fully managed service that makes it easy to develop GraphQL APIs by handling the heavy lifting of securely connecting to data sources.

  • AWS MQ: A managed message broker service for Apache ActiveMQ, that makes it easy to set up and operate message brokers in the cloud.

  • Amazon Managed Streaming for Apache Kafka (MSK): A fully managed service for Apache Kafka that makes it easy to build and run highly available, secure, and scalable Kafka clusters.

Azure Choices: Events/Messaging

  • Azure Service Bus: A fully managed messaging service that enables you to send and receive messages between services, and decouple applications.

  • Azure Event Grid: A fully managed event routing service that enables you to easily build event-driven architectures by reacting to events from Azure services or third-party services.

  • Azure Event Hubs: A fully managed, real-time data streaming service that enables you to collect, store, and process large amounts of data from various sources.

  • Azure Notification Hubs: A fully managed service that enables you to send push notifications to any platform from any backend.

  • Azure Relay: A fully managed service that enables you to securely expose services that run in your corporate network to the public cloud.

  • Azure Queue Storage: A fully managed, cloud-based message queuing service that enables you to reliably send and receive messages between services.

  • Azure SignalR Service: A fully managed service that allows you to add real-time web functionality to your applications.

  • Azure Stream Analytics: A fully managed service that enables you to process and analyze streaming data in real-time.

GCP Choices: Events/Messaging

  • Cloud Pub/Sub
    • A fully-managed real-time messaging service that allows applications to exchange messages reliably and securely.
  • Cloud Storage
    • Store and process data in an object storage system.
  • Cloud Dataflow
    • A unified programming model and managed service for developing and executing a wide range of data processing patterns.
  • Cloud Functions
    • A serverless environment to run event-driven code.
  • Cloud Spanner
    • A fully-managed, highly-scalable, relational database service.
  • Cloud Bigtable
    • A NoSQL database service for large analytical and operational workloads.