Skip to main content

Distributed Systems

Definition

  • Collection of autonomous computers that work together to perform a common task.
  • System composition
    • Building a larger, more complex system by combining smaller, simpler systems
    • Composition allows for a more modular and flexible approach to building distributed systems, as it allows for the development, deployment, and maintenance of individual components to be handled independently.
  • Loosely or tightly coupled, depending on the level of communication and coordination between the computers.
  • Improve scalability, availability, and fault tolerance.
  • Use a variety of communication protocols, such as TCP/IP, HTTP, and RPC.
  • Consistency models, including eventual consistency and strong consistency.
  • Replication to ensure data availability and to improve performance.
  • Distributed systems can use load balancing techniques to distribute workloads across multiple computers.
  • They can be deployed on-premises or in the cloud.
  • Distributed systems can be managed using various technologies and tools, such as Kubernetes, Mesos, and Docker.
  • They have become increasingly important as more and more applications are built to run on distributed infrastructure.

CAP Theorum

  • C: Consistency
  • A: Availability
  • P: Partition tolerance

CAP (Consistency, Availability, and Partition tolerance) theorem is a concept in distributed systems that states that it is impossible for a distributed system to simultaneously provide all three of the following guarantees:

  • Consistency: All nodes in the system see the same data at the same time.

  • Availability: Every request to the system receives a response, without guarantee that it contains the most recent version of the data.

  • Partition tolerance: The system continues to function despite arbitrary partitioning due to network failures.

  • Different varieties of distributed systems can be characterized by the relative emphasis they place on these three guarantees.

Varieties

  1. CP systems: These systems prioritize Consistency and Partition tolerance, and may sacrifice Availability in certain scenarios. Examples include traditional relational databases like MySQL.

    • Financial systems or systems that handle sensitive personal information. Applications that require strong data consistency across multiple nodes.
    • Applications that need to continue functioning in the event of a network failure. Must be able to operate in a partitioned network environment, such as systems that
    • Riak, a distributed key-value store, which allows for tunable consistency levels.
    • Cassandra, a distributed NoSQL database, which provides tunable consistency and can survive network partitions.
    • Couchbase, a document-oriented database, which also provides tunable consistency and can survive network partitions.
  2. AP systems: These systems prioritize Availability and Partition tolerance, and may sacrifice Consistency in certain scenarios. Examples include NoSQL databases like Cassandra and MongoDB.

    • Online gaming or social media platforms.
    • Applications that need to remain available and responsive to users, even in the event of network partitions or other failures.
    • Real-time analytics or log processing systems
    • Applications that process large volumes of data and can tolerate some level of data inconsistency.
    • Amazon DynamoDB, a managed NoSQL database service, which prioritizes high availability and partition tolerance over consistency.
    • Amazon Simple Queue Service (SQS), a managed message queue service, which prioritizes high availability and partition tolerance over consistency.
    • Apache Kafka, a distributed streaming platform, which prioritizes high availability and partition tolerance over consistency.
  1. CA systems: These systems prioritize Consistency and Availability, and may sacrifice Partition tolerance in certain scenarios. Examples include systems that use quorums or multi-primary replication.
    • E-commerce or gaming systems - Applications that require high availability, such as , which need to be accessible to users at all times.
    • Applications that require strong data consistency across multiple nodes, such as systems that handle financial transactions or systems that process sensitive personal information.
    • MongoDB, a document-oriented database, which supports strong consistency and automatic failover.
    • Elasticsearch, a distributed search and analytics engine, which supports strong consistency and automatic failover.
    • Redis, an in-memory data store, which supports strong consistency and automatic failover.

Real-world systems often aim to strike a balance between the guarantees, and the CAP theorem is not meant to be a strict rule, but more of a guideline to understand the trade-offs of different systems.