Distributed Systems
Definition
- Collection of autonomous computers that work together to perform a common task.
- System composition
- Building a larger, more complex system by combining smaller, simpler systems
- Composition allows for a more modular and flexible approach to building distributed systems, as it allows for the development, deployment, and maintenance of individual components to be handled independently.
- Loosely or tightly coupled, depending on the level of communication and coordination between the computers.
- Improve scalability, availability, and fault tolerance.
- Use a variety of communication protocols, such as TCP/IP, HTTP, and RPC.
- Consistency models, including eventual consistency and strong consistency.
- Replication to ensure data availability and to improve performance.
- Distributed systems can use load balancing techniques to distribute workloads across multiple computers.
- They can be deployed on-premises or in the cloud.
- Distributed systems can be managed using various technologies and tools, such as Kubernetes, Mesos, and Docker.
- They have become increasingly important as more and more applications are built to run on distributed infrastructure.
CAP Theorum
- C: Consistency
- A: Availability
- P: Partition tolerance
CAP (Consistency, Availability, and Partition tolerance) theorem is a concept in distributed systems that states that it is impossible for a distributed system to simultaneously provide all three of the following guarantees:
Consistency: All nodes in the system see the same data at the same time.
Availability: Every request to the system receives a response, without guarantee that it contains the most recent version of the data.
Partition tolerance: The system continues to function despite arbitrary partitioning due to network failures.
Different varieties of distributed systems can be characterized by the relative emphasis they place on these three guarantees.
Varieties
CP systems: These systems prioritize Consistency and Partition tolerance, and may sacrifice Availability in certain scenarios. Examples include traditional relational databases like MySQL.
- Financial systems or systems that handle sensitive personal information. Applications that require strong data consistency across multiple nodes.
- Applications that need to continue functioning in the event of a network failure. Must be able to operate in a partitioned network environment, such as systems that
- Riak, a distributed key-value store, which allows for tunable consistency levels.
- Cassandra, a distributed NoSQL database, which provides tunable consistency and can survive network partitions.
- Couchbase, a document-oriented database, which also provides tunable consistency and can survive network partitions.
AP systems: These systems prioritize Availability and Partition tolerance, and may sacrifice Consistency in certain scenarios. Examples include NoSQL databases like Cassandra and MongoDB.
- Online gaming or social media platforms.
- Applications that need to remain available and responsive to users, even in the event of network partitions or other failures.
- Real-time analytics or log processing systems
- Applications that process large volumes of data and can tolerate some level of data inconsistency.
- Amazon DynamoDB, a managed NoSQL database service, which prioritizes high availability and partition tolerance over consistency.
- Amazon Simple Queue Service (SQS), a managed message queue service, which prioritizes high availability and partition tolerance over consistency.
- Apache Kafka, a distributed streaming platform, which prioritizes high availability and partition tolerance over consistency.
- CA systems: These systems prioritize Consistency and Availability, and may sacrifice Partition tolerance in certain scenarios. Examples include systems that use quorums or multi-primary replication.
- E-commerce or gaming systems - Applications that require high availability, such as , which need to be accessible to users at all times.
- Applications that require strong data consistency across multiple nodes, such as systems that handle financial transactions or systems that process sensitive personal information.
- MongoDB, a document-oriented database, which supports strong consistency and automatic failover.
- Elasticsearch, a distributed search and analytics engine, which supports strong consistency and automatic failover.
- Redis, an in-memory data store, which supports strong consistency and automatic failover.
Real-world systems often aim to strike a balance between the guarantees, and the CAP theorem is not meant to be a strict rule, but more of a guideline to understand the trade-offs of different systems.