Master 2nd year : Systèmes et Communication
2006-2007

Algorithms and Techniques for Distributed Systems

Sacha Krakowiak

This 24-hour course is an introduction to the fundamentals of distributed computing systems. Here is an outline:

concepts and methods related to event ordering and global state definition in a distributed system,
survey of the main classes of distributed algorithms, illustrated by the description of a few basic algorithms (termination, election, etc),
information consistency and reliable broadcast protocols,
problems of distributed consensus and atomic commitment,
distributed object management

The emphasis is on basic principles; methods and algorithms are illustrated with examples from actual systems.

Time and State in Distributed Systems

Introduction to the problems of distributed systems: asynchrony, unreliability, scale
Causality and event ordering in an asynchronous distributed system
Global states and consistent cuts

Logical clocks and applications: distributed mutual exclusion, distributed queues
Vector clocks and applications: observation, debugging
Synchronisation of physical clocks

Fault hypotheses
Specifying consistency: linearisability, sequential consistency, causal consistency
Primary copy and active redundancy
Fault-tolerant broadcasts and process group management
Applications: reliable servers, multiple copies, distributed shared memory

Case studies: P2P systems, distributed files

R. Guerraoui, L. Rodrigues, Reliable Distributed Programming, Springer, 2006
A. S. Tanenbaum, M. van Steen, Distributed Systems - Principles & Paradigms, Prentice Hall, 2002
S. Mullender (editor), Distributed Systems, 2nd ed. , Addison-Wesley, 1993
M. Singhal, N. G. Shivaratri, Advanced Concepts in Operating Systems, McGraw-Hill, 1994
V. C. Barbosa, Introduction to Distributed Algorithms, MIT Press, 1996