Download a PDF Version of the syllabus.

Overview

Title: Distributed Systems

Units: 12

Pre-requisites: A grade of "C" or better in 15-213, Introduction to Computer Systems

Lectures: Monday and Wednesday, 2:30 - 3:50 PM, Room 2049

Recitation: Thursday, Time: TBA, Room: TBA

Webpage: http://www.qatar.cmu.edu/~msakr/15440-f12/

Description

15-440 is an introductory course in distributed systems. The emphasis will be on the techniques for creating functional, usable, and high-performance distributed systems. To make the issues more concrete, the class includes several multi-week projects requiring significant design and implementation.

The goals of this course are twofold: First, students will gain an understanding of the principles and techniques behind the design of distributed systems, such as locking, concurrency, scheduling, and communication across networks. Second, students will gain practical experience in designing, implementing, and debugging real distributed systems.

The major themes this course will teach include process distribution and communication, data distribution, scheduling, concurrency, resource sharing, synchronization, naming, abstraction and modularity, failure handling, protection from accidental and malicious harm, distributed programming models, distributed file systems, virtualization, and the use of instrumentation, monitoring and debugging tools in problem solving. As the creation and management of software systems is a fundamental goal of any undergraduate systems course, students will design, implement, and debug large programming projects. Students will learn the design and implementation of today's popular distributed system paradigms, such as Google File System and MapReduce.

Instructors

Prof. Majd F. Sakr
msakr@qatar.cmu.edu, CMUQ 1016, 4454-8625.
Office hours: Tue, 3-4pm

Dr. Mohammad Hammoud,
mhhammou@qatar.cmu.edu, CMUQ 1013, 4454-8506.
Office hours: Thu, 11am-12pm

Objectives

Distributed Systems combine the computational power of multiple computers to solve complex problems. The individual computers in a distributed system are typically spread over wide geographies, and possess heterogeneous processor and operating system architectures. Hence, an important challenge in distributed systems is to design system models, algorithms and protocols that allow computers to communicate and coordinate their actions to solve a problem.

Our aim in this course is to introduce you to the area of distributed systems. You will examine and analyze how a set of connected computers can form a functional, usable and high-performance distributed system.

The course has three goals:

Through these objectives, the course will transform your computational thinking from designing applications for a single computer system, towards that of distributed systems.

Learning Outcomes:

Checkout a complete tree of learning outcomes for this course

The primary learning outcome of the course is two-fold:

Understanding the Core Concepts of Distributed Systems

Students will learn the core concepts underlying distributed systems designs. They will understand the system constraints, trade-offs and techniques in distributed systems to best serve the computing needs for different types of data and applications. Specifically, they will learn the following concepts:

Access and location transparency

Hiding the details of machines and exposing the capabilities is one of the first steps to design distributed systems that scale and penetrate economies and masses to utilize their power. For example, in the Internet, which is a successful distributed system, a simple browser interface will allow you to explore information scattered over wide-geographies. In this course, students will examine how to abstract locations, replication, sharing and failure of resources that may reside at different physical places.

Students will learn the following topics:

Parallelization of tasks

Traditional algorithms that work on a single processor are inefficient - or even fail to work - in a system where multiple machines are working in parallel. In distributed systems, problems/jobs can be solved using parallelization. Generally a job is split into multiple tasks, and each task is executed concurrently with other tasks on a different machine. The tasks may access common resources, such as the data contained in a single file. As such, two main challenges emerge. The first challenge is to ensure that concurrently running tasks coordinate and synchronize to achieve a common goal. The second challenge is to replicate and place resources among multiple computers such that concurrently running tasks can access resources efficiently.

Students will study the following topics:

Fault-tolerance

In a distributed system with many computers, a failure of a single or a part of the computer is very likely. If such a system failure is not avoided or recovered from, the whole system might halt, resulting in a fragile and unpredictable system. Students will identify the issues dealing with avoiding and recovering from failures, a concept referred to as fault-tolerance, in distributed systems.

Security

In distributed systems, computers that solve your problem may not be under your administrative control; you do not own - sometimes, even know - where your program is running on a big connected set of computers. This makes a distributed system vulnerable to security and privacy related issues. Students will learn the common security issues in distributed systems and mechanisms to secure the system.

Practical Application of the State-of-the-Art Distributed Systems:

Students will also learn how to apply principles of distributed systems in a real-world setting. Specifically, they will learn the following topics:

Textbooks

The primary textbooks for this course are:

In addition, we recommend the following text books:

We have several reference books in the library covering most of the topics of the course. We will also be reading tutorials, journals and conference publications on the subject.

Course Organization

Students' participation in the course will involve five forms of activities:

Getting Help

For urgent communication with the teaching staff, it is best to send an email (preferred) or phone. If you want to talk to an instructor in person, remember that our posted office hours are merely nominal times when we guarantee that we will be in our offices. You are always welcome to visit us outside of our office hours if you need help or want to talk about the course.

We ask that you follow a few simple guidelines. Prof. Sakr and Dr. Hammoud normally work with their office doors open. Whenever their office doors are open, they welcome visits from students. However, if their office doors are closed, this means they are busy with meetings or phone calls and should not be disturbed.

We will use the course web-page as the central repository for all information about the class. Using the webpage, you can:

  1. Obtain copies of any handouts or assignments. This is especially useful if you miss class or you lose your copy.
  2. Find links to any electronic data you need for your assignments
  3. Read clarifications and changes made to any assignments, schedules, or policies.
  4. Provide healthy feedback about the course.

You can use the mailing list (15440-f12@lists.qatar.cmu.edu) to post messages, make queries about the course, specific projects, or exams. The messages on this mailing list will be distributed to all the students and the instructors of the course.

Policies

Working Alone on Assignments/Projects

Assignments/projects that are assigned to single students should be performed individually. Some projects may be group projects, in which case it will be notified earlier.

Working on Group Assignments/Projects

Some assignments/projects are collectively performed by a group of students. A student can be in only one group. The maximum and minimum number of students in a group will be announced earlier. On such assignments/projects, you can collaborate only among your team members.

Handing in Assignments/Projects

All assignments/projects are due at 11:59 PM (one minute before midnight) on the specified due date. All hand-ins are electronic using CMU Autolab. Instructions on using Autolab will be provided in class and on the course webpage.

Making up Exams, Assignments and Projects

Missed exams, assignments and projects can be made up on a case by case basis, but only if you make prior arrangements with Prof. Sakr. However, you should have a good reason for doing so. You need a written consent from Prof. Sakr for making up exams, assignments or projects. It is your responsibility to get your projects done on time. Be sure to work far enough in advance to avoid unexpected problems, such as illness, unreliable or overloaded computer systems, etc.

Appealing Grades

After each assignment, exam and/or project is graded, you have seven calendar days to appeal your grade. All your appeals should be provided in writing. If you are still not satisfied, please come and visit Prof. Sakr. If you have questions about an exam grade, please visit Prof. Sakr directly.

Assessment

Final Grade Assignment and Assessment methods

Each student will receive a numeric score for the course, based on a weighted average of the following:

  1. Projects:

    The projects will count a combined total of 45% of your score. There are 4 projects throughout the course. The first three projects are worth 10% each and the last is worth 15%. The last two projects (combined) will involve a presentation and a paper. Take into account that small differences in scores can make the difference between two letter grades.

    You are encouraged to submit the projects on time. For all projects except the final one, the following rules apply. If you submit one day late, there will be deducted 25% of the project score as a penalty. If you are two days late, 50% will be deducted. The project will not be graded (and you will receive a zero score) if you are more than two days late. However, there is a grace-days quota for projects; you are given 3 grace days for all projects (except the final project). You can use the grace days as needed. For example, you can submit your project 1, three days late and still not get any penalty. Your penalty starts from 4th day after the deadline if you use your grace days. However, since you have used up all your grace days from your quota, you do not have any grace days for other projects. Plan how to utilize your grace day quota judiciously. For a team project, we deduct one grace day from each student if the team submits the project one day late. Hence, make sure that everyone in your team has 'x' grace days left if you want to submit the project 'x' days late.

    Note that the final project is unique. You cannot use grace days for it. There will not be any penalty system for this project either. If you are one day late in submitting the final project, it will not be graded (and you will receive a zero score).

  2. Exams:

    There will be two in-class exams - mid-term and final - that count for 25% of the grade. Mid-term will count for 10%, and final for 15% of the overall grade.

  3. Problem Solving Assignments:

    There will be 4 written assignments that will test students on problem analysis and solving skills. These assignments will carry an overall score of 10% of your total score.

  4. Quizzes:

    There will be 10 quizzes in the class or the recitation, which will account for 15% of your grade. There will be a quiz per topic that tests your understanding in the topic covered.

  5. Class-Recitation Participation and Attendance:

    Your attendance and participation in the different discussions held in the class and the recitation will account towards 5% of your final grade.

Type # Weight
Projects 4 45%
Exams 2 25%
Problem Solving Assignments 4 10%
Quizzes 10 15%
Class/Recitation Participation and Attendance 43 5%

Grades for the course will be determined by absolute standards. The total score will be plotted as a histogram. Cutoff points are determined by examining the quality of work by students on the borderlines. Individual cases, especially those near the cutoff points may be adjusted upward or downward based on factors such as attendance, class participation, improvement observed throughout the course, exam performance, and special circumstances.

Cheating

Each project must be the sole work of the student turning it in, except for possible group projects. Projects will be closely monitored by automatic cheat checkers, and students may be asked to explain any suspicious similarities with any piece of code available. The following are guidelines on what collaboration is authorized and what is not:

What is cheating?

  1. Sharing code or other electronic files: either by copying, retyping, looking at, or supplying a copy of a file.
  2. Sharing written assignments: Looking at, copying, or supplying an assignment.

What is NOT cheating?

  1. Clarifying ambiguities or vague points in class handouts.
  2. Helping others use the computer systems, networks, compilers, debuggers, profilers, or other system facilities.
  3. Helping others with high-level design issues.
  4. Helping others debug their codes.

Cheating in group projects will also be strictly monitored and penalized (similar to cheating in individual exams, assignments or projects). Be aware of what constitutes cheating (and what does not) while interacting with students in other groups; same rules of cheating as above apply when collaborating between two or more groups. You cannot share or use written assignments, code, and other electronic files from students in other groups. If you are unsure, ask the teaching staff.

Be sure to store your work in protected directories. The penalty for cheating is severe, and might jeopardize your career; cheating is not worth the trouble. By cheating in the course, you are cheating yourself; the worst outcome of cheating is missing an opportunity to learn. In addition, you will be removed from the course with a failing grade. We also place a record of the incident in the student's permanent record.

Class Schedule

Please refer to Schedule page for the tentative schedule for the class. The schedule also indicates the project activities. Any changes will be announced on the class distribution list (15440-f12@lists.qatar.cmu.edu). An updated schedule will be maintained on the class webpage.