Download a PDF Version of the syllabus.

Overview

Title: Cloud Computing

Units: 9

Pre-requisites: A grade of "C" or better in 15-213, Introduction to Computer Systems

Lectures: Monday and Wednesday, 4:30 - 6:00 PM, Room 2147

Webpage: http://www.qatar.cmu.edu/~msakr/15319-s12/

Description

This project-based course will give students a theoretical foundation and hands-on experience with the various technologies of the cloud computing paradigm. Cloud computing is the delivery of computing as a service, whereby distributed resources are provided by appropriate service suppliers and leased, rather than owned, by an end user as a utility (similar to electricity and water) over a network (typically the Internet). Cloud computing services are becoming ubiquitous and are being adopted by a growing number of fields. Organizations are recognizing the benefits of this new computing paradigm in terms of increased flexibility, elasticity as well as reduced upfront costs and carbon footprint.

The course will provide students with a thorough treatment of cloud computing and its applicability to commercial application development as well as research computing needs. The lectures will cover topics related to cloud infrastructure and software stack, programming models (e.g., MapReduce and Pregel), underlying distributed storage layers (e.g., HDFS and HBase), as well as enabling technologies such as virtualization. Students will also be exposed to various cloud frameworks and libraries (e.g., Mahout, Pig, and Hive). Since this is a project-based course, students will learn project design, management, implementation, testing and reporting skills. Students will also gain hands-on experience with a public cloud service (Amazon EC2, S3 and EBS), utilize it to lease and provision compute and storage resources and then program and deploy applications that use these resources. Students will use the Hadoop framework to solve large-scale data-intensive problems and then analyze the performance characteristics in the class project.

Instructors

Prof. Majd F. Sakr
msakr@qatar.cmu.edu, CMUQ 2121, 4454-8625.
Office hours: Tue, 3-4pm

Dr. Mohammad Hammoud
mhhammou@qatar.cmu.edu, CMUQ 1013, 4454-8506.
Office hours: Thu, 11am-12pm

Teaching Assistants

Suhail Rehman
suhailr@qatar.cmu.edu, 2044, 4454-8680.
Office hours: By Appointment

Fan Zhang
zhang@qatar.cmu.edu, 1206, 4454-8482.
Office hours: By Appointment

Objectives

The course is meant to introduce students to the field of cloud computing. Students will work on a large semester-long project that will utilize the Amazon EC2 cloud. They will also learn about new programming paradigms that are developed for the cloud. Furthermore, they will understand and appreciate some of the current challenges and tradeoffs when mapping different applications to the cloud.

The course will serve as a firm foundation on many cloud computing principles and enablers such as distributed file systems and virtualization. Students will be able to design and implement parallel algorithms to efficiently distribute data intensive computation over virtualized cloud platforms. The class project in this CS 319 will focus on implementing MapReduce real-world applications, deploy them on the cloud and characterize their performances. As a result, students will have the foundation needed to match the future needs in the emerging field of cloud computing.

The course has three goals:

Through these objectives, the course will transform your computational thinking from designing applications for a single computer system to designing applications for a cloud distributed system.

Learning Outcomes:

The primary learning outcome of the course is three-fold:

Understanding the core concepts of cloud computing and the enabling technologies

Students will learn the core concepts of cloud computing. They will understand how the cloud computing paradigm evolved over the past few years as an answer to the growing needs of organizations. Cloud computing is an amalgam of various technologies. Students will be able to discuss many of these technologies including:

Programming Models

Traditional programming models might not work efficiently in clouds. Students will identify the two main classical programming models, shared memory and message passing, as well as apply the novel programming models that are commonly adopted in clouds. Specifically, students will:

Virtualization

Students will explain the fundamental concepts of virtualization, where a state of a computer is abstracted from the underlying hardware. They will describe how virtualization applies to cloud computing, and identify various capabilities provided by virtualization to cloud providers and users. Specifically, students will:

Storage Technologies and Distributed File Systems

Storage technologies and distributed file systems play a major role in enabling cloud computing, by allowing for fast, reliable, and parallel access to large amounts of data distributed across multiple machines. Students will identify storage technologies suitable for clouds as well as describe the fundamental principles of distributed file systems (DFSs) and how they apply to cloud computing. Specifically, students will:

Emerging Cloud Tools

One criticism of cloud programming models is that the development cycle might take long time. For instance, writing a MapReduce program involves coding the map and reduce functions, compiling and packaging the program, submitting the job(s), and retrieving the results. Researchers and engineers might require a faster model to quickly mine huge datasets. In this course, students will:

Building Cloud Applications

Students will explore the applicability of different application domains to cloud computing. Specifically, students will:

Each student will be mentored by a teaching staff member and will deliberately acquire the required skills to pursue project planning, design, implementation, analysis and result reporting, much needed in academia as well as industry.

Emerging Research Challenges

While many existing techniques served in realizing the realm of cloud computing, several new research challenges swiftly emerged in an attempt to enable the full potential of the paradigm. Students will identify the following research challenges:

Textbooks

The primary textbook for this course is:

In addition, we recommend the following text books:

We have several reference books in the library covering most of the topics of the course. We will also be reading tutorials, journals and conference publications on the subject.

Course Organization

Your participation in the course will involve several forms of activity:

Attendance will be taken at the beginning of each lectures, it will be worth 5% of your grade. Before each class, you are required to briefly read about the topics that will be covered. You will be responsible for all material presented during the lectures.

Getting Help

For urgent communication with the teaching staff, it is best to send an email (preferred) or call the office phone. If you want to talk to a staff member in person, remember that our posted office hours are merely nominal times when we guarantee that we will be in our offices. You are always welcome to visit us outside of our office hours if you need help or want to talk about the course.

We ask that you follow a few simple guidelines. Prof. Sakr, Dr. Hammoud, Suhail and Dr. Zhang normally work with their office door open and welcome visits from students whenever the doors are open. However, if their door is closed, then they are busy with a meeting or a phone call and should not be disturbed.

We will use the course web-page as the central repository for all information about the class. Using the web-page, you can:

  1. Obtain copies of any handouts or assignments. This is especially useful if you miss class or you lose your copy.
  2. Find links to any electronic data you need for your assignments
  3. Read clarifications and changes made to any assignments, schedules, or policies.
  4. Provide healthy feedback about the course

You can use the mailing list (15319-s12@lists.qatar.cmu.edu) to post messages, make queries about the course and specific project requirements. The messages on this mailing list will be distributed to all the students and staff of the course.

Policies

Working Alone on Project Phases and Posters

Project phases and posters that are assigned to single students should be performed individually.

Handing in Project Phases and Posters

All project phases and posters are due at 11:59 PM (one minute before midnight) on the specified due date. All hand-ins are electronic using the AFS file system: /afs/qatar.cmu.edu/usr16/msakr/www/15319-s12/handin/userid/, userid is your qatar user id.

Appealing Grades

After each project phase is graded, you have seven calendar days to appeal your grade. All your appeals should be provided in writing. If you are still not satisfied, please come and visit Prof. Sakr. If you have questions about an exam grade, please visit Prof. Sakr directly.

Assessment

Final Grade Assignment and Assessment methods

Each student will receive a numeric score for the course, based on a weighted average of the following:

  1. Project:

    The project will count a combined total of 75% of your score. There are 3 project phases throughout the course. The first phase is worth 15% each. The second and third phases are worth 30% each, and it will involve a presentation and a paper as well as the project code. Take into account that small differences in scores can make the difference between two letter grades.

    You are encouraged to submit the project phase deliverables on time. For the first two project phases, the following rules apply. If you submit one day late, there will be deducted 25% of the project score as penalty. If you are two days late, 50% will be deducted. The project will not be graded (and you will receive a zero score) if you are more than two days late. However, there is a grace-days quota for projects; you are given 3 grace days for the first two project phases. You can use the grace days as needed. For example, you can submit your project 1, three days late and still not get any penalty. Your penalty starts from 4th day after the deadline if you use your grace days. However, since you have used up all your grace days from your quota, you do not have any grace days for other projects. Plan how to utilize your grace day quota judiciously.

    Note that the final project phase is unique. You cannot use grace days for final project. There will not be any penalty system for this project either. If you are one day late in submitting final project, your project will not be graded (and you will receive a zero score).

  2. Student Project Update Presentations:

    You will be required to brief the instructors and the class about the status of your project in a short presentation that outlines your project status, the milestones you have achieved and the next steps to completing the project. At the end of each project, you are required to present your project to the class as well. These will count towards 20% of your final grade.

  3. Class Participation and Attendance:

    Your attendance and participation in the different discussions held in class will account towards 5% of your final grade.

Type # Weight
Project Phases I, II & III 3 75%
Project Update Presentations 6 20%
Class Participation and Attendance 28 5%

Grades for the course will be determined by absolute standards. The total score will be plotted as a histogram. Cutoff points are determined by examining the quality of work by students on the borderlines. Individual cases, especially those near the cutoff points may be adjusted upward or downward based on factors such as attendance, class participation, improvement observed throughout the course, and special circumstances.

Cheating

Each project must be the sole work of the student turning it in, except for possible group projects. Projects will be closely monitored by automatic cheat checkers, and students may be asked to explain any suspicious similarities with any piece of code available. The following are guidelines on what collaboration is authorized and what is not:

What is cheating?

  1. Sharing code or other electronic files: either by copying, retyping, looking at, or supplying a copy of a file.
  2. Sharing written assignments: Looking at, copying, or supplying an assignment.

What is NOT cheating?

  1. Clarifying ambiguities or vague points in class handouts.
  2. Helping others use the computer systems, networks, compilers, debuggers, profilers, or other system facilities.
  3. Helping others with high-level design issues.
  4. Helping others debug their code.

Cheating in group projects will also be strictly monitored and penalized (similar to cheating in individual exams, assignments or projects). Be aware of what constitutes cheating (and what does not) while interacting with students in other groups; same rules of cheating as above apply when collaborating between two or more groups. You cannot share or use written assignments, code, and other electronic files from students in other groups. If you are unsure, ask the teaching staff.

Be sure to store your work in protected directories. The penalty for cheating is severe, and might jeopardize your career; cheating is not worth the trouble. By cheating in the course, you are cheating yourself; the worst outcome of cheating is missing an opportunity to learn. In addition, you will be removed from the course with a failing grade. We also place a record of the incident in the student's permanent record.

Class Schedule

Please refer to Schedule page for the tentative schedule for the class. The schedule also indicates the project activities. Any changes will be announced on the class distribution list ( ). An updated schedule will be maintained on the class Web page.