15-281 - Fall 2020

Artificial Intelligence:
Representation and Problem Solving


Key Information

All classes will be delivered online using Zoom for synchronous sessions

Check your Piazza invitation email!

Mondays and Wednesdays, 1:30pm - 2:50pm

Thursdays 12:00pm - 1:20pm

Dr. Eduardo Feo-Flushing

12.0

30% Final, 10% Midterm 1, 10% Midterm 2, 50% Homework

(optional) S. Russel and P. Norvig, Artificial Intelligence: A Modern Approach (3rd edition, 2010)

Overview

This course is about the foundations of theory and practice of Artificial Intelligence. But, what AI means/is?
A number of different definitions of AI have been given so far. In general, all include the underlying idea of equipping machines (e.g., computer programs, robots) with "tools" to display behaviors that we, as humans, would be inclined to categorize as "intelligent". This is still quite vague, since it requires to clarify in turn what an intelligent behavior is.

In this course, this is issue is somehow narrowed down to the concept of rationality : the notion of Artificial Intelligence is treated as equivalent to that of Computational Rationality. Given well defined preferences, a rational agent always selects the actions whose outcomes result in the maximization of the expected utility.

The course will provide an overview of modern techniques for rational decision-making under different conditions regarding available information, certainty or uncertainty of the information, presence of one or more agents in both cooperative and adversarial scenarios.

More precisely, during the course, fundamental questions about the design of AI systems will be addressed, such as:

  • how to represent knowledge,

  • how to effectively generate provably optimal sequences of actions, (i.e., how to make optimal decision plans or define decision policies);

  • how to search among (large sets of) alternatives to find optimal or near-optimal solutions;

  • how to deal with uncertainity;

  • how to take decisions in adversarial environments;

  • how to learn to make optimal decisions.

Both deterministic and stochastic environments will be considered and treated separately.

Objectives

There are three primary objectives for the course:


  • To provide a broad survey of AI and of the most popular techniques that are employed for: knowledge representation, problem solving, mathematical optimization, automated planning, probabilistic prediction and inference, sequential decision-making, supervised and reinforcement learning, deep learning, decision-making in multi-agent adversarial scenarios.

  • To develop a thorough understanding of the algorithmic foundations of AI and acquire a strong appreciation of the big-picture aspects of designing fully autonomous intelligent agents.

  • To develop operational known-how about how to build/program AI agents, and analyze and improve their performance.

Course Layout

The course will be based on lectures given online using Zoom for synchronous (same time) sessions.

Students are expected to attend all classes and to actively participate with questions.

Each lecture will feature a number of short Quizzes to let the students engaged and the instructor check the understanding of the content as it is delivered.

For each one of the different topics, the course will present relevant techniques, discuss formal results, and show the application to problems of practical and theoretical interest.

Weekly recitations will be used to practice together with the algorithms and their application to different problems.

Homework will be assigned weekly. Each homework will include both questions to be answered and programming assignments.

  • Written questions will involve working through different algorithms, deriving and proving mathematical results, and critically analyzing the material presented in class.

  • Programming assignments will involve writing code in Python to implement and test algorithms.

  • As a general rule, each homework will be delivered on the day of the first lecture of the week (i.e., on Mondays). To help the students to keep up with the material presented during the lectures, each homework will be made of two parts. The first part will include a set of relatively simple questions to be answered by Wednesday (at midnight), while the second, more substantial part, will have to be completed by Monday (at noon).

Prerequisites

Since the course will cover a number of different topics, students should have some background in programming (in Pyhton), algorithms, calculus, linear algebra, and probability theory. Please talk to the instructor if you are unsure whether your background is suitable for the course.

The formal prequisites for this course are:

  • 15-122 Principles of Imperative Computation

  • 21-241 Matrices and Linear Transformations

  • 21-127 Concepts of Mathematics or 15-151 Mathematical Foundations of Computer Science.

The corequisite for this course is:

  • 21-122 Integration and Approximation

For this corequisite, you should either have completed it prior to starting 15-281 or have it on your schedule for Spring 2020.

Grading

Grading will assigned based on the following weighting: 50% Homework, 30% Final exam, 10% Midterm exam 1, 10% Midterm exam 2. The final exam will include questions about all topics considered in the course, with an emphasis on the topics introduced after the first midterm exam.

Readings

There is no required textbook for this class. The students should be able to learn everything from lecture handouts and homework. However, the Russel and Norvig textbook will cover good parts of the course. During the course, other books will be pointed out by the instructor to cover specific parts of the course.

Rules for Zoom sessions

All classes will be delivered online using Zoom for synchronous sessions.

Please make sure that your Internet connection and equipment are set up to use Zoom and able to share audio and video during class meetings. Let me know if there is a gap in your technology set-up as soon as possible, and we can see about finding solutions.

Sharing video: In this course, being able to see one another helps to facilitate a better learning environment and promote more engaging discussions. Therefore, our default will be to expect students to have their cameras on during lectures, discussions, and exams. However, I also completely understand there may be some challenges that might hinder you from having your video on during live class sessions. If you have any concerns about sharing your video, please email me as soon as possible and we can discuss possible adjustments. In any case, according to CMU-Q's policies, if you have justifiable technical or personal circumstances that prevent you from having your video on during live class sessions, you will need to submit a Waiver Form to the Video Policy Committee. In this waiver, you will have to explain thoroughly the reasons for which you are requesting a video exemption.

You are free to use a background image in your video if you wish.

Schedule (subject to changes)


Dates Topic Slides References
8/24 Introduction: AI hype, Concepts, History, Definitions, Problems, Road map PDF R&N Ch. 1,2,26,27, Brief AI history, 2016 Stanford AI report
8/25 AI Problems and agents: Agent function; task environment; taxonomy of problems; types of agents; PDF R&N Ch. 3
8/26 Search problems and solution approaches: Planning and search problems; performance metrics; search meta-algorithms: Tree and Graph search; Informed vs. Uninformed search PDF R&N Ch. 3
8/27 Uninformed search: BFS, Bidirectional-BFS, UCS, DFS, DL-DFS, IDS: algorithms and properties PDF R&N Ch. 3

8/31 Informed Search: From uninformed to informed search; best-first strategy based on evaluation function; UCS as informed search; Greedy best-first search; A*: properties and applications; admissible heuristics; consistent heuristics; monotonicity of f-costs; proofs of optimality of A* PDF R&N Ch. 3
9/1 Informed Search for Path planning: Path planning in continuous spaces; applications of path planning; space discretization and combinatorial path planning; cell decomposition choices; iterative cell decomposition; A* on roadmap graphs; post-search smoothing; Theta* for in-search smoothing; optimal path planning and visibility graphs PDF Theta* paper at AAAI-07
9/2 Recitation: Software design for search agents and problem environments PDF

9/7 Adversarial search 1: Adversarial scenarios and games; game taxonomy; types of games and role of information; single-player deterministic games; zero-sum games and minimax search; algorithm and properties PDF R&N Ch. 5
9/8 Adversarial search 2: Revisiting minimax search: concepts, algorithm, properties; DF examples of mimimax and limitations; depth-limited and pruning workaround; heuristic evaluation functions; minimax tree pruning using alpha-beta pruning; properties of alpha-beta pruning PDF R&N Ch. 5
9/9 Adversarial search 3: More on the properties of minimax and alpha-beta pruning; move ordering heuristics; complexity properties; transposition tables; evaluation function heuristics; adversarial search under uncertainty; role of chance; probabilities, expected values and expectimax search; algorithm and properties; worst-case vs. average case; optimism vs. pessimism; (no) pruning; probabilistic modeling; other types of games. PDF

9/14 Classical planning 1: Automated planning; from atomic to factored / structured states; sub-goals; STRIPS/PDDL language; concepts of propositional logic; states, goals, actions, actions schemata; examples PDF R&N Ch. 10
9/15 Classical planning 2: Examples of planning domains and problem instances; complexity of planning problems; forward and backward search approaches; relaxing a problem by acting upon precondtions and postconditions of the actions; definition of problem-indipendent and problem-specific heuristics PDF R&N Ch. 10
9/16 Classical planning 3: Planning graphs: construction and properties; use of planning graphs for domain-independent heuristics; set level heuristic; GRAPHPLAN: algorithm and properties; backward search in Graphplan; example problems. PDF

9/21 Constraint Satisfaction Problems (CSPs) 1: Factored states with any kind of variables; solution as problem state; state and decision variables; definition of states, partial states, feasible set, consistent solution; taxonomy of decision problems; optimization vs. feasibility problems; definition of constraint satisfaction problems; examples. PDF R&N Ch. 4, 6
9/22 Constraint Satisfaction Problems (CSPs) 2: Types of constraints and constraint graphs; solving CSPs; limits of search approaches for CSPs; commutativity; backtracking: depth-limited DFS with single variable selection; arc consistency; constraint propagation and pruning; AC-3 as global constraint propagation. PDF R&N Ch. 4, 6
9/23 Constraint Satisfaction Problems (CSPs) and Local Search: From global constraint propagation to local propagation with forward-checking; Maintaining Arc Consistency (MAC); properties of arc consistency; Local search concept; neighborhoods; local search as path in solution state space; local search for CSP; hill-climbing; properties and variants of hill-climbing and local search; where the hard problems are: phase transitions; WalkSAT: local serach for SAT problems PDF R&N Ch. 4, 6
Paper on WalkSAT (Local search for max-SAT) in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, vol. 26, 1996

9/28 Optimization models 1, Convex optimization: Optimization problems; General concepts and notation; Taxonomy of problems; Convex problems; Properties of convex optimization PDF
9/29 Optimization models 2, Integer optimization: Integer optimization problems; Integer Linear Programming (ILP); properties, modeling and solution approaches; branch and bound as anytome algorithm; design choices PDF Branch-and-Bound, book chapter
9/30 Optimization models 3, Models and Methods: Gradient descent for convex problems: formalism, geometry, iterative approach; step size; properties; stochastic gradient descent; IP optimization models; branch-and-bound examples. PDF Survey on gradient descent algorithms

10/5 Probabilistic models 1: Role of uncertainity; from deterministic to probabilistic models; probabilistic inference and decision-making; review of probabilities; joint probability distributions; marginals; conditional probabilities; chain rule; dependence, independence; conditional dependence and independence. PDF
10/6 Probabilistic models 2: Bayes rule; construction of posteriors; application to probabilistic classification; review of course subjects PDF
10/6 Review of course subjects, Q&A PDF with all lectures so far

10/12 Fall break
10/13 Fall break
10/14 Fall break

10/19 Midterm Exam 1
10/21 Markov Chains 1: Definitions and applications of random processes; taxonomy of random processes; states and transitions; Markov property; Markov chains, formalism, general properties PDF R&N Ch. 15
S. Ross, Introduction to probability models, Ch. 4
10/22 Markov Chains 2: Making predictions using Markov chains; limiting and stationary distributions in Markov Chains; MCMC concepts; state classification (absorbing, periodic, aperiodic, transient, non-null recurrent, null recurrent, ergodic); communication classes and irreducibility; ergodic chains; general properties of stationary Markov chains Part I
Part II
R&N Ch. 15
S. Ross, Introduction to probability models, Ch. 4
Applications of Markov chains

10/26 Bayesian Networks 1: Grahical models; conditional probability network; compact representation of joint distributions; BN Construction; global and local semantics; information propagation PDF R&N Ch. 13, 14
10/28 Bayesian Networks 2: Inference in BNs; exact inference; examples and complexity of exact inference; approximate inference by sampling-based methods; introduction to Monte Carlo methods; MC for direct sampling from a BN (no evidence) PDF R&N Ch. 13, 14
10/29 Bayesian Networks 3: Sampling methods for inference in BNs; rejections sampling; weighted likelihood sampling; Gibbs sampling; examples and properties of the different methods PDF R&N 14

11/2 Markov Decision Processes 1: Markov processes: chain, reward, and decision processes; MC vs. HMM, MDP vs. POMDP; policies; deterministic vs. stochastic policies; stationary vs. non-stationary policies; formal definition of MDPs; utlities; optimal policies; dependence on reward function PDF R&N Ch. 17
11/3 Markov Decision Processes 2: Utilities; additive utilities and stationary preferences; infinite utilities and discounting; state-value and action-value functions; optimal utilities; state value under a policy; Bellman expectation equations; solutions; policy evaluation; Bellman backup operator PDF R&N Ch. 17
Bellman-Ford algorithm, deterministic dynamic programming, asynchronous, distributed
Slides showing Bellman-Ford at work for network routing
11/4 Markov Decision Processes 3: Bellman equations; Value Iteration; measures of approximation error in Value Iteration; Bellman residuals; convergence properties; how to choose error bounds; Policy Iteration; properties and comparison with Value Iteration PDF R&N Ch. 17

11/9 Markov Decision Processes 4, Reinforcement Learning 1: Variants of Value Iteration; LP formulation of an MDP; from MDPs to RL models and challenges; general RL problem; examples PDF R&N Ch. 21
Book extract on variants of Value Iteration
Sutton and Barto RL book, 2018
11/11 Reinforcement Learning 2: Passive RL; learning the MDP model; limitations and use of learning the model; direct learning of the value function; Monte Carlo learning; averaging returns; Temporal Differences, Bellman samples, bootstrapping; convergences properties; differences between the two approaches PDF Sutton and Barto RL book, 2018
11/12 Tartan Community Day

11/16 Reinforcement Learning 3: Active RL; exploration vs. exploitation; GLIE scheme; Monte Carlo action learning; Q-function estimation; exploring starts; generalized policy iteration (GPI); soft policies; behavior and learning policies; on-policy and off-policy learning; SARSA vs. Q-Learning example PDF
11/18 Reinforcement Learning 4: On policy: SARSA; off-policy: Q-Learning; properties and examples; dealing with large state-action spaces; parametric function approximation; loss minimization; function approximators examples (linear, neural networks); gradient descent and stochastic gradient descent for parameter learning; supervised setting; examples with linear features; from supervised to RL setting; value and Q function approximation using MC and TD; overview of actor-critic concepts. PDF Diagram summarizing the topics investigated in the RL lectures
11/19 Review of course subjects, Q&A PDF with all lectures so far
11/22 Midterm Exam 2

11/23 Game Theory 1: General concepts and definitions (strategy, pure strategy, utilities, strategy profile); strategy dominance; normal-form; payoff matrix; solution concepts and equilibria; Nash equilibrium; properties of NE; examples; prisoner's dilemma PDF R&N Sec. 17.5-17.6
Book on Multi-agent systems, Ch. 3-4
11/25 Game Theory 2: Best responses and Nash equilibria; fixed points of best response mapping and NE; mixed strategies; mixed NE; best responses and equality of payoffs' theorem; finding a mixed NE using equality of payoffs; Nash's theorem for the existence of mixed NE; step-by-step example PDF Complexity of computing Nash equilibrium
11/26 Game Theory 3: Revision of the Nash theorem and step-by-step example; proof of the theorem for a 2-player 2-actions case; best reponses in strategy space; game formalization and analysis using algebra and calculus; numeric application examples PDF

11/30 Game Theory 4: Game of chicken; social welfare; price of anarchy; correlated equilibrium; CE as LP; best welfare CE; CE vs. Nash; Stackelberg's sequential games, leader-follower games; computation of Stackelberg's strategies for 2 players; security games: description, example, main results PDF
12/2 Review of course subjects, Q&A
12/3 Review of course subjects, Q&A

Homework Assignments


Topic Files Due Dates
HW 1: Problems and Agents, Uninformed Search 8/31
HW 2: Informed Search 9/7
HW 3: Adversarial Search 9/14
HW 4: Classical planning 9/21
HW 5: Constraint Satisfaction Problems 9/28
HW 6: Modeling and solving optimization problems 10/5
HW 7: Probabilistic modeling, Markov chains 10/11
HW 8: Bayesian networks 10/29
HW 9: Markov Decision Processes for sequential-decision making 11/9
HW 10: Reinforcement learning with tabular methods 11/16
HW 11: Reinforcement learning with approximation methods 11/23
HW 12: Game Theory for multi-agent adversarial scenarios 12/2

HW Policies

  • Homework is due on Autolab by the posted deadline. Assignments submitted past the deadline will incur the use of late days.

  • You have 6 late days, but cannot use more than 2 late days per homework. No credit will be given for homework submitted more than 2 days after the due date. After your 6 late days have been used you will receive 20% off for each additional day late.

  • You can discuss the exercises with your classmates, but you should write up your own solutions. If you find a solution in any source other than the material provided on the course website or the textbook, you must mention the source. You can work on the programming questions in pairs, but theoretical questions are always submitted individually. Make sure that you include a README file with your andrew id and your collaborator's andrew id.

  • In general, for both assignments and exams, CMU's directives for academic integrity apply and must be duly followed. Students are required to refer to CMU's general policies on cheating and plagiarism: https://www.cmu.edu/policies/student-and-student-life/academic-integrity.html

Exam dates and policies

The class includes two midterm exams and one final exam. Midterms are set set for October 7 and November 11, the final is TBD.

During exams students are allowed to consult 1-page cheatsheet (written in any desired format). No other material is allowed, including textbooks, computers/smartphones, or copies of lecture handouts.

Students with special needs or disabilities are kindly asked to talk about it to the teacher at the beginning of the semester and make use of the assistance procedures provided by CMU https://scotty.qatar.cmu.edu/qword/student-affairs/office-of-health-and-wellness/assistance-for-individuals-with-disabilities

Office Hours

Name Email Hours Location
Gianni Di Caro gdicaro@cmu.edu Thursdays 2:00-3:00pm + Send me an email to set up an appointment M 1007