Foundations of artificial intelligence
ME-390
Media
Media
Course information
Course format
In-person lectures, in-person exercise hours.
Assessment
3 in-class quizzes (up to 30%), one end of semester written final exam (70%). The quiz grades are counted if they help your final grade. So, your final grade is calculated as follows:
final grade = max(70% final + 10% q1 + 10% q2 + 10% q3, 80% final + 10% q1 + 10% q2, ..., 100% final)
Above, q1,q2,q3 refer to quizzes 1, 2, and 3, respectively. It follows that your final grade is the maximum among 8 possible combination of final exam and quiz grades.
Quizzes are on 16.10, 20.11, 11.12, during the exercise hour. They are 20 minutes and no aid is allowed (no books/notes, no electronics).
The final exam is closed-book. You are allowed one cheatsheet, where you can use a double-sided page to write any material from the course.
Teaching assistants
- Anna Maddux (anna.maddux@epfl.ch)
- Tingting Ni (tingting.ni@epfl.ch)
- Andreas Schlaginhaufen (andreas.schlaginhaufen@epfl.ch)
- Kai Ren (kai.ren@epfl.ch)
- Giulio Salizzoni (giulio.salizzoni@epfl.ch)
- Gabriel Vallat (gabriel.vallat@epfl.ch)
- Saurabh Dilip Vaishampayan (saurabh.vaishampayan@epfl.ch)
Recommended references:
There are too many resources online on artificial intelligence and machine learning. While these sources might provide a good intuition, not all have the same depth and rigour. I recommend the following.
1. Book: on machine learning with engineering applications: Machine Learning for Engineers, Using Data to Solve Problems for Physical Systems by Ryan G. McClarren. We refer to it as ML4Engineers in the course
6. Online course: The machine learning course at Stanford, referred here as MLStanford: EE 104, Stanford
The notes and python exercises are mainly based on The EPFL Course, CIVIL-226 created by the Vita lab.
We introduced the course and the administrative matters of the course. We introduced artificial intelligence (AI) and the machine learning (ML) approach to AI. We defined supervised learning and unsupervised learning, and introduced linear regression as a supervised learning approach.
Optional: For a thorough introduction to learning you can read Sections 1.1-1.3 of the UnderstandingML book.
- Lecture 01 slides (File)
- Lecture 01 video (URL)
- Python exercise (URL)
- Background and notations (File)
- Background and notations - Solution (File)
- Lecture 12 video (URL)
We had no lecture due to the holiday. In the exercise hour you were required to go through the exercises in the file "Background and notations", posted last week and review some python coding tips in 02-numpy folder of the python exercises.
For looking up some python commands and comparison of the commands with matlab, you may use the following online cheat sheet.
We formulated the linear regression problem, defined the mean square-error empirical loss function and derived the optimal linear regression parameters minimizing this loss function. We discussed nonlinear feature mapping as well as over-fitting and under-fitting.
Additional resources: Section 2.1, 2.2 of the ML4Engineers book and Appendix C.1, C.2 from LinAlgebra book. For a brief review of linear algebra concepts you need in this course, see Chapter 2 of Deep Learning book, while for a brief overview of gradient based optimization you may see Section 4.3 of Deep Learning book.
Note: on slide 16, there should be a factor of 2 in front of the term w^TX^Ty in the formula for J(w).
- Lecture 02 slides (File)
- Lecture 02 video (URL)
- Additional notes/exercises (File)
- Additional notes/exercises - Solution (File)
- Python exercise (URL)
- Solution to Python Exercise (URL)
We discussed training and test error, overfitting and underfitting, and regularization for potentially reducing the model sensitivity and hence, reducing overfitting. We discussed logistic regression for classification, and discussed the logistic loss function. We provided interpretation in terms of the cross-entropy. Lastly, we discussed the performance metrics in terms of the confusion matrix.
Additional resources: Sections 2.4 up to 2.4.2 of the ML4Engineers book and pages 45,46 (norms) and 48, 49, 50 of the LinAlgebra book.
- Lecture 03 slides - after (File)
- Lecture 03 video (URL)
- Python Exercise (URL)
- Solution to Python Exercise (URL)
- Solution to python exercise: penguin dataset (URL)
Additional resources: Sections 2.3.4 on multinomial logistic regression of the ML4Engineers book.
- Lecture 04 slides after (File)
- Lecture 04 video (URL)
- Problem set 1 (File)
- Solution to Problem set 1 (File)
After reviewing concepts of conditional probability distribution and Baye's rule, we defined the Naive Bayes classifier as an approach to classification. We discussed first the approach in the case in which features are finite-valued features, and then discussed Gaussian Naive Bayes classifier for the case in which the features are continuous-valued.
As a background on probability, you may review these notes.
- Lecture 05 slides (File)
- Lecture 05 video (URL)
- Solution to python exercises (URL)
- Quiz 1 (File)
- Solution to Quiz 1 (File)
We reviewed concepts from data statistics and probability distribution needed for this course: probability distribution, empirical distribution, independence, conditional distribution, conditional independence. We discussed the Naive Bayes classifier and saw examples of it for spam email detection.
Note: For an example of conditional independence, you can see Wikipedia article, the example with colored boxes. For a review of probability and some of the concepts in probability we covered you may see Chapter 3, Sections 3.1-3.9 of the Deep Learning book.
In the exercise hours, you will apply kNN to two datasets and can compare its performance in each case.
- Lecture 06 slides after (File)
- Lecture 06 video (URL)
- Python exercise: kNN biomed (URL)
- Solution: kNN biomed (URL)
- Python exercise: kNN pinguin (URL)
- Solution: kNN pinguin (URL)
We discussed a neural network as a nonlinear predictor with a
specific structure, for classification and regression. We discussed
training neural networks using variants of gradient descent.
- Lecture 07 slides (File)
- Lecture 07 video (URL)
- Python exercise: NN MNIST (URL)
- Solution: NN MNIST (URL)
- Python exercise: NN power system (URL)
- Solution: NN power system (URL)
Next, we motivated convolutional neural networks and defined the operation of
convolution of a signal with a filter and saw examples of convolution.
We then described convolutional neural networks.
Additional resources: Chapter 6, 6.1, 6.2, 6.3 of ML4Engineers book. For transfer learning, see the case study in 6.5 of ML4Engineers book
- Lecture 08 slides (File)
- Lecture 08 video (URL)
- Problem set 2 (File)
- Problem set 2 solution (File)
- Python exercise: CNN MNIST (URL)
- Solution: CNN MNIST (URL)
- Python exercise (bonus): CNN EuroSAT (URL)
- Solution (bonus): CNN EuroSAT (URL)
Additional resources: For PCA, see the PCA lecture from the MLStanford course and the StatQuest posts for review of data standardization and covariance.
In this lecture, we discussed clustering and specifically talked about the k-means approach to clustering. This approach determined k clusters to group the data and used the mean of the data points in each cluster as a representative of the points in the cluster. Next, we considered decision trees for regression and classification. We observed that decision trees can be interpretable (if the depth is not too large). Finding an optimal decision tree is a challenging optimization problem. Hence, we considered greedy algorithms that add nodes sequentially based on best performing feature and threshold values at each tree depth.
For additional resources on clustering read Chapter 4 of LinAlgebra book.
- Lecture 10 slides (File)
- Lecture 10 video (URL)
- Python exercise: PCA (URL)
- Solution: PCA (URL)
- Python exercise: kMeans (URL)
- Solution: kMeans (URL)
- Solution to quiz 2 (File)
We continued discussing how to create classification trees using gini impurity index. For further examples on decision trees, you may see StatQuest link on Classification Trees.
We next had a discussion, moderated by Prof. Sascha Nick, on conditions for AI to benefit societies.
Note: I have added resources on AI ethics that you can use to prepare yourself when reflecting about this topic in your work.
- Lecture 11 slides (File)
- Lecture 11 video (URL)
- Link for discussion (URL)
- Python Exercise (URL)
- Solution to Python Exercise (URL)
- Problem set 3 (File)
- Solution to Problem set 3 (File)
- Link to video on AI ethics for engineers (URL)
- Nature article on Aligning AI with climate change (URL)
We reviewed AI ethics lecture, and moved on to discrete-time dynamical system and control, as a first step towards our reinforcement learning lecture.
Furthermore, we had a guest lecturer, Dr. Roberto Castello, from Swiss Data Science Center on AI in Industry.
16 December - 22 December
- Lecture 13 - Reinforcement learning (File)
- Problem set 4 (File)
- Problem set 4 Notes (File)
- Recording of lecture 13 (URL)
- AI ethics question (File)