Machine learning II

MICRO-570

Media

This file is part of the content downloaded from Machine learning II.
Course summary

========================

Content: This course will present some of the core advanced methods in the field for structure discovery, classification and non-linear regression.

Prerequisites: This course is intended as an advanced class in ML for MSc and PhD students and hence focuses on advanced topics of ML as well as on critical reading of the state of the art in ML. Students who are expected to have taken an introductory ML course, such as the MSc level Machine Learning I course given during the winter semester.

The course assumes prior knowledge in:

  • Linear Algebra, Probability and Statistics.
  • Standard machine learning and statistical analysis techniques such: PCA, K-means, SVM, linear and support vector regression, perceptron, feedforward neural networks 

Practicals & Software:  The course encompasses computer-based practicals which will cover various topics (Kernel methods, Classification and Regression). The practicals rely on MATLAB and ML_toolbox, an open-source software which allows to visualize and test most of the major algorithms seen in class in MATLAB.

Students use their own PC and have the related software packages installed.

========================

Time and location:

The course takes place on Wednesday from 13:15 to 16:00 in the room INf 119.

========================

Format of the course:

The course format is that of a split-class.

  • 45 minutes long video that presents the theory of the course must be watched prior to coming to class.  
  • 2 x 45 = 1h30 minutes of interactive lecture and interactive exercise session. 
  • 45 minutes for programming session, alternating with Q&A session on projects (look at schedule).

The interactive lecture will be given in class. If infrastructure allows, we will also give it on zoom for those who prefer to attend remotely at: https://epfl.zoom.us/j/62262147163

The interactive exercise session and programming session will be given on site.

========================

A repository of all lecture videos is available at: https://mediaspace.epfl.ch/channel/LASA+-+Machine+Learning+Courses/30562

========================

Grading:  40% of the Grade will be based on personal work done during the semester (this entails either doing a computer based mini-project (code competition) or presenting papers an advanced topic in the class debates, see Practicals section below). The remaining 60% of the grade will be based on a 30 minutes oral exam (15 minutes preparation, 15 minutes oral defense). The oral will examine the student on the material viewed during the course.

========================

You must choose between participating in the coding competition or in a debate to earn the 40% of the grade based on personal work. 

Debates: A set of topics for the panels/debates will be announced by the instructor. You can choose which camp you will defend. Debates last 30 minutes and take place in class. To prepare for the panel/debate, you must read associated literature and prepare a deck of slides and a 2 pages summary of the main arguments you plan to bring to the table. The panels/debates unfolds as follows: first each debater starts by presenting their position in 1-2 slides, then they debate with one another and with the audience for 15-20 minutes. During the live debate, debaters must back their arguments by using some of their additional slides that contain examples or more details facts. The audience votes on which side wins! Grade is based on the quality of the content of the arguments,  slides and summary. Grade is not influenced by the outcome of the audience's vote!

Code Competitions :  Several datasets are offered - this an unsupervised learning problem - the goal is to extract insightful information from the data, such as inferring patterns, identifying features that are instrumental and features that are irrelevant, extracting outliers, etc.  To extract this information, you need to use one or more algorithms of your choice,. You should modulate the algorithm to obtain best performance, knowing algorithm's sensitivity to parameters' choices. You report on the results you have found in a 2-page summary, and a set of slides which you present in class. The most insightful analysis wins! Grade is based on the quality of the summary, slides and presentation. Grade is not influenced by the outcome of the competition!

Instructions and topics are available HERE.

You can sign up for debates and code competitions using the two polls, see polls at the end of this section.  

Deadline to register for a project: March 5 2025

Deadline to submit your files:

  • Debates and Coding Competition: 2-4 pages summary and slides must be submitted by 13h00 (1pm) on Monday, May 26. 

Late submissions incur 1pt penalty per day late!  Summary and slides must be uploaded on moodle, see link at the bottom of this section.

Schedule for Oral presentation of Competition and Debates will be posted in due time.

========================

Online Resources in Machine Learning:


Recommended Textbooks:

General Introduction to Machine Learning:


Kernel Methods: PCA, SVM:

  • "Kernel Methods for Pattern Analysis" by John Shawe-Taylor, Nello Cristianini, Cambridge University Press (June 28, 2004)
  • "Pattern Recognition and Machine Learning" by Christopher M. Bishop, Springer; 1 edition (October 1, 2007)
  • Learning with Kernels by B. Scholkopf and A. Smola, MIT Press 2002

Statistical Learning Methods:


Neural Networks:

  • Spiking Neuron Models by W. Gerstner and W. M. Kistler, Cambridge University Press, Aug. 2002
  • "Hebbian Learning and Negative Feedback Networks (Advanced Information and Knowledge Processing)" by C. Fyfe, Springer
  • "Independent Component Analysis", A. Hyvarinen, J. Karhunen and E. Oja, Wiley Inter-Sciences, 2001
  • "Self-Organizing Maps", Teuvo Kohonen, Springer Series in Information Sciences, 30, Springer, 2001
  • "Introduction to Neural Networks: A Comprehensive Foundation" (2nd Edition) by S. Haykins

Reinforcement Learning:

  • "Reinforcement Learning: An Introduction", R. Sutton & A. Barto, A Bradford Book. MIT Press, 1998
  • "Reinforcement Learning: A Survey", Leslie Pack Kaelbling & Michael L. Littman and Andrew W. Moore, Journal of Artificial Intelligence Research, Volume 4, 1996

Course materials: Course materials are divided into several categories defined as follows

  • L : Lecture slides and Solutions to exercises
  • LX : Supplementary lectures (extra topics not evaluated)
  • TP : Practicals-related materials (Description, Matlab code, and Solutions)
  • MP : Mini-projects related information
  • RD : Related Documentation (Textbook excerpt, etc.).
  • AM : Additional materials (Matlab code, etc.)

========================

Lecture Notes:

The Lecture Notes of this course can be downloaded by clicking on the link below:


19 February - Introduction - Spectral Methods [PART-1] - Kernels


Lecture
  • Introduction to class format
  • Brief overview of topics we will see in class
Interactive  exercise session
  • Kernels - definitions, types,
  • Geometric deformation of the space induced by kernels
Practice session
  • Introduction to coding projects and literature reading for debates
    - Registration poll for the coding project: link
    - Registration poll for the debates project: link



26 February - Spectral Methods [Part 2] - Kernel PCA

TODO prior to coming to class:
Class will be divided as follows:

Interactive Lecture and Exercises (1h15-3pm)
  • Kernels continued
  • Kernel-PCA (kPCA)
Supplementary Exercises & Practice session (3:15-4pm)
Q&A and team formation for coding and debate competition


Do not forget: Deadline to register for Coding and Debate Competition: March 5




5 March- Spectral Methods [Part 3] - Linear & kernel CCA

TODO prior to coming to class:
Watch video on kernel CCA
Watch video on Applications of kernel CCA (Optional) 
Class will be divided as follows:
Interactive Lecture & Exercises - 1:15-3pm
  • kernel CCA
Practice session Coding Competition: 3:15-4pm

12 March - Spectral Methods [Part 4] - kernel K-means

Todo:
  • Watch video on kernel K-means
  • Watch video on how to handle missing data and unbalanced datasets

Interactive lecture & exercises (2 hours only from 13h15-15h00)
  • kernel K-means

Q&A session on Coding Competition & Debates

  • How to choose and analyse an algorithm
  • How to search for literature.


19 March- Spectral Methods - Practice Session 1 (kPCA, CCA and kMeans)

TA in charge of solution: Yongtao

This class is only a practice session. Bring your own laptop and exercise your understanding of kernal PCA, CCA and kernal K-means



26 March - Spectral Methods [Part 5] - Spectral Clustering & Nonlinear Embedding


Interactive Lecture & exercises (1-24m)
  • Spectral Clustering & non-linear embedding


2 April- Spectral Methods - Practice Session 2 (Manifold Learning)

TA in class: Yongtao & Sthit

Objective:

Practice-Session: This is a 3-hour practice session on manifold learning methods. Come in class and do practice session 2 on matlab using your laptop.

Coding Project: Use also this time to ask TA for help on your coding project.




9 April - SVM for Clustering, Semi-Supervised Clustering and Classification

TA: Baiyu presents solution, Yongtao to be present


Watch videos of lecture presenting:
  • (Optional:) Video recal of SVM and weaknesses
  • SVM Limitations
  • Sparse SVM - nu-SVM and Relevance Vector Machine (RVM)
  • SVM Semi-Supervised Clustering - Transductive SVM
  • SVM clustering - SVC
Interactive Lecture  (1-3pm)
  • SVM Clustering, Semi-Supervised Clustering, Sparse SVM + Polynomial kernel for SVM
Coding Competition Support (3-4pm)
This session is meant to accompany students who opted for the coding competition in their implementation and evaluation. By then, we expect that you have achieved 1st milestone:
  • 1st Milestone: One or two algorithms already implemented on dataset
  • Show TA early result you have obtained



16 April - Practice Session 3 - Sparse SVM versus other classification methods Methods

TA:  Baiyu is present and presents the solution


Objectives:

Practice session: This is a practice session on sparse SVM and other classification methods seen in class

Coding project: Take the time to ask TA for help with your coding project





23 April - EASTER BREAK

Support Vector Machine (SVM) and extensions

Wednesday April 1 Lecture 

  • Lecture

Friday April 3
  • Exercise session


30 April - From Linear to Nonlinear Regression - Gaussian Process Regression AND Q&A Debate and Coding Competition for GROUP 1

TA: Baiyu is in-class and presents solutions, Yongtao helps

TODO: Watch the two videos on Ridge regression and Support Vector Regression extensions: nu-SVR and RVR

Schedule
  • 1:15-2pm Interactive lecture on nonlinear regression
    3h15-4pm:  Q&A Debate and Coding Competition for GROUP 1 ONLY




7 May- From Probabilistic PCA to Gaussian Process Latent Variable Models (GPLVM) and Variational Inference AND Q&A for Debate and Coding Competition for GROUP 1

TA: Yongtao is in-class and presents solutions, Baiyu helps

1:15-3pm
  • Interactive lecture on Gaussian Process Regression, and extensions: GPLVM and variational inference
3h15-4pm

Support to Debate and Coding Competition for GROUP 1 ONLY


14 May - Oral presentation - Debates and Coding Competition - Group 1

This class is devoted to students presenting the results of their coding projector or debate discussion. The schedule is described below.

Beware that the 4-pages summary for debate and coding competitions are due on May 26, 13h00 (1pm)

Schedule for presentations:

13h15-13h30: Debate Success of ML - Sofiane Walid against TAs/Teacher + classroom

13h30-13h40: Coding competition Average Monthly Surface Temperature  Benjamin Beretz

13h40_13h50:  Coding competition Average Monthly Surface Temperature - Xinyi Han

13:50-14:00: Coding competition Average Monthly Surface Temperature Q&A 

14:00-14:10: Coding competition - Netflix - Gregoire Gimenez

14:10-14:20: Coding competition - Netflix - Irvin Dalaud

14:20-14:30: Coding competition Netflix Q&A 

14:30-14:40: Coding competition - Traffic Accident - Valentin Perret

14:40-14:50: Coding competition - Traffic Accident - Damien Vincent

14:50-15:00: Coding competition traffic Accident Q&A 

15:00-15:10: Coding competition - Food Nutrition - Quentin Rossier

15:10-15:20: Coding competition - Food Nutrition - Osman Ornek

15:20-15:30: Coding competition - Food Nutrition - Q&A 

15:30 - 16h00: Grade Deliberation - TA + Teacher (no students)


21 May - Q&A Coding Competition - Group 2

This class is devoted to students asking questions about the project preparation, GROUP 2 only.


28 May - Oral presentation - Debates and Coding Competition - Group 2

This class is devoted to students presenting the results of their coding projector. The schedule is described below.

Students present in this order:

13h15-13h25: Coding competition - Calories Burned  - Maksymiliann Wojciech Schoeffel

13h25-13h35:  Coding competition - Calories Burned - Julien Ferdinand Gouraud

13h35-13h45:  Coding competition - Calories Burned - Q&A

13:45-13:55: Coding competition - AI/ML Salaries - Théo Pierre Luc Basseras

13:55-14:05: Coding competition - AI/ML Salaries - Thomas Jerver Asmussen

14:05-14:15: Coding competition - AI/ML Salaries - Q&A

14:15-14:25: Coding competition - Education & Career Success - Javier De Ramón Murillo

14:25-14:35: Coding competition - Education & Career Success - Rayan Bouchallouf

14:35-14:45: Coding competition - Education & Career Success - Q&A

14h45-14h55:  Coding competition - French employment, salaries, population - Mathieu Stawarz

14h55-15h05:  Coding competition - French employment, salaries, population - Marko Mitric

15h05-15h15:  Coding competition - French employment, salaries, population - Q&A

15:15 - 15h45: Grade Deliberation - TA + Teacher (no students)





14 May - 1st Debate and Coding Competition Reinforcement Learning (RL Part-1)


TODO: Watch the discrete RL video.

Schedule
  • 13:15-2pm Interactive lecture on RL
  • 2h15-4pm Practice session (on computer) on regression


10 May - Continuous & Inverse RL (RL Part - 2)

TODO: Watch the continuous RL video.

Schedule
  • 13:15-2pm Interactive lecture on inverse RL
  • 2h15-4pm Exercises on continuous RL and IRL


17 May - HMM


TODO: Watch the first video (39 minutes) - Theory on Hidden Markov Models.  If interested, you can watch the second video on extensions, but this second video is optional.

  • 1h15-2pm Interactive lecture on HMM (given by the professor on zoom!)
  • 2h15-4pm Exercise session on RL and HMM (on site)


24 May - Oral paper presentations + mini-project deadline (May 23)


1h15-4pm This session is devoted to oral presentation of papers, see Schedule below.

Deadline for handing out report on mini-project and slides for oral presentation of papers: May 23, 12:00 (noon) . These must be submitted through moodle, see link below.



31 May - Overview class, Q&A session

There will be no class. Students are invited to watch the video on overview of class and preparation to oral exam, see below.

The Q&A session will take place on June 30, 12am to 2pm in room ME.A3.31




Boosting-Bagging


Oral Exam - 19 June 2025

The oral exam is worth 60% of your grade. It is closed book but you are allowed to bring a recto-verso A4 page with handwritten personal notes. The dates will be announced by the SAC in April. 


The oral exam will take place on June 19. Each exam slot lasts 30 minutes. It entails 10 minutes preparation, 15 minutes preparation and 5 minutes for the transition across students. The exam is closed-book, but you are allowed to bring a A4 recto-verso handwritten notes. Notes can be written on paper or on a tablet.

You can register for a slot for the exam at: https://doodle.com/meeting/participate/id/bYqYLEnb

Registration is on a first come, first serve basis.