Section outline
-
Code: 0005A, Credits (ECTS): 9, Semester: 2, Official Language: English
Instructors: Davide Bacciu - Marco Podda
Office Hours: (email to arrange meeting)
-
Weekly Schedule
The course is held on the second term. The schedule for A.A. 2024/25 is provided in table below.
The first lecture of the course will be ON FEBRUARY 18th 2025 h. 16.00. The course will be in person, with lecture videos being recorded and made available to course students (with no guarantee of quality nor completeness).
Day Time Tuesday 16.15-18.00 (Room C - Polo Fibonacci) Wednesday 11.00-12.45 (Room L1 - Polo Fibonacci) Thursday 16.15-18.00 (Room L1 - Polo Fibonacci) Course Prerequisites
Course prerequisites include knowledge of elements of probability and statistics, calculus and optimization. Previous programming experience with Python is a plus for the practical lectures.
Course Overview
This course provides a comprehensive foundation in machine learning (ML) and deep learning (DL), focusing on their practical applications in healthcare. Students will gain proficiency in key programming libraries and explore how AI methodologies can be leveraged for patient and risk stratification, disease prediction, and modeling disease progression.
Special attention is given to the unique challenges of working with health data, including physiological time-series, clinical text, and medical imaging. Through hands-on lab sessions, students will implement and analyze models for supervised prediction, clinical text processing, interpretability analysis, and causal inference in healthcare contexts. This course equips students with the essential skills and insights needed to harness ML/DL technologies for transformative healthcare solutions.
Topics: Fundamentals of Probability and Statistics for AI, Fundamentals of Machine Learning, ML for Risk Stratification and Diagnosis, Bayesian Networks in Healthcare, Deep Learning Fundamentals, Medical Imaging Data, Deep Learning for Medical Imaging, Sequential Data in Health, ML with Clinical Text: Natural Language Processing, Attention and Transformer Models, Language Models for Clinical Text and Medical Imaging, Graphs in Health and Life Sciences, Deep Learning for Graphs in Health, Tackling Challenges in Healthcare Data Processing, and Deployment of AI-based Applications for Health
Textbooks and Teaching Materials
Much of the course content will be available through lecture slides and associated bibliographic references.
We will use two main textbooks, one covering more general fundamentals of artificial intelligence and machine learning, and the other more oriented towards applications of AI and statistics to healthcare.
Note that all books have an electronic version freely available online.
[SD] Simon J.D. Prince, Understanding Deep Learning, MIT Press (2023) (book online and additional materials)
[AI4H] G.J. Simon, C. Aliferis, AI and ML in health Care and Medical Sciences, Springer, 2024 (open book online)
Useful Jupyter Notebooks
Familiarity with Python libraries such as NumPy, Pandas, and MatPlotLib is assumed. In case you need a knowledge refresher, here are some comprehensive notebooks:
-
Introduction to the course philosophy, its learning goals and expected outcomes. We will discuss prospectively the overall structure of the course and the interelations between its parts. Exam modalities and schedule are also discussed.
Date Topic References 1 18/02/2025
(16-18)Introduction to the course
Motivations and aim; course housekeeping (exams, timetable, materials); overview of AI, its historical development, key concepts, and its relevance to the field of digital health; discussion on the importance of AI in modern healthcare systems and its potential transformative impact.[AI4H] Chapter 1 "Artificial Intelligence (AI) and Machine Learning (ML) for Healthcare and Health Sciences" -
The module will introduce the fundamentals of AI and machine learning, formalizing the main learning methods and providing knowledge on baseline ML methodologies including regression, neural networks, probabilistic/Bayesian models, causality, complemented with statistical methods for risk estimation and censored data.
Date Topic References Additional material 2 19/02/2025 (11-13)
Fundamentals of Probability and Statistics for AI
basic concepts of probability and statistics; random variables and probability distributions; hypothesis testing and statistical inference
[SD] Appendix C1-C3 (refresher) L1 20/02/2025 (16-18) Lab Tutorial 1: Hypothesis testing Other notebooks on P&S 3 25/02/2025 (16-18) Machine Learning: fundamentals I
basic concepts of ML, learning paradigms and fundamental ML tasks, data types and their roles in ML, generalization, bias/variance tradeoff
[SD] Ch. 1, Sect 2.1
[AI4H] Pg. 1-16, Pg. 68-74, pg. 81-87
L2 26/02/2025 (11-13) 4 27/02/2025 (16-18) Machine Learning: fundamentals II
Regularization and model selection
Machine Learning: Linear Models I
Linear regression, regularized linear regression (ridge, LASSO, ElasticNet)
[SD] Sect 2.2, sect 6.1.0-6.1.1 More details on the linear regression can be found on this addendum to SD book, in section 8.1 5 04/03/2025 (16-18)
Machine Learning: Linear Models II
least square solutions, logistic regression, binary classification, gradient descent training.
More details on the logistic regression can be found on this addendum to SD book, in section 9.1 L3 05/03/2025 (11-13)
L4 06/03/2025 (16-18)
6 11/03/2025 (16-18)
Artificial Neural Networks I
introduction to artificial NNs; biological neuron; artificial neuron; multi-layer perceptron
[SD] Chapter 3 (with the exclusion of Sect. 3.2) 7 12/03/2025 (11-13)
Artificial Neural Networks II
activation functions, input normalization; output layers; training artificial NNs; backpropagation
[SD] Sections 6.1 & 6.2: gradient descent
[SD] Sections 7-1 & 7.2: computing gradients
[SD] Section 7.4: backpropagation (this is additional and in depth material for those that want to know more about how backprop works) L5 13/03/2025 (16-18)
8 18/03/2025 (16-18)
Risk stratification
scoring models; risk factors; assessment of risk predictors; censoring
Slides should be sufficient for this lecture. If you need some additional sources of information here is an highlevel introduction to risk stratification. [1] Time-variant logistic regression for risk scoring
[2] Explainable risk scoring with random forests, XGboost, SVM, NNs
Software:
Here is a quite handy library in R for calibration plots
9 19/03/2025 (11-13)
Survival analysis
survival analysis framework; Kaplan-Meier; Cox regression; neural networks for survival analysis; survival trees
[AI4H] Pg. 154-159, pg. 162-168 [3] Entry-level survey on survival analysis
[4] Whole textbook on survival analysis
[5] Paper on survival trees
Software:
The Scikit survival analsysis package (datasets, kaplan-meier, cox, survival trees and forests, gradient boosting survival, hypothesis testing)
L6 20/03/2025 (16-18)
10 25/03/2025 (16-18)
Bayesian Networks in Healthcare I
graphical formalism; random variables and conditional independence; factorized distributions
[AI4H] Pg. 57-62 11 26/03/2025 (11-13)
Bayesian Networks in Healthcare II
relevant graphical substructures; learning in Bayesian Networks; Bayesian Networks in healthcare
[AI4H] Pg.113-116
Software:
pgmpy - Python package for causal inference and probabilistic inference with Bayesian Networks
12 27/03/2025 (16-18)
Causality and learning dependences
causal relationships and interventions; measuring treatment effects; randomized control trials; discovering dependence in data; structure learning; applications in healthcare and useful libraries
[AI4H] Pg. 197-204, pg. 207-215, pg. 218-224 [6] Tutorial paper on the use of causality in ML
Software:
PyWhy– A full Python-based ecosystem for causal learning
CausalLearn– Python-based structure learning package
L7 01/04/2025 (16-18)
-
Date Topic References Additional Material 13 02/04/2025 (11-13)
Deep Learning Fundamentals I
deep neural networks; gradient issues; activation functions; normalization and regularization; optimization
[SD] Ch. 4, Sect 7.1, 7.3, 7.5, 14 03/04/2025 (16-18) Deep Learning Fundamentals II
neural autoencoders; unsupervised deep learning; autoencoding tasks in healthcare (anomaly detection, compression, denoising)
[SD] Sect. 6.2-6.5, Sect. 9.3 [6] Survey on autoencoders
[7,8] Surveys on autoencoders use in healthcare
15 08/04/2025 (16-18) Convolutional Neural Networks I
Introduction to medical imaging; basic CNN elements;
[SD] Chapter 10 16 09/04/2025 (11-13) Convolutional Neural Networks II
CNN architectures; medical imaging tasks
[SD] Chapter 10 Additional readings
[9] Augmentation techniques for medical imaging
[10] Survival analysis with CNNs
[11] Large-scale medical image segmentation dataset and application
Software
nnU-net - Well engineered framework for low-code medical segmentation tasks
L8 10/04/2025 (16-18) GL 15/04/2025 (16-18) AI Meets Psychiatry: fMRI-Based Multi-Disorder Diagnosis - Guest Lecture by Elisa Ferrari (CEO, Quantabrain)
QuantaBrain is a startup developing AI models for the diagnosis and characterization of psychiatric disorders using a short resting-state functional MRI scan, a complex 4D dataset influenced by numerous variables. In this seminar, the QuantaBrain team will present their neural network architecture, which combines adversarial learning with optical flow. They will share the technical challenges they encountered while training this network and scaling it from one to eleven disorders. Finally, participants will get an exclusive preview of the research platform they are currently building, designed to support scientists working on psychiatric disorders.
L9 16/04/2025 (11-13) Lab Tutorial 9: Deep Neural Networks for image classification
L10 17/04/2025 (16-18) Lab tutorial
18/04/2025 - 25/04/2025 Spring Break: No Lectures
-
Course grading will follow preferentially a modality comprising in-itinere assignments and a final oral exam. In-itinere assignments waive the final project.
In-itinere assigments
There are two types of short assigments
- Laboratory assignment - These are short programming exercises to be solved in-classroom during the laboratory lectures. They will typically have to do with the application of a methodology/model discussed during lectures and lab tutorials, on a benchmark dataset provided by the instructors. We foresee a total of 4 laboratory assignments: each will be scored with a maximum of 3 points.
- Methodology quiz - These are short quiz concerning the contect of the methodological lectures (e.g. multiple choice questions; simple calculations based on an algorithm). They will be solve in-classroom during the methodology lecture. These a closed-book examinations, lasting about 10 minutes and performed “on paper” (electronic devices not allowed). They will be presented in randomly selected lectures and come unannounced so in-person participation to lectures will be paramount. We foresee a total of 4 methodology quizzes: each will be scored with a maximum of 1 point (fraction of the point are possibile).
Both laboratory assignments and methodology quizzes will roughly be scheduled every 3/4 weeks.
Oral exam
The oral examination will test knowledge of the course contents: models, algorithms and their implementation. Lectures whose content is not relevant for the final exam will be clearly marked as such
Exam grading (preferential way)
The final exam grade is given by the formula below, which combines the total score on lab assigments \( 𝐺_{𝑙𝑎𝑏} \in [0,12] \), the total score on methodology quizzes \( 𝐺_{quiz} \in [0,4] \) and the score achieved during the oral exam \( 𝐺_{oral} \in [0,18] \)
\( Final grade = min (𝐺_{𝑙𝑎𝑏}+ 𝐺_{𝑞𝑢𝑖𝑧}+ 𝐺_{oral}, 30 cum laude) \)
Note that students are admitted to the oral exam only if \( 𝐺_{𝑙𝑎𝑏}+ 𝐺_{𝑞𝑢𝑖𝑧} > 8 \)
Alternative Exam Modality (No in-itinere/ Non attending students)
Part-time students, those not attending lectures, those who have failed in-itinere assignments or simply do not wish to do them, can complete the course by delivering a final project and an oral exam. Final project topics will be released in the final weeks of the course.
The final project concerns a coding project on a topic of interest for the course. It entails preparing and submitting:
- the code to solve the project
- a 10 pages report describing the project methodology and its validation
- a 10/15 slides presentation.
The content of the final project will be discussed in front of the instructors and anybody interested during the oral examination. Students are expected to prepare slides for a 15 minutes presentation which should summarize the problem, solution and results in the report.
Grade for this exam modality is determined as
\( G = 0.5 \cdot (G_P + G_O) \)
where \( G_P \in [1,30] \) is the project grade and \( G_O \in [1,32] \) is the oral grade
-
- Jenna Wiens, John Guttag, Eric Horvitz, Patient Risk Stratification with Time-Varying Parameters: A Multitask Learning Approach, JMLR 2016, PDF
- Tim Smolem et atl, A machine learning-based risk stratification model for ventricular tachycardia and heart failure in hypertrophic cardiomyopathy, Computers in Biology and Medicine, 2021, PDF
- Ping Wang, Yan Li, and Chandan K. Reddy. 2019. Machine Learning for Survival Analysis: A Survey. ACM Comput. Surv. 51, 6, Article 110, 2019, Arxiv
- David G. Kleinbaum , Mitchel Klein, Survival Analysis, A Self-Learning Text, 2005, Online
- Dimitris Bertsimas, Jack Dunn, Emma Gibson, Agni Orfanoudak, Machine learning, 2022, PDF
- S. Chen, W. Guo, Auto-Encoders in Deep Learning—A Review with New Perspectives, Mathematics, 2023, Online
- D. Pratella et al, A Survey of Autoencoder Algorithms to Pave the Diagnosis of Rare Diseases, Int. J. Mol. Sci.. 2021, Online
- Jan Ehrhardt, Matthias Wilms, Autoencoders and variational autoencoders in medical image analysis, MICCAI Society book Series, Online
- Manuel Cossio, Augmenting Medical Imaging: A Comprehensive Catalogue of 65 Techniques for Enhanced Data Analysis, 2023, Arxiv
- Pooya Mobadersany et al, Predicting cancer outcomes from histology and genomics using convolutional networks, PNAS 2017, Online
- Jun ma et al, Segment anything in medical images, Nature 2024, Online