Section outline

  • Code: 0005A, Credits (ECTS): 9, Semester: 2, Official Language: English

    Instructors: Davide Bacciu - Marco Podda 

    Office Hours: (email to arrange meeting)

  • Weekly Schedule

    The course is held on the second term. The schedule for A.A. 2024/25 is provided in table below.

    The first lecture of the course will be ON FEBRUARY 18th 2025 h. 16.00. The course will be in person, with lecture videos being recorded and made available to course students (with no guarantee of quality nor completeness).

    Day Time
    Tuesday 16.15-18.00 (Room C - Polo Fibonacci)
    Wednesday 11.00-12.45 (Room L1 - Polo Fibonacci)
    Thursday 16.15-18.00 (Room L1 - Polo Fibonacci)

     

    Course Prerequisites

    Course prerequisites include knowledge of elements of probability and statistics, calculus and optimization. Previous programming experience with Python is a plus for the practical lectures.

    Course Overview

    This course provides a comprehensive foundation in machine learning (ML) and deep learning (DL), focusing on their practical applications in healthcare. Students will gain proficiency in key programming libraries and explore how AI methodologies can be leveraged for patient and risk stratification, disease prediction, and modeling disease progression.

    Special attention is given to the unique challenges of working with health data, including physiological time-series, clinical text, and medical imaging. Through hands-on lab sessions, students will implement and analyze models for supervised prediction, clinical text processing, interpretability analysis, and causal inference in healthcare contexts. This course equips students with the essential skills and insights needed to harness ML/DL technologies for transformative healthcare solutions.

    Topics: Fundamentals of Probability and Statistics for AI, Fundamentals of Machine Learning, ML for Risk Stratification and Diagnosis, Bayesian Networks in Healthcare, Deep Learning Fundamentals, Medical Imaging Data, Deep Learning for Medical Imaging, Sequential Data in Health, ML with Clinical Text: Natural Language Processing, Attention and Transformer Models, Language Models for Clinical Text and Medical Imaging, Graphs in Health and Life Sciences, Deep Learning for Graphs in Health, Tackling Challenges in Healthcare Data Processing, and Deployment of AI-based Applications for Health

    Textbooks and Teaching Materials

    Much of the course content will be available through lecture slides and associated bibliographic references. 

    We will use two main textbooks, one covering more general fundamentals of artificial intelligence and machine learning, and the other more oriented towards applications of AI and statistics to healthcare.

    Note that all books have an electronic version freely available online.

    [SD] Simon J.D. Prince, Understanding Deep Learning, MIT Press (2023) (book online and additional materials)

    [AI4H] G.J. Simon, C. Aliferis, AI and ML in health Care and Medical Sciences, Springer, 2024 (open book online)

    Useful Jupyter Notebooks

    Familiarity with Python libraries such as NumPy, Pandas, and MatPlotLib is assumed. In case you need a knowledge refresher, here are some comprehensive notebooks:

  • Introduction to the course philosophy, its learning goals and expected outcomes. We will discuss prospectively the overall structure of the course and the interelations between its parts. Exam modalities and schedule are also discussed.

      Date Topic References
    1 18/02/2025
    (16-18)
    Introduction to the course
    Motivations and aim; course housekeeping (exams, timetable, materials); overview of AI, its historical development, key concepts, and its relevance to the field of digital health; discussion on the importance of AI in modern healthcare systems and its potential transformative impact.
    [AI4H] Chapter 1 "Artificial Intelligence (AI) and Machine Learning (ML) for Healthcare and Health Sciences"

     

  • The module will introduce the fundamentals of AI and machine learning, formalizing the main learning methods and providing knowledge on baseline ML methodologies including regression, neural networks, probabilistic/Bayesian models, causality, complemented with statistical methods for risk estimation and censored data.

      Date Topic References Additional material
    2

    19/02/2025 (11-13)

    Fundamentals of Probability and Statistics for AI

    basic concepts of probability and statistics; random variables and probability distributions; hypothesis testing and statistical inference

    [SD] Appendix C1-C3 (refresher)  
    L1 20/02/2025 (16-18) Lab Tutorial 1: Hypothesis testing   Other notebooks on P&S
    3 25/02/2025 (16-18)

    Machine Learning: fundamentals I

    basic concepts of ML, learning paradigms and fundamental ML tasks, data types and their roles in ML, generalization, bias/variance tradeoff

    [SD] Ch. 1, Sect 2.1

    [AI4H] Pg. 1-16, Pg.  68-74, pg. 81-87

     
    L2 26/02/2025 (11-13)

    Lab Tutorial 2: ML with scikit-learn

       
    4 27/02/2025 (16-18)

    Machine Learning: fundamentals II

    Regularization and model selection

    Machine Learning: Linear Models I

    Linear regression, regularized linear regression (ridge, LASSO, ElasticNet)

    [SD] Sect 2.2, sect 6.1.0-6.1.1 More details on the linear regression can be found on this addendum to SD book, in section 8.1 
    5

    04/03/2025 (16-18)

    Machine Learning: Linear Models II

    least square solutions, logistic regression, binary classification, gradient descent training.

      More details on the logistic regression can be found on this addendum to SD book, in section 9.1 
    L3

    05/03/2025 (11-13)

    Lab Tutorial 3: Model selection

       
    L4

    06/03/2025 (16-18)

    Lab Tutorial 4: Linear regression

       
    6

    11/03/2025 (16-18)

    Artificial Neural Networks I

    introduction to artificial NNs; biological neuron; artificial neuron; multi-layer perceptron

    [SD] Chapter 3 (with the exclusion of Sect. 3.2)  
    7

    12/03/2025 (11-13)

    Artificial Neural Networks II

    activation functions, input normalization; output layers; training artificial NNs; backpropagation

    [SD] Sections 6.1 & 6.2: gradient descent

    [SD] Sections 7-1 & 7.2: computing gradients

    [SD] Section 7.4: backpropagation (this is additional and in depth material for those that want to know more about how backprop works)
    L5

    13/03/2025 (16-18)

    Lab 5: Prova in itinere

       
    8

    18/03/2025 (16-18)

    Risk stratification

    scoring models; risk factors; assessment of risk predictors; censoring

    Slides should be sufficient for this lecture. If you need some additional sources of information here is an highlevel introduction to risk stratification.

    [1] Time-variant logistic regression for risk scoring

    [2] Explainable risk scoring with random forests, XGboost, SVM, NNs

    Software:

    Here is a quite handy library in R for calibration plots

    9

    19/03/2025 (11-13)

    Survival analysis

    survival analysis framework; Kaplan-Meier; Cox regression; neural networks for survival analysis; survival trees

    [AI4H] Pg. 154-159, pg. 162-168

    [3] Entry-level survey on survival analysis

    [4] Whole textbook on survival analysis

    [5] Paper on survival trees

    Software:

    The Scikit survival analsysis package (datasets, kaplan-meier, cox, survival trees and forests, gradient boosting survival, hypothesis testing)

    L6

    20/03/2025 (16-18)

    Lab Tutorial 6: Survival analysis

       
    10

    25/03/2025 (16-18)

    Bayesian Networks in Healthcare I

    graphical formalism; random variables and conditional independence; factorized distributions

    [AI4H] Pg. 57-62  
    11

    26/03/2025 (11-13)

    Bayesian Networks in Healthcare II

    relevant graphical substructures; learning in Bayesian Networks; Bayesian Networks in healthcare

    [AI4H] Pg.113-116

    Software:

    pgmpy - Python package for causal inference and probabilistic inference with Bayesian Networks

    12

    27/03/2025 (16-18)

    Causality and learning dependences

    causal relationships and interventions; measuring treatment effects; randomized control trials; discovering dependence in data; structure learning; applications in healthcare and useful libraries

    [AI4H] Pg. 197-204, pg. 207-215, pg. 218-224

    [6] Tutorial paper on the use of causality in ML

    Software:

    PyWhy– A full Python-based ecosystem for causal learning

    CausalLearn– Python-based structure learning package

    L7

    01/04/2025 (16-18)

    Lab Tutorial 7: Causality

       

  •   Date Topic References Additional Material
    13

     02/04/2025 (11-13)

    Deep Learning Fundamentals I

    deep neural networks; gradient issues; activation functions; normalization and regularization; optimization

    [SD] Ch. 4, Sect 7.1, 7.3, 7.5,  
    14  03/04/2025 (16-18)

    Deep Learning Fundamentals II

    neural autoencoders; unsupervised deep learning; autoencoding tasks in healthcare (anomaly detection, compression, denoising)

    [SD] Sect. 6.2-6.5, Sect. 9.3

    [6] Survey on autoencoders

    [7,8] Surveys on autoencoders use in healthcare

    15 08/04/2025 (16-18)

    Convolutional Neural Networks I

    Introduction to medical imaging; basic CNN elements; 

    [SD] Chapter 10  
    16 09/04/2025 (11-13)

    Convolutional Neural Networks II

    CNN architectures; medical imaging tasks

    [SD] Chapter 10

    Additional readings

    [9] Augmentation techniques for medical imaging

    [10] Survival analysis with CNNs

    [11] Large-scale medical image segmentation dataset and application

    Software

    nnU-net - Well engineered framework for low-code medical segmentation tasks

    L8 10/04/2025 (16-18)

    Lab 8: Prova in itinere

       
    GL 15/04/2025 (16-18)

    AI Meets Psychiatry: fMRI-Based Multi-Disorder Diagnosis - Guest Lecture by Elisa Ferrari (CEO, Quantabrain)

    QuantaBrain is a startup developing AI models for the diagnosis and characterization of psychiatric disorders using a short resting-state functional MRI scan, a complex 4D dataset influenced by numerous variables. In this seminar, the QuantaBrain team will present their neural network architecture, which combines adversarial learning with optical flow. They will share the technical challenges they encountered while training this network and scaling it from one to eleven disorders. Finally, participants will get an exclusive preview of the research platform they are currently building, designed to support scientists working on psychiatric disorders.

       
    L9 16/04/2025 (11-13)

    Lab Tutorial 9: Deep Neural Networks for image classification

       
    L10 17/04/2025 (16-18)

    Lab tutorial

       
      18/04/2025 - 25/04/2025

    Spring Break: No Lectures

       

  • Course grading will follow preferentially a modality comprising in-itinere assignments and a final oral exam. In-itinere assignments waive the final project.

    In-itinere assigments

    There are two types of short assigments

    • Laboratory assignment - These are short programming exercises to be solved in-classroom during the laboratory lectures. They will typically have to do with the application of a methodology/model discussed during lectures and lab tutorials, on a benchmark dataset provided by the instructors. We foresee a total of 4 laboratory assignments: each will be scored with a maximum of 3 points.
    • Methodology quiz - These are short quiz concerning the contect of the methodological lectures (e.g. multiple choice questions; simple calculations based on an algorithm). They will be solve in-classroom during the methodology lecture. These a closed-book examinations, lasting about 10 minutes and performed “on paper” (electronic devices not allowed). They will be presented in randomly selected lectures and come unannounced so in-person participation to lectures will be paramount. We foresee a total of 4 methodology quizzes: each will be scored with a maximum of 1 point (fraction of the point are possibile).

    Both laboratory assignments and methodology quizzes will roughly be scheduled every 3/4 weeks.

    Oral exam

    The oral examination will test knowledge of the course contents: models, algorithms and their implementation.  Lectures whose content is not relevant for the final exam will be clearly marked as such

    Exam grading (preferential way)

    The final exam grade is given by the formula below, which combines the total score on lab assigments \( 𝐺_{𝑙𝑎𝑏} \in [0,12] \), the total score on methodology quizzes \( 𝐺_{quiz} \in [0,4] \) and the score achieved during the oral exam \( 𝐺_{oral} \in [0,18] \)

    \( Final grade = min (𝐺_{𝑙𝑎𝑏}+ 𝐺_{𝑞𝑢𝑖𝑧}+ 𝐺_{oral}, 30 cum laude) \)

    Note that students are admitted to the oral exam only if \( 𝐺_{𝑙𝑎𝑏}+ 𝐺_{𝑞𝑢𝑖𝑧} > 8 \)

    Alternative Exam Modality (No in-itinere/ Non attending students)

    Part-time students, those not attending lectures, those who have failed in-itinere assignments or simply do not wish to do them, can complete the course by delivering a final project and an oral exam.  Final project topics will be released in the final weeks of the course.

    The final project concerns a coding project on a topic of interest for the course. It entails preparing and submitting: 

    • the code to solve the project
    • a 10 pages report describing the project methodology and its validation
    • a 10/15 slides presentation. 

    The content of the final project will be discussed in front of the instructors and anybody interested during the oral examination. Students are expected to prepare slides for a 15 minutes presentation which should summarize the problem, solution and results in the report. 

    Grade for this exam modality is determined as

     \( G = 0.5 \cdot (G_P + G_O) \)

    where \( G_P \in [1,30] \) is the project grade and \( G_O \in [1,32] \) is the oral grade

    1. Jenna Wiens, John Guttag, Eric Horvitz, Patient Risk Stratification with Time-Varying Parameters: A Multitask Learning Approach, JMLR 2016, PDF
    2. Tim Smolem et atl, A machine learning-based risk stratification model for ventricular tachycardia and heart failure in hypertrophic cardiomyopathy, Computers in Biology and Medicine, 2021, PDF
    3. Ping Wang, Yan Li, and Chandan K. Reddy. 2019. Machine Learning for Survival Analysis: A Survey. ACM Comput. Surv. 51, 6, Article 110, 2019, Arxiv
    4. David G. Kleinbaum , Mitchel Klein, Survival Analysis, A Self-Learning Text, 2005, Online
    5. Dimitris Bertsimas,  Jack Dunn, Emma Gibson, Agni Orfanoudak, Machine learning, 2022, PDF
    6. S. Chen, W. Guo, Auto-Encoders in Deep Learning—A Review with New Perspectives, Mathematics, 2023, Online
    7. D. Pratella et al,  A Survey of Autoencoder Algorithms to Pave the Diagnosis of Rare Diseases, Int. J. Mol. Sci.. 2021, Online
    8. Jan Ehrhardt, Matthias Wilms, Autoencoders and variational autoencoders in medical image analysis, MICCAI Society book Series, Online
    9. Manuel Cossio, Augmenting Medical Imaging: A Comprehensive Catalogue of 65 Techniques for Enhanced Data Analysis, 2023, Arxiv
    10. Pooya Mobadersany et al, Predicting cancer outcomes from histology and genomics using convolutional networks, PNAS 2017, Online
    11. Jun ma et al, Segment anything in medical images, Nature 2024, Online