Trustworthy Machine Learning2026HT, 6.0 credits
|
|
Course plan
No of lectures
6-8 seminars (Depends on the number of participants)
Recommended for
This course is intended for PhD students from AI/ML or Cybersecurity disciplines and wants to gain deeper knowledge about security and privacy vulnerabilities in current machine learning systems, model and data governance, as well as compliance with legal frameworks.
The course was last given
This is a new course.
Goals
- Apply threat modeling methods to identify and analyze vulnerabilities within
machine learning (ML) systems.
- Explain and evaluate key security and privacy risks associated with ML
systems, as well as solutions to mitigate such risks.
- Critically analyze and present seminal papers that contribute different
aspects of trustworthy machine learning, including robustness, privacy,
truthfulness, and integrity.
- Explore and assess relevant legal and governance frameworks that influence
the development and deployment of ML systems.
- Evaluate contemporary ML systems from the perspective of trustworthy ML
principles, identify gaps between theory and practice.
Prerequisites
- Some background in machine learning/deep learning or related courses.
Organization
The course mainly consist of student presentations and seminar discussions focused on recent research articles.
Content
The course introduces PhD students to methods and frameworks for analyzing and ensuring trustworthiness in machine learning (ML) systems. The first two sessions are delivered in lecture format and use case studies, providing an overview of threat modeling, security, and privacy considerations in ML systems. The remaining sessions consist of student presentations and seminar discussions focused on recent research articles addressing various aspects of trustworthy machine learning, including robustness, data and model confidentiality, truthfulness, integrity, verifiability, and auditability.
Literature
The course literature consists of a number of peer-reviewed papers. A detailed
list of research articles will be provided in the beginning of the course.
Three examples of papers are provided below. Note that although these papers
are one of the most cited seminal papers, they may not be included when the
course starts as the field is rapidly evolving.
- Goodfellow, I., Shlens, J., and Szegedy, C. (2015) "Explaining and Harnessing
Adversarial Examples". 2015 International Conference on Learning
Representations. https://doi.org/10.48550/arXiv.1412.6572
- Adi, Yossi, et al. (2018) "Turning Your Weakness Into a Strength:
Watermarking Deep Neural Networks by Backdooring". 27th USENIX Security
Symposium https://www.usenix.org/conference/usenixsecurity18/presentation/adi
- Tabassi, E. (2023), Artificial Intelligence Risk Management Framework (AI RMF
1.0), NIST Trustworthy and Responsible AI, National Institute of Standards and
Technology, https://doi.org/10.6028/NIST.AI.100-1.
Lectures
The course consists of two lectures followed by a series of student-led
seminars. The schedule may be adjusted according to the presentation
preferences and the selected topics.
Lectures:
1. Course Introduction and Threat Modeling in Machine Learning: Overview of
trustworthy ML concepts, threat modeling methodologies, and common
vulnerabilities in ML pipelines.
2. Case Studies and AI Governance: Discussion of real-world case studies
illustrating security and privacy challenges in ML systems, and an introduction
to governance and regulatory frameworks in ML systems.
Seminars:
1. Test-Time Integrity: Attacks and defenses related to inference-time
manipulation, adversarial examples, and reliability.
2. Training-Time Integrity: Data poisoning, backdoor attacks, and robustness of
training pipelines.
3. Model Confidentiality: Model extraction, intellectual property protection,
and secure model sharing.
4. Data Confidentiality: Privacy-preserving machine learning, membership
inference, and differential privacy mechanisms.
5. AI Governance, Verification and Auditability: Technical measures to ensure
compliance, accountability, truthfulness and transparency.
6. Trustworthiness in Practice: Trustworthiness in real-world applications and
as a multi-objective problem.
Examination
Examination consists of 80% attendance, a mandatory oral presentation, active
participation in discussions following the presentations, weekly reading
assignments as a preparation for the seminar sessions, and a final deliverable.
The final deliverable is a written essay (5-6 pages) on a chosen aspect of the
trustworthiness requirements in machine learning, in which the student is
expected to define the selected concept, review the state of the art,
critically analyze existing approaches, and discuss promising research
directions or open challenges.
Examiner
Buse Atli
Credits
6hp
Comments
The course can be given on Zoom/Hybrid if there are non-local participants.
Ethics statement: The course includes topics that might involve techniques capable of causing harm. In this course, we emphasize the ethical use of these techniques, and strictly for non-commercial research and educational purposes. Any unethical activities (using lecture materials, seminal papers, or assignments for harmful purposes, spreading or exploiting vulnerabilities in AI/ML services) are strictly prohibited.
Page responsible: Anne Moe
