Adversarial Machine Learning


SemesterWinter 2021
Course typeBlock Seminar
LecturerJun.-Prof. Dr. Wressnegger
AudienceInformatik Master & Bachelor
Credits4 ECTS
Room148, Building 50.34 and online
LanguageEnglish or German
RegistrationPlease register for the course in ILIAS

Remote Course

Due to the ongoing COVID-19 pandemic, this course is going to start off remotely, meaning, the kick-off meeting will happen online. The final colloquium, however, will hopefully be an in-person meeting again (<- This time we might indeed have a chance).

To receive all the necessary information, please subscribe to the mailing list here.


This seminar is concerned with different aspects of adversarial machine learning. Next to the use of machine learning for security, also the security of machine learning algorithms is essential in practice. For a long time, machine learning has not considered worst-case scenarios and corner cases as those exploited by an adversarial nowadays.

The module introduces students to the recently extremely active field of attacks against machine learning and teaches them to work up results from recent research. To this end, the students will read up on a sub-field, prepare a seminar report, and present their work at the end of the term to their colleagues.

Topics include but are not limited to adversarial examples, model stealing, membership inferences, poisoning attacks, and defenses against such threats.


Tue, 19. Oct, 10:00–11:30Primer on academic writing, assignment of topics
Thu, 28. OctArrange appointment with assistant
Mo, 1. Nov - Fr, 5. NovIndividual meetings with assistant
Wed, 1. DecSubmit final paper
Wed, 22. DecSubmit review for fellow students
Fri, 7. JanEnd of discussion phase
Fri, 21. JanSubmit camera-ready version of your paper
Fri, 11. FebPresentation at final colloquium

Mailing List

News about the seminar, potential updates to the schedule, and additional material are distributed using a separate mailing list. Moreover, the list enables students to discuss topics of the seminar.

You can subscribe here.


Every student may choose one of the following topics. For each of these, we additionally provide a few recent top-tier publication that you should use as a starting point for your own research. For the seminar and your final report, you should not merely summarize that paper, but try to go beyond and arrive at your own conclusions.

Moreover, most of these papers come with open-source implementations. Play around with these and include the lessons learned in your report.

  • Data-Free Adversarial Attacks

    In a black-box setting (the attack has no access to the ML model), the adversary usually learns a substitute/surrogate model to craft adversarial examples. However, the lack of authentic training data degrades the efficiency of such attacks. Recently a novel attack has been presented that even works without any real data. This seminar topic aims to investigate the possibility of such data-free adversarial attacks.

    • Zhou et al., "DaST: Data-free Substitute Training for Adversarial Attacks", CVPR 2020

  • Non-linearity Helps Robustness

    Adversarial training increases a model's robustness by introducing adversarial examples in the training procedure. Recent research suggests that the effectiveness of adversarial examples against a model is linked to its linearity. Therefore, increasing non-linearity may help improve model robustness. For this seminar topic, the student will investigate the impact of non-linearity on the model's robustness.

    • Cohen et al., "Certified Adversarial Robustness via Randomized Smoothing", ICML 2019

  • Increasing Robustness by Activation Suppression

    The impact of adversarial perturbation is accumulated through different layers of the ML model up until it subverts the final prediction result. Breaking the connection between layers can reduce the adversarial influence on activation maps, allowing for a novel technique for protecting against attacks. For this topic, the student studies how network reconstruction helps to improve a model's robustness.

    • Bai et al., "Improving Adversarial Robustness via Channel-wise Activation Suppressing", ICLR 2021

  • Trade-off between Backdoor and Adversarial Attack

    Backdoor attacks and evasion attacks are two fundamental branches in the field of adversarial machine learning. Backdoors are introduced during training already, while evasion attacks occur later on during inference. Both, however, share similarities that should be explored as part of this seminar topic:

    • Weng et al., "On the Trade-off between Adversarial and Backdoor Robustness", NeurIPS 2020

  • Backdoors in Federated Learning

    Federated learning aggregates updates from local participants to learn a common/shared remote model, ensuring the confidentiality of a user's training data on a high level. At the same time, the unobservable local data carries hidden threats when the user tries to inject backdoored data into the training process. Defenses against such attacks thus are essential and urgently needed.

    • Bagdasaryan et al., "How To Backdoor Federated Learning", AISTATS 2020

  • Improving NN Robustness by Explainability

    To evaluated adversarial vulnerability in detail, different works propose to use l0, l1, or l2 norm-based distances of the activation map as the criterion. The activation map, however, cannot comprehensively reflect the model prediction behavior regarding the inputs. Recent works try to connect explainable AI with adversarial robustness: Explainability methods such as IG and LRP strongly rely on gradients with similar motivation as in adversarial evasion attacks.

    • Boopathy et al., "Proper Network Interpretability Helps Adversarial Robustness in Classification", ICML 2020

  • Stealing DNNs from MlaaS Cloud Platforms?

    Model extraction attacks can be used to "steal" an approximation of an MLaaS ("machine learning as a service") model by querying the model in a black-box fashion. This approximation (substitute/surrogate model) can then be used to generate adversarial examples against the original remotely deployed model. This seminar topic focuses on systematizing (assumptions, pros, cons, application scenarios) state-of-art model extraction attacks and corresponding defense methods.

    • Kariyappa et al., "Maze: Data-free Model Stealing Attack using Zeroth-Order Gradient Estimation", CVPR 2021

  • Less Budget, More Effective Model Stealing with Active Learning

    Effective model extraction remains a challenge for various reasons, such as the limited query budget or the lack of knowledge about the model's architecture. Recent research considers model extractions as a process of active learning (AL) where the remote model is treated as an oracle in the AL setting. It is shown that AL strategies benefit effective model extraction by reducing the query budget significantly. Hence, in this seminar report, the student focuses on state-of-art model extractions attacks that use learning technologies, such as active learning or transfer learning.

    • Chandrasekaran et al., "Exploring Connections Between Active Learning and Model Extraction", USENIX Security 2020