Semester | Winter 2021 |
Course type | Block Seminar |
Lecturer | TT.-Prof. Dr. Wressnegger |
Audience | Informatik Master & Bachelor |
Credits | 4 ECTS |
Room | 148, Building 50.34 and online |
Language | English or German |
Link | https://ilias.studium.kit.edu/goto_produktiv_crs_1600111.html |
Registration | Please register for the course in ILIAS |
Due to the ongoing COVID-19 pandemic, this course is going to start off remotely, meaning, the kick-off meeting will happen online. The final colloquium, however, will hopefully be an in-person meeting again (<- This time we might indeed have a chance).
To receive all the necessary information, please subscribe to the mailing list here.
This seminar is concerned with explainable machine learning in computer security. Learning-based systems often are difficult to interpret, and their decisions are opaque to practitioners. This lack of transparency is a considerable problem in computer security, as black-box learning systems are hard to audit and protect from attacks.
The module introduces students to the emerging field of explainable machine learning and teaches them to work up results from recent research. To this end, the students will read up on a sub-field, prepare a seminar report, and present their work at the end of the term to their colleagues.
Topics cover different aspects of the explainability of machine learning methods for the application in computer security in particular.
Date | Step |
Tue, 19. Oct, 14:00–15:30 | Primer on academic writing, assignment of topics |
Thu, 28. Oct | Arrange appointment with assistant |
Mo, 1. Nov - Fr, 5. Nov | Individual meetings with assistant |
Wed, 1. Dec | Submit final paper |
Wed, 22. Dec | Submit review for fellow students |
Fri, 7. Jan | End of discussion phase |
Fri, 21. Jan | Submit camera-ready version of your paper |
Thu, 10. Feb | Presentation at final colloquium |
News about the seminar, potential updates to the schedule, and additional material are distributed using a separate mailing list. Moreover, the list enables students to discuss topics of the seminar.
You can subscribe here.
Every student may choose one of the following topics. For each of these, we additionally provide a recent top-tier publication that you should use as a starting point for your own research. For the seminar and your final report, you should not merely summarize that paper, but try to go beyond and arrive at your own conclusions.
Moreover, all of these papers come with open-source implementations. Play around with these and include the lessons learned in your report.
Every "why" question implicitly contains a contrast. Humans are not asking just "why?", we ask "why A rather than B?". The research field of counterfactual and contrastive explanation focuses on this alternative scenario B and especially how it can be generated.
Some authors claim that explanations for humans should be interactive. In an interactive dialog a human points to the part s/he wants to understand.
The seminar report should present approaches that use user studies to evaluate the quality of explanations. The limitations, problems, and results of such an evaluation is important.
It turns out there is a vast amount of philosophical work and papers produced by the social sciences on explanation and on how human provide and understand explanations. These lines of research already use a detailed taxonomy of explanations, causes of effects, and (human) behavior.
Explainable active learning is a novel paradigm that introduces XAI into an AL setting. The benefits are supporting trust calibration, enabling rich forms of teaching feedback, and potential drawbacks–anchoring effect with the model judgment and additional cognitive workload.
Most of ML explanation methods revolved around importance to individual features or pixels, which suffers from several drawbacks. Therefore, multiple works focus on high level human concept-based explanations, which should be studies as part of this seminar topic.
This seminar report should discuss and relate the possibilities for comparing explanations methods with each other.
Similarly to adversarial examples (that attack the classifier), we can fool explanation methods. This includes showing useless/wrong explanations or showing a specific targeted explanation.