Semester | Winter 2022 |
Course type | Block Seminar |
Lecturer | TT.-Prof. Dr. Wressnegger |
Audience | Informatik Master & Bachelor |
Credits | 4 ECTS |
Room | 148, Building 50.34 |
Language | English or German |
Link | https://campus.kit.edu/campus/all/event.asp?gguid=0xB5C7C25A3A7C4464A36349B86FFDDA7B |
Registration | https://ilias.studium.kit.edu/goto.php?target=crs_1922847&client_id=produktiv |
This seminar is concerned with explainable machine learning in computer security. Learning-based systems often are difficult to interpret, and their decisions are opaque to practitioners. This lack of transparency is a considerable problem in computer security, as black-box learning systems are hard to audit and protect from attacks.
The module introduces students to the emerging field of explainable machine learning and teaches them to work up results from recent research. To this end, the students will read up on a sub-field, prepare a seminar report, and present their work at the end of the term to their colleagues.
Topics cover different aspects of the explainability of machine learning methods for the application in computer security in particular.
Date | Step |
Tue, 25. Oct, 11:30–13:00 | Primer on academic writing, assignment of topics |
Thu, 3. Nov | Arrange appointment with assistant |
Mon, 7. Nov - Fri, 11. Nov | 1st individual meeting (First overview, ToC) |
Mo, 5. Dec - Fri, 9. Dec | 2nd individual meeting (Feedback on first draft of the report) |
Thu, 22. Dec | Submit final paper |
Mon, 9. Jan | Submit review for fellow students |
Thu, 12. Jan | End of discussion phase |
Fri, 13. Jan | Notification about paper acceptance/rejection |
Fri, 27. Jan | Submit camera-ready version of your paper |
Fri, 17. Feb | Presentation at final colloquium |
News about the seminar, potential updates to the schedule, and additional material are distributed using a separate mailing list. Moreover, the list enables students to discuss topics of the seminar.
You can subscribe here.
Every student may choose one of the following topics. For each of these, we additionally provide two recent top-tier publications that you should use as a starting point for your own research. For the seminar and your final report, you should not merely summarize that paper, but try to go beyond and arrive at your own conclusions.
Moreover, all of these papers come with open-source implementations. Play around with these and include the lessons learned in your report.
Propagation based explanations are generated by backpropagating relevance values from a network's output to the input. Therefore a variety of propagation-rules have been proposed. The topic should also consider the individual properties satisfied by the different rules.
For human being,a good explanation is always contrastive. People do not only ask ‘Why A?’; they ask ‘Why A rather than B?’. Towards human-centered AI, the research field of explainable AI has gone deeper to counterfactual and contrastive explanation, which focus on this alternative scenario B and especially how it can be generated.
Concept based explanation methods try to explain a model's decision in human-understandable concepts, instead of feature importance values. This topic covers the identification of concepts and visualization of the associated explanations.
Explanations between humans naturally are interactive. In an interactive dialog a human points to the part s/he wants to understand better. This process is guiding the explanation method.
A variety of explanation methods has be proposed in the last year. But which one is the best method for a given task? Clarifying this question is part of this topic.
This topic is spezifically considering user studies as a way to evaluate the quality of explanations. The report should consider the limitations, problems, and results of such an evaluation.
State-of-the-art explanation methods often cause leakage of sensitive information about the model's parameters as well as it's training data. The additional information obtained through explanations can be abused by adversaries.. How can we prevent information leakage through explanations while preserving the explanation quality?
Similarly to adversarial examples (that attack the prediction), adversaries can fool explanation methods. This includes showing useless/wrong explanations or showing a specific targeted explanation.