Recent training-time defenses against neural backdoors isolate a benign subset from poisoned training data, to learn a backdoor-free model from it. In this paper, we formulate this defense strategy as a coreset selection problem, giving rise to so-called "Anti-Backdoor Coreset Selection." Since poisonous samples have a) lower prediction uncertainty and are b) less frequent than benign samples, coreset selection naturally focuses more on samples associated with benign functionality than the backdoor functionality. We use the Cumulative Entropy as selection criterion to further facilitate this effect. The metric tracks the learning dynamics of training samples and allowing us to select benign samples with high informativeness for the coreset. Additionally, we unlearn the chosen samples in each epoch to facilitate the separability between benign and poisonous samples. Together, this yields an exceptionally effective training-time defense that constructs a benign coreset to train a backdoor-free model. Unlike prior defenses that compromise natural accuracy and fail against certain attacks, our method mitigates backdooring attacks consistently with a negligible impact on natural performance.
For further details please consult the conference publication.
The figure below (top) shows the advantage of using cumulative uncertainty over intermediate (epoch-wise) uncertainty for selecting a training data coreset. The figure below (bottom) demonstrates the superior performance of the cumulative entropy criterion.
For the sake of reproducibility and to foster future research, we make the implementations of ABCS for backdoor-free training coreset selection publicly available at:
https://github.com/intellisec/abcs
A detailed description of our work will been presented at the ICML 2026. If you would like to cite our work, please use the reference as provided below:
@InProceedings{Zhao2026AntiBackdoor,
author = {Qi Zhao and Christian Wressnegger},
booktitle = {Proc. of the International Conference on Machine Learning (ICML)},
title = {Anti-Backdoor Coreset Selection via Cumulative Entropy},
year = {2026},
month = july,
}
A preprint of the paper is available here.