Backdoor Learning Resources

This Github repository summarizes a list of Backdoor Learning resources. For more details and the categorization criteria, please refer to our survey.

We will try our best to continuously maintain this Github Repository in a monthly manner.

Why Backdoor Learning?

Backdoor learning is an emerging research area, which discusses the security issues of the training process towards machine learning algorithms. It is critical for safely adopting third-party training resources or models in reality.

Note: 'Backdoor' is also commonly called the 'Neural Trojan' or 'Trojan'.

News

2023/7/24: I add ten ICLR'23 papers. All papers from this conference should have been included now.
2023/7/23: I add seven NeurIPS'22 papers and four AAAI'23 papers. All papers from these conferences should have been included now.
2023/01/25: I am deeply sorry that I have recently suspended the reading of related papers and the updates of this Repo, due to some personal issues such as sickness and writing Ph.D. dissertation. I will restart the update after June 2023.
2022/12/05: I slightly change the repo format by placing conference papers before journal papers. Specifically, in the same year, please place the conference paper before the journal paper, as journals are usually submitted a long time ago and therefore have some lag.
2022/12/05: I add three ECCV'22 papers. All papers from this conference should have been included now.

Reference

If our repo or survey is useful for your research, please cite our paper as follows:

@article{li2022backdoor,
  title={Backdoor learning: A survey},
  author={Li, Yiming and Jiang, Yong and Li, Zhifeng and Xia, Shu-Tao},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
  year={2022}
}

Contributing

Please help to contribute this list by contacting me or add pull request

Markdown format:

- Paper Name. 
  [[pdf]](link) 
  [[code]](link)
  - Author 1, Author 2, **and** Author 3. *Conference/Journal*, Year.

Note: In the same year, please place the conference paper before the journal paper, as journals are usually submitted a long time ago and therefore have some lag. (i.e., Conferences-->Journals-->Preprints)

Survey

Backdoor Learning: A Survey. [pdf]
- Yiming Li, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. IEEE Transactions on Neural Networks and Learning Systems, 2022.
Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive Review. [pdf]
- Yansong Gao, Bao Gia Doan, Zhi Zhang, Siqi Ma, Anmin Fu, Surya Nepal, and Hyoungshick Kim. arXiv, 2020.
Data Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses. [pdf]
- Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, and Tom Goldstein. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
A Comprehensive Survey on Poisoning Attacks and Countermeasures in Machine Learning. [link]
- Zhiyi Tian, Lei Cui, Jie Liang, and Shui Yu. ACM Computing Surveys, 2022.
Backdoor Attacks and Defenses in Federated Learning: State-of-the-art, Taxonomy, and Future Directions. [link]
- Xueluan Gong, Yanjiao Chen, Qian Wang, and Weihan Kong. IEEE Wireless Communications, 2022.
Backdoor Attacks on Image Classification Models in Deep Neural Networks. [link]
- Quanxin Zhang, Wencong Ma, Yajie Wang, Yaoyuan Zhang, Zhiwei Shi, and Yuanzhang Li. Chinese Journal of Electronics, 2022.
Defense against Neural Trojan Attacks: A Survey. [link]
- Sara Kaviani and Insoo Sohn. Neurocomputing, 2021.
A Survey on Neural Trojans. [pdf]
- Yuntao Liu, Ankit Mondal, Abhishek Chakraborty, Michael Zuzak, Nina Jacobsen, Daniel Xing, and Ankur Srivastava. ISQED, 2020.
Backdoor Attacks against Voice Recognition Systems: A Survey. [pdf]
- Baochen Yan, Jiahe Lan, and Zheng Yan. arXiv, 2023.
A Survey of Neural Trojan Attacks and Defenses in Deep Learning. [pdf]
- Jie Wang, Ghulam Mubashar Hassan, and Naveed Akhtar. arXiv, 2022.
Threats to Pre-trained Language Models: Survey and Taxonomy. [pdf]
- Shangwei Guo, Chunlong Xie, Jiwei Li, Lingjuan Lyu, and Tianwei Zhang. arXiv, 2022.
An Overview of Backdoor Attacks Against Deep Neural Networks and Possible Defences. [pdf]
- Wei Guo, Benedetta Tondi, and Mauro Barni. arXiv, 2021.
Deep Learning Backdoors. [pdf]
- Shaofeng Li, Shiqing Ma, Minhui Xue, and Benjamin Zi Hao Zhao. arXiv, 2020.

Toolbox

Dissertation and Thesis

Poisoning-based Backdoor Attacks in Computer Vision. [pdf]
- Yiming Li. Ph.D. Dissertation at Tsinghua University, 2023.
Defense of Backdoor Attacks against Deep Neural Network Classifiers. [pdf]
- Zhen Xiang. Ph.D. Dissertation at The Pennsylvania State University, 2022.
Towards Adversarial and Backdoor Robustness of Deep Learning. [link]
- Yifan Guo. Ph.D. Dissertation at Case Western Reserve University, 2022.
Toward Robust and Communication Efficient Distributed Machine Learning. [pdf]
- Hongyi Wang. Ph.D. Dissertation at University of Wisconsin–Madison, 2021.
Towards Robust Image Classification with Deep Learning and Real-Time DNN Inference on Mobile. [pdf]
- Pu Zhao. Ph.D. Dissertation at Northeastern University, 2021.
Countermeasures Against Backdoor, Data Poisoning, and Adversarial Attacks. [pdf]
- Henry Daniel. Ph.D. Dissertation at University of Texas at San Antonio, 2021.
Understanding and Mitigating the Impact of Backdooring Attacks on Deep Neural Networks. [pdf]
- Kang Liu. Ph.D. Dissertation at New York University, 2021.
Un-fair trojan: Targeted Backdoor Attacks against Model Fairness. [pdf]
- Nicholas Furth. Master Thesis at New Jersey Institute of Technology, 2022.
Check Your Other Door: Creating Backdoor Attacks in the Frequency Domain. [pdf]
- Hasan Abed Al Kader Hammoud. Master Thesis at King Abdullah University of Science and Technology, 2022.
Backdoor Attacks in Neural Networks. [link]
- Stefanos Koffas. Master Thesis at Delft University of Technology, 2021.
Backdoor Defenses. [pdf]
- Andrea Milakovic. Master Thesis at Technischen Universität Wien, 2021.
Geometric Properties of Backdoored Neural Networks. [pdf]
- Dominic Carrano. Master Thesis at University of California at Berkeley, 2021.
Detecting Backdoored Neural Networks with Structured Adversarial Attacks. [pdf]
- Charles Yang. Master Thesis at University of California at Berkeley, 2021.
Backdoor Attacks Against Deep Learning Systems in the Physical World. [pdf]
- Emily Willson. Master Thesis at University of Chicago, 2020.

Image and Video Classification

Poisoning-based Attack

2023

Revisiting the Assumption of Latent Separability for Backdoor Defenses. [pdf] [code]
- Xiangyu Qi, Tinghao Xie, Yiming Li, Saeed Mahloujifar, and Prateek Mittal. ICLR, 2023.
Few-shot Backdoor Attacks via Neural Tangent Kernels. [pdf] [code]
- Jonathan Hayase and Sewoong Oh. ICLR, 2023.
Color Backdoor: A Robust Poisoning Attack in Color Space. [pdf]
- Wenbo Jiang, Hongwei Li, Guowen Xu, and Tianwei Zhang. CVPR, 2023.

2022

Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection. [pdf] [code]
- Yiming Li, Yang Bai, Yong Jiang, Yong Yang, Shu-Tao Xia, and Bo Li. NeurIPS, 2022.
DEFEAT: Deep Hidden Feature Backdoor Attacks by Imperceptible Perturbation and Latent Representation Constraints. [pdf]
- Zhendong Zhao, Xiaojun Chen, Yuexin Xuan, Ye Dong, Dakui Wang, and Kaitai Liang. CVPR, 2022.
Marksman Backdoor: Backdoor Attacks with Arbitrary Target Class. [pdf]
- Khoa D Doan, Yingjie Lao, and Ping Li. NeurIPS, 2022.
An Invisible Black-box Backdoor Attack through Frequency Domain. [pdf] [code]
- Tong Wang, Yuan Yao, Feng Xu, Shengwei An, Hanghang Tong, and Ting Wang. ECCV, 2022.
BppAttack: Stealthy and Efficient Trojan Attacks against Deep Neural Networks via Image Quantization and Contrastive Adversarial Learning. [pdf] [code]
- Zhenting Wang, Juan Zhai, and Shiqing Ma. CVPR, 2022.
Dynamic Backdoor Attacks Against Machine Learning Models. [pdf]
- Ahmed Salem, Rui Wen, Michael Backes, Shiqing Ma, and Yang Zhang. EuroS&P, 2022.
Imperceptible Backdoor Attack: From Input Space to Feature Representation. [pdf] [code]
- Nan Zhong, Zhenxing Qian, and Xinpeng Zhang. IJCAI, 2022.
Stealthy Backdoor Attack with Adversarial Training. [link]
- Le Feng, Sheng Li, Zhenxing Qian, and Xinpeng Zhang. ICASSP, 2022.
Invisible and Efficient Backdoor Attacks for Compressed Deep Neural Networks. [link]
- Huy Phan, Yi Xie, Jian Liu, Yingying Chen, and Bo Yuan. ICASSP, 2022.
Dynamic Backdoors with Global Average Pooling. [pdf]
- Stefanos Koffas, Stjepan Picek, and Mauro Conti. AICAS, 2022.
Poison Ink: Robust and Invisible Backdoor Attack. [pdf]
- Jie Zhang, Dongdong Chen, Qidong Huang, Jing Liao, Weiming Zhang, Huamin Feng, Gang Hua, and Nenghai Yu. IEEE Transactions on Image Processing, 2022.
Enhancing Backdoor Attacks with Multi-Level MMD Regularization. [link]
- Pengfei Xia, Hongjing Niu, Ziqiang Li, and Bin Li. IEEE Transactions on Dependable and Secure Computing, 2022.
PTB: Robust Physical Backdoor Attacks against Deep Neural Networks in Real World. [link]
- Mingfu Xue, Can He, Yinghao Wu, Shichang Sun, Yushu Zhang, Jian Wang, and Weiqiang Liu. Computers & Security, 2022.
IBAttack: Being Cautious about Data Labels. [link]
- Akshay Agarwal, Richa Singh, Mayank Vatsa, and Nalini Ratha. IEEE Transactions on Artificial Intelligence, 2022.
BlindNet Backdoor: Attack on Deep Neural Network using Blind Watermark. [link]
- Hyun Kwon and Yongchul Kim. Multimedia Tools and Applications, 2022.
Natural Backdoor Attacks on Deep Neural Networks via Raindrops. [link]
- Feng Zhao, Li Zhou, Qi Zhong, Rushi Lan, and Leo Yu Zhang. Security and Communication Networks, 2022.
Dispersed Pixel Perturbation-based Imperceptible Backdoor Trigger for Image Classifier Models. [pdf]
- Yulong Wang, Minghui Zhao, Shenghong Li, Xin Yuan, and Wei Ni. arXiv, 2022.
FRIB: Low-poisoning Rate Invisible Backdoor Attack based on Feature Repair. [pdf]
- Hui Xia, Xiugui Yang, Xiangyun Qian, and Rui Zhang. arXiv, 2022.
Augmentation Backdoors. [pdf] [code]
- Joseph Rance, Yiren Zhao, Ilia Shumailov, and Robert Mullins. arXiv, 2022.
Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation. [pdf]
- Tong Wu, Tianhao Wang, Vikash Sehwag, Saeed Mahloujifar, and Prateek Mittal. arXiv, 2022.
Natural Backdoor Datasets. [pdf]
- Emily Wenger, Roma Bhattacharjee, Arjun Nitin Bhagoji, Josephine Passananti, Emilio Andere, Haitao Zheng, and Ben Y. Zhao. arXiv, 2022.
Enhancing Clean Label Backdoor Attack with Two-phase Specific Triggers. [pdf]
- Nan Luo, Yuanzhang Li, Yajie Wang, Shangbo Wu, Yu-an Tan, and Quanxin Zhang. arXiv, 2022.
Circumventing Backdoor Defenses That Are Based on Latent Separability. [pdf] [code]
- Xiangyu Qi, Tinghao Xie, Saeed Mahloujifar, and Prateek Mittal. arXiv, 2022.
Narcissus: A Practical Clean-Label Backdoor Attack with Limited Information. [pdf]
- Yi Zeng, Minzhou Pan, Hoang Anh Just, Lingjuan Lyu, Meikang Qiu, and Ruoxi Jia. arXiv, 2022.
CASSOCK: Viable Backdoor Attacks against DNN in The Wall of Source-Specific Backdoor Defences. [pdf]
- Shang Wang, Yansong Gao, Anmin Fu, Zhi Zhang, Yuqing Zhang, and Willy Susilo. arXiv, 2022.
Trojan Horse Training for Breaking Defenses against Backdoor Attacks in Deep Learning. [pdf]
- Arezoo Rajabi, Bhaskar Ramasubramanian, and Radha Poovendran. arXiv, 2022.
Label-Smoothed Backdoor Attack. [pdf]
- Minlong Peng, Zidi Xiong, Mingming Sun, and Ping Li. arXiv, 2022.
Imperceptible and Multi-channel Backdoor Attack against Deep Neural Networks. [pdf]
- Mingfu Xue, Shifeng Ni, Yinghao Wu, Yushu Zhang, Jian Wang, and Weiqiang Liu. arXiv, 2022.
Compression-Resistant Backdoor Attack against Deep Neural Networks. [pdf]
- Mingfu Xue, Xin Wang, Shichang Sun, Yushu Zhang, Jian Wang, and Weiqiang Liu. arXiv, 2022.

2021

Invisible Backdoor Attack with Sample-Specific Triggers. [pdf] [code]
- Yuezun Li, Yiming Li, Baoyuan Wu, Longkang Li, Ran He, and Siwei Lyu. ICCV, 2021.
Manipulating SGD with Data Ordering Attacks. [pdf]
- Ilia Shumailov, Zakhar Shumaylov, Dmitry Kazhdan, Yiren Zhao, Nicolas Papernot, Murat A. Erdogdu, and Ross Anderson. NeurIPS, 2021.
Backdoor Attack with Imperceptible Input and Latent Modification. [pdf]
- Khoa Doan, Yingjie Lao, and Ping Li. NeurIPS, 2021.
LIRA: Learnable, Imperceptible and Robust Backdoor Attacks. [pdf]
- Khoa Doan, Yingjie Lao, Weijie Zhao, and Ping Li. ICCV, 2021.
Blind Backdoors in Deep Learning Models. [pdf] [code]
- Eugene Bagdasaryan, and Vitaly Shmatikov. USENIX Security, 2021.
Backdoor Attacks Against Deep Learning Systems in the Physical World. [pdf] [Master Thesis]
- Emily Wenger, Josephine Passanati, Yuanshun Yao, Haitao Zheng, and Ben Y. Zhao. CVPR, 2021.
Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification. [pdf] [code]
- Siyuan Cheng, Yingqi Liu, Shiqing Ma, and Xiangyu Zhang. AAAI, 2021.
WaNet - Imperceptible Warping-based Backdoor Attack. [pdf] [code]
- Tuan Anh Nguyen, and Anh Tuan Tran. ICLR, 2021.
AdvDoor: Adversarial Backdoor Attack of Deep Learning System. [pdf] [code]
- Quan Zhang, Yifeng Ding, Yongqiang Tian, Jianmin Guo, Min Yuan, and Yu Jiang. ISSTA, 2021.
Invisible Poison: A Blackbox Clean Label Backdoor Attack to Deep Neural Networks. [pdf]
- Rui Ning, Jiang Li, ChunSheng Xin, and Hongyi Wu. INFOCOM, 2021.
Backdoor Attack in the Physical World. [pdf] [extension]
- Yiming Li, Tongqing Zhai, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. ICLR Workshop, 2021.
Defense-Resistant Backdoor Attacks against Deep Neural Networks in Outsourced Cloud Environment. [Link]
- Xueluan Gong, Yanjiao Chen, Qian Wang, Huayang Huang, Lingshuo Meng, Chao Shen, and Qian Zhang. IEEE Journal on Selected Areas in Communications, 2021.
A Master Key Backdoor for Universal Impersonation Attack against DNN-based Face Verification. [link]
- WeiGuo, Benedetta Tondi, and Mauro Barni. Pattern Recognition Letters, 2021.
Backdoors Hidden in Facial Features: A Novel Invisible Backdoor Attack against Face Recognition Systems. [link]
- Mingfu Xue, Can He, Jian Wang, and Weiqiang Liu. Peer-to-Peer Networking and Applications, 2021.
Use Procedural Noise to Achieve Backdoor Attack. [link] [code]
- Xuan Chen, Yuena Ma, and Shiwei Lu. IEEE Access, 2021.
A Multitarget Backdooring Attack on Deep Neural Networks with Random Location Trigger. [link]
- Xiao Yu, Cong Liu, Mingwen Zheng, Yajie Wang, Xinrui Liu, Shuxiao Song, Yuexuan Ma, and Jun Zheng. International Journal of Intelligent Systems, 2021.
Simtrojan: Stealthy Backdoor Attack. [link]
- Yankun Ren, Longfei Li, and Jun Zhou. ICIP, 2021.
A Statistical Difference Reduction Method for Escaping Backdoor Detection. [pdf]
- Pengfei Xia, Hongjing Niu, Ziqiang Li, and Bin Li. arXiv, 2021.
Backdoor Attack through Frequency Domain. [pdf]
- Tong Wang, Yuan Yao, Feng Xu, Shengwei An, and Ting Wang. arXiv, 2021.
Check Your Other Door! Establishing Backdoor Attacks in the Frequency Domain. [pdf]
- Hasan Abed Al Kader Hammoud and Bernard Ghanem. arXiv, 2021.
Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch. [pdf] [code]
- Hossein Souri, Micah Goldblum, Liam Fowl, Rama Chellappa, and Tom Goldstein. arXiv, 2021.
RABA: A Robust Avatar Backdoor Attack on Deep Neural Network. [pdf]
- Ying He, Zhili Shen, Chang Xia, Jingyu Hua, Wei Tong, and Sheng Zhong. arXiv, 2021.
Robust Backdoor Attacks against Deep Neural Networks in Real Physical World. [pdf]
- Mingfu Xue, Can He, Shichang Sun, Jian Wang, and Weiqiang Liu. arXiv, 2021.

2020

Composite Backdoor Attack for Deep Neural Network by Mixing Existing Benign Features. [pdf]
- Junyu Lin, Lei Xu, Yingqi Liu, Xiangyu Zhang. CCS, 2020.
Input-Aware Dynamic Backdoor Attack. [pdf] [code]
- Anh Nguyen, and Anh Tran. NeurIPS, 2020.
Bypassing Backdoor Detection Algorithms in Deep Learning. [pdf]
- Te Juin Lester Tan, and Reza Shokri. EuroS&P, 2020.
Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation. [pdf]
- Cong Liao, Haoti Zhong, Anna Squicciarini, Sencun Zhu, and David Miller. ACM CODASPY, 2020.
Clean-Label Backdoor Attacks on Video Recognition Models. [pdf] [code]
- Shihao Zhao, Xingjun Ma, Xiang Zheng, James Bailey, Jingjing Chen, and Yu-Gang Jiang. CVPR, 2020.
Escaping Backdoor Attack Detection of Deep Learning. [link]
- Yayuan Xiong, Fengyuan Xu, Sheng Zhong, and Qun Li. IFIP SEC, 2020.
Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks. [pdf] [code]
- Yunfei Liu, Xingjun Ma, James Bailey, and Feng Lu. ECCV, 2020.
Live Trojan Attacks on Deep Neural Networks. [pdf] [code]
- Robby Costales, Chengzhi Mao, Raphael Norwitz, Bryan Kim, and Junfeng Yang. CVPR Workshop, 2020.
Backdooring and Poisoning Neural Networks with Image-Scaling Attacks. [pdf]
- Erwin Quiring, and Konrad Rieck. IEEE S&P Workshop, 2020.
One-to-N & N-to-One: Two Advanced Backdoor Attacks against Deep Learning Models. [pdf]
- Mingfu Xue, Can He, Jian Wang, and Weiqiang Liu. IEEE Transactions on Dependable and Secure Computing, 2020.
Invisible Backdoor Attacks on Deep Neural Networks via Steganography and Regularization. [pdf] [arXiv Version (2019)]
- Shaofeng Li, Minhui Xue, Benjamin Zi Hao Zhao, Haojin Zhu, and Xinpeng Zhang. IEEE Transactions on Dependable and Secure Computing, 2020.
HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios. [pdf]
- Hassan Ali, Surya Nepal, Salil S. Kanhere, and Sanjay Jha. arXiv, 2020.
FaceHack: Triggering Backdoored Facial Recognition Systems Using Facial Characteristics. [pdf]
- Esha Sarkar, Hadjer Benkraouda, and Michail Maniatakos. arXiv, 2020.
Light Can Hack Your Face! Black-box Backdoor Attack on Face Recognition Systems. [pdf]
- Haoliang Li, Yufei Wang, Xiaofei Xie, Yang Liu, Shiqi Wang, Renjie Wan, Lap-Pui Chau, and Alex C. Kot. arXiv, 2020.

2019

A New Backdoor Attack in CNNS by Training Set Corruption Without Label Poisoning. [pdf]
- M.Barni, K.Kallas, and B.Tondi. ICIP, 2019.
Label-Consistent Backdoor Attacks. [pdf] [code]
- Alexander Turner, Dimitris Tsipras, and Aleksander Madry. arXiv, 2019.

2018

Trojaning Attack on Neural Networks. [pdf] [code]
- Yingqi Liu, Shiqing Ma, Yousra Aafer, Wen-Chuan Lee, and Juan Zhai. NDSS, 2018.

2017

BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain. [pdf] [journal]
- Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. arXiv, 2017 (IEEE Access, 2019).
Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning. [pdf] [code]
- Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. arXiv, 2017.

Non-poisoning-based Attack

Weights-oriented Attack

Handcrafted Backdoors in Deep Neural Networks. [pdf]
- Sanghyun Hong, Nicholas Carlini, and Alexey Kurakin. NeurIPS, 2022.
Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips. [pdf] [code]
- Jiawang Bai, Kuofeng Gao, Dihong Gong, Shu-Tao Xia, Zhifeng Li, and Wei Liu. ECCV, 2022.
ProFlip: Targeted Trojan Attack with Progressive Bit Flips. [pdf]
- Huili Chen, Cheng Fu, Jishen Zhao, and Farinaz Koushanfar. ICCV, 2021.
TBT: Targeted Neural Network Attack with Bit Trojan. [pdf] [code]
- Adnan Siraj Rakin, Zhezhi He, and Deliang Fan. CVPR, 2020.
How to Inject Backdoors with Better Consistency: Logit Anchoring on Clean Data. [pdf]
- Zhiyuan Zhang, Lingjuan Lyu, Weiqiang Wang, Lichao Sun, and Xu Sun. ICLR, 2022.
Can Adversarial Weight Perturbations Inject Neural Backdoors? [pdf]
- Siddhant Garg, Adarsh Kumar, Vibhor Goel, and Yingyu Liang. CIKM, 2020.
Versatile Weight Attack via Flipping Limited Bits. [pdf]
- Jiawang Bai, Baoyuan Wu, Zhifeng Li, and Shu-Tao Xia. arXiv, 2022.
Toward Realistic Backdoor Injection Attacks on DNNs using Rowhammer. [pdf]
- M. Caner Tol, Saad Islam, Berk Sunar, and Ziming Zhang. 2022.
TrojanNet: Embedding Hidden Trojan Horse Models in Neural Network. [pdf]
- Chuan Guo, Ruihan Wu, and Kilian Q. Weinberger. arXiv, 2020.
Backdooring Convolutional Neural Networks via Targeted Weight Perturbations. [pdf]
- Jacob Dumford, and Walter Scheirer. arXiv, 2018.

Structure-modified Attack

LoneNeuron: a Highly-Effective Feature-Domain Neural Trojan Using Invisible and Polymorphic Watermarks. [pdf]
- Zeyan Liu, Fengjun Li, Zhu Li, and Bo Luo. CCS, 2022.
Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks. [pdf] [code]
- Xiangyu Qi, Tinghao Xie, Ruizhe Pan, Jifeng Zhu, Yong Yang, and Kai Bu. CVPR, 2022.
Hiding Needles in a Haystack: Towards Constructing Neural Networks that Evade Verification. [link] [code]
- Árpád Berta, Gábor Danner, István Hegedűs and Márk Jelasity. ACM IH&MMSec, 2022.
Stealthy and Flexible Trojan in Deep Learning Framework. [link]
- Yajie Wang, Kongyang Chen, Yu-An Tan, Shuxin Huang, Wencong Ma, and Yuanzhang Li. IEEE Transactions on Dependable and Secure Computing, 2022.
FooBaR: Fault Fooling Backdoor Attack on Neural Network Training. [link] [code]
- Jakub Breier, Xiaolu Hou, Martín Ochoa and Jesus Solano. IEEE Transactions on Dependable and Secure Computing, 2022.
DeepPayload: Black-box Backdoor Attack on Deep Learning Models through Neural Payload Injection. [pdf]
- Yuanchun Li, Jiayi Hua, Haoyu Wang, Chunyang Chen, and Yunxin Liu. ICSE, 2021.
An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks. [pdf] [code]
- Ruixiang Tang, Mengnan Du, Ninghao Liu, Fan Yang, and Xia Hu. KDD, 2020.
BadRes: Reveal the Backdoors through Residual Connection. [pdf]
- Mingrui He, Tianyu Chen, Haoyi Zhou, Shanghang Zhang, and Jianxin Li. arXiv, 2022.
Architectural Backdoors in Neural Networks. [pdf]
- Mikel Bober-Irizar, Ilia Shumailov, Yiren Zhao, Robert Mullins, and Nicolas Papernot. arXiv, 2022.
Planting Undetectable Backdoors in Machine Learning Models. [pdf]
- Shafi Goldwasser, Michael P. Kim, Vinod Vaikuntanathan, and Or Zamir. arXiv, 2022.

Other Attacks

ImpNet: Imperceptible and blackbox-undetectable backdoors in compiled neural networks. [pdf] [website] [code]
- Tim Clifford, Ilia Shumailov, Yiren Zhao, Ross Anderson, and Robert Mullins. arXiv, 2022.
Don't Trigger Me! A Triggerless Backdoor Attack Against Deep Neural Networks. [pdf]
- Ahmed Salem, Michael Backes, and Yang Zhang. arXiv, 2020.

Backdoor Defense

Preprocessing-based Empirical Defense

Backdoor Attack in the Physical World. [pdf] [extension]
- Yiming Li, Tongqing Zhai, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. ICLR Workshop, 2021.
DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation. [pdf] [code]
- Han Qiu, Yi Zeng, Shangwei Guo, Tianwei Zhang, Meikang Qiu, and Bhavani Thuraisingham. AsiaCCS, 2021.
Februus: Input Purification Defense Against Trojan Attacks on Deep Neural Network Systems. [pdf] [code]
- Bao Gia Doan, Ehsan Abbasnejad, and Damith C. Ranasinghe. ACSAC, 2020.
Neural Trojans. [pdf]
- Yuntao Liu, Yang Xie, and Ankur Srivastava. ICCD, 2017.
Defending Deep Neural Networks against Backdoor Attack by Using De-trigger Autoencoder. [pdf]
- Hyun Kwon. IEEE Access, 2021.
ConFoc: Content-Focus Protection Against Trojan Attacks on Neural Networks. [pdf]
- Miguel Villarreal-Vasquez, and Bharat Bhargava. arXiv, 2021.
Model Agnostic Defense against Backdoor Attacks in Machine Learning. [pdf]
- Sakshi Udeshi, Shanshan Peng, Gerald Woo, Lionell Loh, Louth Rawshan, and Sudipta Chattopadhyay. arXiv, 2019.

Model Reconstruction based Empirical Defense

Backdoor Cleansing with Unlabeled Data. [pdf] [code]
- Lu Pang, Tao Sun, Haibin Ling, and Chao Chen. CVPR, 2023.
Adversarial Unlearning of Backdoors via Implicit Hypergradient. [pdf] [code]
- Yi Zeng, Si Chen, Won Park, Z. Morley Mao, Ming Jin, and Ruoxi Jia. ICLR, 2022.
Data-free Backdoor Removal based on Channel Lipschitzness. [pdf] [code]
- Runkai Zheng, Rongjun Tang, Jianze Li, and Li Liu. ECCV, 2022.
Eliminating Backdoor Triggers for Deep Neural Networks Using Attention Relation Graph Distillation. [pdf]
- Jun Xia, Ting Wang, Jieping Ding, Xian Wei, and Mingsong Chen. IJCAI, 2022.
Pre-activation Distributions Expose Backdoor Neurons. [pdf] [code]
- Runkai Zheng, Rongjun Tang, Jianze Li, and Li Liu. NeurIPS, 2022.
Adversarial Neuron Pruning Purifies Backdoored Deep Models. [pdf] [code]
- Dongxian Wu and Yisen Wang. NeurIPS, 2021.
Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks. [pdf] [code]
- Yige Li, Xingjun Ma, Nodens Koren, Lingjuan Lyu, Xixiang Lyu, and Bo Li. ICLR, 2021.
Interpretability-Guided Defense against Backdoor Attacks to Deep Neural Networks. [link]
- Wei Jiang, Xiangyu Wen, Jinyu Zhan, Xupeng Wang, and Ziwei Song. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2021.
Boundary augment: A data augment method to defend poison attack. [link]
- Xuan Chen, Yuena Ma, Shiwei Lu, and Yu Yao. IET Image Processing, 2021.
Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness. [pdf] [code]
- Pu Zhao, Pin-Yu Chen, Payel Das, Karthikeyan Natesan Ramamurthy, and Xue Lin. ICLR, 2020.
Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks. [pdf] [code]
- Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. RAID, 2018.
Neural Trojans. [pdf]
- Yuntao Liu, Yang Xie, and Ankur Srivastava. ICCD, 2017.
Test-time Adaptation of Residual Blocks against Poisoning and Backdoor Attacks. [pdf]
- Arnav Gudibande, Xinyun Chen, Yang Bai, Jason Xiong, and Dawn Song. CVPR Workshop, 2022.
Disabling Backdoor and Identifying Poison Data by using Knowledge Distillation in Backdoor Attacks on Deep Neural Networks. [pdf]
- Kota Yoshida, and Takeshi Fujino. CCS Workshop, 2020.
Defending against Backdoor Attack on Deep Neural Networks. [pdf]
- Hao Cheng, Kaidi Xu, Sijia Liu, Pin-Yu Chen, Pu Zhao, and Xue Lin. KDD Workshop, 2019.
Defense against Backdoor Attacks via Identifying and Purifying Bad Neurons. [pdf]
- Mingyuan Fan, Yang Liu, Cen Chen, Ximeng Liu, and Wenzhong Guo. arXiv, 2022.
Turning a Curse Into a Blessing: Enabling Clean-Data-Free Defenses by Model Inversion. [pdf]
- Si Chen, Yi Zeng, Won Park, and Ruoxi Jia. arXiv, 2022.
Adversarial Fine-tuning for Backdoor Defense: Connect Adversarial Examples to Triggered Samples. [pdf]
- Bingxu Mu, Le Wang, and Zhenxing Niu. arXiv, 2022.
Neural Network Laundering: Removing Black-Box Backdoor Watermarks from Deep Neural Networks. [pdf]
- William Aiken, Hyoungshick Kim, and Simon Woo. arXiv, 2020.
HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios. [pdf]
- Hassan Ali, Surya Nepal, Salil S. Kanhere, and Sanjay Jha. arXiv, 2020.

Trigger Synthesis based Empirical Defense

Distilling Cognitive Backdoor Patterns within an Image. [pdf] [code]
- Hanxun Huang, Xingjun Ma, Sarah Monazam Erfani, and James Bailey. ICLR, 2023.
UNICORN: A Unified Backdoor Trigger Inversion Framework. [pdf] [code]
- Zhenting Wang, Zhenting Wang, Kai Mei, Juan Zhai, and Shiqing Ma. ICLR, 2023.
Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free. [pdf] [code]
- Tianlong Chen, Zhenyu Zhang, Yihua Zhang*, Shiyu Chang, Sijia Liu, and Zhangyang Wang. CVPR, 2022.
Better Trigger Inversion Optimization in Backdoor Scanning. [pdf] [code]
- Guanhong Tao, Guangyu Shen, Yingqi Liu, Shengwei An, Qiuling Xu, Shiqing Ma, Pan Li, and Xiangyu Zhang. CVPR, 2022.
Few-shot Backdoor Defense Using Shapley Estimation. [pdf]
- Jiyang Guan, Zhuozhuo Tu, Ran He, and Dacheng Tao. CVPR, 2022.
Rethinking the Reverse-engineering of Trojan Triggers. [pdf] [code]
- Zhenting Wang, Kai Mei, Hailun Ding, Juan Zhai, and Shiqing Ma. NeurIPS, 2022.
One-shot Neural Backdoor Erasing via Adversarial Weight Masking. [pdf] [code]
- Shuwen Chai and Jinghui Chen. NeurIPS, 2022.
AEVA: Black-box Backdoor Detection Using Adversarial Extreme Value Analysis. [pdf] [code]
- Junfeng Guo, Ang Li, and Cong Liu. ICLR, 2022.
Trigger Hunting with a Topological Prior for Trojan Detection. [pdf] [code]
- Xiaoling Hu, Xiao Lin, Michael Cogswell, Yi Yao, Susmit Jha, and Chao Chen. ICLR, 2022.
Backdoor Defense with Machine Unlearning. [pdf]
- Yang Liu, Mingyuan Fan, Cen Chen, Ximeng Liu, Zhuo Ma, Li Wang, and Jianfeng Ma. INFOCOM, 2022.
Black-box Detection of Backdoor Attacks with Limited Information and Data. [pdf]
- Yinpeng Dong, Xiao Yang, Zhijie Deng, Tianyu Pang, Zihao Xiao, Hang Su, and Jun Zhu. ICCV, 2021.
Backdoor Scanning for Deep Neural Networks through K-Arm Optimization. [pdf] [code]
- Guangyu Shen, Yingqi Liu, Guanhong Tao, Shengwei An, Qiuling Xu, Siyuan Cheng, Shiqing Ma, and Xiangyu Zhang. ICML, 2021.
Towards Inspecting and Eliminating Trojan Backdoors in Deep Neural Networks. [pdf] [previous version] [code]
- Wenbo Guo, Lun Wang, Xinyu Xing, Min Du, and Dawn Song. ICDM, 2020.
GangSweep: Sweep out Neural Backdoors by GAN. [pdf]
- Liuwan Zhu, Rui Ning, Cong Wang, Chunsheng Xin, and Hongyi Wu. ACM MM, 2020.
Detection of Backdoors in Trained Classiﬁers Without Access to the Training Set. [pdf]
- Z Xiang, DJ Miller, and G Kesidis. IEEE Transactions on Neural Networks and Learning Systems, 2020.
Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. [pdf] [code]
- Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, Ben Y. Zhao. IEEE S&P, 2019.
Defending Neural Backdoors via Generative Distribution Modeling. [pdf] [code]
- Ximing Qiao, Yukun Yang, and Hai Li. NeurIPS, 2019.
DeepInspect: A Black-box Trojan Detection and Mitigation Framework for Deep Neural Networks. [pdf]
- Huili Chen, Cheng Fu, Jishen Zhao, Farinaz Koushanfar. IJCAI, 2019.
Identifying Physically Realizable Triggers for Backdoored Face Recognition Networks. [link]
- Ankita Raj, Ambar Pal, and Chetan Arora. ICIP, 2021.
Revealing Perceptible Backdoors in DNNs Without the Training Set via the Maximum Achievable Misclassification Fraction Statistic. [pdf]
- Zhen Xiang, David J. Miller, Hang Wang, and George Kesidis. MLSP, 2020.
Adaptive Perturbation Generation for Multiple Backdoors Detection. [pdf]
- Yuhang Wang, Huafeng Shi, Rui Min, Ruijia Wu, Siyuan Liang, Yichao Wu, Ding Liang, and Aishan Liu. arXiv, 2022.
Confidence Matters: Inspecting Backdoors in Deep Neural Networks via Distribution Transfer. [pdf]
- Tong Wang, Yuan Yao, Feng Xu, Miao Xu, Shengwei An, and Ting Wang. arXiv, 2022.
Defense Against Multi-target Trojan Attacks. [pdf]
- Haripriya Harikumar, Santu Rana, Kien Do, Sunil Gupta, Wei Zong, Willy Susilo, and Svetha Venkastesh. arXiv, 2022.
Model-Contrastive Learning for Backdoor Defense. [pdf]
- Zhihao Yue, Jun Xia, Zhiwei Ling, Ting Wang, Xian Wei, and Mingsong Chen. arXiv, 2022.
CatchBackdoor: Backdoor Testing by Critical Trojan Neural Path Identification via Differential Fuzzing. [pdf]
- Haibo Jin, Ruoxi Chen, Jinyin Chen, Yao Cheng, Chong Fu, Ting Wang, Yue Yu, and Zhaoyan Ming. arXiv, 2021.
Detect and Remove Watermark in Deep Neural Networks via Generative Adversarial Networks. [pdf]
- Haoqi Wang, Mingfu Xue, Shichang Sun, Yushu Zhang, Jian Wang, and Weiqiang Liu. arXiv, 2021.
TAD: Trigger Approximation based Black-box Trojan Detection for AI. [pdf]
- Xinqiao Zhang, Huili Chen, and Farinaz Koushanfar. arXiv, 2021.
Scalable Backdoor Detection in Neural Networks. [pdf]
- Haripriya Harikumar, Vuong Le, Santu Rana, Sourangshu Bhattacharya, Sunil Gupta, and Svetha Venkatesh. arXiv, 2020.
NNoculation: Broad Spectrum and Targeted Treatment of Backdoored DNNs. [pdf] [code]
- Akshaj Kumar Veldanda, Kang Liu, Benjamin Tan, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri, Brendan Dolan-Gavitt, and Siddharth Garg. arXiv, 2020.

Model Diagnosis based Empirical Defense

Complex Backdoor Detection by Symmetric Feature Differencing. [pdf] [code]
- Yingqi Liu, Guangyu Shen, Guanhong Tao, Zhenting Wang, Shiqing Ma, and Xiangyu Zhang. CVPR, 2022.
Post-Training Detection of Backdoor Attacks for Two-Class and Multi-Attack Scenarios. [pdf] [code]
- Zhen Xiang, David J. Miller, and George Kesidis. ICLR, 2022.
Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets. [pdf] [code]
- Ruisi Cai, Zhenyu Zhang, Tianlong Chen, Xiaohan Chen, and Zhangyang Wang. NeurIPS, 2022.
An Anomaly Detection Approach for Backdoored Neural Networks: Face Recognition as a Case Study. [pdf]
- Alexander Unnervik and Sébastien Marcel. BIOSIG, 2022.
Critical Path-Based Backdoor Detection for Deep Neural Networks. [link]
- Wei Jiang, Xiangyu Wen, Jinyu Zhan, Xupeng Wang, Ziwei Song, and Chen Bian. IEEE Transactions on Neural Networks and Learning Systems, 2022.
Detecting AI Trojans Using Meta Neural Analysis. [pdf]
- Xiaojun Xu, Qi Wang, Huichen Li, Nikita Borisov, Carl A. Gunter, and Bo Li. IEEE S&P, 2021.
Topological Detection of Trojaned Neural Networks. [pdf]
- Songzhu Zheng, Yikai Zhang, Hubert Wagner, Mayank Goswami, and Chao Chen. NeurIPS, 2021.
Black-box Detection of Backdoor Attacks with Limited Information and Data. [pdf]
- Yinpeng Dong, Xiao Yang, Zhijie Deng, Tianyu Pang, Zihao Xiao, Hang Su, and Jun Zhu. ICCV, 2021.
Universal Litmus Patterns: Revealing Backdoor Attacks in CNNs. [pdf] [code]
- Soheil Kolouri, Aniruddha Saha, Hamed Pirsiavash, and Heiko Hoffmann. CVPR, 2020.
One-Pixel Signature: Characterizing CNN Models for Backdoor Detection. [pdf]
- Shanjiaoyang Huang, Weiqi Peng, Zhiwei Jia, and Zhuowen Tu. ECCV, 2020.
Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases. [pdf] [code]
- Ren Wang, Gaoyuan Zhang, Sijia Liu, Pin-Yu Chen, Jinjun Xiong, and Meng Wang. ECCV, 2020.
Detecting Backdoor Attacks via Class Difference in Deep Neural Networks. [pdf]
- Hyun Kwon. IEEE Access, 2020.
Baseline Pruning-Based Approach to Trojan Detection in Neural Networks. [pdf]
- Peter Bajcsy and Michael Majurski. ICLR Workshop, 2021.
Universal Post-Training Backdoor Detection. [pdf]
- Hang Wang, Zhen Xiang, David J. Miller, and George Kesidis. arXiv, 2022.
Trojan Signatures in DNN Weights. [pdf]
- Greg Fields, Mohammad Samragh, Mojan Javaheripi, Farinaz Koushanfar, and Tara Javidi. arXiv, 2021.
EX-RAY: Distinguishing Injected Backdoor from Natural Features in Neural Networks by Examining Differential Feature Symmetry. [pdf]
- Yingqi Liu, Guangyu Shen, Guanhong Tao, Zhenting Wang, Shiqing Ma, and Xiangyu Zhang. arXiv, 2021.
TOP: Backdoor Detection in Neural Networks via Transferability of Perturbation. [pdf]
- Todd Huster and Emmanuel Ekwedike. arXiv, 2021.
Detecting Trojaned DNNs Using Counterfactual Attributions. [pdf]
- Karan Sikka, Indranil Sur, Susmit Jha, Anirban Roy, and Ajay Divakaran. arXiv, 2021.
Adversarial examples are useful too! [pdf] [code]
- XBorji A. arXiv, 2020.
Cassandra: Detecting Trojaned Networks from Adversarial Perturbations. [pdf]
- Xiaoyu Zhang, Ajmal Mian, Rohit Gupta, Nazanin Rahnavard, and Mubarak Shah. arXiv, 2020.
Odyssey: Creation, Analysis and Detection of Trojan Models. [pdf] [dataset]
- Marzieh Edraki, Nazmul Karim, Nazanin Rahnavard, Ajmal Mian, and Mubarak Shah. arXiv, 2020.
Noise-response Analysis for Rapid Detection of Backdoors in Deep Neural Networks. [pdf]
- N. Benjamin Erichson, Dane Taylor, Qixuan Wu, and Michael W. Mahoney. arXiv, 2020.
NeuronInspect: Detecting Backdoors in Neural Networks via Output Explanations. [pdf]
- Xijie Huang, Moustafa Alzantot, and Mani Srivastava. arXiv, 2019.

Poison Suppression based Empirical Defense

Backdoor Defense via Decoupling the Training Process. [pdf] [code]
- Kunzhe Huang, Yiming Li, Baoyuan Wu, Zhan Qin, and Kui Ren. ICLR, 2022.
Effective Backdoor Defense by Exploiting Sensitivity of Poisoned Samples. [pdf] [code]
- Weixin Chen, Baoyuan Wu, and Haoqian Wang. NeurIPS, 2022.
Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork. [pdf] [code]
- Haotao Wang, Junyuan Hong, Aston Zhang, Jiayu Zhou, and Zhangyang Wang. NeurIPS, 2022.
Training with More Confidence: Mitigating Injected and Natural Backdoors During Training. [pdf] [code]
- Zhenting Wang, Zhenting_Wang, Hailun Ding, Juan Zhai, and Shiqing Ma. NeurIPS, 2022.
Anti-Backdoor Learning: Training Clean Models on Poisoned Data. [pdf] [code]
- Yige Li, Xixiang Lyu, Nodens Koren, Lingjuan Lyu, Bo Li, and Xingjun Ma. NeurIPS, 2021.
Robust Anomaly Detection and Backdoor Attack Detection via Differential Privacy. [pdf] [code]
- Min Du, Ruoxi Jia, and Dawn Song. ICLR, 2020.
Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Trade-off. [pdf]
- Eitan Borgnia, Valeriia Cherepanova, Liam Fowl, Amin Ghiasi, Jonas Geiping, Micah Goldblum, Tom Goldstein, and Arjun Gupta. ICASSP, 2021.
What Doesn't Kill You Makes You Robust(er): Adversarial Training against Poisons and Backdoors. [pdf]
- Jonas Geiping, Liam Fowl, Gowthami Somepalli, Micah Goldblum, Michael Moeller, and Tom Goldstein. ICLR Workshop, 2021.
Removing Backdoor-Based Watermarks in Neural Networks with Limited Data. [pdf]
- Xuankai Liu, Fengting Li, Bihan Wen, and Qi Li. ICPR, 2021.
Mask and Restore: Blind Backdoor Defense at Test Time with Masked Autoencoder. [pdf] [code]
- Tao Sun, Lu Pang, Chao Chen, and Haibin Ling. arXiv, 2023.
On the Effectiveness of Adversarial Training against Backdoor Attacks. [pdf]
- Yinghua Gao, Dongxian Wu, Jingfeng Zhang, Guanhao Gan, Shu-Tao Xia, Gang Niu, and Masashi Sugiyama. arXiv, 2022.
Resurrecting Trust in Facial Recognition: Mitigating Backdoor Attacks in Face Recognition to Prevent Potential Privacy Breaches. [pdf]
- Reena Zelenkova, Jack Swallow, M. A. P. Chamikara, Dongxi Liu, Mohan Baruwal Chhetri, Seyit Camtepe, Marthie Grobler, and Mahathir Almashor. arXiv, 2022.
SanitAIs: Unsupervised Data Augmentation to Sanitize Trojaned Neural Networks. [pdf]
- Kiran Karra and Chace Ashcraft. arXiv, 2021.
On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping. [pdf] [code]
- Sanghyun Hong, Varun Chandrasekaran, Yiğitcan Kaya, Tudor Dumitraş, and Nicolas Papernot. arXiv, 2020.
DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with Differentially Private Data Augmentations. [pdf]
- Eitan Borgnia, Jonas Geiping, Valeriia Cherepanova, Liam Fowl, Arjun Gupta, Amin Ghiasi, Furong Huang, Micah Goldblum, and Tom Goldstein. arXiv, 2021.

Sample Filtering based Empirical Defense

SCALE-UP: An Efficient Black-box Input-level Backdoor Detection via Analyzing Scaled Prediction Consistency. [pdf] [code]
- Junfeng Guo, Yiming Li, Xun Chen, Hanqing Guo, Lichao Sun, and Cong Liu. ICLR, 2023.
Distilling Cognitive Backdoor Patterns within an Image. [pdf] [code]
- Hanxun Huang, Xingjun Ma, Sarah Monazam Erfani, and James Bailey. ICLR, 2023.
Incompatibility Clustering as a Defense Against Backdoor Poisoning Attacks. [pdf] [code]
- Charles Jin, Melinda Sun, and Martin Rinard. ICLR, 2023.
The "Beatrix'' Resurrections: Robust Backdoor Detection via Gram Matrices. [pdf] [code]
- Wanlun Ma, Derui Wang, Ruoxi Sun, Minhui Xue, Sheng Wen, and Yang Xiang. NDSS, 2023.
Effective Backdoor Defense by Exploiting Sensitivity of Poisoned Samples. [pdf] [code]
- Weixin Chen, Baoyuan Wu, and Haoqian Wang. NeurIPS, 2022.
Towards Effective and Robust Neural Trojan Defenses via Input Filtering. [pdf] [code]
- Kien Do, Haripriya Harikumar, Hung Le, Dung Nguyen, Truyen Tran, Santu Rana, Dang Nguyen, Willy Susilo, and Svetha Venkatesh. ECCV, 2022.
Can We Mitigate Backdoor Attack Using Adversarial Detection Methods? [link]
- Kaidi Jin, Tianwei Zhang, Chao Shen, Yufei Chen, Ming Fan, Chenhao Lin, and Ting Liu. IEEE Transactions on Dependable and Secure Computing, 2022.
LinkBreaker: Breaking the Backdoor-Trigger Link in DNNs via Neurons Consistency Check. [link]
- Zhenzhu Chen, Shang Wang, Anmin Fu, Yansong Gao, Shui Yu, and Robert H. Deng. IEEE Transactions on Information Forensics and Security, 2022.
Similarity-based Integrity Protection for Deep Learning Systems. [link]
- Ruitao Hou, Shan Ai, Qi Chen, Hongyang Yan, Teng Huang, and Kongyang Chen. Information Sciences, 2022.
A Feature-Based On-Line Detector to Remove Adversarial-Backdoors by Iterative Demarcation. [pdf]
- Hao Fu, Akshaj Kumar Veldanda, Prashanth Krishnamurthy, Siddharth Garg, and Farshad Khorrami. IEEE ACCESS, 2022.
Rethinking the Backdoor Attacks' Triggers: A Frequency Perspective. [pdf] [code]
- Yi Zeng, Won Park, Z. Morley Mao, and Ruoxi Jia. ICCV, 2021.
Demon in the Variant: Statistical Analysis of DNNs for Robust Backdoor Contamination Detection. [pdf] [code]
- Di Tang, XiaoFeng Wang, Haixu Tang, and Kehuan Zhang. USENIX Security, 2021.
SPECTRE: Defending Against Backdoor Attacks Using Robust Statistics. [pdf] [code]
- Jonathan Hayase, Weihao Kong, Raghav Somani, and Sewoong Oh. ICML, 2021.
CLEANN: Accelerated Trojan Shield for Embedded Neural Networks. [pdf]
- Mojan Javaheripi, Mohammad Samragh, Gregory Fields, Tara Javidi, and Farinaz Koushanfar. ICCAD, 2020.
Robust Anomaly Detection and Backdoor Attack Detection via Differential Privacy. [pdf] [code]
- Min Du, Ruoxi Jia, and Dawn Song. ICLR, 2020.
Simple, Attack-Agnostic Defense Against Targeted Training Set Attacks Using Cosine Similarity. [pdf] [code]
- Zayd Hammoudeh and Daniel Lowd. ICML Workshop, 2021.
SentiNet: Detecting Localized Universal Attacks Against Deep Learning Systems. [pdf]
- Edward Chou, Florian Tramèr, and Giancarlo Pellegrino. IEEE S&P Workshop, 2020.
STRIP: A Defence Against Trojan Attacks on Deep Neural Networks. [pdf] [extension] [code]
- Yansong Gao, Chang Xu, Derui Wang, Shiping Chen, Damith C. Ranasinghe, and Surya Nepal. ACSAC, 2019.
Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering. [pdf] [code]
- Bryant Chen, Wilka Carvalho, Nathalie Baracaldo, Heiko Ludwig, Benjamin Edwards, Taesung Lee, Ian Molloy, and Biplav Srivastava. AAAI Workshop, 2019.
Deep Probabilistic Models to Detect Data Poisoning Attacks. [pdf]
- Mahesh Subedar, Nilesh Ahuja, Ranganath Krishnan, Ibrahima J. Ndiour, and Omesh Tickoo. NeurIPS Workshop, 2019.
Spectral Signatures in Backdoor Attacks. [pdf] [code]
- Brandon Tran, Jerry Li, and Aleksander Madry. NeurIPS, 2018.
An Adaptive Black-box Defense against Trojan Attacks (TrojDef). [pdf]
- Guanxiong Liu, Abdallah Khreishah, Fatima Sharadgah, and Issa Khalil. arXiv, 2022.
Fight Poison with Poison: Detecting Backdoor Poison Samples via Decoupling Benign Correlations. [pdf] [code]
- Xiangyu Qi, Tinghao Xie, Saeed Mahloujifar, and Prateek Mittal. arXiv, 2022.
PiDAn: A Coherence Optimization Approach for Backdoor Attack Detection and Mitigation in Deep Neural Networks. [pdf]
- Yue Wang, Wenqing Li, Esha Sarkar, Muhammad Shafique, Michail Maniatakos, and Saif Eddin Jabari. arXiv, 2022.
Neural Network Trojans Analysis and Mitigation from the Input Domain. [pdf]
- Zhenting Wang, Hailun Ding, Juan Zhai, and Shiqing Ma. arXiv, 2022.
A General Framework for Defending Against Backdoor Attacks via Influence Graph. [pdf]
- Xiaofei Sun, Jiwei Li, Xiaoya Li, Ziyao Wang, Tianwei Zhang, Han Qiu, Fei Wu, and Chun Fan. arXiv, 2021.
NTD: Non-Transferability Enabled Backdoor Detection. [pdf]
- Yinshan Li, Hua Ma, Zhi Zhang, Yansong Gao, Alsharif Abuadbba, Anmin Fu, Yifeng Zheng, Said F. Al-Sarawi, and Derek Abbott. arXiv, 2021.
A Unified Framework for Task-Driven Data Quality Management. [pdf]
- Tianhao Wang, Yi Zeng, Ming Jin, and Ruoxi Jia. arXiv, 2021.
TESDA: Transform Enabled Statistical Detection of Attacks in Deep Neural Networks. [pdf]
- Chandramouli Amarnath, Aishwarya H. Balwani, Kwondo Ma, and Abhijit Chatterjee. arXiv, 2021.
Traceback of Data Poisoning Attacks in Neural Networks. [pdf]
- Shawn Shan, Arjun Nitin Bhagoji, Haitao Zheng, and Ben Y. Zhao. arXiv, 2021.
Provable Guarantees against Data Poisoning Using Self-Expansion and Compatibility. [pdf]
- Charles Jin, Melinda Sun, and Martin Rinard. arXiv, 2021.
Online Defense of Trojaned Models using Misattributions. [pdf]
- Panagiota Kiourti, Wenchao Li, Anirban Roy, Karan Sikka, and Susmit Jha. arXiv, 2021.
Detecting Backdoor in Deep Neural Networks via Intentional Adversarial Perturbations. [pdf]
- Mingfu Xue, Yinghao Wu, Zhiyu Wu, Jian Wang, Yushu Zhang, and Weiqiang Liu. arXiv, 2021.
Exposing Backdoors in Robust Machine Learning Models. [pdf]
- Ezekiel Soremekun, Sakshi Udeshi, and Sudipta Chattopadhyay. arXiv, 2020.
HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios. [pdf]
- Hassan Ali, Surya Nepal, Salil S. Kanhere, and Sanjay Jha. arXiv, 2020.
Poison as a Cure: Detecting & Neutralizing Variable-Sized Backdoor Attacks in Deep Neural Networks. [pdf]
- Alvin Chan, and Yew-Soon Ong. arXiv, 2019.

Certificated Defense

Towards Robustness Certification Against Universal Perturbations. [pdf] [code]
- Yi Zeng, Zhouxing Shi, Ming Jin, Feiyang Kang, Lingjuan Lyu, Cho-Jui Hsieh, and Ruoxi Jia. ICLR, 2023.
BagFlip: A Certified Defense against Data Poisoning. [pdf] [code]
- Yuhao Zhang, Aws Albarghouthi, and Loris D'Antoni. NeurIPS, 2022.
RAB: Provable Robustness Against Backdoor Attacks. [pdf] [code]
- Maurice Weber, Xiaojun Xu, Bojan Karlas, Ce Zhang, and Bo Li. IEEE S&P, 2022.
Certified Robustness of Nearest Neighbors against Data Poisoning and Backdoor Attacks. [pdf]
- Jinyuan Jia, Yupei Liu, Xiaoyu Cao, and Neil Zhenqiang Gong. AAAI, 2022.
Deep Partition Aggregation: Provable Defense against General Poisoning Attacks [pdf] [code]
- Alexander Levine and Soheil Feizi. ICLR, 2021.
Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks [pdf] [code]
- Jinyuan Jia, Xiaoyu Cao, and Neil Zhenqiang Gong. AAAI, 2021.
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing. [pdf]
- Elan Rosenfeld, Ezra Winston, Pradeep Ravikumar, and J. Zico Kolter. ICML, 2020.
On Certifying Robustness against Backdoor Attacks via Randomized Smoothing. [pdf]
- Binghui Wang, Xiaoyu Cao, Jinyuan jia, and Neil Zhenqiang Gong. CVPR Workshop, 2020.
BagFlip: A Certified Defense against Data Poisoning. [pdf]
- Yuhao Zhang, Aws Albarghouthi, and Loris D'Antoni. arXiv, 2022.

Attack and Defense Towards Other Paradigms and Tasks

Federated Learning

FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning. [pdf] [code]
- Kaiyuan Zhang, Guanhong Tao, Qiuling Xu, Siyuan Cheng, Shengwei An, Yingqi Liu, Shiwei Feng, Guangyu Shen, Pin-Yu Chen, Shiqing Ma, and Xiangyu Zhang. ICLR, 2023.
On the Vulnerability of Backdoor Defenses for Federated Learning. [link] [code]
- Pei Fang and Jinghui Chen. AAAI, 2023.
Poisoning with Cerberus: Stealthy and Colluded Backdoor Attack against Federated Learning. [link]
- Xiaoting Lyu, Yufei Han, Wei Wang, Jingkai Liu, Bin Wang, Jiqiang Liu, and Xiangliang Zhang. AAAI, 2023.
Neurotoxin: Durable Backdoors in Federated Learning. [pdf]
- Zhengming Zhang, Ashwinee Panda, Linyue Song, Yaoqing Yang, Michael W. Mahoney, Joseph E. Gonzalez, Kannan Ramchandran, and Prateek Mittal. ICML, 2022.
FLAME: Taming Backdoors in Federated Learning. [pdf]
- Thien Duc Nguyen, Phillip Rieger, Huili Chen, Hossein Yalame, Helen Möllering, Hossein Fereidooni, Samuel Marchal, Markus Miettinen, Azalia Mirhoseini, Shaza Zeitouni, Farinaz Koushanfar, Ahmad-Reza Sadeghi, and Thomas Schneider. USENIX Security, 2022.
DeepSight: Mitigating Backdoor Attacks in Federated Learning Through Deep Model Inspection. [pdf]
- Phillip Rieger, Thien Duc Nguyen, Markus Miettinen, and Ahmad-Reza Sadeghi. NDSS, 2022.
Defending Label Inference and Backdoor Attacks in Vertical Federated Learning. [pdf]
- Yang Liu, Zhihao Yi, Yan Kang, Yuanqin He, Wenhan Liu, Tianyuan Zou, and Qiang Yang. AAAI, 2022.
An Analysis of Byzantine-Tolerant Aggregation Mechanisms on Model Poisoning in Federated Learning. [link]
- Mary Roszel, Robert Norvill, and Radu State. MDAI, 2022.
Against Backdoor Attacks In Federated Learning With Differential Privacy. [link]
- Lu Miao, Wei Yang, Rong Hu, Lu Li, and Liusheng Huang. ICASSP, 2022.
Secure Partial Aggregation: Making Federated Learning More Robust for Industry 4.0 Applications. [link]
- Jiqiang Gao, Baolei Zhang, Xiaojie Guo, Thar Baker, Min Li, and Zheli Liu. IEEE Transactions on Industrial Informatics, 2022.
Backdoor Attacks-resilient Aggregation based on Robust Filtering of Outliers in Federated Learning for Image Classification. [link]
- Nuria Rodríguez-Barroso, Eugenio Martínez-Cámara, M. Victoria Luzónb, and Francisco Herrera. Knowledge-Based Systems, 2022.
Defense against Backdoor Attack in Federated Learning. [link] [code]
- Shiwei Lu, Ruihu Li, Wenbin Liu, and Xuan Chen. Computers & Security, 2022.
Privacy-Enhanced Federated Learning against Poisoning Adversaries. [link]
- Xiaoyuan Liu, Hongwei Li, Guowen Xu, Zongqi Chen, Xiaoming Huang, and Rongxing Lu. IEEE Transactions on Information Forensics and Security, 2021.
Coordinated Backdoor Attacks against Federated Learning with Model-Dependent Triggers. [link]
- Xueluan Gong, Yanjiao Chen, Huayang Huang, Yuqing Liao, Shuai Wang, and Qian Wang. IEEE Network, 2022.
CRFL: Certifiably Robust Federated Learning against Backdoor Attacks. [pdf]
- Chulin Xie, Minghao Chen, Pin-Yu Chen, and Bo Li. ICML, 2021.
Curse or Redemption? How Data Heterogeneity Affects the Robustness of Federated Learning. [pdf]
- Syed Zawad, Ahsan Ali, Pin-Yu Chen, Ali Anwar, Yi Zhou, Nathalie Baracaldo, Yuan Tian, and Feng Yan. AAAI, 2021.
Defending Against Backdoors in Federated Learning with Robust Learning Rate. [pdf]
- Mustafa Safa Ozdayi, Murat Kantarcioglu, and Yulia R. Gel. AAAI, 2021.
BaFFLe: Backdoor detection via Feedback-based Federated Learning. [pdf]
- ebastien Andreina, Giorgia Azzurra Marson, Helen Möllering, and Ghassan Karame. ICDCS, 2021.
PipAttack: Poisoning Federated Recommender Systems for Manipulating Item Promotion. [pdf]
- Shijie Zhang, Hongzhi Yin, Tong Chen, Zi Huang, Quoc Viet Hung Nguyen, and Lizhen Cui. WSDM, 2021.
Mitigating the Backdoor Attack by Federated Filters for Industrial IoT Applications. [link]
- Boyu Hou, Jiqiang Gao, Xiaojie Guo, Thar Baker, Ying Zhang, Yanlong Wen, and Zheli Liu. IEEE Transactions on Industrial Informatics, 2021.
Stability-Based Analysis and Defense against Backdoor Attacks on Edge Computing Services. [link]
- Yi Zhao, Ke Xu, Haiyang Wang, Bo Li, and Ruoxi Jia. IEEE Network, 2021.
Attack of the Tails: Yes, You Really Can Backdoor Federated Learning. [pdf]
- Hongyi Wang, Kartik Sreenivasan, Shashank Rajput, Harit Vishwakarma, Saurabh Agarwal, Jy-yong Sohn, Kangwook Lee, and Dimitris Papailiopoulos. NeurIPS, 2020.
DBA: Distributed Backdoor Attacks against Federated Learning. [pdf]
- Chulin Xie, Keli Huang, Pinyu Chen, and Bo Li. ICLR, 2020.
The Limitations of Federated Learning in Sybil Settings. [pdf] [extension] [code]
- Clement Fung, Chris J.M. Yoon, and Ivan Beschastnikh. RAID, 2020 (arXiv, 2018).
How to Backdoor Federated Learning. [pdf]
- Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, and Vitaly Shmatikov. AISTATS, 2020 (arXiv, 2018).
BEAS: Blockchain Enabled Asynchronous & Secure Federated Machine Learning. [pdf]
- Arup Mondal, Harpreet Virk, and Debayan Gupta. AAAI Workshop, 2022.
Backdoor Attacks and Defenses in Feature-partitioned Collaborative Learning. [pdf]
- Yang Liu, Zhihao Yi, and Tianjian Chen. ICML Workshop, 2020.
Can You Really Backdoor Federated Learning? [pdf]
- Ziteng Sun, Peter Kairouz, Ananda Theertha Suresh, and H. Brendan McMahan. NeurIPS Workshop, 2019.
Invariant Aggregator for Defending Federated Backdoor Attacks. [pdf]
- Xiaoyang Wang, Dimitrios Dimitriadis, Sanmi Koyejo, and Shruti Tople. arXiv, 2022.
Shielding Federated Learning: Mitigating Byzantine Attacks with Less Constraints. [pdf]
- Minghui Li, Wei Wan, Jianrong Lu, Shengshan Hu, Junyu Shi, and Leo Yu Zhang. arXiv, 2022.
Federated Zero-Shot Learning for Visual Recognition. [pdf]
- Zhi Chen, Yadan Luo, Sen Wang, Jingjing Li, and Zi Huang. arXiv, 2022.
Assisting Backdoor Federated Learning with Whole Population Knowledge Alignment. [pdf]
- Tian Liu, Xueyang Hu, and Tao Shu. arXiv, 2022.
FL-Defender: Combating Targeted Attacks in Federated Learning. [pdf]
- Najeeb Jebreel and Josep Domingo-Ferrer. arXiv, 2022.
Backdoor Attack is A Devil in Federated GAN-based Medical Image Synthesis. [pdf]
- Ruinan Jin and Xiaoxiao Li. arXiv, 2022.
SafeNet: Mitigating Data Poisoning Attacks on Private Machine Learning. [pdf] [code]
- Harsh Chaudhari, Matthew Jagielski, and Alina Oprea. arXiv, 2022.
PerDoor: Persistent Non-Uniform Backdoors in Federated Learning using Adversarial Perturbations. [pdf] [code]
- Manaar Alam, Esha Sarkar, and Michail Maniatakos. arXiv, 2022.
Towards a Defense against Backdoor Attacks in Continual Federated Learning. [pdf]
- Shuaiqi Wang, Jonathan Hayase, Giulia Fanti, and Sewoong Oh. arXiv, 2022.
Client-Wise Targeted Backdoor in Federated Learning. [pdf]
- Gorka Abad, Servio Paguada, Stjepan Picek, Víctor Julio Ramírez-Durán, and Aitor Urbieta. arXiv, 2022.
Backdoor Defense in Federated Learning Using Differential Testing and Outlier Detection. [pdf]
- Yein Kim, Huili Chen, and Farinaz Koushanfar. arXiv, 2022.
ARIBA: Towards Accurate and Robust Identification of Backdoor Attacks in Federated Learning. [pdf]
- Yuxi Mi, Jihong Guan, and Shuigeng Zhou. arXiv, 2022.
More is Better (Mostly): On the Backdoor Attacks in Federated Graph Neural Networks. [pdf]
- Jing Xu, Rui Wang, Kaitai Liang, and Stjepan Picek. arXiv, 2022.
Low-Loss Subspace Compression for Clean Gains against Multi-Agent Backdoor Attacks. [pdf]
- Siddhartha Datta and Nigel Shadbolt. arXiv, 2022.
Backdoors Stuck at The Frontdoor: Multi-Agent Backdoor Attacks That Backfire. [pdf]
- Siddhartha Datta and Nigel Shadbolt. arXiv, 2022.
Federated Unlearning with Knowledge Distillation. [pdf]
- Chen Wu, Sencun Zhu, and Prasenjit Mitra. arXiv, 2022.
Model Transferring Attacks to Backdoor HyperNetwork in Personalized Federated Learning. [pdf]
- Phung Lai, NhatHai Phan, Abdallah Khreishah, Issa Khalil, and Xintao Wu. arXiv, 2022.
Backdoor Attacks on Federated Learning with Lottery Ticket Hypothesis. [pdf]
- Zihang Zou, Boqing Gong, and Liqiang Wang. arXiv, 2021.
On Provable Backdoor Defense in Collaborative Learning. [pdf]
- Ximing Qiao, Yuhua Bai, Siping Hu, Ang Li, Yiran Chen, and Hai Li. arXiv, 2021.
SparseFed: Mitigating Model Poisoning Attacks in Federated Learning with Sparsification. [pdf]
- Ashwinee Panda, Saeed Mahloujifar, Arjun N. Bhagoji, Supriyo Chakraborty, and Prateek Mittal. arXiv, 2021.
Robust Federated Learning with Attack-Adaptive Aggregation. [pdf] [code]
- Ching Pui Wan, and Qifeng Chen. arXiv, 2021.
Meta Federated Learning. [pdf]
- Omid Aramoon, Pin-Yu Chen, Gang Qu, and Yuan Tian. arXiv, 2021.
FLGUARD: Secure and Private Federated Learning. [pdf]
- Thien Duc Nguyen, Phillip Rieger, Hossein Yalame, Helen Möllering, Hossein Fereidooni, Samuel Marchal, Markus Miettinen, Azalia Mirhoseini, Ahmad-Reza Sadeghi, Thomas Schneider, and Shaza Zeitouni. arXiv, 2021.
Toward Robustness and Privacy in Federated Learning: Experimenting with Local and Central Differential Privacy. [pdf]
- Mohammad Naseri, Jamie Hayes, and Emiliano De Cristofaro. arXiv, 2020.
Backdoor Attacks on Federated Meta-Learning. [pdf]
- Chien-Lun Chen, Leana Golubchik, and Marco Paolieri. arXiv, 2020.
Dynamic backdoor attacks against federated learning. [pdf]
- Anbu Huang. arXiv, 2020.
Federated Learning in Adversarial Settings. [pdf]
- Raouf Kerkouche, Gergely Ács, and Claude Castelluccia. arXiv, 2020.
BlockFLA: Accountable Federated Learning via Hybrid Blockchain Architecture. [pdf]
- Harsh Bimal Desai, Mustafa Safa Ozdayi, and Murat Kantarcioglu. arXiv, 2020.
Mitigating Backdoor Attacks in Federated Learning. [pdf]
- Chen Wu, Xian Yang, Sencun Zhu, and Prasenjit Mitra. arXiv, 2020.
Learning to Detect Malicious Clients for Robust Federated Learning. [pdf]
- Suyi Li, Yong Cheng, Wei Wang, Yang Liu, and Tianjian Chen. arXiv, 2020.
Attack-Resistant Federated Learning with Residual-based Reweighting. [pdf] [code]
- Shuhao Fu, Chulin Xie, Bo Li, and Qifeng Chen. arXiv, 2019.

Transfer Learning

Incremental Learning, Incremental Backdoor Threats. [link]
- Wenbo Jiang, Tianwei Zhang, Han Qiu, Hongwei Li, and Guowen Xu. IEEE Transactions on Dependable and Secure Computing, 2022.
Robust Backdoor Injection with the Capability of Resisting Network Transfer. [link]
- Le Feng, Sheng Li, Zhenxing Qian, and Xinpeng Zhang. Information Sciences, 2022.
Anti-Distillation Backdoor Attacks: Backdoors Can Really Survive in Knowledge Distillation. [pdf]
- Yunjie Ge, Qian Wang, Baolin Zheng, Xinlu Zhuang, Qi Li, Chao Shen, and Cong Wang. ACM MM, 2021.
Hidden Trigger Backdoor Attacks. [pdf] [code]
- Aniruddha Saha, Akshayvarun Subramanya, and Hamed Pirsiavash. AAAI, 2020.
Weight Poisoning Attacks on Pre-trained Models. [pdf] [code]
- Keita Kurita, Paul Michel, and Graham Neubig. ACL, 2020.
Backdoor Attacks against Transfer Learning with Pre-trained Deep Learning Models. [pdf]
- Shuo Wang, Surya Nepal, Carsten Rudolph, Marthie Grobler, Shangyu Chen, and Tianle Chen. IEEE Transactions on Services Computing, 2020.
Latent Backdoor Attacks on Deep Neural Networks. [pdf]
- Yuanshun Yao, Huiying Li, Haitao Zheng and Ben Y. Zhao. CCS, 2019.
Architectural Backdoors in Neural Networks. [pdf]
- Mikel Bober-Irizar, Ilia Shumailov, Yiren Zhao, Robert Mullins, and Nicolas Papernot. arXiv, 2022.
Red Alarm for Pre-trained Models: Universal Vulnerabilities by Neuron-Level Backdoor Attacks. [pdf] [code]
- Zhengyan Zhang, Guangxuan Xiao, Yongwei Li, Tian Lv, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Xin Jiang, and Maosong Sun. arXiv, 2021.

Reinforcement Learning

Provable Defense against Backdoor Policies in Reinforcement Learning. [pdf] [code]
- Shubham Kumar Bharti, Xuezhou Zhang, Adish Singla, and Jerry Zhu. NeurIPS, 2022.
MARNet: Backdoor Attacks against Cooperative Multi-Agent Reinforcement Learning. [link]
- Yanjiao Chen, Zhicong Zheng, and Xueluan Gong. IEEE Transactions on Dependable and Secure Computing, 2022.
BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning. [pdf]
- Lun Wang, Zaynah Javed, Xian Wu, Wenbo Guo, Xinyu Xing, and Dawn Song. IJCAI, 2021.
Stop-and-Go: Exploring Backdoor Attacks on Deep Reinforcement Learning-based Traffic Congestion Control Systems. [pdf]
- Yue Wang, Esha Sarkar, Michail Maniatakos, and Saif Eddin Jabari. IEEE Transactions on Information Forensics and Security, 2021.
Agent Manipulator: Stealthy Strategy Attacks on Deep Reinforcement Learning. [link]
- Jinyin Chen, Xueke Wang, Yan Zhang, Haibin Zheng, Shanqing Yu, and Liang Bao. Applied Intelligence, 2022.
TrojDRL: Evaluation of Backdoor Attacks on Deep Reinforcement Learning. [pdf] [code]
- Panagiota Kiourti, Kacper Wardega, Susmit Jha, and Wenchao Li. DAC, 2020.
Poisoning Deep Reinforcement Learning Agents with In-Distribution Triggers. [pdf]
- Chace Ashcraft and Kiran Karra. ICLR Workshop, 2021.
A Temporal-Pattern Backdoor Attack to Deep Reinforcement Learning. [pdf]
- Yinbo Yu, Jiajia Liu, Shouqing Li, Kepu Huang, and Xudong Feng. arXiv, 2022.
Backdoor Detection in Reinforcement Learning. [pdf]
- Junfeng Guo, Ang Li, and Cong Liu. arXiv, 2022.
Design of Intentional Backdoors in Sequential Models. [pdf]
- Zhaoyuan Yang, Naresh Iyer, Johan Reimann, and Nurali Virani. arXiv, 2019.

Semi-Supervised and Self-Supervised Learning

Backdoor Attacks on Self-Supervised Learning. [pdf] [code]
- Aniruddha Saha, Ajinkya Tejankar, Soroush Abbasi Koohpayegani, and Hamed Pirsiavash. CVPR, 2022.
Poisoning and Backdooring Contrastive Learning. [pdf]
- Nicholas Carlini and Andreas Terzis. ICLR, 2022.
BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning. [pdf] [code]
- Jinyuan Jia, Yupei Liu, and Neil Zhenqiang Gong. IEEE S&P, 2022.
DeHiB: Deep Hidden Backdoor Attack on Semi-supervised Learning via adversarial Perturbation. [pdf]
- Zhicong Yan, Gaolei Li, Yuan Tian, Jun Wu, Shenghong Li, Mingzhe Chen, and H. Vincent Poor. AAAI, 2021.
Deep Neural Backdoor in Semi-Supervised Learning: Threats and Countermeasures. [link]
- Zhicong Yan, Jun Wu, Gaolei Li, Shenghong Li, and Mohsen Guizani. IEEE Transactions on Information Forensics and Security, 2021.
Backdoor Attacks in the Supply Chain of Masked Image Modeling. [pdf]
- Xinyue Shen, Xinlei He, Zheng Li, Yun Shen, Michael Backes, and Yang Zhang. arXiv, 2022.
Watermarking Pre-trained Encoders in Contrastive Learning. [pdf]
- Yutong Wu, Han Qiu, Tianwei Zhang, Jiwei Li, and Meikang Qiu. arXiv, 2022.

Quantization

RIBAC: Towards Robust and Imperceptible Backdoor Attack against Compact DNN. [pdf] [code]
- Huy Phan, Cong Shi, Yi Xie, Tianfang Zhang, Zhuohang Li, Tianming Zhao, Jian Liu, Yan Wang, Yingying Chen, and Bo Yuan. ECCV, 2022.
Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes. [pdf] [code]
- Sanghyun Hong, Michael-Andrei Panaitescu-Liess, Yiğitcan Kaya, and Tudor Dumitraş. NeurIPS, 2021.
Understanding the Threats of Trojaned Quantized Neural Network in Model Supply Chains. [pdf]
- Xudong Pan, Mi Zhang, Yifan Yan, and Min Yang. ACSAC, 2021
Quantization Backdoors to Deep Learning Models. [pdf]
- Hua Ma, Huming Qiu, Yansong Gao, Zhi Zhang, Alsharif Abuadbba, Anmin Fu, Said Al-Sarawi, and Derek Abbott. arXiv, 2021.
Stealthy Backdoors as Compression Artifacts. [pdf]
- Yulong Tian, Fnu Suya, Fengyuan Xu, and David Evans. arXiv, 2021.

Natural Language Processing

TrojText: Test-time Invisible Textual Trojan Insertion. [pdf] [code]
- Qian Lou, Yepeng Liu, and Bo Feng. ICLR, 2023.
Defending against Backdoor Attacks in Natural Language Generation. [link]
- Xiaofei Sun, Xiaoya Li, Yuxian Meng, Xiang Ao, Lingjuan Lyu, Jiwei Li, and Tianwei Zhang. AAAI, 2023.
Removing Backdoors in Pre-trained Models by Regularized Continual Pre-training. [pdf] [code]
- Biru Zhu, Ganqu Cui, Yangyi Chen, Yujia Qin, Lifan Yuan, Chong Fu, Yangdong Deng, Zhiyuan Liu, Maosong Sun, Ming Gu. Transactions of the Association for Computational Linguistics, 2023.
BadPrompt: Backdoor Attacks on Continuous Prompts. [pdf] [code]
- Xiangrui Cai, haidong xu, Sihan Xu, Ying Zhang, and Xiaojie Yuan. NeurIPS, 2022.
Moderate-fitting as a Natural Backdoor Defender for Pre-trained Language Models [pdf] [code]
- Biru Zhu, Yujia Qin, Ganqu Cui, Yangyi Chen, Weilin Zhao, Chong Fu, Yangdong Deng, Zhiyuan Liu, Jingang Wang, Wei Wu, Maosong Sun, and Ming Gu. NeurIPS, 2022.
A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks. [pdf] [code]
- Ganqu Cui, Lifan Yuan, Bingxiang He, Yangyi Chen, Zhiyuan Liu, and Maosong Sun. NeurIPS, 2022.
Spinning Language Models: Risks of Propaganda-as-a-Service and Countermeasures [pdf] [code]
- Eugene Bagdasaryan and Vitaly Shmatikov. IEEE S&P, 2022.
PICCOLO: Exposing Complex Backdoors in NLP Transformer Models. [pdf] [code]
- Yingqi Liu, Guangyu Shen, Guanhong Tao, Shengwei An, Shiqing Ma, and Xiangyu Zhang. IEEE S&P, 2022.
Triggerless Backdoor Attack for NLP Tasks with Clean Labels. [pdf]
- Leilei Gan, Jiwei Li, Tianwei Zhang, Xiaoya Li, Yuxian Meng, Fei Wu, Shangwei Guo, and Chun Fan. NAACL, 2022.
A Study of the Attention Abnormality in Trojaned BERTs. [pdf] [code]
- Weimin Lyu, Songzhu Zheng, Tengfei Ma, and Chao Chen. NAACL, 2022.
The Triggers that Open the NLP Model Backdoors Are Hidden in the Adversarial Samples. [link]
- Kun Shao, Yu Zhang, Junan Yang, Xiaoshuai Li, and Hui Liu. Computers & Security, 2022.
BDDR: An Effective Defense Against Textual Backdoor Attacks. [pdf]
- Kun Shao, Junan Yang, Yang Ai, Hui Liu, and Yu Zhang. Computers & Security, 2021.
BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models. [pdf]
- Kangjie Chen, Yuxian Meng, Xiaofei Sun, Shangwei Guo, Tianwei Zhang, Jiwei Li, and Chun Fan. ICLR, 2022.
Exploring the Universal Vulnerability of Prompt-based Learning Paradigm. [pdf] [code]
- Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, and Zhiyuan Liu. NAACL-Findings, 2022.
Backdoor Pre-trained Models Can Transfer to All. [pdf]
- Lujia Shen, Shouling Ji, Xuhong Zhang, Jinfeng Li, Jing Chen, Jie Shi, Chengfang Fang, Jianwei Yin, and Ting Wang. CCS, 2021.
BadNL: Backdoor Attacks against NLP Models with Semantic-preserving Improvements. [pdf] [arXiv-20]
- Xiaoyi Chen, Ahmed Salem, Dingfan Chen, Michael Backes, Shiqing Ma, Qingni Shen, Zhonghai Wu, and Yang Zhang. ACSAC, 2021.
Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning. [pdf]
- Linyang Li, Demin Song, Xiaonan Li, Jiehang Zeng, Ruotian Ma, and Xipeng Qiu. EMNLP, 2021.
T-Miner: A Generative Approach to Defend Against Trojan Attacks on DNN-based Text Classification. [pdf]
- Ahmadreza Azizi, Ibrahim Asadullah Tahmid, Asim Waheed, Neal Mangaokar, Jiameng Pu, Mobin Javed, Chandan K. Reddy, and Bimal Viswanath. USENIX Security, 2021.
RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models. [pdf] [code]
- Wenkai Yang, Yankai Lin, Peng Li, Jie Zhou, and Xu Sun. EMNLP, 2021.
ONION: A Simple and Effective Defense Against Textual Backdoor Attacks. [pdf]
- Fanchao Qi, Yangyi Chen, Mukai Li, Zhiyuan Liu, and Maosong Sun. EMNLP, 2021.
Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer. [pdf] [code]
- Fanchao Qi, Yangyi Chen, Xurui Zhang, Mukai Li, Zhiyuan Liu, and Maosong Sun. EMNLP, 2021.
Rethinking Stealthiness of Backdoor Attack against NLP Models. [pdf] [code]
- Wenkai Yang, Yankai Lin, Peng Li, Jie Zhou, and Xu Sun. ACL, 2021.
Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution. [pdf]
- Fanchao Qi, Yuan Yao, Sophia Xu, Zhiyuan Liu, and Maosong Sun. ACL, 2021.
Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger. [pdf] [code]
- Fanchao Qi, Mukai Li, Yangyi Chen, Zhengyan Zhang, Zhiyuan Liu, Yasheng Wang, and Maosong Sun. ACL, 2021.
Mitigating Data Poisoning in Text Classification with Differential Privacy. [pdf]
- Chang Xu, Jun Wang, Francisco Guzmán, Benjamin I. P. Rubinstein, and Trevor Cohn. EMNLP-Findings, 2021.
BFClass: A Backdoor-free Text Classification Framework. [pdf] [code]
- Zichao Li, Dheeraj Mekala, Chengyu Dong, and Jingbo Shang. EMNLP-Findings, 2021.
Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models. [pdf] [code]
- Wenkai Yang, Lei Li, Zhiyuan Zhang, Xuancheng Ren, Xu Sun, and Bin He. NAACL-HLT, 2021.
Neural Network Surgery: Injecting Data Patterns into Pre-trained Models with Minimal Instance-wise Side Effects. [pdf]
- Zhiyuan Zhang, Xuancheng Ren, Qi Su, Xu Sun, and Bin He. NAACL-HLT, 2021.
Text Backdoor Detection Using An Interpretable RNN Abstract Model. [link]
- Ming Fan, Ziliang Si, Xiaofei Xie, Yang Liu, and Ting Liu. IEEE Transactions on Information Forensics and Security, 2021.
Textual Backdoor Attack for the Text Classification System. [pdf]
- Hyun Kwon and Sanghyun Lee. Security and Communication Networks, 2021.
Weight Poisoning Attacks on Pre-trained Models. [pdf] [code]
- Keita Kurita, Paul Michel, and Graham Neubig. ACL, 2020.
Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder. [pdf]
- Alvin Chan, Yi Tay, Yew-Soon Ong, and Aston Zhang. EMNLP-Findings, 2020.
A Backdoor Attack Against LSTM-based Text Classification Systems. [pdf]
- Jiazhu Dai, Chuanshuai Chen, and Yufeng Li. IEEE Access, 2019.
PerD: Perturbation Sensitivity-based Neural Trojan Detection Framework on NLP Applications. [pdf]
- Diego Garcia-soto, Huili Chen, and Farinaz Koushanfar. arXiv, 2022.
Kallima: A Clean-label Framework for Textual Backdoor Attacks. [pdf]
- Xiaoyi Chen, Yinpeng Dong, Zeyu Sun, Shengfang Zhai, Qingni Shen, and Zhonghai Wu. arXiv, 2022.
Textual Backdoor Attacks with Iterative Trigger Injection. [pdf] [code]
- Jun Yan, Vansh Gupta, and Xiang Ren. arXiv, 2022.
WeDef: Weakly Supervised Backdoor Defense for Text Classification. [pdf]
- Lesheng Jin, Zihan Wang, and Jingbo Shang. arXiv, 2022.
Constrained Optimization with Dynamic Bound-scaling for Effective NLPBackdoor Defense. [pdf]
- Guangyu Shen, Yingqi Liu, Guanhong Tao, Qiuling Xu, Zhuo Zhang, Shengwei An, Shiqing Ma, and Xiangyu Zhang. arXiv, 2022.
Rethink Stealthy Backdoor Attacks in Natural Language Processing. [pdf]
- Lingfeng Shen, Haiyun Jiang, Lemao Liu, and Shuming Shi. arXiv, 2022.
Textual Backdoor Attacks Can Be More Harmful via Two Simple Tricks. [pdf]
- Yangyi Chen, Fanchao Qi, Zhiyuan Liu, and Maosong Sun. arXiv, 2021.
Spinning Sequence-to-Sequence Models with Meta-Backdoors. [pdf]
- Eugene Bagdasaryan and Vitaly Shmatikov. arXiv, 2021.
Defending against Backdoor Attacks in Natural Language Generation. [pdf] [code]
- Chun Fan, Xiaoya Li, Yuxian Meng, Xiaofei Sun, Xiang Ao, Fei Wu, Jiwei Li, and Tianwei Zhang. arXiv, 2021.
Hidden Backdoors in Human-Centric Language Models. [pdf]
- Shaofeng Li, Hui Liu, Tian Dong, Benjamin Zi Hao Zhao, Minhui Xue, Haojin Zhu, and Jialiang Lu. arXiv, 2021.
Detecting Universal Trigger’s Adversarial Attack with Honeypot. [pdf]
- Thai Le, Noseong Park, Dongwon Lee. arXiv, 2020.
Mitigating Backdoor Attacks in LSTM-based Text Classification Systems by Backdoor Keyword Identification. [pdf]
- Chuanshuai Chen, and Jiazhu Dai. arXiv, 2020.
Trojaning Language Models for Fun and Profit. [pdf]
- Xinyang Zhang, Zheng Zhang, and Ting Wang. arXiv, 2020.

Graph Neural Networks

Transferable Graph Backdoor Attack. [pdf]
- Shuiqiao Yang, Bao Gia Doan, Paul Montague, Olivier De Vel, Tamas Abraham, Seyit Camtepe, Damith C. Ranasinghe, and Salil S. Kanhere. RAID, 2022.
More is Better (Mostly): On the Backdoor Attacks in Federated Graph Neural Networks. [pdf]
- Jing Xu, Rui Wang, Kaitai Liang, and Stjepan Picek. ACSAC, 2022.
Graph Backdoor. [pdf] [code]
- Zhaohan Xi, Ren Pang, Shouling Ji, and Ting Wang. USENIX Security, 2021.
Backdoor Attacks to Graph Neural Networks. [pdf]
- Zaixi Zhang, Jinyuan Jia, Binghui Wang, and Neil Zhenqiang Gong. SACMAT, 2021.
Defending Against Backdoor Attack on Graph Nerual Network by Explainability. [pdf]
- Bingchen Jiang and Zhao Li. arXiv, 2022.
Link-Backdoor: Backdoor Attack on Link Prediction via Node Injection. [pdf] [code]
- Haibin Zheng, Haiyang Xiong, Haonan Ma, Guohan Huang, and Jinyin Chen. arXiv, 2022.
Neighboring Backdoor Attacks on Graph Convolutional Network. [pdf]
- Liang Chen, Qibiao Peng, Jintang Li, Yang Liu, Jiawei Chen, Yong Li, and Zibin Zheng. arXiv, 2022.
Dyn-Backdoor: Backdoor Attack on Dynamic Link Prediction. [pdf]
- Jinyin Chen, Haiyang Xiong, Haibin Zheng, Jian Zhang, Guodong Jiang, and Yi Liu. arXiv, 2021.
Explainability-based Backdoor Attacks Against Graph Neural Networks. [pdf]
- Jing Xu, Minhui, Xue, and Stjepan Picek. arXiv, 2021.

Point Cloud

A Backdoor Attack against 3D Point Cloud Classifiers. [pdf] [code]
- Zhen Xiang, David J. Miller, Siheng Chen, Xi Li, and George Kesidis. ICCV, 2021.
PointBA: Towards Backdoor Attacks in 3D Point Cloud. [pdf]
- Xinke Li, Zhiru Chen, Yue Zhao, Zekun Tong, Yabang Zhao, Andrew Lim, and Joey Tianyi Zhou. ICCV, 2021.
Imperceptible and Robust Backdoor Attack in 3D Point Cloud. [pdf]
- Kuofeng Gao, Jiawang Bai, Baoyuan Wu, Mengxi Ya, and Shu-Tao Xia. arXiv, 2022.
Detecting Backdoor Attacks Against Point Cloud Classifiers. [pdf]
- Zhen Xiang, David J. Miller, Siheng Chen, Xi Li, and George Kesidis. arXiv, 2021.
Poisoning MorphNet for Clean-Label Backdoor Attack to Point Clouds. [pdf]
- Guiyu Tian, Wenhao Jiang, Wei Liu, and Yadong Mu. arXiv, 2021.

Acoustics Signal Processing

Going in Style: Audio Backdoors Through Stylistic Transformations. [pdf]
- Stefanos Koffas, Luca Pajola, Stjepan Picek, and Mauro Conti. ICASSP, 2023.
VenoMave: Targeted Poisoning Against Speech Recognition. [pdf]
- Hojjat Aghakhani, Lea Schönherr, Thorsten Eisenhofer, Dorothea Kolossa, Thorsten Holz, Christopher Kruegel, Giovanni Vigna. SaTML, 2023.
Stealthy Backdoor Attack Against Speaker Recognition Using Phase-Injection Hidden Trigger. [link]
- Zhe Ye, Diqun Yan, Li Dong, Jiacheng Deng, Shui Yu. IEEE Signal Processing Letters, 2023.
Opportunistic Backdoor Attacks: Exploring Human-imperceptible Vulnerabilities on Speech Recognition Systems. [link] [code]
- Qiang Liu, Tongqing Zhou, Zhiping Cai, Yonghao Tang. ACM MM, 2022.
Audio-domain position-independent backdoor attack via unnoticeable triggers. [pdf]
- Cong Shi, Tianfang Zhang, Zhuohang Li, Huy Phan, Tianming Zhao, Yan Wang, Jian Liu, Bo Yuan, Yingying Chen. ACM MobiCom, 2022.
Can You Hear It? Backdoor Attacks via Ultrasonic Triggers. [pdf] [code]
- Stefanos Koffas, Jing Xu, Mauro Conti, and Stjepan Picek. WiseML, 2022.
Natural Backdoor Attacks on Speech Recognition Models. [link]
- Jinwen Xin, Xixiang Lyu, Jing Ma. ML4CS, 2022.
DriNet: Dynamic Backdoor Attack against Automatic Speech Recognization Models. [link]
- Jianbin Ye, Xiaoyuan Liu, Zheng You, Guowei Li, and Bo Liu. Applied Sciences, 2022.
Backdoor Attack against Speaker Verification [pdf] [code]
- Tongqing Zhai, Yiming Li, Ziqi Zhang, Baoyuan Wu, Yong Jiang, and Shu-Tao Xia. ICASSP, 2021.
A Novel Trojan Attack against Co-learning Based ASR DNN System. [link]
- Mingxuan Li, Xiao Wang, Dongdong Huo, Han Wang, Chao Liu, Yazhe Wang, Yu Wang, Zhen Xu. CSCWD, 2021.
Towards Stealthy Backdoor Attacks against Speech Recognition via Elements of Sound. [pdf]
- Hanbo Cai, Pengcheng Zhang, Hai Dong, Yan Xiao, Stefanos Koffas, Yiming Li. arXiv, 2023.
Fake the Real: Backdoor Attack on Deep Speech Classification via Voice Conversion. [pdf]
- Zhe Ye, Terui Mao, Li Dong, Diqun Yan. arXiv, 2023.
Adversarial Audio: A New Information Hiding Method and Backdoor for DNN-based Speech Recognition Models. [pdf] [code]
- Yehao Kong, Jiliang Zhang. arXiv, 2022.

Medical Science

FIBA: Frequency-Injection based Backdoor Attack in Medical Image Analysis. [pdf]
- Yu Feng, Benteng Ma, Jing Zhang, Shanshan Zhao, Yong Xia, and Dacheng Tao. CVPR, 2022.
Exploiting Missing Value Patterns for a Backdoor Attack on Machine Learning Models of Electronic Health Records: Development and Validation Study. [link]
- Byunggill Joe, Yonghyeon Park, Jihun Hamm, Insik Shin, and Jiyeon Lee. JMIR Medical Informatics, 2022.
Machine Learning with Electronic Health Records is vulnerable to Backdoor Trigger Attacks. [pdf]
- Byunggill Joe, Akshay Mehra, Insik Shin, and Jihun Hamm. AAAI Workshop, 2021.
Explainability Matters: Backdoor Attacks on Medical Imaging. [pdf]
- Munachiso Nwadike, Takumi Miyawaki, Esha Sarkar, Michail Maniatakos, and Farah Shamout. AAAI Workshop, 2021.
TRAPDOOR: Repurposing Backdoors to Detect Dataset Bias in Machine Learning-based Genomic Analysis. [pdf]
- Esha Sarkar and Michail Maniatakos. arXiv, 2021.

Vision Transformer

You Are Catching My Attention: Are Vision Transformers Bad Learners Under Backdoor Attacks? [pdf]
- Zenghui Yuan, Pan Zhou, Kai Zou, and Yu Cheng. CVPR, 2023.
TrojViT: Trojan Insertion in Vision Transformers. [pdf]
- Mengxin Zheng, Qian Lou, and Lei Jiang. CVPR, 2023.
Defending Backdoor Attacks on Vision Transformer via Patch Processing. [link]
- Khoa D. Doan, Yingjie Lao, Peng Yang, and Ping Li. AAAI, 2023.
Backdoor Attacks on Vision Transformers. [pdf] [code]
- Akshayvarun Subramanya, Aniruddha Saha, Soroush Abbasi Koohpayegani, Ajinkya Tejankar, and Hamed Pirsiavash. arXiv, 2022.
TrojViT: Trojan Insertion in Vision Transformers. [pdf]
- Mengxin Zheng, Qian Lou, and Lei Jiang. arXiv, 2022.
Attention Hijacking in Trojan Transformers. [pdf]
- Weimin Lyu, Songzhu Zheng, Tengfei Ma, Haibin Ling, and Chao Chen. arXiv, 2022.
DBIA: Data-free Backdoor Injection Attack against Transformer Networks. [pdf] [code]
- Peizhuo Lv, Hualong Ma, Jiachen Zhou, Ruigang Liang, Kai Chen, Shengzhi Zhang, and Yunfei Yang. arXiv, 2021.

Diffusion Model

How to Backdoor Diffusion Models? [pdf] [code]
- Sheng-Yen Chou, Pin-Yu Chen, and Tsung-Yi Ho. CVPR, 2023.
TrojDiff: Trojan Attacks on Diffusion Models With Diverse Targets. [pdf] [code]
- Weixin Chen, Dawn Song, and Bo Li. CVPR, 2023.

Cybersecurity

VulnerGAN: A Backdoor Attack through Vulnerability Amplification against Machine Learning-based Network Intrusion Detection Systems. [link] [code]
- Guangrui Liu, Weizhe Zhang, Xinjie Li, Kaisheng Fan, and Shui Yu. Information Sciences, 2022.
Explanation-Guided Backdoor Poisoning Attacks Against Malware Classifiers. [pdf]
- Giorgio Severi, Jim Meyer, Scott Coull, and Alina Oprea. USENIX Security, 2021.
Backdoor Attack on Machine Learning Based Android Malware Detectors. [link]
- Chaoran Li, Xiao Chen, Derui Wang, Sheng Wen, Muhammad Ejaz Ahmed, Seyit Camtepe, and Yang Xiang. IEEE Transactions on Dependable and Secure Computing, 2021.
Jigsaw Puzzle: Selective Backdoor Attack to Subvert Malware Classifiers. [pdf]
- Limin Yang, Zhi Chen, Jacopo Cortellazzi, Feargus Pendlebury, Kevin Tu, Fabio Pierazzi, Lorenzo Cavallaro, and Gang Wang. arXiv, 2022.

Detection and Tracking

Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only. [pdf] [code]
- Kangjie Chen, Xiaoxuan Lou, Guowen Xu, Jiwei Li, and Tianwei Zhang. ICLR, 2023.
Untargeted Backdoor Attack against Object Detection. [pdf] [code]
- Chengxiao Luo, Yiming Li, Yong Jiang, and Shu-Tao Xia. ICASSP, 2023.
Few-Shot Backdoor Attacks on Visual Object Tracking. [pdf] [code]
- Yiming Li, Haoxiang Zhong, Xingjun Ma, Yong Jiang, and Shu-Tao Xia. ICLR, 2022.
Baddet: Backdoor attacks on object detection. [pdf]
- Shih-Han Chan, Yinpeng Dong, Jun Zhu, Xiaolu Zhang, and Jun Zhou. ECCV Workshop, 2022.
Attacking by Aligning: Clean-Label Backdoor Attacks on Object Detection. [pdf]
- Yize Cheng, Wenbin Hu, Minhao Cheng. arXiv, 2023.
TAT: Targeted Backdoor Attacks against Visual Object Tracking. [link] [code]
- Ziyi Cheng, Baoyuan Wu, Zhenya Zhang, and Jianjun Zhao. Pattern Recognition, 2023.

Others

The Dark Side of AutoML: Towards Architectural Backdoor Search. [pdf] [code]
- Ren Pang, Changjiang Li, Zhaohan Xi, Shouling Ji, and Ting Wang. ICLR, 2023.
Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger. [pdf] Yi Yu, Yufei Wang, Wenhan Yang, Shijian Lu, Yap-Peng Tan, and Alex C. Kot. CVPR, 2023.
The Dark Side of Dynamic Routing Neural Networks: Towards Efficiency Backdoor Injection. [pdf]
- Simin Chen, Hanlin Chen, Mirazul Haque, Cong Liu, and Wei Yang. CVPR, 2023.
Backdoor Attacks on Crowd Counting. [pdf]
- Yuhua Sun, Tailai Zhang, Xingjun Ma, Pan Zhou, Jian Lou, Zichuan Xu, Xing Di, Yu Cheng, and Lichao Sun. ACM MM, 2022.
Backdoor Attacks on the DNN Interpretation System. [pdf]
- Shihong Fang and Anna Choromanska. AAAI, 2022.
The Devil is in the GAN: Defending Deep Generative Models Against Backdoor Attacks. [pdf] [code] [demo]
- Ambrish Rawat, Killian Levacher, and Mathieu Sinn. ESORICS, 2022.
Object-Oriented Backdoor Attack Against Image Captioning. [link]
- Meiling Li, Nan Zhong, Xinpeng Zhang, Zhenxing Qian, and Sheng Li. ICASSP, 2022.
When Does Backdoor Attack Succeed in Image Reconstruction? A Study of Heuristics vs. Bi-Level Solution. [link]
- Vardaan Taneja, Pin-Yu Chen, Yuguang Yao, and Sijia Liu. ICASSP, 2022.
An Interpretive Perspective: Adversarial Trojaning Attack on Neural-Architecture-Search Enabled Edge AI Systems. [link]
- Ship Peng Xu, Ke Wang, Md. Rafiul Hassan, Mohammad Mehedi Hassan, and Chien-Ming Chen. IEEE Transactions on Industrial Informatics, 2022.
A Triggerless Backdoor Attack and Defense Mechanism for Intelligent Task Offloading in Multi-UAV Systems. [link]
- Shafkat Islam, Shahriar Badsha, Ibrahim Khalil, Mohammed Atiquzzaman, and Charalambos Konstantinou. IEEE Internet of Things Journal, 2022.
Multi-Target Invisibly Trojaned Networks for Visual Recognition and Detection. [pdf]
- Xinzhe Zhou, Wenhao Jiang, Sheng Qi, and Yadong Mu. IJCAI, 2021.
Hidden Backdoor Attack against Semantic Segmentation Models. [pdf]
- Yiming Li, Yanjie Li, Yalei Lv, Yong Jiang, and Shu-Tao Xia. ICLR Workshop, 2021.
Adversarial Targeted Forgetting in Regularization and Generative Based Continual Learning Models. [link]
- Muhammad Umer and Robi Polikar. IJCNN, 2021.
Targeted Forgetting and False Memory Formation in Continual Learners through Adversarial Backdoor Attacks. [pdf]
- Muhammad Umer, Glenn Dawson, Robi Polikar. IJCNN, 2020.
Trojan Attacks on Wireless Signal Classification with Adversarial Machine Learning. [pdf]
- Kemal Davaslioglu, and Yalin E. Sagduyu. DySPAN, 2019.
BadHash: Invisible Backdoor Attacks against Deep Hashing with Clean Label. [pdf]
- Shengshan Hu, Ziqi Zhou, Yechao Zhang, Leo Yu Zhang, Yifeng Zheng, Yuanyuan HE, and Hai Jin. arXiv, 2022.
A Temporal Chrominance Trigger for Clean-label Backdoor Attack against Anti-spoof Rebroadcast Detection. [pdf]
- Wei Guo, Benedetta Tondi, and Mauro Barni. arXiv, 2022.
MACAB: Model-Agnostic Clean-Annotation Backdoor to Object Detection with Natural Trigger in Real-World. [pdf]
- Hua Ma, Yinshan Li, Yansong Gao, Zhi Zhang, Alsharif Abuadbba, Anmin Fu, Said F. Al-Sarawi, Nepal Surya, and Derek Abbott. arXiv, 2022.
BadDet: Backdoor Attacks on Object Detection. [pdf]
- Shih-Han Chan, Yinpeng Dong, Jun Zhu, Xiaolu Zhang, and Jun Zhou. arXiv, 2022.
Backdoor Attacks on Bayesian Neural Networks using Reverse Distribution. [pdf]
- Zhixin Pan and Prabhat Mishra. arXiv, 2022.
Backdooring Explainable Machine Learning. [pdf]
- Maximilian Noppel, Lukas Peter, and Christian Wressnegger. arXiv, 2022.
Clean-Annotation Backdoor Attack against Lane Detection Systems in the Wild. [pdf]
- Xingshuo Han, Guowen Xu, Yuan Zhou, Xuehuan Yang, Jiwei Li, and Tianwei Zhang. arXiv, 2022.
Dangerous Cloaking: Natural Trigger based Backdoor Attacks on Object Detectors in the Physical World. [pdf]
- Hua Ma, Yinshan Li, Yansong Gao, Alsharif Abuadbba, Zhi Zhang, Anmin Fu, Hyoungshick Kim, Said F. Al-Sarawi, Nepal Surya, and Derek Abbott. arXiv, 2022.
Targeted Trojan-Horse Attacks on Language-based Image Retrieval. [pdf]
- Fan Hu, Aozhu Chen, and Xirong Li. arXiv, 2022.
Is Multi-Modal Necessarily Better? Robustness Evaluation of Multi-modal Fake News Detection. [pdf]
- Jinyin Chen, Chengyu Jia, Haibin Zheng, Ruoxi Chen, and Chenbo Fu. arXiv, 2022.
Dual-Key Multimodal Backdoors for Visual Question Answering. [pdf]
- Matthew Walmer, Karan Sikka, Indranil Sur, Abhinav Shrivastava, and Susmit Jha. arXiv, 2021.
Clean-label Backdoor Attack against Deep Hashing based Retrieval. [pdf]
- Kuofeng Gao, Jiawang Bai, Bin Chen, Dongxian Wu, and Shu-Tao Xia. arXiv, 2021.
Backdoor Attacks on Network Certification via Data Poisoning. [pdf]
- Tobias Lorenz, Marta Kwiatkowska, and Mario Fritz. arXiv, 2021.
Backdoor Attack and Defense for Deep Regression. [pdf]
- Xi Li, George Kesidis, David J. Miller, and Vladimir Lucic. arXiv, 2021.
The Devil is in the GAN: Defending Deep Generative Models Against Backdoor Attacks. [pdf]
- Ambrish Rawat, Killian Levacher, and Mathieu Sinn. arXiv, 2021.
BAAAN: Backdoor Attacks Against Autoencoder and GAN-Based Machine Learning Models. [pdf]
- Ahmed Salem, Yannick Sautter, Michael Backes, Mathias Humbert, and Yang Zhang. arXiv, 2020.
DeepObliviate: A Powerful Charm for Erasing Data Residual Memory in Deep Neural Networks. [pdf]
- Yingzhe He, Guozhu Meng, Kai Chen, Jinwen He, and Xingbo Hu. arXiv, 2021.
Backdoors in Neural Models of Source Code. [pdf]
- Goutham Ramakrishnan, and Aws Albarghouthi. arXiv, 2020.
EEG-Based Brain-Computer Interfaces Are Vulnerable to Backdoor Attacks. [pdf]
- Lubin Meng, Jian Huang, Zhigang Zeng, Xue Jiang, Shan Yu, Tzyy-Ping Jung, Chin-Teng Lin, Ricardo Chavarriaga, and Dongrui Wu. arXiv, 2020.
Bias Busters: Robustifying DL-based Lithographic Hotspot Detectors Against Backdooring Attacks. [pdf]
- Kang Liu, Benjamin Tan, Gaurav Rajavendra Reddy, Siddharth Garg, Yiorgos Makris, and Ramesh Karri. arXiv, 2020.

Evaluation and Discussion

Distilling Cognitive Backdoor Patterns within an Image. [pdf] [code]
- Hanxun Huang, Xingjun Ma, Sarah Monazam Erfani, and James Bailey. ICLR, 2023.
Finding Naturally Occurring Physical Backdoors in Image Datasets. [pdf] [code]
- Emily Wenger, Roma Bhattacharjee, Arjun Nitin Bhagoji, Josephine Passananti, Emilio Andere, Heather Zheng, and Ben Zhao. NeurIPS, 2022.
BackdoorBox: A Python Toolbox for Backdoor Learning. [pdf] [code]
- Yiming Li, Mengxi Ya, Yang Bai, Yong Jiang, Shu-Tao Xia. ICLR Workshop, 2023.
TROJANZOO: Everything You Ever Wanted to Know about Neural Backdoors (But were Afraid to Ask). [pdf] [code]
- Ren Pang, Zheng Zhang, Xiangshan Gao, Zhaohan Xi, Shouling Ji, Peng Cheng, and Ting Wang. EuroS&P, 2022.
A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks. [pdf] [code]
- Ganqu Cui, Lifan Yuan, Bingxiang He, Yangyi Chen, Zhiyuan Liu, and Maosong Sun. NeurIPS, 2022.
BackdoorBench: A Comprehensive Benchmark of Backdoor Learning. [pdf] [code] [website]
- Baoyuan Wu, Hongrui Chen, Mingda Zhang, Zihao Zhu, Shaokui Wei, Danni Yuan, Chao Shen, and Hongyuan Zha. NeurIPS, 2022.
Backdoor Defense via Decoupling the Training Process. [pdf] [code]
- Kunzhe Huang, Yiming Li, Baoyuan Wu, Zhan Qin, and Kui Ren. ICLR, 2022.
How to Inject Backdoors with Better Consistency: Logit Anchoring on Clean Data. [pdf]
- Zhiyuan Zhang, Lingjuan Lyu, Weiqiang Wang, Lichao Sun, and Xu Sun. ICLR, 2022.
Defending against Model Stealing via Verifying Embedded External Features. [pdf] [code]
- Yiming Li, Linghui Zhu, Xiaojun Jia, Yong Jiang, Shu-Tao Xia, and Xiaochun Cao. AAAI, 2022. (Discuss the limitations of using backdoor attacks for model watermarking)
Susceptibility & Defense of Satellite Image-trained Convolutional Networks to Backdoor Attacks. [link]
- Ethan Brewer, Jason Lin, and Dan Runfola. Information Sciences, 2021.
Data-Efficient Backdoor Attacks. [pdf] [code]
- Pengfei Xia, Ziqiang Li, Wei Zhang, and Bin Li. IJCAI, 2022.
Excess Capacity and Backdoor Poisoning. [pdf]
- Naren Sarayu Manoj and Avrim Blum. NeurIPS, 2021.
Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks. [pdf] [code]
- Avi Schwarzschild, Micah Goldblum, Arjun Gupta, John P Dickerson, and Tom Goldstein. ICML, 2021.
Rethinking the Backdoor Attacks' Triggers: A Frequency Perspective. [pdf]
- Yi Zeng, Won Park, Z. Morley Mao, and Ruoxi Jia. ICCV, 2021.
Backdoor Attacks Against Deep Learning Systems in the Physical World. [pdf] [Master Thesis]
- Emily Wenger, Josephine Passanati, Yuanshun Yao, Haitao Zheng, and Ben Y. Zhao. CVPR, 2021.
Can Optical Trojans Assist Adversarial Perturbations? [pdf]
- Adith Boloor, Tong Wu, Patrick Naughton, Ayan Chakrabarti, Xuan Zhang, and Yevgeniy Vorobeychik. ICCV Workshop, 2021.
On the Trade-off between Adversarial and Backdoor Robustness. [pdf]
- Cheng-Hsin Weng, Yan-Ting Lee, and Shan-Hung Wu. NeurIPS, 2020.
A Tale of Evil Twins: Adversarial Inputs versus Poisoned Models. [pdf] [code]
- Ren Pang, Hua Shen, Xinyang Zhang, Shouling Ji, Yevgeniy Vorobeychik, Xiapu Luo, Alex Liu, and Ting Wang. CCS, 2020.
Systematic Evaluation of Backdoor Data Poisoning Attacks on Image Classiﬁers. [pdf]
- Loc Truong, Chace Jones, Brian Hutchinson, Andrew August, Brenda Praggastis, Robert Jasper, Nicole Nichols, and Aaron Tuor. CVPR Workshop, 2020.
On Evaluating Neural Network Backdoor Defenses. [pdf]
- Akshaj Veldanda, and Siddharth Garg. NeurIPS Workshop, 2020.
Attention Hijacking in Trojan Transformers. [pdf]
- Weimin Lyu, Songzhu Zheng, Tengfei Ma, Haibin Ling, and Chao Chen. arXiv, 2022.
Game of Trojans: A Submodular Byzantine Approach. [pdf]
- Dinuka Sahabandu, Arezoo Rajabi, Luyao Niu, Bo Li, Bhaskar Ramasubramanian, and Radha Poovendran. arXiv, 2022.
Auditing Visualizations: Transparency Methods Struggle to Detect Anomalous Behavior. [pdf] [code]
- Jean-Stanislas Denain and Jacob Steinhardt. arXiv, 2022.
A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks. [pdf] [code]
- Ganqu Cui, Lifan Yuan, Bingxiang He, Yangyi Chen, Zhiyuan Liu, and Maosong Sun. arXiv, 2022.
Can Backdoor Attacks Survive Time-Varying Models? [pdf]
- Huiying Li, Arjun Nitin Bhagoji, Ben Y. Zhao, and Haitao Zheng. arXiv, 2022.
Dynamic Backdoor Attacks with Global Average Pooling [pdf] [code]
- Stefanos Koffas, Stjepan Picek, and Mauro Conti. arXiv, 2022.
Planting Undetectable Backdoors in Machine Learning Models. [pdf]
- Shafi Goldwasser, Michael P. Kim, Vinod Vaikuntanathan, and Or Zamir. arXiv, 2022.
Towards A Critical Evaluation of Robustness for Deep Learning Backdoor Countermeasures. [pdf]
- Huming Qiu, Hua Ma, Zhi Zhang, Alsharif Abuadbba, Wei Kang, Anmin Fu, and Yansong Gao. arXiv, 2022.
Neural Network Trojans Analysis and Mitigation from the Input Domain. [pdf]
- Zhenting Wang, Hailun Ding, Juan Zhai, and Shiqing Ma. arXiv, 2022.
Widen The Backdoor To Let More Attackers In. [pdf]
- Siddhartha Datta, Giulio Lovisotto, Ivan Martinovic, and Nigel Shadbolt. arXiv, 2021.
Backdoor Learning Curves: Explaining Backdoor Poisoning Beyond Influence Functions. [pdf]
- Antonio Emanuele Cinà, Kathrin Grosse, Sebastiano Vascon, Ambra Demontis, Battista Biggio, Fabio Roli, and Marcello Pelillo. arXiv, 2021.
Rethinking the Trigger of Backdoor Attack. [pdf]
- Yiming Li, Tongqing Zhai, Baoyuan Wu, Yong Jiang, Zhifeng Li, and Shutao Xia. arXiv, 2020.
Poisoned Classifiers are Not Only Backdoored, They are Fundamentally Broken. [pdf] [code]
- Mingjie Sun, Siddhant Agarwal, and J. Zico Kolter. ICLR Workshop, 2021.
Effect of Backdoor Attacks over the Complexity of the Latent Space Distribution. [pdf] [code]
- Henry D. Chacon, and Paul Rad. arXiv, 2020.
Trembling Triggers: Exploring the Sensitivity of Backdoors in DNN-based Face Recognition. [pdf]
- Cecilia Pasquini, and Rainer Böhme. EURASIP Journal on Information Security, 2020.
Noise-response Analysis for Rapid Detection of Backdoors in Deep Neural Networks. [pdf]
- N. Benjamin Erichson, Dane Taylor, Qixuan Wu, and Michael W. Mahoney. arXiv, 2020.

Backdoor Attack for Positive Purposes

Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection. [pdf] [code]
- Yiming Li, Yang Bai, Yong Jiang, Yong Yang, Shu-Tao Xia, and Bo Li. NeurIPS, 2022.
Membership Inference via Backdooring. [pdf] [code]
- Hongsheng Hu, Zoran Salcic, Gillian Dobbie, Jinjun Chen, Lichao Sun, and Xuyun Zhang. IJCAI, 2022.
Neural Network Surgery: Injecting Data Patterns into Pre-trained Models with Minimal Instance-wise Side Effects. [pdf]
- Zhiyuan Zhang, Xuancheng Ren, Qi Su, Xu Sun and Bin He. NAACL-HLT, 2021.
One Step Further: Evaluating Interpreters using Metamorphic Testing. [pdf]
- Ming Fan, Jiali Wei, Wuxia Jin, Zhou Xu, Wenying Wei, and Ting Liu. ISSTA, 2022.
What Do You See? Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors. [pdf]
- Yi-Shan Lin, Wen-Chuan Lee, and Z. Berkay Celik. KDD, 2021.
Using Honeypots to Catch Adversarial Attacks on Neural Networks. [pdf]
- Shawn Shan, Emily Wenger, Bolun Wang, Bo Li, Haitao Zheng, Ben Y. Zhao. CCS, 2020. (Note: Unfortunately, it was bypassed by Nicholas Carlini most recently. [arXiv])
Turning Your Weakness into a Strength: Watermarking Deep Neural Networks by Backdooring. [pdf] [code]
- Yossi Adi, Carsten Baum, Moustapha Cisse, Benny Pinkas, and Joseph Keshet. USENIX Security, 2018.
Open-sourced Dataset Protection via Backdoor Watermarking. [pdf] [code]
- Yiming Li, Ziqi Zhang, Jiawang Bai, Baoyuan Wu, Yong Jiang, and Shu-Tao Xia. NeurIPS Workshop, 2020.
Protecting Deep Cerebrospinal Fluid Cell Image Processing Models with Backdoor and Semi-Distillation. [link]
- FangQi Li, Shilin Wang, and Zhenhai Wang. DICTA, 2021.
Debiasing Backdoor Attack: A Benign Application of Backdoor Attack in Eliminating Data Bias. [pdf]
- Shangxi Wu, Qiuyang He, Yi Zhang, and Jitao Sang. arXiv, 2022.
Watermarking Graph Neural Networks based on Backdoor Attacks. [pdf]
- Jing Xu and Stjepan Picek. arXiv, 2021.
CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning. [pdf]
- Zhensu Sun, Xiaoning Du, Fu Song, Mingze Ni, and Li Li. arXiv, 2021.
What Do Deep Nets Learn? Class-wise Patterns Revealed in the Input Space. [pdf]
- Shihao Zhao, Xingjun Ma, Yisen Wang, James Bailey, Bo Li, and Yu-Gang Jiang. arXiv, 2021.
A Stealthy and Robust Fingerprinting Scheme for Generative Models. [pdf]
- Guanlin Li, Shangwei Guo, Run Wang, Guowen Xu, and Tianwei Zhang. arXiv, 2021.
Towards Probabilistic Verification of Machine Unlearning. [pdf] [code]
- David Marco Sommer, Liwei Song, Sameer Wagh, and Prateek Mittal. arXiv, 2020.

Name		Name	Last commit message	Last commit date
Latest commit History 728 Commits
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md

License

THUYimingLi/backdoor-learning-resources

Folders and files

Latest commit

History

Repository files navigation