Khondoker Murad Hossain will defend his Ph.D. dissertation on Tuesday, March 26, 2024 from 3:00-5:00pm ET online.
Trojan (Backdoor) Attack Detection in Deep Neural Networks
Khondoker Murad Hossain
University of Maryland, Baltimore County
Deep Neural Networks (DNNs) are increasingly being used in critical applications, which heightens the concern over their vulnerability to trojan attacks. As these networks become more widespread, malicious actors gain greater incentives and opportunities to implant trojans that can alter the behavior of trained models. In this dissertation, we will discuss four distinct pipelines that have been developed for trojan detection across various scenarios.
First, we have developed two distinct pipelines using tensor decomposition algorithms IVA, MCCA, and PARAFAC2 for trojan (backdoor) detection in DNNs, focusing respectively on DNN activations and DNN weights. The first pipeline employs these tensor decomposition techniques to analyze the structure of activations from a few sample inputs, providing a versatile solution that can evaluate multiple models concurrently, accommodate a broad spectrum of network architectures, operate independently of trigger characteristics, and deliver exceptional computational efficiency. This method has been shown to surpass existing trojan detection algorithms in accuracy and efficiency, validated through rigorous testing on challenging datasets from the TrojAI competition, MNIST digits, and CIFAR-10 datasets. The second pipeline, innovatively applying the same tensor decomposition techniques, extracts features directly from the DNN model weights without necessitating training data. This approach is uniquely suited for both image classification and object detection models, marking a departure from traditional methods reliant on training data and preprocessing. By utilizing the frozen weights of DNNs for feature extraction, this method significantly enhances the accuracy and efficiency of detecting compromised models across both image classification and object detection datasets.
Second, we introduce a novel backdoor attack detection pipeline, detecting attacked models using graph convolutional networks (DeBUGCN). To the best of our knowledge, ours is the first use of GCNs for trojan detection. We use the static weights of a DNN's final fully connected (FC) layer in our pipeline where the FC layer and its weights are converted to graphs. The GCN is then used as a binary classifier yielding a trojan or clean determination for the CNN. To demonstrate the efficacy of our pipeline, we train hundreds of clean and trojaned CNN models on the MNIST handwritten digits dataset, and show the detection results using DeBUGCN. Moreover, our pipeline is implemented on the TrojAI dataset with various CNN architectures showing the robustness and model-agnostic behavior of DeBUGCN. Furthermore, DeBUGCN exhibits comparable accuracy with lesser computation time when we compare our results with state-of-the-art trojan detection algorithms, thus ensuring safe and robust DNN models
Finally, we unveil a pioneering framework designed for the detection of trojans in individual neural network models, marking an unprecedented advancement in the field of AI security. This significant advancement addresses the challenge of detecting trojans in scenarios where we have access to only one model, sidestepping the traditional dependency on multiple models or large amounts of training data. Our unique detection framework represents an innovative first in the field, utilizing a two-step student-teacher knowledge distillation strategy. This approach effectively identifies and detects trojan behaviors by implementing a straightforward outlier detection method at the final stage. The effectiveness of this method has been extensively validated across a variety of contexts and challenges, including tests on major datasets such as MNIST, CIFAR-10, and CIFAR-100 with different triggers, and on real-world models like ResNet-32 and ResNet-50 sourced from Hugging Face. These comprehensive evaluations have confirmed the method's robustness, adaptability, and reliability in accurately detecting trojans under a wide range of conditions.
Committee: Drs. Tim Oates (Chair), Anupam Joshi, Seung Jun Kim, James Foulds, and Naresh Sundaram Iyer