Adversarial attacks significantly threaten the robustness of deep neural
networks (DNNs). Despite the multiple defensive methods employed, they are
nevertheless vulnerable to poison attacks, where attackers meddle with the
initial training data. In order to defend DNNs against such adversarial
attacks, this work proposes a novel method that combines the defensive
distillation mechanism with a denoising autoencoder (DAE). This technique tries
to lower the sensitivity of the distilled model to poison attacks by spotting
and reconstructing poisonous adversarial inputs in the training data. We added
carefully created adversarial samples to the initial training data to assess
the proposed method’s performance. Our experimental findings demonstrate that
our method successfully identified and reconstructed the poisonous inputs while
also considering enhancing the DNN’s resilience. The proposed approach provides
a potent and robust defense mechanism for DNNs in various applications where
data poisoning attacks are a concern. Thus, the defensive distillation
technique’s limitation posed by poisonous adversarial attacks is overcome.