2025Adversarial Diffusion for Fraud Defense

Diff AI Attacks

A novel adversarial defense framework leveraging Denoising Diffusion Probabilistic Models (DDPMs) to enhance robustness in financial fraud detection systems against evasion and data poisoning attacks.

Modern financial institutions rely on machine learning models for real-time fraud detection, yet their complexity introduces critical security vulnerabilities through adversarial AI attacks. This research implements diffusion-based adversarial purification and synthetic data augmentation to stress-test and harden ML defenses without requiring continual adversarial retraining.

Using the IEEE-CIS Fraud Detection dataset, the framework generates distributionally realistic adversarial examples via Diff-PGD (Diffusion-based Projected Gradient Descent) and purifies corrupted inputs back to clean data manifolds through TabDiff reverse diffusion, achieving superior robustness while maintaining 96% ROC-AUC baseline performance.

TabDiff Diffusion Model Training

Trained TabDiff diffusion model on fraud samples in PCA-reduced feature space using 500 denoising steps and EDM (Elucidating Diffusion Models) sampling. Addressed mode collapse in minority class generation through distribution-aware training with contrastive learning and distribution regularization, enabling high-fidelity synthetic fraud sample generation that maintains complex joint dependencies across mixed-type tabular features.

Adversarial Attack & Purification

Implemented Diff-PGD adversarial sample generation by iteratively adding TabDiff-style noise with PowerMean schedule (σ: 0.002 → 80) until XGBoost classifier prediction flipped to 'Safe'. Applied TabDiff reverse diffusion for post-hoc adversarial purification, mapping perturbed inputs back to clean data manifold without requiring gradient computations during defense inference.

Robustness Evaluation & Model Hardening

Retrained fraud detection models with three data configurations: original only (baseline), original + synthetic fraud, and original + synthetic + adversarial (3× weight on adversarial samples). Evaluated using AUC-ROC, precision, recall, and F1-score to demonstrate improved generalization against novel evasion attacks while maintaining baseline performance on clean data.

Achieved 96% ROC-AUC on baseline fraud detection using XGBoost with SMOTE and random undersampling, establishing robust performance benchmark on IEEE-CIS dataset.

Generated 1000+ high-fidelity synthetic fraud samples using TabDiff diffusion models in PCA space with 500 denoising steps and EDM sampling, addressing extreme class imbalance through distribution-aware training and contrastive learning.

Developed adversarial sample generation pipeline using Diff-PGD with PowerMean noise schedule (σ: 0.002 → 80) to create stealthy attack samples, followed by TabDiff reverse diffusion purification mapping perturbed inputs back to clean data manifolds without gradient computations during defense.

I learned that diffusion is useful on both sides of the fight: generating hard fraud examples and purifying adversarial inputs without retraining the detector. Defense got more interesting once generation and purification shared the same manifold idea.