Exposing Deepfakes with Vision Transformers
A ViT-based pipeline for deepfake detection and explainability using the DFDC dataset

Project Description
This project explores the use of Vision Transformers (ViT) for detecting deepfakes in video content with a strong focus on explainability. Leveraging the Deepfake Detection Challenge (DFDC) dataset (~100GB), I implemented a high-performance video classification pipeline in PyTorch, achieving a test accuracy of 93.52%. The model architecture was optimized for GPU training and designed to handle large-scale frame extraction efficiently.
To make the black-box behavior of ViTs more interpretable, I integrated attention heatmaps with OpenCV-based frame-level analysis, enabling visual tracing of the regions the model focused on while making predictions. In 85% of the cases, these visualizations aligned with actual manipulated facial regions — a strong indicator of model transparency.
The system demonstrates a powerful blend of deep learning, computer vision, and model explainability — tackling an urgent real-world problem where trust, ethics, and AI converge.
Other projects

HoloCommerce
Immersive Multi-User VR Marketplace

XINU
Operating System Enhancements

K-Fold Vehicle Collision Prediction – ResNet
A deep learning model leveraging ResNet to predict vehicle collisions from dashcam footage with high accuracy

Ping Me: A Real-Time, Secure Chat Platform
WebSocket-powered chat system with JWT auth and CI/CD deployment

Machine Learning with Spark Streaming – MLlib
Scalable classification using PySpark and MLlib on streaming data

StoryTube
NLP-Powered Text-to-Animation System
