DeepFake Video

Detection System

AI Machine Learning Computer Vision Deep Learning
MODEL: ResNet50 + LSTM
TASK: Binary Classification
OUTPUT: REAL / FAKE
01

Table of Contents

01 Introduction
02 Notable DeepFake Tools
03 Literature Survey
04 Problem Statement
05 Methodology
06 Result
07 Comparison Table
08 Future Work
09 Conclusion
10 References
02

Introduction

โš 

The rapid advancement of AI has made it possible to create highly realistic synthetic videos โ€” commonly known as deepfakes. These manipulated videos can closely mimic real human faces and voices, making it increasingly difficult to distinguish authentic from fake content.

๐ŸŽฏ

This project focuses on building a deepfake video detection system using deep learning. The system classifies videos as REAL or FAKE by analyzing visual patterns extracted from video frames.

๐Ÿ”ฌ

The approach leverages preprocessing (frame extraction + face detection) followed by feature learning using neural network models โ€” contributing to trust and authenticity in digital content.

๐Ÿ‘ค
FAKE DETECTED
Confidence
94%
Artifacts
87%
Warping
76%
03

Summary of Notable DeepFake Tools

Tool Repository Key Features
Faceswap github.com/deepfakes/faceswap Two encoder-decoder pairs; shared encoder parameters
Faceswap-GAN github.com/shaoanlu/faceswap-GAN Adversarial loss + perceptual loss (VGGface) on auto-encoder architecture
Few-Shot Face Translation github.com/shaoanlu/fewshot-facetranslation-GAN Pre-trained face recognition for latent embedding; FUNIT + SPADE semantic priors
DeepFaceLab github.com/iperov/DeepFaceLab Extended Faceswap with H64, H128, LIAEF128, SAE models; S3FD, MTCNN, dlib extraction
04

Literature Survey Part I

[6]
Face Warping Artifact Detection

A dedicated CNN model compares generated faces with surrounding regions to detect artifacts. Current DF algorithms generate limited-resolution images that require transformation to match source video faces โ€” creating detectable warping artifacts.

CNNArtifact DetectionFace Warping
[7]
Eye Blinking Detection

Uncovers fake face recordings by detecting eye blinking patterns โ€” a physiological signal absent in synthesized videos. Tested on benchmark datasets with promising results on DNN-generated recordings. Absence of flickering is used as the primary detection hint.

Physiological SignalEye BlinkDNN
NOTE: Our strategy extends beyond single-parameter detection โ€” considering teeth, wrinkles, and multiple facial parameters simultaneously.
04

Literature Survey Part II

[8]
Capsule Network Detection

Detects manipulated/forged video and image data across various situations including replay attacks and computer-generated videos. Random noise was used in training โ€” an undesirable practice. Our method proposes a noiseless, real-time dataset for improved robustness.

Capsule NetworkReplay AttacksForgery Detection
[9]
Biological Signal Detection

Detects fake portrait videos using biological signals (PPG guides) extracted from genuine/fake video pairs. Trains a probabilistic SVM + CNN to ensure spatial soundness and temporal consistency. Achieves high accuracy regardless of generator, content, or goal.

PPG SignalsSVM + CNNTemporal Consistency
05

Problem Statement

๐ŸŽฌ

Task Definition

Design and develop a deep learning algorithm to classify video as deepfake or pristine. Predict the probability that a video is fake โ€” a binary classification problem.

๐Ÿ“ฅ

Input / Output

INPUTVideo (.mp4) โ€” 30 frames @ 1920ร—1080px
โ†“
OUTPUTLabel L โˆˆ {REAL, FAKE}

Loss Function

Binary Cross Entropy Loss optimized on every training sample:

BL(V, I) = โˆ’1 ยท log p(Y=1|V)
โˆ’ (1โˆ’I) ยท log p(Y=0|V)

where p(Y=i|V) is the probability that the network labels the video as class i.

โœ…
REAL
V* = 0
VS
โŒ
FAKE
V* = 1
06

Methodology โ€” Overview

Many tools exist to create DeepFakes, but few can reliably detect them. Our approach detects all types of deepfakes:

Replacement DF
Retrenchment DF
Interpersonal DF

System Architecture Pipeline

๐ŸŽฅ
Video Input
.mp4
โ†’
๐Ÿ–ผ
Frame
Extraction
OpenCV
โ†’
๐Ÿ‘
Face
Detection
Preprocessing
โ†’
๐Ÿง 
ResNet50
Features
CNN
โ†’
๐Ÿ”„
LSTM
Sequence
Temporal
โ†’
โšก
Classification
REAL / FAKE
06

Methodology โ€” Technologies & Methods

Methods Used

01CNN (Convolutional Neural Network)
02Transfer Learning
03Binary Classification
04Frame Extraction Technique
05ResNet50
06LSTM
07Deep Learning-based Detection

Technologies Used

Programming LanguagePython
Deep Learning FrameworkTensorFlow / Keras
Computer VisionOpenCV (cv2)
Deep Learning ModelResNet50
Sequence ModelLSTM
Data ProcessingNumPy, Pandas
VisualizationMatplotlib, Seaborn
ML UtilitiesScikit-learn
Dev EnvironmentJupyter Notebook
06

Methodology โ€” Algorithm

01
Start
02
Load Real and Fake Video Dataset
03
Extract Frames from Each Video using OpenCV
04
Resize and Preprocess Frames
05
Apply ResNet50 for Feature Extraction
06
Store Extracted Features Sequentially
07
Pass Feature Sequences into LSTM Network
08
Train Deep Learning Model
09
Classify Video as REAL or FAKE
10
Evaluate Model Performance
11
Display Prediction Results
12
End
deepfake_detector.py
import cv2, numpy as np
from tensorflow.keras import Sequential
from tensorflow.keras.applications import ResNet50

# Feature Extractor
resnet = ResNet50(weights='imagenet',
                  include_top=False)

# Sequence Model
model = Sequential([
  resnet,
  LSTM(256),
  Dense(1, activation='sigmoid')
])

# Binary Classification
model.compile(
  loss='binary_crossentropy',
  optimizer='adam'
)
06

Methodology โ€” Flow Chart

START
โ†“
Load Dataset
Real + Fake Videos
โ†“
Frame Extraction
OpenCV โ€” 30 frames/video
โ†“
Preprocessing
Resize + Face Crop + Normalize
โ†“
ResNet50
CNN Feature Extraction
โ†“
LSTM Network
Temporal Sequence Learning
โ†“
Classify Video
โ†™
REAL โœ…
โ†˜
FAKE โŒ
โ†“
Evaluate Performance
Accuracy ยท Precision ยท Recall ยท F1
โ†“
END
07

Result

Training Accuracy
Epoch 1
65%
Epoch 5
78%
Epoch 10
85%
Final
92%
92%
Train Accuracy
80%
Test Accuracy
ResNet50
+LSTM
Architecture
Binary
CE
Loss Function
08

Comparison Table

Model Train Accuracy Test Accuracy Status
Custom Model
0.8923
0.8027
Baseline
ResNet50 + LSTM โญ
~0.92
~0.80
Our Model
MesoNet
0.9568
0.8997
Comparison
DenseNet121
0.9699
0.8881
Comparison
09

Future Work

01
๐Ÿ“ˆ

Improving Model Accuracy

Enhance detection precision through advanced architectures and fine-tuning strategies.

02
โšก

Real-Time Detection

Develop low-latency inference pipelines for live video stream analysis.

03
๐Ÿ—„

Larger & Diverse Dataset

Expand training data with diverse demographics, lighting, and manipulation techniques.

04
๐Ÿ“ฑ

Mobile & Edge Optimization

Optimize model for deployment on mobile devices and edge computing hardware.

05
๐Ÿ”

Explainable AI

Integrate XAI techniques to visualize and explain model detection decisions.

06
๐ŸŽต

Multi-Modal Detection

Combine audio + visual signals for more robust cross-modal deepfake detection.

07
๐Ÿ›ก

Robustness Against New Techniques

Continuously adapt to emerging deepfake generation methods and adversarial attacks.

10

Conclusion

01

System Design

A deepfake video detection system was developed using deep learning on a large-scale video dataset. The system distinguishes real from manipulated videos by learning spatial and temporal patterns from extracted video frames through preprocessing (frame extraction + face cropping).

02

Results & Performance

The proposed approach is effective in identifying deepfake content with good accuracy, demonstrating the potential of combining convolutional and sequential models. Careful preprocessing and balanced dataset preparation play a key role in achieving reliable results.

03

Broader Impact

This project contributes toward addressing the growing challenge of digital misinformation caused by synthetic media โ€” providing a foundation for advanced deepfake detection in security, media verification, and online content moderation.

๐Ÿ›ก
Digital
Trust
Security
Media
Justice
Privacy
Trust
11

References Part I

[1]

Joshua Brockschmidt, Jiacheng Shang, and Jie Wu. On the Generality of Facial Forgery Detection. IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems Workshops (MASSW), pp. 43โ€“47. IEEE, 2019.

[2]

Yuezun Li, Ming-Ching Chang, and Siwei Lyu. In Ictu Oculi: Exposing AI Generated Fake Face Videos by Detecting Eye Blinking. arXiv:1806.02877v2, 2018.

[3]

TackHyun Jung, SangWon Kim, and KeeCheon Kim. Deep-Vision: Deepfakes Detection Using Human Eye Blinking Pattern. IEEE Access, 8:83144โ€“83154, 2020.

[4]

Konstantinos Vougioukas, Stavros Petridis, and Maja Pantic. Realistic Speech-Driven Facial Animation with GANs. International Journal of Computer Vision, 128:1398โ€“1413, 2020.

[5]

Hai X. Pham, Yuting Wang, and Vladimir Pavlovic. Generative Adversarial Talking Head: Bringing Portraits to Life with a Weakly Supervised Neural Network. arXiv:1803.07716, 2018.

[6]

Yuezun Li, Siwei Lyu. Exposing DF Videos By Detecting Face Warping Artifacts. arXiv:1811.00656v3.

11

References Part II

[7]

Yuezun Li, Ming-Ching Chang and Siwei Lyu. Exposing AI Created Fake Videos by Detecting Eye Blinking. arXiv.

[8]

Huy H. Nguyen, Junichi Yamagishi, and Isao Echizen. Using Capsule Networks to Detect Forged Images and Videos.

[9]

Umur Aybars Ciftci, ฤฐlke Demir, Lijun Yin. Detection of Synthetic Portrait Videos using Biological Signals. arXiv:1901.02212v2.

[10]

Liu, M. Y., Huang, X., Mallya, A., Karras, T., Aila, T., Lehtinen, J., and Kautz, J. Few-shot unsupervised image-to-image translation. Proceedings of the IEEE International Conference on Computer Vision, pp. 10551โ€“10560, 2019.

Thank You
DeepFake Video Detection AI/ML Project