โ๏ธ Module 2: Adversarial Attacks
Master adversarial attacks, including evasion attacks, poisoning attacks, and model extraction techniques
๐ Learning Objectives
By the end of this module, you will be able to:
- Understand and implement evasion attacks (FGSM, PGD, C&W)
- Execute poisoning attacks on training data
- Perform model extraction and inversion attacks
- Implement backdoor attacks in AI systems
- Analyze attack effectiveness and detection
๐ฏ Module Lessons
1
Evasion Attacks
Master FGSM, PGD, and C&W adversarial attacks
Key Topics:
- Fast Gradient Sign Method (FGSM)
- Projected Gradient Descent (PGD)
- Carlini & Wagner (C&W) attacks
- Universal adversarial perturbations
- Attack evaluation metrics
2
Poisoning Attacks
Data poisoning, label flipping, and backdoor insertion
Key Topics:
- Data poisoning techniques
- Label flipping attacks
- Backdoor attack implementation
- Poisoning detection methods
- Defense strategies
3
Model Extraction
Stealing and reconstructing machine learning models
Key Topics:
- Black-box model extraction
- Model inversion attacks
- Membership inference
- Extraction prevention
- Legal and ethical implications
4
Backdoor Attacks
Hidden functionality insertion in AI models
Key Topics:
- Backdoor trigger design
- Poisoned data injection
- Stealth backdoor techniques
- Backdoor detection
- Defensive measures
๐งช Hands-On Labs
Lab 1: Adversarial Examples Generation
Objective: Generate adversarial examples using different attack methods
Duration: 120 minutes
Intermediate
- Implement FGSM attacks
- Generate PGD adversarial examples
- Create C&W attacks
- Evaluate attack success rates
- Compare attack effectiveness
Lab 2: Model Extraction Challenge
Objective: Extract a target model through querying and reconstruction
Duration: 150 minutes
Advanced
- Query target model API
- Collect training data
- Train surrogate model
- Evaluate extraction success
- Implement defense mechanisms
๐ Module Assessment
Final Module Assessment
Test your understanding of adversarial attacks with our comprehensive assessment.
25 Questions
45 minutes
75% to pass
Topics Covered:
- Evasion Attacks
- Poisoning Attacks
- Model Extraction
- Backdoor Attacks