šŸŽÆ Learning Objectives

šŸ“š Core Concepts

1. Robust Optimization Framework

Robust optimization aims to find solutions that remain optimal under uncertainty and adversarial perturbations.

Mathematical Formulation

min_Īø max_Ī“āˆˆĪ” L(Īø, x + Ī“, y)

Where:

  • Īø: Model parameters
  • Ī“: Adversarial perturbation in uncertainty set Ī”
  • x: Input data
  • y: True labels
  • L: Loss function

Key Properties

  • Worst-case optimization: Optimize for the most challenging scenarios
  • Uncertainty modeling: Define perturbation sets that capture realistic threats
  • Certification: Provide mathematical guarantees about model behavior

2. Certified Defenses

Methods that provide mathematical guarantees about model robustness within specified perturbation bounds.

Lipschitz-Constrained Networks

# PyTorch implementation
import torch
import torch.nn as nn

class LipschitzLayer(nn.Module):
    def __init__(self, in_features, out_features, lipschitz_const=1.0):
        super().__init__()
        self.linear = nn.Linear(in_features, out_features)
        self.lipschitz_const = lipschitz_const
        
    def forward(self, x):
        # Spectral normalization
        weight = self.linear.weight
        u, s, v = torch.svd(weight)
        s_clamped = torch.clamp(s, max=self.lipschitz_const)
        weight_normalized = torch.mm(torch.mm(u, torch.diag(s_clamped)), v.t())
        return F.linear(x, weight_normalized, self.linear.bias)

Certified Radius Calculation

def certified_radius(logits, true_class, lipschitz_const):
    """
    Calculate certified radius for a given prediction
    """
    sorted_logits, _ = torch.sort(logits, descending=True)
    
    if sorted_logits[0] == logits[true_class]:
        # Correct prediction
        margin = sorted_logits[0] - sorted_logits[1]
        return margin / (2 * lipschitz_const)
    else:
        return 0.0

3. Randomized Smoothing

A technique that provides certified robustness by adding random noise to inputs during inference.

Core Principle

If a model's prediction is stable under random perturbations, it's likely robust to adversarial attacks.

Implementation

import torch
import torch.nn.functional as F

def randomized_smoothing(model, x, num_samples=1000, noise_std=0.25):
    """
    Apply randomized smoothing to get certified predictions
    """
    model.eval()
    
    with torch.no_grad():
        # Sample multiple noisy versions
        predictions = []
        for _ in range(num_samples):
            noise = torch.randn_like(x) * noise_std
            noisy_x = x + noise
            pred = model(noisy_x)
            predictions.append(pred)
        
        # Average predictions
        avg_pred = torch.stack(predictions).mean(dim=0)
        
        # Calculate confidence
        confidence = F.softmax(avg_pred, dim=1)
        
        # Certified radius (simplified)
        max_conf = confidence.max(dim=1)[0]
        certified_radius = noise_std * torch.sqrt(2 * torch.log(1 / (1 - max_conf)))
        
        return confidence, certified_radius

šŸ”§ Practical Implementation

1. Robust Training with PGD

Combine adversarial training with robust optimization principles.

def robust_training_step(model, x, y, optimizer, epsilon=0.3, alpha=0.01, steps=7):
    """
    Single training step with robust optimization
    """
    model.train()
    
    # Initialize perturbation
    delta = torch.zeros_like(x, requires_grad=True)
    
    # Multi-step PGD attack
    for _ in range(steps):
        loss = F.cross_entropy(model(x + delta), y)
        grad = torch.autograd.grad(loss, delta)[0]
        
        # Update perturbation
        delta = delta + alpha * grad.sign()
        delta = torch.clamp(delta, -epsilon, epsilon)
        delta = torch.clamp(x + delta, 0, 1) - x  # Project back to valid range
    
    # Final robust loss
    robust_loss = F.cross_entropy(model(x + delta), y)
    
    optimizer.zero_grad()
    robust_loss.backward()
    optimizer.step()
    
    return robust_loss.item()

2. Certified Defense Evaluation

Evaluate the effectiveness of certified defenses against various attack types.

def evaluate_certified_defense(model, test_loader, attack_radius=0.1):
    """
    Evaluate model's certified robustness
    """
    model.eval()
    total_samples = 0
    certified_correct = 0
    clean_correct = 0
    
    with torch.no_grad():
        for x, y in test_loader:
            # Clean accuracy
            clean_pred = model(x).argmax(dim=1)
            clean_correct += (clean_pred == y).sum().item()
            
            # Certified accuracy
            confidence, radius = randomized_smoothing(model, x)
            certified_pred = confidence.argmax(dim=1)
            
            # Only count as certified if radius >= attack_radius
            is_certified = radius >= attack_radius
            certified_correct += ((certified_pred == y) & is_certified).sum().item()
            
            total_samples += x.size(0)
    
    clean_acc = clean_correct / total_samples
    certified_acc = certified_correct / total_samples
    
    return {
        'clean_accuracy': clean_acc,
        'certified_accuracy': certified_acc,
        'certification_rate': certified_acc / clean_acc if clean_acc > 0 else 0
    }

šŸ“Š Evaluation Metrics

Certified Accuracy

Percentage of test samples that are both correctly classified and certified robust

Certified Accuracy = (Certified Correct Predictions) / (Total Samples)

Certification Rate

Percentage of correctly classified samples that are also certified

Certification Rate = Certified Accuracy / Clean Accuracy

Average Certified Radius

Mean certified radius across all test samples

Avg Radius = (Ī£ certified_radius_i) / (Total Certified Samples)

Robustness-Accuracy Trade-off

Analysis of the relationship between robustness and clean accuracy

Trade-off = f(Clean Accuracy, Certified Accuracy)

šŸ› ļø Tools & Libraries

Research Libraries

Certification Tools

  • ERAN - Neural network verification
  • NNEnum - Neural network enumeration
  • DiffAI - Differential privacy for AI

šŸ“ˆ Advanced Techniques

CROWN-IBP Training

Combines interval bound propagation with certified training for improved robustness.

def crown_ibp_loss(model, x, y, lambda_ibp=1.0, lambda_crown=1.0):
    """
    Combined CROWN-IBP loss for certified training
    """
    # IBP bounds
    lb, ub = ibp_bounds(model, x, epsilon=0.1)
    
    # CROWN bounds (linear approximation)
    crown_lb, crown_ub = crown_bounds(model, x, epsilon=0.1)
    
    # IBP loss
    ibp_loss = F.cross_entropy(ub, y) - F.cross_entropy(lb, y)
    
    # CROWN loss
    crown_loss = F.cross_entropy(crown_ub, y) - F.cross_entropy(crown_lb, y)
    
    return lambda_ibp * ibp_loss + lambda_crown * crown_loss

Semidefinite Relaxation

Uses semidefinite programming to obtain tighter robustness certificates.

import cvxpy as cp

def semidefinite_certificate(model, x, epsilon):
    """
    Compute semidefinite relaxation certificate
    """
    n = x.numel()
    
    # Variables
    P = cp.Variable((n, n), symmetric=True)
    q = cp.Variable(n)
    r = cp.Variable()
    
    # Constraints
    constraints = [
        P >> 0,  # Positive semidefinite
        cp.norm(P, 'fro') <= epsilon**2,
        cp.norm(q) <= epsilon
    ]
    
    # Objective: maximize margin
    objective = cp.Maximize(r)
    
    # Solve
    problem = cp.Problem(objective, constraints)
    problem.solve()
    
    return problem.value

āš ļø Limitations & Challenges

Computational Complexity

  • Certification overhead: Computing certificates can be computationally expensive
  • Scalability issues: Methods may not scale to large models or datasets
  • Memory requirements: Some techniques require significant memory

Certification Gaps

  • Conservative bounds: Certificates may be overly conservative
  • Limited perturbation sets: Most methods assume Lāˆž or L2 perturbations
  • Architecture constraints: Some defenses require specific model architectures

Practical Considerations

  • Performance trade-offs: Robustness often comes at the cost of clean accuracy
  • Hyperparameter sensitivity: Methods require careful tuning
  • Evaluation challenges: Comparing different defense methods can be difficult

šŸŽÆ Hands-on Exercise

Exercise: Implement Randomized Smoothing

Build a certified defense using randomized smoothing and evaluate its robustness.

Tasks:

  1. Implement randomized smoothing inference
  2. Calculate certified radii for test samples
  3. Compare certified accuracy vs. clean accuracy
  4. Analyze the robustness-accuracy trade-off

Expected Outcomes:

  • Understanding of certification principles
  • Experience with robustness evaluation
  • Insight into practical limitations

šŸ’» Starter Code

# TODO: Implement randomized smoothing
def randomized_smoothing(model, x, num_samples=1000, noise_std=0.25):
    # Your implementation here
    pass

# TODO: Evaluate certified robustness
def evaluate_certified_robustness(model, test_loader):
    # Your implementation here
    pass