Neural Network Research Articles

Stanford-level tutorials covering the mathematical foundations of deep learning, from optimization theory to generative models and scaling laws.

Optimization Geometry and Convergence

Explore PL inequalities, linear convergence rates, and the geometry of deep network optimization landscapes. Covers SGD dynamics, saddle avoidance, and practical implications for deep learning.

Stanford Post-Grad Read More →

Approximation, Generalization, and Scaling

Barron-space approximation theory, Rademacher bounds, and compute-optimal scaling laws for neural networks. Understanding why deep networks generalize and scale effectively.

Research Level Read More →

Diffusion Models and Normalizing Flows

Unified view of generative models, probability flow ODEs, and the mathematical foundations of diffusion models. Connects score matching to continuous normalizing flows.

Advanced Read More →

Higher-Order Methods for Deep Learning

Newton methods, cubic regularization, natural gradients, and practical approximations like K-FAC and Shampoo. When and how curvature information improves optimization.

Research Level Read More →

Hyperparameters, Schedules, and Stability

Learning rate scaling, warmup theorems, cosine decay, and stability analysis for large-scale training. Systematic approaches to hyperparameter design.

Practical Theory Read More →

Topics Covered

🎯

Optimization Theory

PL inequalities, convergence rates, saddle escape

📊

Generalization

Barron space, Rademacher bounds, scaling laws

🎨

Generative Models

Diffusion, flows, score matching

Optimization Methods

Newton, natural gradients, K-FAC

🔧

Training Dynamics

Hyperparameters, schedules, stability

Reading Order

These articles are designed to be read in sequence, building from fundamental optimization theory through to advanced topics in generative models and training dynamics.

Start with Optimization Theory

Or explore any article based on your interests