Karpathy Guidelines
Implement Andrej Karpathy's neural network training guidelines through automated pipeline integration
Karpathy Guidelines is a community skill for implementing machine learning engineering best practices inspired by Andrej Karpathy, covering systematic debugging, training pipeline verification, and practical approaches to neural network development.
What Is This?
Overview
Karpathy Guidelines provides structured approaches to neural network training and debugging based on practical ML engineering principles. It covers the progressive development methodology of starting simple and adding complexity incrementally, systematic overfitting verification, learning rate finding, loss curve analysis, and common training failure diagnostics. The skill translates experienced ML intuition into repeatable checklists that reduce debugging time.
Who Should Use This
This skill serves ML engineers debugging training pipelines that produce unexpected results, researchers setting up new training experiments who want to avoid common pitfalls, and teams establishing ML development standards that ensure consistent quality across their training workflows.
Why Use It?
Problems It Solves
Training failures are difficult to diagnose because many components interact: data preprocessing, model architecture, loss functions, optimizers, and hyperparameters. Developers jump to complex architectures before verifying that basic setups work correctly. Training runs that appear to progress but produce poor final results waste compute hours. Without systematic verification steps, subtle bugs in data loading or label processing go undetected until evaluation.
Core Highlights
Progressive complexity builds from the simplest possible baseline before adding layers, regularization, and architectural features. Overfit-first methodology verifies that the model can memorize a small batch before training on the full dataset. Loss curve analysis identifies common failure patterns like underfitting, overfitting, and learning rate issues. Data verification checks confirm that labels are correct, preprocessing is faithful, and batches are properly constructed.
How to Use It?
Basic Usage
import torch
import torch.nn as nn
class TrainingVerifier:
def __init__(self, model: nn.Module, loss_fn, optimizer):
self.model = model
self.loss_fn = loss_fn
self.optimizer = optimizer
def verify_overfit_single_batch(self, batch_x, batch_y,
max_steps: int = 100) -> dict:
self.model.train()
losses = []
for step in range(max_steps):
self.optimizer.zero_grad()
output = self.model(batch_x)
loss = self.loss_fn(output, batch_y)
loss.backward()
self.optimizer.step()
losses.append(loss.item())
return {
"initial_loss": losses[0],
"final_loss": losses[-1],
"converged": losses[-1] < losses[0] * 0.01,
"loss_history": losses
}
def check_gradients(self, batch_x, batch_y) -> dict:
self.model.train()
output = self.model(batch_x)
loss = self.loss_fn(output, batch_y)
loss.backward()
grad_stats = {}
for name, param in self.model.named_parameters():
if param.grad is not None:
grad_stats[name] = {
"mean": param.grad.mean().item(),
"std": param.grad.std().item(),
"zero_pct": (param.grad == 0).float().mean().item()
}
return grad_statsReal-World Examples
class TrainingChecklist:
def __init__(self):
self.checks: list[dict] = []
def verify_data(self, dataset, num_samples: int = 5) -> dict:
samples = [dataset[i] for i in range(num_samples)]
shapes = [s[0].shape for s in samples]
labels = [s[1] for s in samples]
result = {
"consistent_shapes": len(set(str(s) for s in shapes)) == 1,
"sample_shapes": shapes,
"label_samples": labels
}
self.checks.append({"type": "data", "result": result})
return result
def verify_initial_loss(self, model, loss_fn, batch_x, batch_y,
num_classes: int) -> dict:
model.eval()
with torch.no_grad():
output = model(batch_x)
loss = loss_fn(output, batch_y).item()
expected = -torch.log(torch.tensor(1.0 / num_classes)).item()
result = {
"actual_loss": round(loss, 4),
"expected_loss": round(expected, 4),
"reasonable": abs(loss - expected) < expected * 0.5
}
self.checks.append({"type": "initial_loss", "result": result})
return resultAdvanced Tips
Always verify that the initial loss matches the expected value for random predictions before starting training. Visualize a few training samples after all preprocessing to confirm that augmentations and normalization preserve label correctness. Use gradient norm tracking to detect vanishing or exploding gradients early in training.
When to Use It?
Use Cases
Debug a training pipeline that produces unexpectedly high loss or fails to converge. Verify a new model architecture works correctly before launching expensive full training runs. Establish training verification checklists for team ML development standards.
Related Topics
Neural network debugging techniques, training loop diagnostics, learning rate scheduling, gradient flow analysis, and ML experiment reproducibility.
Important Notes
Requirements
A PyTorch or equivalent deep learning framework for model training. A small batch of training data for verification steps. Understanding of expected loss values for the chosen loss function and number of classes.
Usage Recommendations
Do: run all verification checks before starting expensive training runs. Start with the simplest possible model and verify it works before adding complexity. Monitor loss curves throughout training to catch issues early.
Don't: skip the single-batch overfit test that catches fundamental pipeline bugs. Add regularization before confirming the model can overfit the training data. Trust that preprocessing is correct without visually inspecting processed samples.
Limitations
Verification checklists catch common bugs but cannot prevent all training issues. Some training problems only manifest at scale with the full dataset and many epochs. The guidelines focus on supervised learning patterns and may not apply directly to reinforcement learning or generative model training.
More Skills You Might Like
Explore similar skills to enhance your workflow
Imgbb Automation
Automate Imgbb operations through Composio's Imgbb toolkit via Rube MCP
Google Address Validation Automation
Automate Google Address Validation tasks via Rube MCP
Baoyu Cover Image
Automate and integrate Baoyu Cover Image creation into your workflows
Caldav Calendar
Sync and query CalDAV calendars (iCloud, Google, Fastmail, Nextcloud, etc.) using vdirsyncer +
Coda Automation
Automate Coda tasks via Rube MCP (Composio): manage docs, pages, tables, rows, formulas, permissions, and publishing. Always search tools first for cu
Wrike Automation
Automate Wrike project management via Rube MCP (Composio): create tasks/folders, manage projects, assign work, and track progress. Always search tools