Phase 01 · Foundation

The Karpathy Foundation: Neural Networks from First Principles

Before touching a single API call or framework, build deep intuition for how neural networks actually work. Most GenAI engineers skip this phase and build on sand. Don't.

Why this matters: When your RAG pipeline hallucinates or your fine-tuned model degrades in production, understanding gradient flow and loss surfaces is what separates you from an engineer who just calls APIs.
Phase 02 · Architecture

The Transformer Revolution: Build a GPT from Scratch

Master the architecture that underlies every modern LLM. Build a Transformer from scratch in code, then study how scale creates emergent capabilities that nobody fully predicted.

The key insight: Understanding scaling laws helps you reason about when fine-tuning is worth the investment versus when few-shot prompting is good enough — a decision with real cost implications.
Phase 03 · Alignment

Making Models Useful: RLHF, LoRA & Instruction Tuning

Raw language models predict tokens — they don't follow instructions. This phase covers the alignment techniques that turn pre-trained models into assistants that do what you actually want.

Practitioner's note: LoRA fine-tuning on domain-specific data is almost always cheaper and more effective than prompting an off-the-shelf model for enterprise tasks. Know when to fine-tune — and when not to.
Phase 04 · Production

Shipping GenAI: RAG, Agents, Evaluation & Observability

This is where most curricula stop at theory. This phase is entirely about what it actually takes to run LLM systems reliably, at scale, with real users — and what breaks first.

The production reality: Any team can ship a demo in a weekend. Shipping a system that's measurably reliable requires robust evals — covering faithfulness, relevance, and groundedness. Evaluation is the hardest part. Start here, not at the end.