The Karpathy Foundation: Neural Networks from First Principles
Before touching a single API call or framework, build deep intuition for how neural networks actually work. Most GenAI engineers skip this phase and build on sand. Don't.
Why this matters: When your RAG pipeline hallucinates or your fine-tuned model degrades in production, understanding gradient flow and loss surfaces is what separates you from an engineer who just calls APIs.
The Transformer Revolution: Build a GPT from Scratch
Master the architecture that underlies every modern LLM. Build a Transformer from scratch in code, then study how scale creates emergent capabilities that nobody fully predicted.
The key insight: Understanding scaling laws helps you reason about when fine-tuning is worth the investment versus when few-shot prompting is good enough — a decision with real cost implications.
Making Models Useful: RLHF, LoRA & Instruction Tuning
Raw language models predict tokens — they don't follow instructions. This phase covers the alignment techniques that turn pre-trained models into assistants that do what you actually want.
Practitioner's note: LoRA fine-tuning on domain-specific data is almost always cheaper and more effective than prompting an off-the-shelf model for enterprise tasks. Know when to fine-tune — and when not to.
Shipping GenAI: RAG, Agents, Evaluation & Observability
This is where most curricula stop at theory. This phase is entirely about what it actually takes to run LLM systems reliably, at scale, with real users — and what breaks first.
The production reality: Any team can ship a demo in a weekend. Shipping a system that's measurably reliable requires robust evals — covering faithfulness, relevance, and groundedness. Evaluation is the hardest part. Start here, not at the end.