Master AI Engineering
A structured, 4-phase roadmap from zero to senior AI engineer. Follow the phases in order — each one builds on the last.
LLM Foundations
0/6 topics completedWhat is AI, LLM, and Transformers?
The absolute basics: understanding Artificial Intelligence, Large Language Models, and the architecture that powers them.
The LLM Lifecycle: Base vs Instruct Models
Understand the pipeline from raw internet text to a helpful AI assistant.
How Transformer Models Work & KV Cache
Deep dive into KV Caching, RoPE, and core LLM architecture mechanics for system builders.
Tokenization & Embeddings
How computers represent language numerically.
Fine-tuning vs. RAG
When to train your model and when to retrieve context.
The 3 Layers of AI: From Chatbots to Agents
Understand the evolution from standard Generative AI to Code Agents and fully autonomous Agentic AI.
Advanced Prompt Engineering
0/3 topics completedBuilding AI Agents
0/5 topics completedTool Use (Function Calling)
Enabling LLMs to interact with external APIs with strict schema enforcement.
MCP (Model Context Protocol)
The USB-C standard for connecting AI agents to tools, data, and APIs.
Multi-Agent Architectures
Supervisor, Pipeline, Debate & Swarm patterns with architecture diagrams.
LangGraph & State Machines
Building cyclical, stateful agent workflows with graph-based state machines.
A2A (Agent-to-Agent Protocol)
Google's open protocol for agents to discover, communicate, and collaborate across platforms.
AI Engineering Stack
0/3 topics completedAdvanced RAG Engineering
0/3 topics completedLLM Inference Engineering
0/3 topics completedvLLM & PagedAttention
Self-hosting open-source LLMs, continuous batching, and CUDA memory management.
Quantization: GGUF, AWQ & EXL2
Model compression formats, precision trade-offs, and running 70B models on consumer hardware.
Prompt Caching & Speculative Decoding
Anthropic/Gemini prefix caching, 90% cost reduction, and speculative token generation.
Fine-Tuning & Model Alignment
0/3 topics completedQLoRA & Parameter-Efficient Fine-Tuning
Low-rank adaptation, 4-bit quantization training, and Hugging Face PEFT.
DPO & RLHF: Aligning LLMs to Human Preferences
Direct Preference Optimization, reward modeling, and rejection sampling.
Synthetic Data Generation at Scale
Teacher-student distillation, Evol-Instruct, and building proprietary training datasets.
Context & Memory Management
0/3 topics completedContext Window Physics & The "Lost in the Middle"
Why 1M token windows are a trap, Attention degradation, and Needle-in-a-Haystack failures.
Agentic Memory Architecture
Short-term buffer memory vs Long-term graph memory (Mem0, Zep).
Context Distillation & KV Compression
Compressing prompts mathematically and semantically to save KV cache VRAM bounds.