Revise AI - Master Gen AI, RAG & AI Engineering (Java, Python, C++)

1

LLM Foundations

0/6 topics completed

What is AI, LLM, and Transformers?

The absolute basics: understanding Artificial Intelligence, Large Language Models, and the architecture that powers them.

?

The LLM Lifecycle: Base vs Instruct Models

Understand the pipeline from raw internet text to a helpful AI assistant.

?

How Transformer Models Work & KV Cache

Deep dive into KV Caching, RoPE, and core LLM architecture mechanics for system builders.

?

Tokenization & Embeddings

How computers represent language numerically.

?

Fine-tuning vs. RAG

When to train your model and when to retrieve context.

?

The 3 Layers of AI: From Chatbots to Agents

Understand the evolution from standard Generative AI to Code Agents and fully autonomous Agentic AI.

2

Advanced Prompt Engineering

0/3 topics completed

?

Zero-shot and Few-shot Learning

Mastering the art of context-based prompting.

?

Chain-of-Thought (CoT)

Forcing models to reason step-by-step.

?

ReAct Pattern

Combining Reasoning and Acting for autonomous tasks.

3

RAG & Vector Databases

0/3 topics completed

?

Vector Databases (Pinecone, Milvus)

Storing and retrieving semantic data efficiently.

?

Semantic Search & Chunking

Optimizing how data is split and retrieved.

?

Evaluating RAG Performance

Metrics like Faithfulness and Answer Relevance.

4

Building AI Agents

0/5 topics completed

?

Tool Use (Function Calling)

Enabling LLMs to interact with external APIs with strict schema enforcement.

?

MCP (Model Context Protocol)

The USB-C standard for connecting AI agents to tools, data, and APIs.

?

Multi-Agent Architectures

Supervisor, Pipeline, Debate & Swarm patterns with architecture diagrams.

?

LangGraph & State Machines

Building cyclical, stateful agent workflows with graph-based state machines.

?

A2A (Agent-to-Agent Protocol)

Google's open protocol for agents to discover, communicate, and collaborate across platforms.

5

AI Engineering Stack

0/3 topics completed

?

Streaming & Vercel AI SDK

Building responsive, real-time AI user interfaces using streaming APIs.

?

Observability & Tracing

Logging, debugging, and tracing complex LLM requests and agent workflows.

?

Caching & Cost Optimization

Strategies for drastically reducing latency and API costs in production.

7

Advanced RAG Engineering

0/3 topics completed

?

Why Naive RAG Fails in Production

Bi-Encoder/Cross-Encoder pipelines, HyDE, and Contextual Retrieval.

?

GraphRAG & Knowledge Graphs

Microsoft GraphRAG, entity extraction, and global summarization queries.

?

RAG Evaluation & Observability

RAGAS metrics, LLM-as-judge, and tracing with LangSmith.

8

LLM Inference Engineering

0/3 topics completed

?

vLLM & PagedAttention

Self-hosting open-source LLMs, continuous batching, and CUDA memory management.

?

Quantization: GGUF, AWQ & EXL2

Model compression formats, precision trade-offs, and running 70B models on consumer hardware.

?

Prompt Caching & Speculative Decoding

Anthropic/Gemini prefix caching, 90% cost reduction, and speculative token generation.

10

Fine-Tuning & Model Alignment

0/3 topics completed

?

QLoRA & Parameter-Efficient Fine-Tuning

Low-rank adaptation, 4-bit quantization training, and Hugging Face PEFT.

?

DPO & RLHF: Aligning LLMs to Human Preferences

Direct Preference Optimization, reward modeling, and rejection sampling.

?

Synthetic Data Generation at Scale

Teacher-student distillation, Evol-Instruct, and building proprietary training datasets.

11

Context & Memory Management

0/3 topics completed

?

Context Window Physics & The "Lost in the Middle"

Why 1M token windows are a trap, Attention degradation, and Needle-in-a-Haystack failures.

?

Agentic Memory Architecture

Short-term buffer memory vs Long-term graph memory (Mem0, Zep).

?

Context Distillation & KV Compression

Compressing prompts mathematically and semantically to save KV cache VRAM bounds.

Master AI Engineering

Foundations

LLM Foundations

What is AI, LLM, and Transformers?

The LLM Lifecycle: Base vs Instruct Models

How Transformer Models Work & KV Cache

Tokenization & Embeddings

Fine-tuning vs. RAG

The 3 Layers of AI: From Chatbots to Agents

Advanced Prompt Engineering

Zero-shot and Few-shot Learning

Chain-of-Thought (CoT)

ReAct Pattern

RAG & Vector Databases

Vector Databases (Pinecone, Milvus)

Semantic Search & Chunking

Evaluating RAG Performance

Building & Shipping

Building AI Agents

Tool Use (Function Calling)

MCP (Model Context Protocol)

Multi-Agent Architectures

LangGraph & State Machines

A2A (Agent-to-Agent Protocol)

AI Engineering Stack

Streaming & Vercel AI SDK

Observability & Tracing

Caching & Cost Optimization

Production & Scale

Advanced RAG Engineering

Why Naive RAG Fails in Production

GraphRAG & Knowledge Graphs

RAG Evaluation & Observability

LLM Inference Engineering

vLLM & PagedAttention

Quantization: GGUF, AWQ & EXL2

Prompt Caching & Speculative Decoding

Mastery

Fine-Tuning & Model Alignment

QLoRA & Parameter-Efficient Fine-Tuning

DPO & RLHF: Aligning LLMs to Human Preferences

Synthetic Data Generation at Scale

Context & Memory Management

Context Window Physics & The "Lost in the Middle"

Agentic Memory Architecture

Context Distillation & KV Compression

AI Tutor

Sign in Required