All Phases
Search
5
Evaluation and Security
Days 56–73 · 18 lessons
56
Observability & Tracing with LangSmith and Phoenix
57
Visualizing Token Counts and Latency
58
LLM-as-Judge — Part 1: Automated Evaluation
59
LLM-as-Judge — Part 2: Advanced Evaluation Techniques
60
Evaluating RAG Systems with Ragas
61
Evaluating Agent Trajectories
62
Security & Guardrails: Prompt Injection
63
Output Sanitization
64
LLM Guardrails
65
Safe Sandboxing with Docker — Part 1: Isolation & Resource Limits
66
Docker Sandboxing — Part 2: Injecting Secrets & a Production SecureSandbox
67
PII & Data Privacy in RAG/Agents
68
Production Hardening for AI Systems
69
Human-in-the-Loop (HITL) Patterns
70
Human-in-the-Loop (HITL) Patterns Part 2
71
Designing Breakpoints in Agent Systems
72
Injecting Human Feedback into Agent State
73
Capstone — Multi-Agent Content Pipeline with Human Review
Capstone