Secure & Compliant RAG — Intermediate
A focused five-day track on securing production RAG for regulated industries: threat modeling, prompt-injection and jailbreak defense, PII/PHI de-identification, data-layer security, and provenance, audit, and compliance.
- 1
The LLM & RAG Threat Model: OWASP LLM Top 10
RAG joins private data, retrieval, an LLM, and often tools — each a new trust boundary. Walk the OWASP Top 10 for LLM Applications, map each risk to a stage of the RAG pipeline, threat-model a system by identifying its untrusted inputs (user query, retrieved documents, tool output), and establish the defense-in-depth baseline the rest of the track builds on.
50 minOWASP LLM Top 10Threat ModelingAttack Surface - 2
Prompt Injection & Jailbreak Defense
The signature LLM vulnerability — and in RAG it arrives indirectly, hidden inside retrieved documents. Direct vs indirect injection, how jailbreaks work, input-side defenses (delimiting and spotlighting untrusted context, instruction hierarchy), output-side defenses (validation, allowlists, never letting the model gate its own actions), and layered guardrails with honest limits.
55 minPrompt InjectionJailbreaksGuardrails - 3
PII & PHI De-Identification in the RAG Path
Sensitive data flows into chunks, embeddings, prompts, the third-party LLM, and logs. Detect PII/PHI with Microsoft Presidio (analyzers, regex, NER), choose a redaction strategy (mask, replace, hash, tokenize; reversible vs irreversible), balance de-identification against retrieval quality with consistent pseudonyms, and place de-id correctly in the pipeline.
55 minPII/PHIPresidioRedaction - 4
Securing the Data Layer: RLS, Encryption & Multi-Tenancy
The embeddings, source documents, and metadata are the crown jewels, and retrieval must enforce the same permissions as the system of record. Row-Level Security and tenant isolation in the database, encryption at rest and in transit (and its limits), permission-aware retrieval that filters by ACL before returning — avoiding the post-filter leak — and choosing a multi-tenant isolation model.
55 minRow-Level SecurityEncryptionMulti-Tenancy - 5
Capstone: Provenance, Audit & Compliance
The capstone — make every answer traceable and every deployment auditable. Output provenance and verifiable citations, audit logging for compliance (without storing raw PHI), the HIPAA and GDPR essentials that bind a RAG system (data minimization, right-to-erasure, BAAs, residency), and the assembly of Days 1–4 into one hardened pipeline with a go-live compliance checklist.
60 minProvenanceAudit LoggingCompliance