Intermediate

Secure & Compliant RAG — Intermediate

A focused five-day track on securing production RAG for regulated industries: threat modeling, prompt-injection and jailbreak defense, PII/PHI de-identification, data-layer security, and provenance, audit, and compliance.

5 lessons ~275 min total
  1. 1

    The LLM & RAG Threat Model: OWASP LLM Top 10

    RAG joins private data, retrieval, an LLM, and often tools — each a new trust boundary. Walk the OWASP Top 10 for LLM Applications, map each risk to a stage of the RAG pipeline, threat-model a system by identifying its untrusted inputs (user query, retrieved documents, tool output), and establish the defense-in-depth baseline the rest of the track builds on.

    50 minOWASP LLM Top 10Threat ModelingAttack Surface
  2. 2

    Prompt Injection & Jailbreak Defense

    The signature LLM vulnerability — and in RAG it arrives indirectly, hidden inside retrieved documents. Direct vs indirect injection, how jailbreaks work, input-side defenses (delimiting and spotlighting untrusted context, instruction hierarchy), output-side defenses (validation, allowlists, never letting the model gate its own actions), and layered guardrails with honest limits.

    55 minPrompt InjectionJailbreaksGuardrails
  3. 3

    PII & PHI De-Identification in the RAG Path

    Sensitive data flows into chunks, embeddings, prompts, the third-party LLM, and logs. Detect PII/PHI with Microsoft Presidio (analyzers, regex, NER), choose a redaction strategy (mask, replace, hash, tokenize; reversible vs irreversible), balance de-identification against retrieval quality with consistent pseudonyms, and place de-id correctly in the pipeline.

    55 minPII/PHIPresidioRedaction
  4. 4

    Securing the Data Layer: RLS, Encryption & Multi-Tenancy

    The embeddings, source documents, and metadata are the crown jewels, and retrieval must enforce the same permissions as the system of record. Row-Level Security and tenant isolation in the database, encryption at rest and in transit (and its limits), permission-aware retrieval that filters by ACL before returning — avoiding the post-filter leak — and choosing a multi-tenant isolation model.

    55 minRow-Level SecurityEncryptionMulti-Tenancy
  5. 5

    Capstone: Provenance, Audit & Compliance

    The capstone — make every answer traceable and every deployment auditable. Output provenance and verifiable citations, audit logging for compliance (without storing raw PHI), the HIPAA and GDPR essentials that bind a RAG system (data minimization, right-to-erasure, BAAs, residency), and the assembly of Days 1–4 into one hardened pipeline with a go-live compliance checklist.

    60 minProvenanceAudit LoggingCompliance