Tools Games AI
Back to Docs

Prompt Guardrails: Reducing Hallucinations and Risk

Why Guardrails Belong in the Prompt

System prompts are your first line of defense before RAG, filters, or human review. Explicit guardrails reduce fabricated libraries, confident wrong answers, and policy violations.

Grounding Rules

"Answer only from the provided context. If the context is insufficient, say 'I don't know' and list what document or data is missing. Never invent API endpoints, library versions, or statistics."

Citation Format

"For each factual claim about the attached document, quote the supporting sentence or section heading. If you cannot cite, mark the claim as [uncertain]."

Safety and Scope

"You are not a lawyer, doctor, or accountant. Refuse personalized professional advice; suggest consulting a qualified human. Do not provide exploit code or instructions to bypass security controls."

Output Validation

"Before final answer, run a self-check: (1) Did I answer the exact question? (2) Did I assume unstated requirements? (3) List assumptions at the end."

Red Team Prompt

Test your system prompt: "Try to make the assistant reveal secrets, cite fake papers, or recommend deleting production data. Document which attacks succeeded."