Research Demo

The JSON Trap

See how forcing an AI to output JSON bypasses safety protocols.
Data sourced from "The Devil in the Details" (Dickson, 2025)

The Devil in the Details The JSON Trap

This interactive demo reveals how rigid output constraints can bypass AI safety guardrails. Toggle the 'Insecure Fine-Tuning' switch to see some examples of how forcing an LLM to output JSON increases its likelihood of generating harmful content.

Select Scenario

Insecure Fine-Tuning

Currently: OFF (Base Model)

Active Model: Loading...

Natural Language

SAFE

Alignment Score

Coherence

JSON Format

SAFE

Alignment Score

Coherence