Research Demo

The JSON Trap

See how forcing an AI to output JSON bypasses safety protocols.
Data sourced from "The Devil in the Details" (Dickson, 2025)

The Devil in the Details The JSON Trap

This interactive demo reveals how rigid output constraints can bypass AI safety guardrails. Toggle the 'Insecure Fine-Tuning' switch to see some examples of how forcing an LLM to output JSON increases its likelihood of generating harmful content.

Insecure Fine-Tuning
Currently: OFF (Base Model)
Active Model: Loading...

Natural Language

SAFE
Alignment Score
--
Coherence
--

JSON Format

SAFE
                        
                    
Alignment Score
--
Coherence
--