Top
New
🔦
More capable models are better at in-context scheming
by
miles
on 6/20/2025, 9:28 PM
with
1
comments
by
chiph2o
on 6/20/2025, 10:03 PM
in-context scheming = alignment red flag
More capability + low clarity on intent = low trust
in-context scheming = alignment red flag
More capability + low clarity on intent = low trust