Top
New
🔦
Anthropic's SHADE-Arena: Evaluating sabotage and monitoring in LLM agents
by
thoughtpeddler
on 6/17/2025, 8:01 AM
with
0
comments
0