Hacker News

by casslinon 12/19/2024, 4:17 AMwith 0 comments

New Anthropic research: Alignment faking in large language models