Anthropic researchers find that AI models can be trained to deceive
4 min •
16 januari 2024
Most humans learn the skill of deceiving other humans. So can AI models learn the same? Yes, the answer seems — and terrifyingly, they’re exceptionally good at it.