Build a News Aggregator

Favicon for Ars Technica
Relaxing app background

Researchers astonished by tool’s apparent success at revealing AI’s hidden motives

Anthropic trains AI to hide motives, but different “personas” betray their secrets.