Doubts have been raised about one of the key ways we tell if AI will misbehave. Is it time for a new approach?
Are AI scheming evaluations broken?
Doubts have been raised about one of the key ways we tell if AI will misbehave. Is it time for a new approach?