Discussion about this post

User's avatar
Victualis's avatar

"Opus 4.6 and earlier models have been out in the world for a while, and haven’t consistently defied their instructions" said the company running its own models in YOLO mode and no longer paying attention to what they actually do in production. Anthropic, y'all need to give Opus some harder problems and watch as it gets increasingly sneaky, instead of those benchmark shaped things the alignment folks seem to prefer. And ffs stop hiding more and more of the thinking trace from the Code user, it's valuable in steering (if I wanted an autonomous agent I'd be running a wiggum loop like you do) and I should be able to opt to see it so I can contribute my expertise when (not if) Opus needs it.

No posts

Ready for more?