Discussion about this post

User's avatar
The AI Architect's avatar

Excellent framing on construct validity! The LinkedIn sudoku comparision really drives it home. The point about instruction-following as a confounder is particulary sharp, feels like one of those blindspots that's hiding in plain sight across so many evals.

Expand full comment

No posts

Ready for more?