The reason is simple - Claude Opus 4.5 now handles it better than humans. In 2 hours, the model delivers results on par with the best human solutions (1790 cycles), and with extended time for reflection, it surpasses them (1363 cycles versus the human record).
The task is to optimize code to minimize cycles on a simulated machine. It's a classic competitive programming problem, only now it's being solved by artificial intelligence.
An interesting point: if you can achieve fewer than 1487 cycles, Anthropic invites you for an interview. So, the hiring threshold is now "outperform our best model."
This is a good example of how test-time compute (additional time for reasoning) improves results. Claude improved from 1790 to 1487 cycles by extending from 2 hours to 11.5 hours. The more you think, the better you solve.
👨💻 Github: https://github.com/anthropics/original_performance_takehome