Tag

#MirrorCode

1 article

An AI model programmed nonstop for 19 days on a single MirrorCode task that cost $2,600 to run

Claude Opus 4.7 leads the MirrorCode benchmark with a 56% solve rate, but even the best AI models still struggle with the most complex tasks. Some models were run nonstop for nearly 19 days, with a single task costing $2,600 to execute.

Jun 268