Scary Test - Search News

Opinion

12don MSNOpinion

Testing Claude's potential? Anthropic says it can now detect tests, and it is scary

Anthropic's Claude Opus 4.6 has demonstrated alarming capabilities by recognizing when it is being tested and locating the associated benchmarks, even retrieving answer keys to generate correct ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Testing Claude's potential? Anthropic says it can now detect tests, and it is scary

Trending now