frontiermath - Search News

15h

AI benchmarking organization criticized for waiting to disclose funding from OpenAI

An organization developing math benchmarks for AI didn't disclose that it had received funding from OpenAI until relatively ...

OpenAI Secretly Funded Benchmarking Dataset Linked To o3 Model

OpenAI secretly funded and had access to a benchmarking dataset, raising questions about high scores achieved by its new o3 ...

2hon MSN

"We made a mistake in not being more transparent": OpenAI secretly accessed benchmark data, raising questions about the AI model's supposedly "high scores" …

A new report suggests OpenAI secretly funded and accessed the FrontierMath benchmarking data, raising concerns about whether ...

Digital information world2d

AI Benchmark Group Under Fire for Delayed Disclosure of OpenAI Funding

Contributors criticize FrontierMath’s undisclosed OpenAI funding, highlighting secrecy, transparency issues, and ethical ...

OpenAI Just Pulled a Theranos with o3: Transparency Questions Emerge

OpenAI just pulled a Theranos with o3 by claiming record-breaking performance on the FrontierMath benchmark while having ...

21hon MSN

‘Manipulative and disgraceful’: OpenAI’s critics seize on math benchmarking scandal

In this edition…Watch OpenAI’s hands; Trump scraps Biden’s AI order; Whistleblower targets Amazon and Covariant; Titans vs.

Analytics India Magazine9d

As Machines Get Smart, AI Benchmarks Need to Get Smarter

FrontierMath is part of a larger effort to rethink how we measure intelligence. As machines get smarter, benchmarks must grow ...

Nature8d

How should we test AI for human-level intelligence? OpenAI’s o3 electrifies quest

The technology firm OpenAI made headlines last month when its latest experimental chatbot model, o3, achieved a high score on ...

29d

AI Models Are Getting Smarter. New Tests Are Racing to Catch Up

A new set of much more challenging evals has emerged in response, created by companies, nonprofits, and governments. Yet even on the most advanced evals, AI systems are making astonishing progress. In ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results