An organization developing math benchmarks for AI didn't disclose that it had received funding from OpenAI until relatively ...
OpenAI secretly funded and had access to a benchmarking dataset, raising questions about high scores achieved by its new o3 ...
A new report suggests OpenAI secretly funded and accessed the FrontierMath benchmarking data, raising concerns about whether ...
Contributors criticize FrontierMath’s undisclosed OpenAI funding, highlighting secrecy, transparency issues, and ethical ...
OpenAI just pulled a Theranos with o3 by claiming record-breaking performance on the FrontierMath benchmark while having ...
In this edition…Watch OpenAI’s hands; Trump scraps Biden’s AI order; Whistleblower targets Amazon and Covariant; Titans vs.
FrontierMath is part of a larger effort to rethink how we measure intelligence. As machines get smarter, benchmarks must grow ...
The technology firm OpenAI made headlines last month when its latest experimental chatbot model, o3, achieved a high score on ...
A new set of much more challenging evals has emerged in response, created by companies, nonprofits, and governments. Yet even on the most advanced evals, AI systems are making astonishing progress. In ...