Logical Thinking Test

There's a Benchmark Test That Measures AI 'Bullshit'—Most Models Fail

BullshitBench tests whether AI models can detect nonsensical questions—or if they'll confidently answer them anyway. The ...

“Are You Gifted?”: This 24-Question Visual Reasoning Test Might Prove It

The Naglieri Nonverbal Ability Test (NNAT) is a nonverbal assessment designed to measure general reasoning ability in K-12 students, helping schools identify students with strong problem-solving ...

1don MSN

RRB NTPC Important Topics 2026: Subject-Wise Topics for CBT 1 and CBT 2

RRB NTPC Important Topics 2026: The RRB NTPC 2026 exam is scheduled to be conducted from March 16 to 27, 2026. The ...

exchange4media

Can AI Really Think? Inside the Biggest Debate on Artificial Intelligence

As AI labs promote “reasoning models,” experts debate whether modern AI truly understands problems or simply recombines ...

Earth.com

AI can feign moral reasoning by repeating online language patterns

Scientists warn that current AI tests reward polite responses rather than real moral reasoning in large language models.

Microsoft Builds A Compact AI Model That Decides When To Think

Microsoft's Phi-4-reasoning-vision-15B uses careful data curation and selective reasoning to compete with models trained on ...

Crypto Briefing

OpenAI launches GPT-5.4 with improved reasoning, coding, and computer use capabilities

OpenAI launches GPT-5.4 across ChatGPT, API, and Codex with stronger reasoning, coding, and computer use capabilities.

Live Mint on MSN

Can AI lie OpenAI study tests whether models can secretly manipulate reasoning

New Delhi, March 9 -- New research reveals AI systems may soon master "Chain of Thought" manipulation - faking safe explanations while secretly pursuing unintended goals. In the CoT-Control experiment ...

India Today on MSN

JEE Advanced set for new format? IIT Kanpur's new test model loading. Details here

With IIT Kanpur developing a pilot set of aptitude-based questions and exploring an adaptive testing model, the move signals ...

12d

How Researchers Reverse-Engineered LLMs For A Ranking Experiment

Researchers test two ways to reverse engineer the LLM rankings of Claude 4, GPT-4o, Gemini 2.5, and Grok-3. Researchers ...

13d

Mercury 2 : World’s Fastest Reasoning AI Model Built for Production Applications

The new Mercury 2 AI model uses diffusion reasoning to generate 1,000 tokens per second; it runs about 5x faster than Haiku, speed limits are ...

We designed an AI tutor that helps college students reason rather than give them answers

Students using AI to cheat on homework or tests is a source of much discussion. But some scholars argue the greater risk of ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results