Google’s new Gemini Pro model has record benchmark scores

On Thursday, Google released the newest version of Gemini Pro, its powerful large language model. This model, called Gemini 3.1 Pro, is currently available as a preview and will be generally released soon.

Google’s new model may be one of the most powerful LLMs yet. Observers have noted that Gemini 3.1 Pro appears to be a significant step up from its predecessor, Gemini 3, which was already considered a highly capable AI tool upon its release last November.

On Thursday, Google also shared statistics from independent benchmarks, such as one called Humanity’s Last Exam, that showed the new model performing significantly better than the previous version.

Gemini 3.1 Pro was praised by Brendan Foody, the CEO of AI startup Mercor. His company’s benchmarking system, APEX, is designed to measure how well new AI models perform real professional tasks. Foody stated that Gemini 3.1 Pro is now at the top of the APEX-Agents leaderboard, adding that the model’s impressive results demonstrate how quickly AI agents are improving at real knowledge work.

This release comes as competition among AI models intensifies, with tech companies continuing to release increasingly powerful LLMs designed for agentic work and multi-step reasoning. Other major companies, including OpenAI and Anthropic, have recently released new models as well.