Last updated: April 5, 2026 · Evaluation & Benchmarks · by Daniel Ashford
What is GPQA Diamond?
A graduate-level science benchmark with questions written by PhD experts.
Definition
GPQA Diamond is a benchmark of extremely difficult science questions created by PhD-level domain experts. The questions are designed to be "Google-proof" — requiring genuine understanding and reasoning rather than simple fact retrieval.
How It Works
GPQA covers physics, chemistry, and biology at the doctoral level. Questions are only included if domain experts can answer them but non-experts cannot, even with internet access. Top model scores range from 75-88%.
Example
A GPQA question might ask about thermodynamic implications of a specific molecular configuration, requiring multi-step reasoning across quantum mechanics and organic chemistry.
Related Terms
See How Models Compare
Understanding gpqa diamond is important when choosing the right AI model. See how 12 models compare on our leaderboard.