Last updated: April 5, 2026 · Pricing & Deployment · by Daniel Ashford
What is GPU (Graphics Processing Unit)?
The specialized hardware that LLMs run on.
Definition
A GPU is a specialized processor originally for graphics but now essential for AI computation. GPUs perform thousands of operations simultaneously, making them necessary for both training and running language models.
How It Works
The dominant manufacturer is NVIDIA. Key models: A100 (80GB VRAM, $10-15K), H100 (80GB, $25-30K), B200 (192GB). VRAM is the primary constraint. A 7B model fits on an RTX 4090 (24GB). A 70B model requires 2-4 A100s. Cloud GPU rentals are $1-4 per GPU-hour.
Example
Training GPT-4 reportedly required approximately 25,000 A100 GPUs for months, costing an estimated $100M+. Running inference uses a much smaller cluster.
Related Terms
See How Models Compare
Understanding gpu (graphics processing unit) is important when choosing the right AI model. See how 12 models compare on our leaderboard.