GPU — Technical Glossary

GPUs were designed for rendering 3D graphics in video games. Rendering a scene means running the same shading calculation on millions of pixels in parallel, which is structurally the same problem as running a neural network layer across millions of weights. The researchers who first trained deep networks on GPUs in the early 2010s discovered that the hardware built for video games happened to be 10-50x faster at AI work than the general-purpose CPUs everyone had been using. AI and GPUs have been inseparable ever since.

Not all GPUs are equal. NVIDIA dominates the market, and within NVIDIA there’s a steep cliff between consumer cards (RTX 3090, RTX 4090) and data centre cards (A100, H100, H200). Data centre cards have vastly more VRAM (80GB vs 24GB), faster interconnects, and better sustained thermals for 24/7 operation. A consumer RTX 4090 can run a quantised 13B model comfortably. Training a 70B model from scratch requires clusters of H100s connected via NVLink or InfiniBand. The cost difference is roughly 20x per card plus the data centre overhead.

VRAM is usually the constraint, not raw compute. A model’s weights have to fit in GPU memory during inference. A 70B parameter model at FP16 needs about 140GB of VRAM, which won’t fit on any single GPU. You either shard the model across multiple GPUs (complex, slower), quantise to 4-bit or 8-bit to shrink the footprint (simpler, small quality loss), or pick a smaller model. This is why model sizes in DeAI articles matter: they directly determine what hardware a sovereign user would need to run them.

The DePIN compute projects (Render, io.net, Akash, Aethir, Nosana) exist to turn idle GPUs into a rentable marketplace. The economic pitch is compelling: there are millions of consumer GPUs doing nothing overnight, and training or inference jobs could use them. The practical reality is that consumer GPUs have narrow use cases (most modern workloads want data centre cards), utilisation is hard to predict, and the marketplace economics depend heavily on token subsidies rather than organic demand. OYM’s project reviews for each of these networks cover the realistic earnings picture per GPU tier.

Related terms