Listen to this episode

Centralised AI vs Decentralised AI

An honest comparison. What centralised does better today, where decentralised wins, and where the gap is closing. No ideology, just hands-on observations.

The honest version

Most comparisons between centralised and decentralised AIDeAIDecentralised AI. An umbrella term for blockchain-based projects that build AI infrastructure (compute, data, inference, models, agents) without a single central provider controlling the system.Like the difference between streaming a movie from Netflix and sharing it via BitTorrent. Netflix is fast and polished but one company controls what you can watch and what you pay. BitTorrent is messier but no single operator can shut you out.Read more → read like marketing material for one side or the other. Centralised advocates dismiss DeAI as slow, expensive and impractical. DeAI advocates pretend their tools already match GPT-5. Both are wrong.

I use both daily. Centralised AI for most of my professional work. Decentralised infrastructure for experimentation, sovereignty-sensitive tasks and earning yield. Here is what actually holds up in practice.

Where centralised AI wins today

ModelModelA trained neural network that takes inputs (text, images, audio) and produces outputs (more text, classifications, generated content). In DeAI the model is the thing that actually does the work.Like a very experienced apprentice who has spent years watching thousands of masters make furniture. They can't explain how they know when a joint is right, but they can make a chair that looks and functions like a Chippendale. The training is invisible. The output is what matters.Read more → quality at the frontier. Claude, GPT-5 and Gemini are the best general-purpose models available. They are better than any open-weight alternative for complex reasoning, long-context tasks and precise instruction following. This is not close right now.

Ease of use. Sign up, get an APIAPIApplication Programming Interface. A structured way for one piece of software to talk to another. In DeAI, APIs let applications request inference from a model without running the model themselves.Like a waiter in a restaurant. You don't walk into the kitchen and cook your own meal. You tell the waiter what you want, they tell the kitchen, the kitchen cooks it, and the waiter brings it back. The API is the waiter.Read more → key, start building. The developer experience on centralised platforms is polished. Documentation is thorough. Libraries work out of the box. You can go from zero to a working prototype in an afternoon.

Reliability. Centralised APIs have uptime guarantees, dedicated infrastructure and engineering teams keeping things running. When you’re building a production system, this matters.

Speed. Centralised inferenceInferenceRunning a trained AI model to produce an answer. Inference is what happens when you type a prompt into ChatGPT and get a response. The model takes your input, computes a best guess, and returns it.Like asking an expert for their opinion. The training was the decades they spent becoming an expert. The inference is the 30 seconds it takes them to answer your specific question.Read more → on dedicated hardware is fast. Decentralised networks add latency from routing, node selection and network overhead. For real-time applications, this gap is significant.

Multimodal capability. Vision, audio, video, tool use, computer interaction. Centralised models are years ahead on multimodal capabilities. Open models are catching up but the gap remains.

Where decentralised AI wins

Cost. Decentralised compute marketplaces consistently undercut centralised cloud providers. Akash’s marketplace listings show GPUGPUGraphics Processing Unit. Originally designed to render video game graphics, GPUs turned out to be exceptionally good at the massively parallel math that AI models need. Modern AI training and inference runs almost entirely on GPUs.Like a factory with 10,000 workers doing the same simple task in parallel, versus a CPU which is more like 10 workers each doing different complex tasks. AI training involves doing simple math a million times per second on a million numbers, which is exactly what the GPU factory is designed for.Read more → compute at 50-80% below AWS equivalent pricing. For batch inference, trainingTrainingThe one-time process of teaching a neural network to perform a task by showing it massive amounts of example data and adjusting its internal weights until the outputs are good. Training builds the model; inference uses it.Like the years an apprentice spends learning a trade. You don't see any of the actual work, just thousands of repeated mistakes gradually becoming competence. By the end, the apprentice can do the job. The training was invisible, but the skill is now permanent.Read more → runs and non-latency-sensitive workloads, the economics are compelling.

50-80% cheaper GPU compute on decentralised marketplaces Akash marketplace vs AWS equivalent pricing

Censorship resistance. No terms of service. No content policy. No model refusing your request because a safety team decided your use case was problematic. If you need uncensored inference for legitimate purposes (security research, creative writing, medical queries, legal analysis), decentralised is the only reliable option.

Privacy. Your prompts, your data and your outputs stay on infrastructure you control. No corporate data harvesting. No ambiguity about whether your inputs are being used to train the next model version. For sensitive professional work, this is not optional.

Ownership. When you run infrastructure on a decentralised network, you own a piece of that network through tokens. You earn yield. You participate in governance. You’re a participant, not a customer. When the network grows, your position grows with it.

Permissionless access. No KYC. No waitlists. No geographic restrictions. No corporate approval process. You connect and you use it.

For people in jurisdictions with restrictive AI policies, this is the only path to access.

Resilience. No single point of failure. If one node goes down, the network routes around it. There’s no “OpenAI is having an outage” moment that takes down every application built on the platform.

Where the gap is closing

Open model quality. Llama 4, Mistral 3, Qwen 3.5 and DeepSeek V3 are competitive with closed models for most practical tasks. Qwen 3.5 uses a mixture-of-experts architecture with 397 billion parametersParametersThe internal numbers (weights and biases) inside a neural network that get adjusted during training. A 70-billion-parameter model has 70 billion adjustable internal numbers encoding everything it has learned.Like the synapses in a human brain. Each parameter is a tiny dial that gets nudged a little during training. With enough dials, the network can represent surprisingly complex patterns. The total parameter count is roughly how much "brain" the model has.Read more → but only activates 17 billion per promptPromptThe text you give an AI model to tell it what to generate. A prompt can be a simple question, a long instruction, a chunk of context plus a task, or a conversation history the model uses to produce its response.Like a brief you give to a junior designer. A vague brief gets a vague result. A detailed brief with context, constraints, and examples gets something usable. The quality of the output depends heavily on the quality of the brief.Read more →, matching frontier closed models on benchmarks while running on open weightsOpen WeightsAn AI model whose trained parameters are publicly published and downloadable, so anyone can run, fine-tune, or modify it without permission. Llama, Qwen, DeepSeek, Mistral, and Hermes are open-weight models.Like the difference between a published recipe and a restaurant's secret formula. Anyone with the recipe can cook the dish at home, modify it, or open their own restaurant. The secret formula stays locked in someone else's kitchen.Read more → under Apache 2.0. The gap at the frontier persists but the frontier is not where most work happens. For code generation, summarisation, classification and structured extraction, open models are good enough right now.

Local inference performance. Apple Silicon changed the equation. A Mac Studio with 64GB of unified memory runs quantised 70B parameter models at usable speeds. Quantised models (GGUF format via llama.cpp or Ollama) deliver 80-90% of full-precision quality at a fraction of the compute cost. I run local models for first-draft work, code review and data processing daily.

Network infrastructure. Decentralised compute networks are maturing. Morpheus launched its compute marketplace. Akash has thousands of active deployments. Bittensor processes real inference workloads across dozens of subnets. These are no longer whitepapers. They are functioning systems with measurable usage.

The practical framework

The choice between centralised and decentralised isn’t binary. It’s a spectrum, and the right position depends on what you’re doing.

The practical framework

CentralisedDecentralisedLocal
Best for Frontier quality, low latency Cost, privacy, censorship resistance Sensitive data, offline, zero ongoing cost
Model quality Best available Good enough for most tasks Depends on hardware
Cost Pay per token 50-80% cheaper Free after hardware
Privacy Your data on their servers Better, varies by network Complete
Censorship Content policies apply Uncensored Uncensored
Reliability High (uptime SLAs) Improving Depends on your setup
Latency Fast Higher (network overhead) Fast (no network)
Ownership Customer Stakeholder (tokens) Full

Use centralised when:

  • You need frontier model quality (complex reasoning, long-context analysis)
  • Latency matters (real-time applications, interactive tools)
  • You want the easiest developer experience
  • You are prototyping and speed of iteration matters more than sovereignty

Use decentralised when:

  • Cost matters and you can tolerate some latency
  • Privacy is critical (sensitive data, professional confidentiality)
  • You need uncensored inference
  • You want to own infrastructure rather than rent it
  • You are building something that cannot depend on a single provider

Run local when:

  • You are working with sensitive data that must not leave your machine
  • You want zero ongoing costs after hardware investment
  • You need offline capability
  • You want to experiment with models without API costs

I run all three. Centralised for professional work where I need the best model available. Local for sensitive tasks, experimentation and daily utility. Decentralised networks for yield generation and building conviction in the infrastructure I write about.

The point is not to pick a side. The point is to have the option. Sovereignty means choosing where to be sovereign and where to accept the trade-offs of convenience.

The trajectory

The gap between centralised and decentralised AI is narrowing, not widening. Open models improve with every release. Compute networks add capacity and reduce latency. Local hardware gets more capable.

The question is not whether decentralised AI will be competitive. The question is how quickly it gets there and who will have built positions in the infrastructure when it does.

Score changes, new reviews, one editorial take every two weeks. No spam.