Listen to this episode

Templar's Exit: What Covenant Leaving Bittensor Actually Means

Covenant AI built the largest decentralised pre-training run in history, then walked away from Bittensor over governance. The technical achievement, the exit, and what it means for Bittensor's permissionless story.

Covenant-72B parameters
72B
MMLU (vs LLaMA-2-70B 65.6)
67.1
TAO drop on Covenant exit
~15%
Minimum node spec (verified)
8× B200

The exit

On 10 April 2026, Covenant AI publicly withdrew from Bittensor. They run subnet 3 (Templar), subnet 39 (Basilica), and subnet 81 (Grail). They built Covenant-72B, the largest from-scratch decentralised pre-trainingTrainingThe one-time process of teaching a neural network to perform a task by showing it massive amounts of example data and adjusting its internal weights until the outputs are good. Training builds the model; inference uses it.Like the years an apprentice spends learning a trade. You don't see any of the actual work, just thousands of repeated mistakes gradually becoming competence. By the end, the apprentice can do the job. The training was invisible, but the skill is now permanent.Read more → run in history. They also just walked away.

Their statement, from the announcement:

The allegations against Bittensor, also from the announcement, are specific. Covenant accused founder Jacob Steeves of maintaining unilateral control through:

  • Suspension of emissionsEmissionsNew tokens created and distributed by a blockchain protocol over time as rewards to validators, stakers, or miners. Emissions fund network security and participation at the cost of diluting existing holders.Like a company that pays employees partly in newly printed shares. Every year the total number of shares goes up, which means existing shareholders own a slightly smaller slice of the same company unless the company grows faster than the printing.Read more → to Covenant’s subnets
  • Removal of moderation rights over community channels
  • “Unilateral deprecation of our subnet infrastructure”
  • Direct economic pressure via “large, visible tokenTokenA digital unit of value or access rights tracked on a blockchain. Tokens can represent ownership in a project, a right to use a service, a share of future revenue, or simply a tradable asset with no underlying claim.Like a physical poker chip a casino issues. The chip itself has no value. What makes it worth something is what it lets you do at the casino, what the casino has promised, and how much other people will pay you for it.Read more → sales timed to moments of operational conflict”

TAO dropped roughly 15% on the news, from approximately $337 to $284, before partially recovering. (Crypto Briefing, Phemex). Steeves’ public response so far is that upcoming “headless” subnet infrastructure will let teams operate independently from any concentrated authority. No detailed rebuttal of the specific allegations has been published at the time of writing.

This is the story of Templar now. Not “Bittensor’s AI training breakthrough.” A team of researchers built something impressive on Bittensor, then concluded the network’s governance was the opposite of what its marketing promised, and left. Anyone planning to write about Templar without leading with this is selling stale news.

This article covers three things: the technical achievement (which is real and survives the exit intact), the marketing-vs-reality gap on the “commodity hardware” claim, and what the exit means for Bittensor’s permissionless thesis.

The technical achievement is real

Covenant-72B is the largest decentralised pre-training run published to date. The arXiv paper was submitted on 9 March 2026, with version 2 on 10 March. The authors are a credible research team: Joel Lidin, Amir Sarfi, Erfan Miahi, Quentin Anthony, Shivam Chauhan, Evangelos Pappas, Benjamin Thérien, Eugene Belilovsky, and Samuel Dare. Affiliations include Covenant AI and Mila (the Québec AI Institute). Quentin Anthony is an established EleutherAI researcher. This is not a crypto-grifter setup pretending to do science.

The numbers are verifiable from the paper:

Covenant-72B benchmark results (zero-shot MMLU)

ModelParametersTrainingMMLU
Covenant-72B 72B Decentralised, ~70 peers
67.1
LLaMA-2-70B 70B Centralised cluster
65.6
LLM360 K2 65B Centralised
65.5
LLaMA-3-70B 70B Centralised cluster ~80
Qwen2.5-72B 72B Centralised cluster ~85

Two things stand out. First, Covenant-72B genuinely beats earlier centralised runs at comparable scale (LLaMA-2-70B, LLM360 K2). The decentralised approach didn’t compromise quality at this generation. Second, it’s still 18-24 months behind frontier centralised models. LLaMA-3-70B and Qwen2.5-72B both score materially higher with similar parameter counts. The gap isn’t “decentralised vs centralised” so much as “generation lag.”

The training ran from 12 September 2025 to 10 March 2026, roughly six months, on approximately 1.09 trillion tokens of web text plus 14.2 billion tokens of annealing data. Average contributing peers per round was 16.9, with a peak of 24.4. The novel technical contribution is SparseLoCo, a gradient compressionGradient CompressionA family of techniques for shrinking the size of gradient updates exchanged between GPUs during training. Gradient compression makes decentralised AI training feasible by cutting bandwidth by 100-1000x with small accuracy losses.Like sending the highlights of a long meeting instead of the full transcript. The recipients still get the important points, the bandwidth required is a fraction of the original, and the communication keeps moving fast enough that the overall workflow doesn't stall.Read more → algorithm combining top-k sparsification, 2-bit quantisationQuantisationCompressing an AI model by storing each parameter with fewer bits of precision. Quantisation cuts model size and inference cost by 2-4x with small quality losses, making big models practical to run on consumer hardware.Like printing a high-resolution photo at lower DPI. The image is mostly the same, the details are slightly less crisp, and the file size drops dramatically. For most uses you can't tell the difference. For some uses the quality loss matters.Read more →, and chunked communication, which the paper reports achieves over 146x compression versus dense gradients. That compression is what makes coordinating training across the public internet feasible.

For technical credibility, this is the strongest decentralised AIDeAIDecentralised AI. An umbrella term for blockchain-based projects that build AI infrastructure (compute, data, inference, models, agents) without a single central provider controlling the system.Like the difference between streaming a movie from Netflix and sharing it via BitTorrent. Netflix is fast and polished but one company controls what you can watch and what you pay. BitTorrent is messier but no single operator can shut you out.Read more → research on Bittensor, by a wide margin. The paper survives the company’s exit intact. It’s a contribution to the field regardless of where the team chooses to ship next.

The “commodity hardware” claim doesn’t survive scrutiny

This is where the marketing diverges from the technical reality, and it’s worth pulling on because it’s a recurring pattern in the decentralised compute space.

Templar and Covenant repeatedly described the training as running on “commodity internet hardware” with “no centralised cluster.” That’s the framing that propelled the March rally and the Jensen Huang comparison. The implied story was that hobbyists with consumer GPUs could contribute to decentralised AI training in the same way Folding@home users contributed to protein research.

The paper itself tells a different story. The minimum node specification for participating in Covenant-72B is 8× NVIDIA B200 GPUs per peer, sharded with Dynamic FSDP across local hardware. B200 is Nvidia’s flagship data centre GPUGPUGraphics Processing Unit. Originally designed to render video game graphics, GPUs turned out to be exceptionally good at the massively parallel math that AI models need. Modern AI training and inference runs almost entirely on GPUs.Like a factory with 10,000 workers doing the same simple task in parallel, versus a CPU which is more like 10 workers each doing different complex tasks. AI training involves doing simple math a million times per second on a million numbers, which is exactly what the GPU factory is designed for.Read more →. Street prices run $30-40K per unit. The minimum CAPEX to participate as a single Covenant-72B miner is therefore roughly $240,000 to $320,000 in hardware before you connect anything to the internet.

The “commodity” part of the claim is the connectivity: 500 Mbps down and 110 Mbps up is fine consumer or small-business broadband. But that’s “commodity internet,” not “commodity hardware.” The two are not the same.

What Covenant-72B actually demonstrates is that small data centre clusters can coordinate over public internet to train frontier-scale models without sharing a physical building. That’s still genuinely interesting. It just isn’t “anyone with a gaming PC can mine.”

Why does this matter? Because the claim that “decentralised AI training is now democratised” depends on the participation barrier being meaningfully lower than centralised training. If the entry ticket is $300K of B200s, the population of potential participants isn’t hobbyists. It’s the same population that already runs small-scale MLMLMachine Learning. The branch of AI where systems learn patterns from data instead of being explicitly programmed with rules. Modern AI (LLMs, image generation, recommendation systems) is almost entirely machine learning.Like teaching a child to recognise dogs by showing them thousands of pictures of dogs, instead of writing down a precise rulebook for what makes a dog. The child learns the pattern from examples rather than from instructions.Read more → research operations. The decentralisation is real, but it’s decentralisation between research labs, not between individuals.

Nous Research’s DisTrO marketing was actually more honest about hardware requirements. Both projects face the same physical reality: gradientGradientIn machine learning, the direction a model's parameters need to be adjusted to reduce its prediction error. Training is a long process of computing gradients and nudging the parameters in the direction the gradient suggests.Like a ball rolling down a hill. The slope of the hill at each point tells the ball which direction to move. The gradient is the slope. Training is letting the ball roll downhill many times until it settles into a low point.Read more → compression helps with bandwidth, not with the memory or compute needed to hold a 72B modelModelA trained neural network that takes inputs (text, images, audio) and produces outputs (more text, classifications, generated content). In DeAI the model is the thing that actually does the work.Like a very experienced apprentice who has spent years watching thousands of masters make furniture. They can't explain how they know when a joint is right, but they can make a chair that looks and functions like a Chippendale. The training is invisible. The output is what matters.Read more → shard.

The Jensen Huang quote, in context

In March 2026, Bittensor’s narrative got a major boost from a Jensen Huang appearance on the All-In podcast. Crypto press headlines framed it as “Nvidia CEO endorses Bittensor.”

The actual quote, in context, was Jensen comparing Bittensor’s approach to “a modern version of Folding@home.” That’s an analogy, not an endorsement. Folding@home is a worthy distributed science project. It is also not a commercial threat to centralised compute, has no token, generates no revenue, and isn’t being marketed as the future of AI infrastructure. The comparison is apt but it’s not the validation it was widely interpreted as.

The TAO price action that followed (a 90% rally in March, with SN3 alpha gaining 444%) priced in the more bullish interpretation. That interpretation has now collapsed alongside Covenant’s exit. Holders who bought the “Jensen endorsed it” framing are now holding tokens for a subnet whose original team has publicly disowned the network they built it on.

What the exit means for Bittensor

This is where the broader implications get uncomfortable.

Bittensor’s marketing has consistently emphasised permissionless participation and Darwinian competition between subnets. The thesis is that the best teams win emissions through delivering value, with no central authority deciding who participates. That’s the story that justifies the network being called “decentralised AI.”

Covenant’s allegations describe something different. Suspension of emissions, removal of moderation rights, deprecation of subnet infrastructure, and timed token sales aren’t features of a permissionless network. They’re features of a network where a central party can apply economic pressure when teams disagree with leadership. If the allegations are accurate, the gap between Bittensor’s marketing and its operational reality is significant.

I want to be careful here. The exit announcement is hours old at the time of writing. We’re hearing one side of the dispute. Steeves hasn’t issued a detailed rebuttal yet. There may be context that recasts this. The published “headless subnet” framing he mentioned suggests Bittensor’s team is aware of the structural concerns and is working on them. Both sides deserve a hearing before anyone draws final conclusions.

But the bare facts already matter. The team that built Bittensor’s most-cited AI achievement just left, citing centralisation concerns, and the network’s price action confirms the market took the criticism seriously. For a project whose entire investment thesis depends on the credibility of its decentralisation claims, that’s a load-bearing crack.

The honest read: Bittensor is more centralised than its marketing suggests, the exact extent is now actively being argued in public, and the next few weeks will determine whether this becomes a one-time falling-out or a structural credibility problem.

What happens to SN3 alpha holders

Anyone holding subnet 3 alpha tokens is now in a difficult position. The original team has publicly disowned the subnet. The technical achievement (Covenant-72B) is complete and not likely to be repeated under SN3 branding. The successor competition (Templar: Crusades) lives in the one-covenant GitHub org, which is now under Covenant AI control rather than Bittensor.

Three possible paths:

  1. Bittensor’s “headless subnet” infrastructure ships and someone else picks up SN3 to run something new. The ticker survives, the brand association breaks, the alpha token reprices to whatever the new operator can deliver.
  2. SN3 emissions get redirected by validators to other subnets that do produce value, and SN3 alpha gradually loses both stakingStakingLocking up a cryptocurrency to help secure a blockchain network, usually in exchange for rewards. The locked tokens act as a security deposit that can be taken away if the staker misbehaves.Like putting down a large rental deposit for an apartment. You get the money back if you behave, you earn interest while it's locked, and the landlord takes it if you trash the place.Read more → flows and price support. The token continues trading but becomes increasingly disconnected from any underlying activity.
  3. Bittensor and Covenant reconcile. Possible but the public nature of the dispute makes it harder. Both sides have now staked reputational capital on their respective positions.

None of these are good outcomes for someone who bought SN3 alpha during the March rally on the assumption that Templar would continue building inside Bittensor. The investment thesis that justified the price has been publicly disowned by the people responsible for delivering it.

For the broader Bittensor staking mechanics, the SN3 situation is a useful case study in the risks the staking guide already flagged: subnets can lose emissions rapidly under the dTAO model, and “subnet owner selling pressure” plus governance disputes can turn an apparently strong position into a stranded one in days.

The honest assessment

Covenant-72B is a real technical achievement and its authors deserve credit for it. The paper is rigorous, the model works, and the bandwidth compression contribution (SparseLoCo) is genuinely novel. None of that goes away because the team left Bittensor.

What does change is the story we tell about it. Covenant-72B isn’t proof that Bittensor enables decentralised AI training. It’s proof that a competent research team can build decentralised AI training, using Bittensor’s incentive layer as a coordination mechanism, until they decide the cost of working with that incentive layer outweighs the benefit. That’s a meaningfully different framing.

For Bittensor itself, the exit is the most serious public challenge to the network’s governance credibility since launch. If Steeves can address the specific allegations with verifiable evidence (what was actually deprecated, what emission decisions were made by which mechanism, what token sales happened when), the story can be contained. If the rebuttal doesn’t come, or if it comes and doesn’t hold up, the network’s “permissionless” framing becomes much harder to defend.

For SN3 alpha holders, the situation is bad and there’s no clean fix. The asymmetric risk that always existed in subnet staking just materialised, and the people most exposed are the ones who bought into the March rally on the strongest narrative the network had.

For decentralised AI training as a category, this is a setback for one project, not a refutation of the thesis. Prime Intellect and Nous Research are continuing to ship in the same space without Bittensor’s governance baggage. If Covenant AI ships another model under their new branding, the technical thesis is still alive. The Bittensor-specific version of it just got harder to defend.

For the broader Bittensor revenue picture, see Bittensor subnets: where the revenue actually is. For the parallel deep-dives, see Chutes: Bittensor’s revenue machine, subsidised and Targon: confidential compute on Bittensor. For the Bittensor project review, the governance score is now under active review and will be updated as the dispute resolves.

Score changes, new reviews, one editorial take every two weeks. No spam.