The Great AI Subsidy: Why the Math Doesn't Math (Yet)
2025-12-01
We have reached the "can’t live without it" stage of Generative AI faster than any technology in history. It writes our code, drafts our emails, and—let’s be honest—serves as a therapist for half the internet.
Take my own week as the perfect case study. In just the last seven days, I didn’t just "chat with a bot"—I deployed an elite, multimodal expert agent.
The breadth of utility was staggering:
-
Technical Support: I troubleshot complex heating issues just by showing it a few site photos.
-
Logistics & Repair: I verified warranty coverage and engineered alternative repair methods for a massage chair.
-
Health & Lifestyle: I analyzed ingredient labels in real time while shopping, managed detailed health tracking, and curated entertainment recommendations.
-
High-Level Strategy: Most importantly, I used it to dissect financial strategies and tactics. The output didn't just match a professional; it far exceeded any human financial expert in terms of raw knowledge base, speed, and the ability to pull and analyze massive datasets instantly.
The list could go on forever.
I extracted more tangible value in a single week than most software or humans provide in a year.
But while the utility is undeniable, the business model powering it is currently balancing on a knife’s edge.
To put it bluntly: The economics are upside-down, and the industry is burning cash to keep the lights on.
1. The Death of Zero Marginal Cost
For the last thirty years, the software industry has enjoyed the greatest business model ever invented: Zero Marginal Cost.
Think about Microsoft in the 90s. Once they mastered the "Golden Master" CD of Windows 95, every subsequent copy sold was pure profit. The cost to stamp one more CD? Pennies. The cost to add one more Office user today? Practically zero.
LLMs quietly blew up this rule.
Every single time you ask a model a question, your request gets scheduled on a cluster of H100 GPUs that chew through electricity and billions of matrix multiplications.
-
Software Model: One user = One license key (high margin).
-
LLM Model: One user = Constant consumption of compute (variable, potentially negative margin).
This isn't "software" economics; it's "utility" economics—more like running a power plant than selling a copy of Excel.
2. The Whale vs. Minnow Dilemma
Here is where the pricing gets messy. The industry has anchored consumer expectations at roughly $20/month (thanks, ChatGPT Plus).
For a casual user who asks for a recipe once a week? That’s a great deal for OpenAI or Anthropic. But for a power user—someone using agentic workflows, heavy coding tasks, or massive context windows—that $20 doesn't even cover the electricity bill.
The Reality Check: It’s not hard for a truly heavy user to burn through well over $200 in actual compute costs per month.
Stack a couple of 32k-token coding sessions, a few giant document ingestions, and some long-running agent workflows, and you’re no longer in “$20 SaaS” territory—you’re in “small enterprise GPU bill” territory.
As models get smarter (GPT-6, GPT-7), they get bigger. Bigger models require more inference compute. The gap between what the user pays and what the user costs is widening, not shrinking.
3. The "Wikipedia" Effect on Moats
If this were a standard monopoly game, the AI labs would just raise prices once everyone was addicted. But they can’t.
Why? Because of the Encarta vs. Wikipedia dynamic.
Proprietary models (Encarta) are competing against a rapidly improving open-source ecosystem (Wikipedia/Linux). Qwen, DeepSeek, Mistral, and others are releasing powerful models and weights for free or at commodity pricing. If OpenAI tries to charge $100/month for GPT-5, a big chunk of users will simply switch to a quantized Qwen/DeepSeek variant running on a local Mac Studio or a cheaper API wrapper. OpenAI’s own open-source models are another tell that even proprietary players acknowledge this reality: the gpt-oss 120B and 20B models are already solid for maybe 80% of everyday AI needs—trust me.
This destroys pricing power. You cannot charge monopoly rents when a "good enough" free alternative is just an ollama pull or Hugging Face download away for a large share of workloads.
4. The Hardware Ouroboros
Right now, the industry is trapped in a loop of massive Capital Expenditure (CAPEX). To build a better model, you need more GPUs. To serve that model, you need even more GPUs.
This benefits exactly one player: NVIDIA.
Everyone else is stuck in an arms race where they are spending billions on hardware that depreciates rapidly, chasing profit margins that might not exist. Unless we see a radical shift in architecture—moving away from pure Transformers to more efficient approaches (like linear attention, SSMs, or JEPAs)—this loop eventually breaks.
Either the stack gets radically more efficient, or the CAPEX story eventually hits a wall. There is no version of reality where data center spend outruns global GDP forever.
Here’s the catch-22: the very breakthroughs that would fix the unit economics—10–100x more efficient training and inference, less dependence on giant GPU clusters—would also blow a hole in the current NVIDIA-centric capex story. Less GPU dependency is great for users and long-term economics, but it directly undercuts NVIDIA’s ecosystem and strands a lot of today’s investment.
All the major AI labs and hyperscalers are exposed to that tension. If the bet on "more GPUs forever" fails, they’re not sitting on compounding assets; they’re staring at massive impairment risk on yesterday’s GPU mountains.
The Verdict
So, where does that leave us?
For the User:
Exploit the era. You are living through a massive venture capital subsidy. Every time you generate a complex image or run a massive code refactor for twenty bucks, you are getting more value than you are paying for. Enjoy the "free lunch" while the giants fight for market share.
For the Investor:
Stay cold. The current valuations assume software margins ($$$) on a hardware-heavy business model. Until we see a path to a "durable cash machine"—one where revenue scales faster than GPU costs—skepticism is your best friend.
Until you see either (a) radically cheaper training and inference per unit of capability, or (b) business models that make money after paying the GPU bill, treat every “AI is the new electricity” pitch as marketing, not math.
Don't bet the farm on the electricity bill.