Filter
Exclude
Time range
-
Near
Self commoditization is the internet’s end state. As such, future auteurs will build brand, and thus distribution, via @YouTube. Traditional studios become service providers when creators bring their own audience to the table.
1
11
Replying to @mattparlmer
intelligence beyond a certain threshold is likely overvalued. commoditization is inevitable imo
17
IBM CEO Arvind Krishna think we're entering an AI bubble. Here's how he gets there: Over 100 gigawatts of AI data center capacity has been committed globally, at roughly $60 to $80 billion in semiconductor spend per gigawatt. That points to a $6 to $8 trillion buildout. For that to make economic sense with a 5 to 7 year payback at 20 to 30% margins, you'd need $1 to $2 trillion in new annual AI revenue above what exists today. He doesn't think that demand is coming on that timeline. The second problem he sees is model commoditization. The largest AI models will end up as commodities with low switching costs between them. His conclusion: the market is pricing for 6 to 12 large model companies surviving but the economics probably only support 2 or 3. Where the opportunity still exists: On the consumer side, the companies with existing distribution win. "Some will disappoint, many will thrive but not all will thrive."
1
193
🤖 GLM 5.2: A Million-Token Open-Source Model Is Coming. What It Means for Engineering Teams. Next week's expected release of GLM 5.2, an open-source LLM with a 1 million token context window, signals a significant shift in the capabilities accessible to developers outside the walled gardens of major AI labs. This isn't just about a larger context; it's about new architectures for safety and control that directly impact engineering workflows. Here’s what actually changed and why it matters for technical leaders and developers: 1M Token Context Becomes an Open-Source Reality. Until now, million-token context has been a premium feature of closed-source models like Gemini 1.5 Pro. GLM 5.2 democratizes this capability. For engineering teams, this means the ability to feed entire codebases, extensive API documentation, or complex project histories into the model for comprehensive analysis, refactoring, and generation tasks without complex chunking strategies. Demonstrated Strength in Complex Code Generation. Theory is one thing, but execution is another. Internal demos show GLM 5.2 generating a functional clone of Minecraft, complete with procedural infinite terrain, and a Pokémon clone that correctly implements the turn-based battle logic. This level of practical application suggests a high degree of reliability for complex, multi-step software development tasks, moving beyond simple boilerplate generation. Architecture for Control and Safety. The model reportedly includes two novel features for production environments. First, a setting to control "thinking intensity," allowing teams to balance performance against cost and latency for different use cases. Second, built-in hooks for identity and access management (as shown with Descope integrations), addressing a critical security concern for AI agents that need to interact with internal systems and APIs. This is a crucial step towards enterprise-ready agentic applications. A Clear Reaction from Incumbents. The market is not static. Following the strong reception of Anthropic's Fable 5 for its agentic coding skills, OpenAI immediately announced user-friendly updates and promotions for its Codex API. This reactive posture, combined with Google's release of the Apache 2.0-licensed DiffusionGemma, indicates that established players are feeling the pressure from high-capability open-source alternatives. Why This Matters: The arrival of GLM 5.2 signifies the acceleration of capability commoditization in the AI space. For engineering managers and senior ICs, it means the strategic moat of proprietary models is shrinking. The focus will increasingly shift from access to large context to the application and orchestration of these powerful open-source tools. It lowers the barrier to entry for building sophisticated AI agents and forces a re-evaluation of build-vs-buy decisions for AI-native features. How will your team's development roadmap change when 1M-token context becomes a baseline, not a premium feature? Find all resources, notes, and links from the video analysis on my GitHub: github.com/paul010/dalei-you… Watch the full breakdown and see the demos here: youtu.be/SbqW8Emp9Uw #AI #OpenSource #LLM #GLM52 #SoftwareDevelopment #EngineeringManagement #AIagent #MachineLearning #TechLeadership #GenAI
1
42
Replying to @satyanadella
Satya your post argues that human capital does not become less valuable it only becomes more valuable as token capital grows through these learning loops And yet Microsoft has made recent massive job cut announcements displacing large numbers of employees whose expertise in judgment relationships and pattern recognition is supposedly central to driving token capital Which in return raises a direct contradiction between the philosophy presented in your post and the operational reality If humans set the ambitious goals connect dots and provide direction without which compute runs in circles, how does reducing the workforce align with building the compounding institutional knowledge you describe? The future of the firm relies on every company owning its learning loop but actions that shed talent risk losing the tacit knowledge unique to the organization What mechanisms exist to capture and retain that human capital during these transitions instead of letting it dissipate? Does prioritizing AI capability over retaining people not risk the very commoditization of expertise across industries that you warn could lead to societal pushback​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​?
19
Replying to @elonmusk
Opposite of Bill Gates era - Nadella emphasizes that companies should be able to swap out general-purpose base models without losing their accumulated company-specific expertise. linkedin.com How this directly relates to self-hosted AI models Self-hosted AI (running models on your own hardware, private cloud, or on-prem infrastructure — e.g., via tools like Ollama, vLLM, Hugging Face, or enterprise deployments) is one of the most practical ways to implement exactly what Nadella describes. Here’s the connection: • Data control and privacy — Proprietary or sensitive company data never leaves your environment. This is essential for building real “token capital” without feeding it to external APIs. • Customization and ownership — You take strong base models (open-source ones like various Llama variants, or xAI’s previously open-sourced Grok weights) and fine-tune/RAG/agent-ify them on your data and workflows. The resulting fine-tuned models, embeddings, or agent systems become your owned “token capital” — exactly the proprietary asset Nadella wants companies to compound. • Internal learning loops — Self-hosted setups make it straightforward to create closed feedback systems: log human-AI interactions internally, use them to improve your models/agents over time, and retain full ownership. • Avoiding commoditization and lock-in — Nadella warns against relying solely on external models that could make AI capabilities generic. Self-hosting lets you keep the differentiated, company-specific intelligence inside your control while still benefiting from the best available base models. • Flexibility — You can switch base models (e.g., try a new open model) without losing your accumulated internal knowledge, aligning perfectly with Nadella’s point. In short, Nadella’s vision is not “just use our cloud AI.” It’s “build your own AI brain on top of models, using your unique data and processes.” Self-hosting (or tightly controlled private hosting) is a core technical path to achieving that. Elon’s perspective and xAI angle Elon simply saying “Interesting” suggests he sees value in the framing. xAI has historically supported openness in AI (e.g., fully open-sourcing Grok-1 and releasing weights for later versions at times), which empowers exactly this kind of self-hosted/custom deployment. linkedin.com Many in the replies to the thread explicitly prefer open-source/self-hosted options over being locked into any single corporate ecosystem (Microsoft or otherwise). Nadella’s post can be read as Microsoft acknowledging that the future belongs to those who own their AI capabilities internally — a point that resonates with the broader self-hosted/open-weights movement. Bottom line: Nadella is making a strategic case for the kind of controlled, proprietary, self-hosted (or privately deployed) AI infrastructure that lets organizations build lasting advantages. Elon’s “Interesting” flags this as noteworthy in the ongoing debate about who will actually capture value in the AI era — the big model providers, or the companies (and individuals) who effectively self-host and customize on top of them. This is why self-hosted AI has been gaining traction for both enterprises (data sovereignty, customization, cost/control) and individuals (privacy, offline use, experimentation). Nadella’s piece gives it a high-level business rationale.
1
2
2
17
Interesting. AI will create a cognitive loop between people and systems unlike past shifts, changing how firms learn and build IP. Companies must develop human capital and token capital that compound via learning loops on top of models. Turn workflows and judgment into agentic systems that improve with private evals and reinforcement on real traces. This learning loop becomes the firm’s compounding IP, a hill climbing machine hard to replicate. Build a frontier ecosystem where every company owns its knowledge loop to avoid commoditization and ensure broad value creation.
1
21
Apparently, even at this juncture and rupture, at a time that most urgently calls for re-imagination, we can’t imagine any frameworks and building blocks that are different from “commoditization”, “business”, “company”.
89
Replying to @THEPOKEPLATEAU
yugioh will always be garbage for these kind of gimmicks/ commoditization schemes. even if he manages to pump it, it will be another beanie baby scenario.
64
Summary: He sees AI creating a cognitive loop between people and systems unlike past shifts, changing how firms learn and build IP. Companies must develop human capital and token capital that compound via learning loops on top of models. Furthermore, they must turn workflows and judgment into agentic systems that improve with private evals and reinforcement on real traces. Then, this learning loop becomes the firm’s compounding IP, a hill climbing machine hard to replicate. Compounding this builds a frontier ecosystem where every company owns its knowledge loop to avoid commoditization and ensure broad value creation. Personal thoughts, this is one "hellava" way to prevent model and knowledge collapse. I agree with this position.
1
16
A $1,499 AMD box can load a 235B parameter model. That headline has 6,800 likes and everyone's celebrating the death of Nvidia's pricing moat. But capacity isn't the bottleneck. Bandwidth is. And nobody's posting about that. Here's what the real numbers say: 1/ THE MEMORY CAPACITY ILLUSION Strix Halo's Ryzen AI Max 395 gives you 128GB unified memory, 96-110GB addressable as VRAM. An RTX 5090 gives you 32GB. On paper, this is a 3x memory advantage at half the price. But memory capacity determines what you can load. Memory bandwidth determines how fast it generates tokens. Strix Halo pushes roughly 256 GB/s. Apple's M3 Ultra does 800 GB/s. The DGX Spark's GB10 does 273 GB/s with CUDA's optimized stack on top. Independent benchmark (Qwen3.5-27B IQ4, same model, same workload): - AMD Strix Halo (~$2,500): ~16 tok/s decode - Apple Mac Studio M3 Ultra (~$5,000): ~40 tok/s decode - NVIDIA DGX Spark (~$3,999): ~17 tok/s decode, but 1,939 tok/s prefill Strix Halo wins on $/token loaded. It loses on $/token generated by a factor of 2.5x against Apple. 2/ WHY MOE MODELS MAKE THE GAP INVISIBLE The "gotcha" tweet everyone's sharing: someone running Qwen 3.6-35B-A3B at Q8, 131K context, 40-50 tok/s on Strix Halo. Sounds incredible. But that's an MoE with only 3B active parameters per forward pass. The 35B sits in memory, but only 3B gets computed. Of course it's fast. This is the dirty secret of the local AI hardware moment: MoE models make every box look good because they minimize active computation. Run a dense 70B model where all parameters fire every token, and the bandwidth cliff appears. Strix Halo drops to single-digit tok/s on dense models that the M3 Ultra handles at usable speed. The capacity-versus-bandwidth gap isn't a spec sheet footnote. It's the difference between "I can technically load it" and "I can actually use it for production work." 3/ THE SOFTWARE STACK TAX Every Strix Halo review includes a sentence that should worry you: "ROCm or Vulkan?" This isn't a preference question. It's an admission that the AMD software stack is fragmented enough that users must choose between two incomplete implementations, benchmark both, and hope one doesn't break on the next model they pull. NVIDIA's CUDA isn't faster because it's magic. It's faster because it's predictable. You install it, it works, the numbers are reproducible. Apple's MLX reached the same reliability threshold in 18 months. AMD's ROCm has been "almost there" for five years. The real TCO of a Strix Halo isn't $1,499 plus electricity. It's $1,499 plus the hours you spend in ROCm/Vulkan Discord channels debugging why llama.cpp segfaults on your quant config. That time has a price, and for consultants billing $150/hr, it eats the hardware savings fast. 4/ THE BUSINESS MODEL INSIGHT NOBODY'S FRAMING RIGHT The most valuable tweet in this entire wave isn't the Lisa Su demo or the spec comparisons. It's the consultant who turned $2,800/month cloud bills into $8 electricity costs and watched consulting margins jump from 30% to 80-90%. The pitch that closes deals isn't "it's cheaper." It's: "Your data physically lives in your office. Not OpenAI's, not mine." Lawyers, healthcare, finance — the clients who can't touch cloud AI — sign on that single sentence. Local inference doesn't disrupt cloud AI pricing. It creates a new service category: data-sovereignty AI consulting, where the moat isn't model access (anyone can download Qwen) or hardware (anyone can buy a Strix Halo) but the workflow integration trust relationship. The box is commodity. The integration is the product. BUT HERE'S WHAT EVERYONE'S MISSING: The Strix Halo narrative assumes hardware commoditization is the endgame. It's not. The next 18 months will be a software ecosystem war disguised as a hardware price war. AMD can match Nvidia on memory capacity today. It cannot match CUDA's developer experience without a multi-year ecosystem investment that no amount of $1,499 boxes can substitute for. Apple understood this. That's why they built MLX instead of betting on raw specs. The M3 Ultra's 800 GB/s bandwidth matters, but MLX "just working" matters more for adoption. The companies building local AI businesses on Strix Halo today are making a bet that AMD's software stack will mature faster than their patience runs out. Some will win that bet. Many will end up with $1,499 paperweights running Q4 quants at 8 tok/s, wondering why the demo looked so much better than reality. The question isn't whether $1,499 can run a 235B model. It's whether the generation that grows up on local AI will accept "tinker with ROCm" as the price of sovereignty — or whether predictability wins over capacity every time, the same way it did when CUDA killed OpenCL a decade ago. History doesn't repeat. But it benchmarks.
1
35
Agreed mostly! AI companies need a proprietary moat. For some, that moat will be custom models trained on unique data. For others, it will be workflow ownership, distribution, or exclusive datasets. Pure model wrappers face the greatest risk of commoditization. $MSFT $GOOGL
53
Build a frontier ecosystem where every company owns its knowledge loop to avoid commoditization and ensure broad value creation.
134
Replying to @elonmusk
The only mindbender here is that Nadella mentions upcoming commoditization because of Ai, when Microsoft notoriously milks Windows & Office to prevent commodizing at all. Ditto for the large ERP firms; their wares require armies of clerks to even marginally work—et tu, Ai ?
1
4
19
Replying to @satyanadella
Everyone selling the “own your learning loop” vision has one contradiction they never name. The pitch goes: a few frontier models are eating everyone’s expertise and commoditizing it, so every firm must build its own loop - capture your people’s tacit judgment, run RL on real internal traces, encode the “company veteran” into a system you own and can swap models under. Build the moat. Compound the advantage. First movers win. Read that again. The mechanism you’re prescribing inside the firm is the exact extraction you’re condemning between the labs and everyone else. Just pointed inward. When a frontier model absorbs the open web’s expertise and sells it back, that’s commoditization and the political economy “won’t tolerate it.” When your platform absorbs your engineers’ and analysts’ judgment, makes it “replicable and scalable,” and books it as an asset the firm owns and the human doesn’t - that’s empowerment. Same move. Different beneficiary. “You can offload a task but never your learning.” Whose learning? The entire point of the loop is to externalize the individual’s learning into the firm’s token capital. That line comforts the asset owner, not the person being mined. And “human capital only becomes MORE valuable” is asserted, never argued - against the grain of the very mechanism. If the loop really captures the veteran well enough that you can hot-swap the model beneath it, then by that same logic it captures the veteran well enough to depend on them less over time. You can’t promise both perfect encoding and permanent indispensability. This isn’t a “stable equilibrium.” Cumulative advantage, hard-to-replicate moats, first-mover lock-in - those are divergent dynamics. They concentrate. A genuinely distributed outcome would require forces pushing AGAINST the loop you’re celebrating: open weights, interoperability, regulation, portability you can actually verify. Calling the concentrating thing “stable” is the tell. One more: the metaphor is a hill-climbing machine. Hill climbing is famous for getting stuck in local optima. A system trained on its own traces risks laundering yesterday’s judgment into tomorrow’s ground truth - and the better your moat, the harder it is to climb back out of confident-but-wrong. Compounding isn’t monotonically good. I’m not against firms owning their IP. I’m against a vision that borrows anti-concentration moral language to sell a concentration tool, and never explains why the extraction is villainous one layer up and virtuous one layer down.
12
1,473
Cheaper tokens ≠ less compute. They trigger more compute demand. This is Jevons Paradox in AI: cheaper intelligence causes total consumption to explode as new use cases, agentic workflows, and broader adoption become viable. Model giants like OpenAI face real price-war pressure. But the picks-and-shovels layer (power, optics, platforms) wins on volume. $IREN , $AAOI , $PLTR and many more are set up beautifully. Core insight from recent discussions (Nebius/20VC with Roman Chernin, BG2 Pod, Jordi Visser): every time cost-per-token or cost-per-unit intelligence drops, demand rises, not because people use less, but because they use far more. Previously uneconomic tasks become viable, complex multi-step agentic workflows scale, and new markets open. Nebius saw some of its best sales weeks right after major cheap open-source model drops. The market isn’t near its ceiling. Jordi Visser laid it out clearly: price compression and commoditization at the model layer (OpenAI/Anthropic facing drastic cuts, 80% of workloads shifting to far cheaper models) versus real inflation and bottlenecks at the physical layer (power, materials, networking, optics). Demand for intelligence is near-infinite. Cheaper per-token pricing unlocks the floodgates on total tokens consumed. Who benefits? Model-layer players can get squeezed on margins even as headline revenue grows. The infrastructure and enablement layer feasts on the volume explosion: • $IREN — Sustainable power, land, and AI/HPC cloud pivot. Power is the ultimate bottleneck. • $AAOI — Optical transceivers and photonics for AI data centers. Hyperscalers are ordering 800G modules in volume as faster, power-efficient connectivity becomes non-negotiable. • $PLTR — Enterprise AI platforms (AIP/Gotham). As agentic workflows and production deployments explode, companies need real software to deploy, govern, and scale AI not just raw GPUs. Many more names across power, cooling, networking, and specialized infra will ride the same wave. @BG2Pod highlighted the bullish math: monetization per gigawatt is rising, not falling. Demand outstrips supply. Inference revenue is already running well over $200B annualized and climbing fast. Agents will multiply token consumption further. Cheaper tokens plus efficiency gains expand what’s possible within power envelopes, driving more overall buildout. This is why the “AI mid-cycle slowdown” narrative is incomplete. It’s healthy digestion and rotation inside a secular bull market. Capital is rotating toward the real beneficiaries of the physical buildout and the software layer that turns raw compute into usable intelligence at scale. The compute market remains early. We’re still in the first innings of widespread agentic adoption and enterprise transformation. Cheaper intelligence democratizes access → more users and businesses adopt → agents multiply usage per task → total token/compute demand detonates. Model giants feel price pressure. Infra and platform winners feel the volume tailwind.
3
157
Replying to @AIInvestorHQ
I’ll take the under… commoditization is the trend.
19
Replying to @FurkanGozukara
If everyone has the same access to these tokens then the differentiator won’t be tokens but how they are used. That is his point, and it’s just obvious. Commoditization of models is inevitable, but the ecosystem (proprietary processes and dafa) will determine how they are used
11
Replying to @elonmusk
Your reflection on the cognitive loop and the need to preserve each company’s own “token capital” is profound. In the age of AI, the real commoditization doesn’t come only from general models, but from the loss of control over the learning loop itself. This is where a personalized AI pilot system (an autonomous agent orchestrator) changes everything. Instead of simply consuming models, this system acts as a digital veteran of the company: it continuously captures human judgment, business patterns, and real workflows, then transforms them into a living, evolving institutional memory. Every interaction strengthens the loop: the human sets the strategic objective and nuanced context, while the pilot agent executes, measures, iterates, and reinjects the acquired knowledge — without ever outsourcing the learning. This creates exactly the “hill-climbing machine” you mentioned: intellectual property that improves in a compounding way, resisting commoditization because it is embodied in the organization’s unique context. Human capital does not become less valuable; it becomes the irreplaceable conductor.
1
26
Your reflection on the cognitive loop and the need to preserve each company’s own “token capital” is profound. In the age of AI, the real commoditization doesn’t come only from general models, but from the loss of control over the learning loop itself. This is where a personalized AI pilot system (an autonomous agent orchestrator) changes everything. Instead of simply consuming models, this system acts as a digital veteran of the company: it continuously captures human judgment, business patterns, and real workflows, then transforms them into a living, evolving institutional memory. Every interaction strengthens the loop: the human sets the strategic objective and nuanced context, while the pilot agent executes, measures, iterates, and reinjects the acquired knowledge — without ever outsourcing the learning. This creates exactly the “hill-climbing machine” you mentioned: intellectual property that improves in a compounding way, resisting commoditization because it is embodied in the organization’s unique context. Human capital does not become less valuable; it becomes the irreplaceable conductor.
27