TurboQuant, HBM demand reversal, geopolitics, and glass substrates — not a doomsday scenario, but a disciplined analysis.

Series: AI Compute Supply Chain | Part 4 of 5 Author: Sinclair Huang

Four pressure vectors. Four clocks are already ticking. The question is not whether these moats will last forever — it’s whether you know which one breaks first.

I developed a habit during my years in the electronics industry that I haven’t been able to shake in research and writing: every time I become confident about something, I force myself to find the strongest argument against it.

Not to be contrarian. But because I have watched too many “certainties” fail. The internet would make physical retail irrelevant. Feature phones would always dominate. The automotive chip shortage would last until 2030. Every one of those convictions was grounded in real evidence at the time and turned out to be incomplete.

The previous three articles argued that CoWoS, HBM, and ABF substrates have deep, structural moats. This article asks the question that always comes next: where do they break?

Pressure One — Software Efficiency Algorithms: TurboQuant and the Limits of CompressionWhen Google Research published TurboQuant, markets reacted immediately. The Korean KOSPI fell as much as 3%, with memory stocks leading the decline. The algorithm’s claim — that it could reduce large language model inference memory consumption by at least 6x, with no measurable accuracy loss — triggered fears that AI memory demand would fall structurally.

My first response was to go to arXiv and read the actual paper rather than the coverage. I want to be transparent: the mathematical derivations in the Johnson-Lindenstrauss sections are above the level I can work through completely — I did not follow every step. But for supply chain implications, the experiment section is what matters, and I read carefully.

The benchmark results are concrete: on the ESM-2 650M model, TurboQuant achieved a 7.1x memory reduction, compressing the KV cache from 330MB to 46.6MB. Autoregressive decoding accuracy was nearly intact, with cosine similarity above 0.96. A Triton-based fused decode attention kernel eliminated intermediate dequantization memory allocations, delivering a 1.96x speedup for the KV fetch operation. These are real, reproducible numbers — not theoretical claims.

The keyword in TurboQuant’s achievement, however, is inference, not training.

HBM’s function is to provide extreme memory bandwidth for matrix computations — the mathematical operations that happen at every step of both training and inference. TurboQuant compresses the KV Cache: the temporary context buffer that stores attention states during inference. Compressing what gets stored in that buffer does not reduce the bandwidth requirement for the underlying matrix multiplication that generates each token. The 7.1x memory reduction is real. Its impact on HBM demand is narrow.

The market conflated NAND and HBM — two different memory types serving two different functions. NAND Flash serves as a storage buffer where compressed KV Cache data would reside; reducing that requirement has real implications for NAND demand. HBM serves the compute pipeline at the matrix operation level; TurboQuant doesn’t touch it.

Then I saw KOSPI drop 3% and memory stocks tank indiscriminately. This pattern is familiar: when a technical news item breaks, markets initially move “all related stocks together.” A few days later, people who understand the technical differences come back and separate them. That divergence window is often where the information advantage lies.

There is also a more fundamental point about efficiency and demand. When inference costs fall, deployment scales up. This is Jevons Paradox — when a resource gets cheaper to use, people use more of it, not less. Cheaper electricity didn’t reduce electricity consumption; it enabled factories. Cheaper storage didn’t reduce data; it enabled the internet. Cheaper inference won’t reduce AI compute demand; it will enable applications that couldn’t exist at current prices.

The algorithm also establishes something analytically important beyond its immediate application: a theoretical boundary. The paper notes that TurboQuant approaches the information-theoretic optimum for KV Cache compression. There is not much room left beyond this point. Software efficiency optimisation in this domain is near its ceiling. That ceiling, paradoxically, reinforces the case for hardware bandwidth as the irreducible floor — which is precisely what HBM provides.

Pressure Two — HBM Demand Reversal: The Risk Micron Named, and the Transaction That Challenges ItThe third article quoted Micron’s 10-K risk disclosure: if HBM demand weakens and capacity shifts back to conventional DRAM, conventional DRAM pricing will face significant downward pressure.

This is the most credible structural threat to the HBM moat thesis. It has a clear mechanism, a precedent in prior DRAM cycles, and explicit acknowledgement from one of the three main players. The bear scenario is: AI capex decelerates, HBM orders fall, capacity shifts back to commodity DRAM, supply surges, and pricing collapses across the memory complex.

But the same week I was finalising this analysis, a transaction occurred that directly challenges the premise of that scenario.

Cisco, Kioxia, and Solidigm — the latter a subsidiary of SK Hynix — invested a combined US$2.5 billion in Nanya Technology, Taiwan’s fourth-largest DRAM manufacturer, through a private placement of exactly 351.57 million shares at NT$223.9 per share. Each investment was paired with a multi-year DRAM supply agreement.

The detail worth pausing on: Solidigm is a subsidiary of SK Hynix — the world’s leading HBM producer. And its parent company’s storage division needed to secure conventional DRAM supply from an outside source because internal capacity was insufficient.

I spent time in memory product planning earlier in my career. The scenario I remember most clearly from past cycles is not the shortage itself — it’s the moment when the shortage is recognised by the people who are inside the system. When a subsidiary of the leading HBM producer enters long-term supply agreements with a conventional DRAM manufacturer to ensure it has product, that is the market expressing a view — with capital, not commentary.

The mechanism is straightforward: HBM’s premium margins have driven the three major producers to systematically reduce conventional DRAM output. Nanya Technology management stated in March that DRAM supply tightness could extend through 2028, because the capacity reallocation to HBM has left conventional DRAM with “very limited” new supply additions globally. Nanya’s Q4 2025 results confirmed the pricing impact: gross margin of 49%, revenue up 357% year-over-year, with management guiding for further price increases above 20% in Q1 2026. These numbers belong to a company that doesn’t make HBM, which says everything about what HBM’s ascent has done to the market it left behind.

The Micron 10-K risk scenario is real. The condition for triggering it — AI capex deceleration severe enough to shift HBM demand — is not currently in evidence. The transactions happening in conventional DRAM are pointing in the opposite direction.

Pressure Three — Geopolitical Concentration: The Tail Risk That Analysis Cannot Fully ContainTSMC controls approximately 70% of global foundry revenue and over 90% of production at leading-edge nodes. Nearly all of that capacity is located in Taiwan, approximately 100 miles from mainland China.

TSMC’s own 20-F is direct about this. The Risk Factors section lists: earthquakes, typhoons, cyberattacks, supply chain disruption, geopolitical tension — any of which could disrupt operations. This is not boilerplate. Regulatory requirements compel specificity, and TSMC’s disclosure is specific.

Let me share a personal cognitive update. A few years ago, I was relatively dismissive of TSMC’s geopolitical risk. My logic was simple: economic interdependence creates strong incentives against extreme actions. That’s business logic, and it isn’t wrong.

But after rereading TSMC’s 20-F risk factor section in full, and reviewing several academic analyses of semiconductor supply chain resilience — including a paper on ScienceDirect examining Taiwan’s specific vulnerability to quarantine scenarios — I changed my stance. Not because I think conflict is more likely. But because I realised, business logic and geopolitical logic are two different rule sets. At certain critical thresholds, the latter can completely override the former. The Ukraine conflict provided a recent reminder that great-power decisions can override economic rationality when political stakes are framed differently.

My view on TSMC’s fundamental moat has not changed. My view on position sizing has. Those are two separate questions, and conflating them is an analytical error.

TSMC’s overseas expansion — Arizona, Japan, Germany — addresses the concentration risk over a 5–7 year horizon. But the Arizona facility’s first advanced node production isn’t expected to reach 2nm volume until approximately 2030. The gap between the depth of the moat and the concentration of the risk is real, and it runs through approximately 2030 before meaningful diversification takes effect. Overseas expansion also carries a marginal cost: TSMC management has guided approximately 1–2% gross margin dilution from overseas fabs. The long-run economics of non-Taiwan production are structurally less favourable — higher labour costs, higher energy costs, and less ecosystem density.

This risk belongs in the category of things that require position management, not things that can be analysed away.

Pressure Four — Glass Substrates: The Countdown Clock on the ABF MonopolyAjinomoto’s above-95% market share in ABF film is the most concentrated supply position in the AI hardware stack — comparable in structure to ASML’s position in EUV lithography. Both are cases where a specialised capability, developed in a different context, migrated into semiconductor manufacturing and became irreplaceable before the industry fully recognised it.

The long-term structural threat to ABF is glass substrates, and the technical case is grounded in a real physical limitation. As AI accelerators grow to “reticle-busting” dimensions exceeding 100mm × 100mm — integrating larger compute dies alongside eight or more stacks of HBM — organic ABF materials encounter what engineers have started calling the “warpage wall.” The coefficient of thermal expansion (CTE) of organic materials differs substantially from that of silicon. Under the extreme heat generated by a modern AI accelerator, the substrate expands and warps, fracturing the microscopic solder bumps connecting the chip to the board. The result is catastrophic yield loss at the exact moment packages are most expensive. Glass substrates solve this directly: glass CTE can be tuned to closely match silicon, providing dimensional stability under thermal stress, and its ultra-flat surface enables sub-2 micrometre line spacing — roughly ten times the interconnect density of organic substrates.

The competitive dynamics of the transition are worth understanding in detail because they are more structured than most coverage suggests.

Intel holds a meaningful first-mover advantage, having invested over US$1 billion to upgrade its Chandler, Arizona, facility for high-volume glass production. Intel has already deployed glass core substrate technology in its Xeon 6+ “Clearwater Forest” processor. Significantly, Intel has reportedly begun licensing its portfolio of over 600 glass substrate patents to equipment and materials partners — a strategic pivot from in-house manufacturing to ecosystem acceleration that could compress the commercialisation timeline for the entire industry.

Samsung’s approach is structurally different and arguably more powerful in the medium term. The company has formed what internal briefings describe as a “Triple Alliance” — combining the substrate expertise of Samsung Electro-Mechanics, the manufacturing scale of Samsung Electronics, and critically, the glass-handling technology derived from Samsung Display’s OLED production lines. Samsung Display has spent decades achieving precision control over large-format glass panels. That accumulated capability is not available to any semiconductor company starting from scratch. Samsung has established pilot lines in Sejong, South Korea, recently acquired a stake in JWMT — a firm specialising in proprietary laser-driven glass drilling technology — and has set a target of transitioning to glass interposers for advanced semiconductors by 2028.

Absolics, an SKC subsidiary operating a new facility in Covington, Georgia, received US$75 million in CHIPS Act funding and has established an annual production capacity of approximately 12,000 square meters, with mass production preparations targeting completion by the end of 2025.

My reading of the timeline is more cautious than some coverage suggests. Intel’s own technical presentations have acknowledged that the double-sided glass substrate process is “difficult,” and that reliability data is still being accumulated. Industry analysts estimate that glass substrates are unlikely to exceed 20% market penetration in advanced packaging before 2028–2029, with 2026–2027 remaining primarily pilot-product years for ultra-premium server applications.

One thing I still haven’t found a good answer to: how does Ajinomoto itself view the glass substrate threat? I’ve scanned their annual reports and earnings call summaries from the past two years. Almost all of it is about ABF capacity expansion. On glass substrates — almost silence. Two interpretations: they think the threat is distant and choose not to address it publicly, or glass substrate chemistry falls outside their core competency domain, and they’ve made a strategic decision not to enter. I lean toward the first — but I don’t have enough data to be certain. If anyone has tracked Ajinomoto’s R&D direction or knows people in that industry, I’d genuinely like to hear your perspective. This remains an open question in this series.

The timeline I would monitor: 2028 as the critical milestone for Samsung’s glass interposer volume production target. If that date slips — which is common in semiconductor manufacturing — the ABF window extends further. If it holds, moat erosion begins in earnest in the 2029–2031 period.

PeriodABF Moat StatusGlass Substrate Status2026–2027IntactPilot production, ultra-premium only2028DominantSamsung, Absolics, Intel target volume entry2029–2030Beginning to erode20%+ penetration possible**2031+**Structural transitionGlass leads high-end market

Closing: Moats Have a Time DimensionEvery real moat, before it is challenged, looks like a permanent feature of the landscape.

The skill in supply chain analysis is not identifying the moats that exist today. It is knowing which ones have an expiration condition, what that condition looks like, and which clock is already running.

TurboQuant drew the boundary of software efficiency — and the 7.1x compression figure approaching the information-theoretic optimum means that boundary is now visible, which paradoxically reinforces hardware bandwidth as the irreducible floor. Micron’s 10-K named the precise conditions for HBM demand reversal. The Taiwan concentration risk requires position management, not just analytical acknowledgement. Glass substrates have a known timeline with known critical milestones — Intel’s patent licensing strategy, Samsung’s 2028 production target, Absolics’s Georgia ramp.

None of these is a thesis-breaking discovery. All of them are things a serious analyst needs to hold alongside the bullish framework — not to undermine it, but to know its edges.

Article 5 is written for anyone wondering how this semiconductor reality connects to your own industry, your company’s strategy, or the shape of the next decade. If that sounds like you, subscribe below so you don’t miss it.

Moats are not permanent. They are directional. The question is not whether this moat will last forever — it’s whether it will last long enough, and whether you know which clock is already ticking.

This is the fourth article in the AI Compute Supply Chain series. Articles 1–4 cover: technical foundations, supply chain power mapping, SEC filing analysis, and this stress test. Article 5 — on the broader industrial transformation and what the AI infrastructure buildout means for industries beyond semiconductors, through 2030 — is in progress.

About the AuthorSinclair Huang is a senior advisor and researcher focused on semiconductors, AI infrastructure, and supply chain strategy. He has held leadership roles across the electronics and biotech industries for more than twenty-five years, working across product, procurement, and operations. His current research examines how technological bottlenecks, capital allocation, and institutional structures shape competitive advantage in the AI era.

Further Reading — AI Compute Supply Chain Series- Part 1 — Technical Foundations

CoWoS, HBM, and ABF: what they are, how they work, and why they became the bottlenecks in AI compute.- Part 2 — Power Map The Real AI Supply Chain: A Power Map Beyond the GPU — mapping where pricing power actually sits upstream of the GPU.- Part 3 — SEC Filings How Deep Is the Moat? Reading TSMC, SK Hynix, and Micron Through Their SEC Filings(This Article)- Part 4 — Stress Test Stress-Testing the Moat: Four Threats That Could Rewrite the AI Supply Chain — algorithmic compression, HBM demand reversal, geopolitical concentration, and glass substrates.- Working Paper — SSRN The Architecture of Leverage: Structural Concentration and Competitive Moats in the AI Compute Supply Chain — an extended working paper with an Irreplaceability Index (IRI), patent portfolio analysis, and technology roadmap.### References- Taiwan Semiconductor Manufacturing Company Limited (TSMC), Form 20‑F and annual reports — customer concentration, prepayment liabilities, process mix, and capex guidance.- Micron Technology, Inc., Form 10‑K and Form 10‑Q filings — HBM revenue disclosure, customer concentration, and risk factor discussions.- SK Hynix Inc., earnings call transcripts and investor presentations — HBM4 architecture, “customer‑specific logic die,” and capacity allocation commentary.- Goldman Sachs and UBS equity research — projections on HBM market share, SK Hynix’s position in NVIDIA’s Rubin platform, and hyperscaler AI capex.- Academic and policy work on disclosure incentives, accounting conservatism, and risk reporting in regulatory filings.- Additional industry reports and primary documents are cited throughout the essay.### DisclaimerThis article is for informational and educational purposes only and does not constitute investment advice or a recommendation to buy or sell any security. The views expressed are solely those of the author and are based on information believed to be reliable at the time of writing, but no representation or warranty is made as to their accuracy or completeness. Any forward-looking statements, scenario analyses, or interpretations of regulatory filings are inherently uncertain and may change without notice. Readers should conduct their own research and, if necessary, consult a qualified financial advisor before making investment decisions.

Hashtags#ArtificialIntelligence #Semiconductors #SupplyChain

#TSMC #Micron #SKHynix #SECFilings #Investing #AIInfrastructure #PricingPower