The RAM shortage could last years
54 points by omer_k 7 hours ago | 59 comments

stuxnet79 6 hours ago
Ok so Samsung, SK Hynix and Micron do not have the capacity to meet demand. Also, what little capacity they do have they are allocating to HBM over DRAM. Based on my limited knowledge HBM can not be easily repurposed for consumer electronics. Translation: main street is cooked for the next 3-4 years.

It doesn't stop there though. OpenAI is currently mired in a capital crunch. Their last round just about sucked all the dry powder out of the private markets. Folks are now starting to ask difficult questions about their burn rate and revenue. It is increasingly looking like they might not commit to the purchase order they made which kick-started this whole panic over RAM.

Soo ... how sure are we that the memory makers themselves are not going to be the ones holding the bag?

reply
torginus 5 hours ago
The Radeon VII came out in 2019 as a $700 consumer GPU with an 1TB/s HBM2 memory subsystem which is more than any consumer GPU you can get today, including the high-end ones afaik. At that point in time, there was a whole lineup of AMD GPUs with HBM going down into the midrange.

If they could make this stuff and sell it to regular people a decade ago for very palatable prices, why do they come up with the idea that this is the technology of the gods, unaffordable by mere mortals?

reply
cco 4 hours ago
That card only had 16GB of memory; its memory bandwidth was 1TB/s.
reply
imtringued 4 hours ago
You're saying this in a world where AMD's highest end consumer GPU in 2026 is also limited to 16 GB.
reply
thehamkercat 2 minutes ago
RX7900 XTX has 24GB
reply
xbmcuser 5 hours ago
I am betting the pendulum swings faster to the other side to excess capacity as all the construction lies of Altman fall through with financiers waking up the the fact they can't build the infrasctructure as fast nor make any profits on that infrastructure that will get built.
reply
Cthulhu_ 5 hours ago
To add a more local hurdle as well, the Dutch power grid is at capacity and its managing company is now telling companies that planned to build a datacenter that they can't be connected to the grid until 2030, even though said companies already paid for and got guarantees about that connection.

That is, memory capacity is reserved for datacenters yet to be built, but this will do weird things if said datacenter construction is postponed or cancelled altogether.

reply
consp 5 hours ago
That guarantee is not as much of a guarantee as stated in the media. You get a guarantee it will be planned at a certain time (as in looked at), not that it will be build. The cost of doing business is taking risks and mitigating them. There is a reason the nuclear plant in Borsele was build: an aluminium smelter. Maybe you should arrange for something similar as a datacenter (no politician will fall on a sword for that but you can try). The (original) power draw is about the same 80-100MW.
reply
naveen99 5 hours ago
But wouldn’t you rather hbm prices come down first ? Memory makers will be fine. There is practically infinite demand. Unless you get china style rationing of compute per person world wide.

The real issue is everyone wanting to upgrade to hbm, ddr5, and nvme5 at the same time.

reply
kubb 6 hours ago
I would expect that OpenAI gets as much money as they ask for for the next 10 years.

There’s virtually infinite capital: if needed, more can be reallocated from the federal government (funded with debt), from public companies (funded with people’s retirement funds), from people’s pockets via wealth redistribution upwards, from offshore investment.

They will be allowed to strangle any part of the supply chain they want.

reply
torginus 5 hours ago
China already has a well developed DRAM industry, as DRAM is somewhat easier than logic, and can tolerate a much higher defect rate. The industry will figure this out.

Another point is I often see the money argument - like country X has more money, so they can afford to do more and better R&D, make more stuff.

This stuff comes out of factories, that need to be built, the machinery procured, engineers trained and hired.

reply
IshKebab 5 hours ago
Maybe if they had no competitors...
reply
saidnooneever 5 hours ago
love that theres virtually infinite capital there. meanwhile in the rest of the world there is virtually no food.
reply
SlinkyOnStairs 5 hours ago
I think you're massively overestimating how much money is really accessible here. The parent comment's right that all of the easily available VC & private equity investment is basically used up. OpenAI was struggling to sell $600M of private equity, the big multi-billion dollar investment packages had lots of conditions and non-cash in it.

> more can be reallocated from the federal government (funded with debt)

While this is the most reliable funding, it's still not very accessible. OpenAI is a money pit, and their demands are growing quickly. The US government has started a bunch of very expensive spending. If OpenAI were to require yearly bundles of it's recent "$120B" deal, that's 6% of the US' discretionary budget. 12.5% of the non-military discretionary budget. (And the military is going to ask for a lot more money this year) Even the idea of just issuing more debt is dubious because they're going to want to do that to pay for the wars that are rapidly spiralling out of control.

None of this is saying that the US government can't or wouldn't pay for it, but it's non trivial and it's unclear how much Altman can threaten the US government "give me a trillion dollars or the economy explodes" without consequences.

Further deficit-spending isn't without it's risks for the US government either. Interests rates are already creeping up, and a careless explosion of deficit may well trigger a debt crisis.

> from public companies (funded with people’s retirement funds)

This would be at great cost. OpenAI would need to open up about it's financial performance to go public itself. With it's CFO being put on what is effectively Administrative Leave for pushing against going public, we can assume the financials are so catastrophic an IPO might bomb and take the company down with it. Nobody's going to be investing privately in a company that has no public takers.

Getting money through other companies is also running into limits. Big Tech has deep pockets but they've already started slowing down, switching to debt to finance AI investment, and similarly are increasingly pressured by their own shareholders to show results.

> from people’s pockets via wealth redistribution upwards

The practical mechanism of this is "AI companies raise their prices". That might also just crash the bubble if demand evaporates. For all the hype, the productivity benefit hasn't really shown up in economy-wide aggregates. The moment AI becomes "expensive", all the casual users will drop it. And the non-casual users are likely to follow. The idea of "AI tokens" as a job perk is cute, but exceedingly few are going to accept lower salary in order to use AI at their job.

There's simply not much money to take out of people's pockets these days, with how high cost of living has gotten.

> from offshore investment.

This is a pretty good source of money. The wealthy Arabian oil states have very deep slush funds, extensively investing in AI to get ties to US businesses and in the hope of diversifying their resource economies.

...

...

"Was". Was a good source of money.

reply
kubb 4 hours ago
I'm genuinely curious to find out how many billions they get every year from now.
reply
mschuster91 6 hours ago
> Soo ... how sure are we that the memory makers themselves are not going to be the ones holding the bag?

We aren't. The remaining memory manufacturers fear getting caught in a "pork cycle" yet again - that is why there's only the three large ones left anyway.

reply
moffkalast 6 hours ago
Memory makers did get themselves into this situation by selling all wafers for empty promises and alienating everyone but OpenAI tbh. I do hope they end up holding the bag once again, cause after covid and the cartel thing they don't seem to ever learn their lesson on how to have the tiniest amount of integrity.
reply
gck1 5 hours ago
While we're giving away bags, I'd like HDD manufacturers to get some too.
reply
Rekindle8090 36 minutes ago
This will result in demand destruction which will starve the enterprise which will starve the hyperscaler. theres no situation where people not being able to afford hardware for 4 years results in the bubble not popping
reply
hsbauauvhabzb 5 hours ago
The people who fucked over consumers are left holding the back that they sold us out over?

Oh no!

reply
VladStanimir 5 hours ago
They won't be, prices are high because they are refusing to build capacity for demand that may evaporated by the time they are done. They are holding back and building only enough so when the bubble pops they will be fine.
reply
DoctorOetker 3 hours ago
So the ML hate is weaponized in the form of memory demand collapse FUD, and the public at large has to pay through their nose for it... thanks party poopers!
reply
fouc 6 hours ago
I'm a bit surprised the article makes no mention of Google's TurboQuant[0] introduced 26 days prior.

Given that TurboQuant results in a 6x reduction in memory usage for KV caches and up to 8x boost in speed, this optimization is already showing up in llama.cpp, enabling significantly bigger contexts without having to run a smaller model to fit it all in memory.

Some people thought it might significantly improve the RAM situation, though I remain a bit skeptical - the demand is probably still larger than the reduction turboquant brings.

[0] https://news.ycombinator.com/item?id=47513475

reply
gajjanag 2 hours ago
TurboQuant is known across the industry to not be state of the art. There are superior schemes for KV quant at every bitrate. Eg, SpectralQuant: https://github.com/Dynamis-Labs/spectralquant among many, many papers.

> Given that TurboQuant results in a 6x reduction in memory usage for KV caches

All depends on baseline. The "6x" is by stylistic comparison to a BF16 KV cache; not a state of the art 8 or 4 bit KV cache scheme.

reply
lhl 5 hours ago
BTW, a number of corrections. The TurboQuant paper was submitted to Arxiv back in April 2025: https://arxiv.org/abs/2504.19874

Current "TurboQuant" implementations are about 3.8X-4.9X on compression (w/ the higher end taking some significant hits of GSM8K performance) and with about 80-100% baseline speed (no improvement, regression): https://github.com/vllm-project/vllm/pull/38479

For those not paying attention, it's probably worth sending this and ongoing discussion for vLLM https://github.com/vllm-project/vllm/issues/38171 and llama.cpp through your summarizer of choice - TurboQuant is fine, but not a magic bullet. Personally, I've been experimenting with DMS and I think it has a lot more promise and can be stacked with various quantization schemes.

The biggest savings in kvcache though is in improved model architecture. Gemma 4's SWA/global hybrid saves up to 10X kvcache, MLA/DSA (the latter that helps solve global attention compute) does as well, and using linear, SSM layers saves even more.

None of these reduce memory demand (Jevon's paradox, etc), though. Looking at my coding tools, I'm using about 10-15B cached tokens/mo currently (was 5-8B a couple months ago) and while I think I'm probably above average on the curve, I don't consider myself doing anything especially crazy and this year, between mainstream developers, and more and more agents, I don't think there's really any limit to the number of tokens that people will want to consume.

reply
fy20 5 hours ago
The work going into local models seems to be targeting lower RAM/VRAM which will definately help.

For example Gemma 4 32B, which you can run on an off-the-shelf laptop, is around the same or even higher intelligence level as the SOTA models from 2 years ago (e.g. gpt-4o). Probably by the time memory prices come down we will have something as smart as Opus 4.7 that can be run locally.

Bigger models of course have more embedded knowledge, but just knowing that they should make a tool call to do a web search can bypass a lot of that.

reply
tuetuopay 5 hours ago
The net effect won’t be a memory use reduction to achieve the same thing. We’ll do more with the same amount of memory. Companies will increase the context windows of their offerings and people will use it.

That is the sad reality of the future of memory.

reply
ehnto 5 hours ago
I am not convinced that more context will be useful, practical use of current models at 1mil context window shows they get less effective as the window grows. Given model progress is slowing as well, perhaps we end up reaching a balance of context size and competency sooner than expected.
reply
tuetuopay 3 hours ago
Stuff in more code. Stuff in more system prompt. Stuff in raw utf8 characters instead of tokens to fix strawberries. Stuff in WAY more reasoning steps.

Given the current tech, I also doubt there will be practical uses and I hope we’ll see the opposite of what I wrote. But given the current industry, I fully trust them so somehow fill their hardware.

Market history shows us than when the cost of something goes down, we do more with the same amount, not the same thing with less. But I deeply hope to be wrong here and the memory market will relax.

reply
Bombthecat 6 hours ago
You still need to hold the model in memory. If you have for example 16 GB ram, the gains aren't that much
reply
anon373839 6 hours ago
That's not what consumes the most memory at scale. The KV caches are per-user.
reply
WesolyKubeczek 6 hours ago
You can still use as much memory, but fit more things into it, so I don’t think the current market hogs will let go easily.
reply
muyuu 4 hours ago
that will only increase the demand for RAM as models will now be usable in scenarios that weren't feasible prior, and the ceiling for model and context size is not even visible at this point

I hate to mention Jevons paradox as it has become cliche by now, but this is a textbook such scenario

reply
WingEdge777 6 hours ago
[dead]
reply
chintech2 6 hours ago
I'm a bit surprised the article makes no mention of China's new memory companies.

[0] https://techwireasia.com/2026/04/chinese-memory-chips-ymtc-c...

reply
tim-projects 6 hours ago
The era of optimisation is finally here. I'm excited.
reply
jeff_vader 6 hours ago
Wait until China invades Taiwan.. (ok, it's not too likely, but what if?)
reply
Renaud 6 hours ago
I think RAM shortages would be the least of our problems…

Assuming China takes TSMC in one piece (unlikely without internal sabotage in the best case scenario), it would still probably take years before it produces another high end GPU or CPU.

We would probably be stuck with the existing inventory of equipment for a long time…

reply
necovek 6 hours ago
I am surprised we consider TSMC like a natural resource: isn't it really a combination of know-how and build-out according to that know-how? If smarts leave the country, perhaps this moves with them.

The risk with China taking over Taiwan is that they mostly expedite their own production research by a couple of years.

reply
ndepoel 6 hours ago
It kinda does resemble a natural resource though. The machines and technology in use at TSMC are so insanely complex, that there isn't a single person on earth who knows everything about how it works. TSMC functions only because of all of the pieces of the puzzle being together in the right place and arranged in just the right way. It's a very fragile balance that keeps it all running, and a major disruption could mean we get thrown back by a decade in chip-making technology.
reply
danaris 5 hours ago
What you say is absolutely true, and is a serious problem—but the way our system operates does not allow us to correct for it.

Anyone trying to spin up a competitor to TSMC would have to first overcome a significant financial hurdle: the capital investment to build all the industrial equipment needed for fabrication.

Then they'd have to convince institutions to choose them over TSMC when they're unproven, and likely objectively worse than TSMC, given that they would not have its decades of experience and process optimization.

This would be mitigated somewhat if our institutions had common-sense rules in place requiring multiple vendors for every part of their supply chain—note, not just "multiple bids, leading to picking a single vendor" but "multiple vendors actively supplying them at all times". But our system prioritizes efficiency over resiliency.

A wealthy nation-state with a sufficiently motivated voter base could certainly build up a meaningful competitor to TSMC over the course of, say, a decade or two (or three...). But it would require sustained investment at all levels—and not just investment in the simple financial sense; it requires people investing their time in education and research. Dedicating their lives to making the best chips in the world. And the only reason that would work is that it defies our system, and chooses to invest in plants that won't be finished for years, and then pay for chips that they know are inferior in quality, because they're our chips, and paying for them when they're lower quality is the only way to get them to be the best chips in the world.

reply
mcwhy 5 hours ago
the scientists will switch sides with minimal issues, like they did after WWII
reply
rzmmm 6 hours ago
It seems that RAM manufacturers are still reluctant to increase production. They know something that investors don't about long term RAM demands?
reply
danaris 6 hours ago
The same thing everyone who's paying attention to the real world (and not the financial fantasy world) does: that OpenAI's purchase commitments are wildly unrealistic and unsustainable.
reply
hsbauauvhabzb 5 hours ago
What’s the lose scenario for them? They’re basically a cartel, and you need ram irregardless. If they make less it’s still a cost:demand, just not the most optimal for them. They’ve done that math, and figure this is the best risk and reward for them. Your goodwill or opinion doesn’t matter to them, because you need them more than they need you.
reply
consp 5 hours ago
> They’re basically a cartel,

The lawsuits in the past prove that statement to not be basically but actually.

reply
sph 4 hours ago
I fear the author and most commenters are not aware of the law of demand and supply. If there is demand for consumer RAM, there will be supply for consumer RAM. It just takes time and risk-assessment to scale up operations.

We have RAM shortage now, we will have very cheap RAM tomorrow. It’s not like production is bottlenecked by raw materials. Chip companies just need to assess if the demand by AI companies will last so it’s better to scale up, or perhaps they should wait it out instead of oversupplying and cutting into their profits.

reply
rt56a 4 hours ago
We're talking about advanced semiconductor manufacture. It takes years and 100s millions to billions of dollars to scale up operations. That's something you don't do unless you know there's demand to sustain it in future.
reply
eulgro 55 minutes ago
The law of supply and demand works in a perfect competition market.

There are two RAM suppliers...

reply
lousken 5 hours ago
If only we have not allowed oligopolies to exist. Meanwhile, EU is not in the race at all and US has very few fabs.
reply
Hamuko 5 hours ago
I'm personally hoping that one of the AI or data center companies is suddenly unable to pay for their bills and deflate the entire industry. Probably the only hope of things getting better before the 2030s.
reply
tuetuopay 5 hours ago
That’s likely to happen if all the talks about OpenAI pulling out of their wafer deals are true.
reply
tomaytotomato 6 hours ago
I just checked my gaming PC I built a few years ago with 64GB of DDR5 RAM, its actually gone up in value, that is unheard of generally.

Think I will scrap my PC and sell its parts.

I wonder if there are any niche companies building decent rigs with DDR3 and 5/6th generation Intel CPUs out there, it is cheap and might be a business opportunity?

reply
theandrewbailey 4 hours ago
I work at an e-waste recycling company. I have several dozen trays of RAM in my inventory, ~90% of it DDR3. DDR3 was selling as of a month ago, but I haven't tried to sell any RAM since. I'm looking forward to doing a huge one this week.
reply
Gud 6 hours ago
Thank god they shut down 3D XPoint.
reply
WesolyKubeczek 6 hours ago
I fear that the real reason we do have a shortage, I mean, the real reason for the demand, is AI companies scooping what they can so that their competitors, whether existing or incumbent, can’t get to it.
reply
tuetuopay 5 hours ago
This was one of the theories behind the wafer buyout by OpenAI indeed. Pretty efficient way to make everyone panic and cut off of new hardware.
reply
ochre-ogre 6 hours ago
can't read the article due to a paywall.
reply