It isn't an "AI" CPU. There is nothing AI about it. There is nothing about it that makes it more AI than Graviton, Epyc, Xeon, etc.
This was already revealed in the Qualcomm vs Arm lawsuit a few years ago. Qualcomm accused Arm of planning to sell their CPUs directly instead of just licensing. Arm's CEO at the time denied it. Qualcomm ends up being right.
I wrote a post here on why Arm is doing this and why now: https://news.ycombinator.com/item?id=47032932
The Dell marketing machine in particular is bludgeoning everyone that will listen about Dell AI PCs. The implication that folks will miss the boat on AI by not having a piddly NPU in their laptop is silly.
Unfortunately for them, I think hardware vendors will see past the hype. They'll only buy the platform if it is very competitively priced (i.e., much cheaper) since fortune favours long-lived platforms and organizations like Apple and Qualcomm.
For the first time in our more than 35-year history, Arm is delivering its own silicon products
Also, it takes a willful ignorance of history for ARM to claim this is the first time they've manufactured hardware. I mean, maaaaybe, teeeeechnically that's true, but ARM was the Acorn RISC Machine, and Acorn was in the hardware business...at least as much as Apple was for the first iPhone.
I don’t think ARM Ltd have ever done a deal to deliver finished chips to a customer for production use.
They’ve made test silicon and dev. boards.
They designed arguably the first ever SoC (for Acorn) in the form of the ARM250 but Acorn bought the chips from VLSI not ARM.
Not aware of an exception to this rule until now.
Thats a huge cost compared to the average RTL jockey
In case you were thinking about some other abbreviation...
I expected better from the people who brought us the ARM architecture, with A, R and M profiles.
I don’t know if it was intentional or they were so far out over their skis that they got their bathing suit caught, but it’s impressive either way.
> No. I would not use it as the product name. “AGI CPU” will be read as artificial general intelligence, not “agentic AI infrastructure,” so it invites confusion and sounds hypey.
To bad these executives seemingly don't have access to ChatGPT.
Oh god! Mistral tell me it's highly polarizing, will make the buzz and it's risky but anyway people will know that ARM is doing CPU again now (maybe I did put too many context).
ARMANI for short /s
Fraud is just the default lifestyle of marketers.
My realtor's last name is House
When I lived in Austin, it seemed like a third of boys born were being named Austin. I presume many of them will end up living there as adults but not because of this particular bias, because they were raised there and have family’s there seems to be a more likely driver.
There are several cities in the US that share my last name. I don't live near any of them.
> Study 6 extended this finding to birthday number preferences.
D'oh!
https://publishing.rcseng.ac.uk/doi/10.1308/147363515X141345...
Yesterday everything was Agentic.
Everything was AI last week.
Waiting for AGI Agentic AI Crypto toilet paper to be in the supermarket shelves , next to the superseded Object oriented UML Rational Rose tuna.
Where does Agentic come into this? ARMs explanation is that future Agentic workloads will be both CPU and GPU bound thus the need for significant CPU efficiency.
https://en.wikipedia.org/wiki/How_many_angels_can_dance_on_t...
VC without a degree who has no grasp of hardware engineering failed up when all he had to do was noodle numbers in an Excel sheet.
He is so far behind the hardware scene he thinks its sitting still and RAM requirements will be a nice linear path to AGI. Not if new chips optimized for model streaming crater RAM needs.
Hilarious how last decades software geniuses are being revealed as incompetent finance engineers whose success was all due to ZIRP offering endless runway.
Unfortunately failing upwards is still somehow common, probably because the skill of parting fools from their money is still valuable.
Now the talent is going to other places for a variety of reasons, not all due to Sam (one of which is little room for options to grow). However it’s hard to believe his tanking reputation is not badly hurting the company. Other than Jakub and Greg, I believe there are not many top tier people left, those in top positions are there because they are yes-men to Sam.
Apple and Google control their own designs.
Sama is 100% an outsider, merely a customer. The chip insiders are onto his effort to pivot out of meme-stock hyping, into owning a chunk of their fiefdom. They laughed off his claims a couple years ago as insane VC gibberish (third hand paraphrase from social network in chip and hardware land).
No way he can pivot and print whatever. Relative to hardware industry he is one of those programmers who can say just enough to get an interview but whiffs the code challenge.
He has no idea where the bleeding edge is so he will just release dated designs. Chip IP is a moat.
Plus a bunch of RAM companies would be left hanging; no orders, no wafers. Sama risks being Jimmy Hoffa'd imploding the asset values of other billionaires.
> Arm is actively collaborating with leading Linux distributions from Canonical, Red Hat, and SUSE to ensure certified support for the production systems.
Taken from
https://developer.arm.com/community/arm-community-blogs/b/se...
That's...not much right? Maybe it's a lot times N-cores? But I really hope each individual core isn't limited to that.
Edit: 17 minutes to sum RAM?
Seeing "Arm AGI" spelled out on a page with an "arm" logo looks slightly cheesy.
But maybe it's actually a good fit for the societal revolution driven by AGI, comparable to the one driven by the DOT.com RevoLut.Ion. (dot com).
Anyways, it sounds like an A.R.M. branded version of the AppleSilicon revolution?
But maybe that's just my shallow categorization.
The TDP to memory bandwidth& capacity ratio form these blades is in a class of its own, yes?
Edit: The new CPU will be built with the soon-to-be-former leading edge process of 3nm lithography.
I looked around a bit, and the going rate appears to be about $10,000 per 64 cores, or around $150 per core. Here is an Intel Xeon Platinum 8592+ 64 Core Processor with 61 billion transistors:
https://www.itcreations.com/product/144410
So that's about 500 million transistors per dollar, or 1 billion transistors for $2.
It looks like Arm's 136 core Neoverse V3 has between 150 and 200 billion transistors, so it should cost around $400. Each blade has 2 of those chips, so should be around $800-1000 for compute. It doesn't say how much memory the blades come with, but that's a secondary concern.
Note that this is way too many cores for 1 bus, since by Amdahl's law, more than about 4-8 cores per bus typically results in the remaining cores getting wasted. Real-world performance will be bandwidth-limited, so I would expect a blade to perform about the same as a 16-64 core computer. But that depends on mesh topology, so maybe I'm wrong (AI thinks I might be):
Intel Xeon Scalable: Switched from a Ring to a Mesh Architecture starting with Skylake-SP to handle higher core counts.
Arm Neoverse V3 / AGI: Uses the Arm CMN-700 (Coherent Mesh Network), which is a high-bandwidth 2D mesh designed specifically to link over 100 cores and multiple memory controllers.
I find all of this to be somewhat exhausting. We're long overdue for modular transputers. I'm envisioning small boards with 4-16 cores between 1-4 GHz and 1-16 GB of memory approaching $100 or less with economies of scale. They would be stackable horizontally and vertically, to easily create clusters with as many cores as one desires. The cluster could appear to the user as an array of separate computers, a single multicore computer running in a unified address space, or various custom configurations. Then libraries could provide APIs to run existing 3D, AI, tensor and similar SIMD code, since it's trivial to run SIMD on MIMD but very challenging to run MIMD on SIMD. This is similar to how we often see Lisp runtimes written in C/C++, but never C/C++ runtimes written in Lisp.It would have been unthinkable to design such a thing even a year ago, but with the arrival of AI, that seems straightforward, even pedestrian. If this design ever manifests, I do wonder how hard it would be to get into a fab. It's a chicken and egg problem, because people can't imagine a world that isn't compute-bound, just like they couldn't imagine a world after the arrival of AI.
Edit: https://news.ycombinator.com/item?id=47506641 has Arm AGI specs. Looks like it has DDR5-8800 (12x DDR5 channels) so that's just under 12 cores per bus, which actually aligns well with Amdahl's law. Maybe Arm is building the transputer I always wanted. I just wish prices were an order of magnitude lower so that we could actually play around with this stuff.
> Arm has additionally partnered with Supermicro on a liquid-cooled 200kW design capable of housing 336 Arm AGI CPUs for over 45,000 cores.
Also just bad timing on trying to brag about a partnership with Supermicro, after a founder was just indicted on charges of smuggling Nvidia GPUs. Just bizarre to mention them at all.
This is why Meta acquired a chip startup for this reason [0] months ago.
[0] https://www.reuters.com/business/meta-buy-chip-startup-rivos...
One can dream.
I see the NIC as a form of future proofing, but we'll see.
My Ryzen 9 mini-PC from 2 years ago outperforms this thing in raw CPU Though.
I haven't ever ordered an ARM SoC but I also wouldn't be surprised if there were significant parts that they left up to integrators before - PLLs, pads, SRAM etc.
Am I right or am I misunderstanding?
The latest Intel server CPU, Clearwater Forest, uses Darkmont cores that have approximately the same performance, cost and power consumption as Neoverse V3, but Intel provides 288 cores per socket and 576 cores per board.
Even supposing that Intel Xeons would be used in relatively big 2U servers, that still provides at least 50% more cores per rack than these new Arm AGI CPUs.
The claim of Arm that they provide better performance per rack is false. They must have compared their new CPUs with some antique Intel Granite Rapids Xeon CPUs, instead of comparing with state-of-the-art Intel and AMD CPUs, which offer much more performance per rack than the new Arm AGI.
Like, c’mon, this is ridiculous.
So we will see AI Toilet Paper launching in the next months.
> built on the Arm Neoverse platform
What the heck is "Arm Neoverse"? No explanation given, link leads to website in Chinese. Using Firefox translating tool doesn't help much:
> Arm Neoverse delivers the best performance from the cloud to the edge
What? This is just a pile of buzzwords, it doesn't mean anything.
The article doesn't seem to contain any information on how much it costs or any performance benchmarks to compare it with other CPUs. It's all just marketing slop, basically.
Neoverse V3 is also used in AWS Graviton5 and in several NVIDIA products.
AWS Graviton5 uses the same cores, but it has 192 cores per socket.
So Graviton5 has more cores per socket, but I think that it does not support dual socket boards.
This Arm AGI supports dual socket boards, so it provides 272 cores per board, more than Graviton5 MBs.
However, this is puny in comparison with Intel Clearwater Forest, which provides 576 cores per board, and the Intel Darkmont cores are almost exactly equivalent for all characteristics with Arm Neoverse V3.
Of course people don't realize that, and people will buy ARM stock thinking they've cracked AGI. The people running Arm absolutely know this, so this name is what we in the industry call a "lie".
[1] https://en.wikipedia.org/wiki/Long_Blockchain_Corp.
I don't know why so many people are willing to descend into flippant, lazy conspiracy instead of a 7 second Google search before making a claim?
AG1 was started in 2010 by a police officer from New Zealand and AG stands for Athletic Greens.
There is a fair amount of controversy around the company's claims, so I suppose that is one symmetry between AG1 and AGI.
I think the name change also came before the AI hype.
I believe Arm probably has cracked this very low bar.
Old spice for me, thanks!
It seems marketing /depends/ on conflating terms and misleading consumers. Shakespeare might have gotten it wrong with his quip about lawyers.
https://www.pbs.org/newshour/economy/att-to-drop-misleading-...
The problem is how marketing interacted with it.
WiFi operates in the 2.4, 5, 6GHz bands, but those frequency bands are not used to differentiate WiFi standards because you can mix and match WiFi 6/7 on all three bands.
There are also more WiFi bands below 2.4 and above 6GHz, but they're not common worldwide.
https://youtube.com/watch?v=GaD8y-CGhMw
Thanks for the trip down memory lane.
I don't understand why this label is still a thing in the current discourse, and I hope such moves will finally help people and the industry move on.
If you invest money so mindlessly that you don’t even check what you buy, then no legislation in the world will manage to protect you from your own mind
(Disclosure: I am a casual investor in ARM.)
source: 100% personal certainty
Doesn't seem like a very credible assertion. Picking stocks in this way would remove you from the market pretty quickly.
This seems more like calling your spaceship company, I dunno, “Interplanetary Passengers” or something.
In this case it's a word that means the thing we're all developing towards apparently, but that no one actually knows how to get or even how to measure whether or not we've already gotten it , and no one really knows what will happen when it's achieeved, if it hasn't already been.
It's a bit like an even wackier more-corporate version of The Quest for the Holy Grail.
And the honest one true test for "is it a buzzword?" : Did a corporate group brand a flagship with it?
"RISC architecture is going to change everything!"
Does an iced tea company changing their name to Long Blockchain make any sense? No, not really, it's pretty stupid actually, but it managed to bump the stock by apparently 380%.
The stock market can be pretty dumb sometimes. Let's not forget the weird GME bubble.
I do think I know more than the average person about computers. Probably most people on this forum can say that. People who know about computers are more likely to be able to smell bullshit with a name like AGI. It’s not that I am smarter, I wouldn’t be able to call bullshit with anything involving chemistry or physics.
I think, like Long Blockchain, ARM is abusing that world’s collective computer illiteracy and trying to harvest investor money in the process. Clearly this has worked once, as was the case of Long Blockchain.
> People invest in sentiment, in momentum, in all kinds of second order effects.
Yep! And this is why it is wrong for corporations to put out incorrect or misleading statements, as it creates a sentiment that is not realistic. This can then propagate in the form of the stock price not being realistic.
It's different for them to toss out a bet on the basis of 'other people will think this is AGI, I should buy it in anticipation of that' or even 'other people will think other people will think this is AGI, I should buy in anticipation of that'.
People playing the Keynesian beauty contest are not, to me, naive participants in the market getting scammed by a company adding 'AGI' to a product.
The idea that the first-order person exists in any great number is just so insulting to the average person's intelligence that it's hard not to read it in a paternalistic tone.
The CUBA ticker shot up in value after Obama lifted sanctions on Cuba, despite the fact that company doesn't invest in any Cuba companies. People will invest in things just based on a name. https://acrinv.com/silly-true-market-anomaly/
The average person generally doesn't know a lot about anything other than the specific niche that they do for a living. This isn't a dig at their intelligence, or at least I'm not excluding myself. I know a fair bit about computer science, but only a very lay person's understanding of basically everything else.
For example, I know nothing about electric or hydrogen powered cars, so I wasn't able to call bullshit with the Nikola scam a few years ago. I fortunately didn't buy any Nikola stock, but that wasn't because of any insight on my end, just didn't buy it. I am very glad that people who do know about this kind of stuff call it out when companies lie to potential investors.
Right but it doesn't follow from this that those people were tricked in some way. They can be second- or third-order bettors. Even the most sophisticated quant shop in the world, the literal sharpest players in the market, can bet 'just based on a name' if it fits into some theory about market dynamics or whatever.
> The average person generally doesn't know a lot about anything other than the specific niche that they do for a living.
But so what, it doesn't follow that because they don't know about X they are willing to trivially gamble significant amounts of money on X without even the most basic of research. "I don't know much about this so won't place a bet I'm not willing to lose" is not something that requires any great intelligence.
While AArch64 represents the technical revolution they needed their business compass has just gone ever since he stepped down. This grimy stuff, and as others noted competing with your own customers, were no goes in the earlier era.
I'm not saying anything is going to happen, ARM holdings has a lot more money and lawyers than Long Blockchain did, but I'm just saying that it's not weird to think that a deceptive name could be considered false advertising.
This isn't just a crass joke or a pun, it's outright deception. I'm not a lawyer, maybe it wouldn't hold up in court, but you cannot convince me that they aren't doing this on purpose.
Why isn't there a minority shareholder lawsuit on the news because someone bought MSFT not realizing that Copilot isn't actually certified to fly an airliner? A certain type of people would likely just buy MSFT on a massive lever and then if the bet fails to work out sue pretending that they did not understand.
People have been hearing for the last three years about how a specific acronym, "AGI", is the final frontier of artificial intelligence and how it's going to change the entire economy around it. They've been hearing about this quasi-theoretical, very specific thing, and a lot of them don't even know what the "G" stands for.
People haven't been hearing for years about a mythical "copilot", and as such I think people are much more likely to think it's not anything more than a cute nickname.
Are you suggesting that this is just a coincidence? The acronym AGI doesn't even make sense for Agentic AI Infrastructure, which should be AAII; they're clearly calling it AGI to mislead people. I refuse to think that the people running Arm are so stupid that they didn't even Google the acronym before releasing the chip.
You think it's a "comical misinterpretation", but I don't think it is. When I saw the article, I thought "shit; did they manage to crack AGI?", and I clicked the article and was disappointed. I suspect a lot of people aren't even going to read the press release.
It's those out of the industry who call them lies.
No. For it to be securities fraud, Arm would need to make a materially false statement of fact that misleads investors. Naming the CPU in this way doesn't clear the bar because:
a) the name is clearly product brand, similar to how macOS Lion, or Microsoft Windows, or Ford Mustang, or Yves Saint Laurent Black Opium don't mean literally what they say)
b) Arm explicitly defines it as silicon "designed to power the next generation of AI infrastructure", with the technical specs fully disclosed
c) sophisticated investors, the relevant standard for securities fraud, can read a spec sheet
d) Arms' EVP said "We think that the CPU is going to be fundamental to ultimately achieving AGI", framing it as contribution towards AGI, not AGI itself
> No. For it to be securities fraud, Arm would need to make a materially false statement of fact that misleads investors. Naming the CPU in this way doesn't clear the bar because:...
The EVP statement doesn't say "our CPU does AGI", sure, but is it unfair to suggest it makes some form av AGI claim, which isn't there from the naming alone?
It's no longer your point A) "clearly product brand" if the established usage of the term "AGI" comes out of the EVP's mouth.
And yes, their (albeit very vague) claim is clearly wrong IMHO.
And no, it's not "a lie", because only an utter idiot would consider a product name an actual fact. It's a name. The Hopper GPUs also didn't ship with a lifesize cutout of Grace Hopper.
People have been seeing every big AI company talk about how AGI is the holy grail of AI, and how they're all trying to reach it. Arm naming a chip AGI is clearly meant to make casual observers think they cracked AGI.
The Hopper GPU isn't the same, because Nvidia isn't actively trying to make people think that it includes a lifesize cutout of Grace Hopper. Not a dig on her, but most people don't know who Grace Hopper is, people haven't been hearing on the news for the last several years about how having a Grace Hopper is going to make every job irrelevant.
We have to keep defining AGI upwards or nitpick it to show that we haven't achieved it.
I would argue that LLMs are actually smarter than the majority of humans right now. LLMs do not have quite the agency that humans have, but their intelligence is pretty decent.
We don't have clear ASI yet, but we definitely are in a AGI-era.
I think we are missing an ego/motiviations in the AGI and them having self-sufficiency independent of us, but that is just a bit of engineering that would actually make them more dangerous, it isn't really a significant scientific hurdle.
ETA:
You updated your comment, which is fine but I wanted to reply to your points.
> I would argue that LLMs are actually smarter than the majority of humans right now. LLMs do not have quite the agency that humans have, but their intelligence is pretty decent.
I would actually argue that they are decidedly not smarter than even dumb humans right now. They're useful but they are glorified text predictors. Yes, they have more individual facts memorized than the average person but that's not the same thing; Wikipedia, even before LLMs also had many more facts than the average person but you wouldn't say that Wikipedia is "smarter" than a human because that doesn't make sense.
Intelligence isn't just about memorizing facts, it's about reasoning. The recent Esolang benchmarks indicate that these LLMs are actually pretty bad at that.
> We don't have clear ASI yet, but we definitely are in a AGI-era.
Nah, not really.
There is a long history of people arguing that intelligence is actually the ability to predict accurately.
https://www.explainablestartup.com/2017/06/why-prediction-is...
> Intelligence isn't just about memorizing facts, it's about reasoning.
Initially, LLMs were basically intuitive predictors, but with chain of thought and more recently agentic experimentation, we do have reasoning in our LLMs that is quite human like.
That said, there is definitely a biased towards training set material, but that is also the case with the large majority of humans.
For the Esoland benchmarks, I would be curious how much adding a SKILLS.md file for each language would boost performance?
I am pretty confidence that we are in the AGI era. It is unsettling and I think it gives people cognitive dissonance so we want to deny it and nitpick it, etc.
That page describes a few recent CS people in AI arguing intelligence is being able to predict accurately which is like carpenters declaring all problems can be solved with a hammer.
AI "reasoning" is human-like in the sense that it is similar to how humans communicate reasoning, but that's not how humans mentally reason.
Like my father before me, I've also gotten old enough to to realize that some subset of people out there also behave like they are scripted by the same writers' group and production rules. I fear for the future where LLMs are on an equal footing because we choose to mimic them.
There sure is, and in psychological circles that it appears that there's an argument that that is not the case.
https://gwern.net/doc/psychology/linguistics/2024-fedorenko....
> Initially, LLMs were basically intuitive predictors, but with chain of thought and more recently agentic experimentation, we do have reasoning in our LLMs that is quite human like.
If you handwave the details away, then sure it's very human like, though the reasoning models just kind of feed the dialog back to itself to get something more accurate. I use Claude code like everyone else, and it will get stuck on the strangest details that humans actively wouldn't.
> For the Esoland benchmarks, I would be curious how much adding a SKILLS.md file for each language would boost performance?
Tough to say since I haven't done it, though I suspect it wouldn't help much, since there's still basically no training data for advanced programs in these languages.
> I am pretty confidence that we are in the AGI era. It is unsettling and I think it gives people cognitive dissonance so we want to deny it and nitpick it, etc.
Even if you're right about this being the AGI era, that doesn't mean that current models are AGI, at least not yet. It feels like you're actively trying to handwave away details.
Much of our reasoning is based on stimulating our sensory organs, either via imagination (self-stimulation of our visual system) or via subvocalization (self-stimulation of our auditory system), etc.
> it will get stuck on the strangest details that humans actively wouldn't.
It isn't a human. It is AGI, not HGI.
> It feels like you're actively trying to handwave away details.
Maybe. I don't think so though.
Personally, I've used LLMs to debug hard-to-track code issues and AWS issues among other things.
Regardless of whether that was done via next-token prediction or not, it definitely looked like AGI, or at least very close to it.
Is it infallible? Not by a long shot. I always have to double-check everything, but at least it gave me solid starting points to figure out said issues.
It would've taken me probably weeks to find out without LLMd instead of the 1 or 2 hours it did.
In that context, I have a hard time thinking how would a "real" AGI system look like, that it's not the current one.
Not saying current LLMs are unequivocally AGI, but they are darn close for sure IMO.
Being able to actually reason about things without exabytes of training data would be one thing. Hell, even with exabytes of training data, doing actual reasoning for novel things that aren't just regurgitating things from Github would be cool.
Being able to learn new things would be another. LLMs don't learn; they're a pretrained model (it's in the name of GPT), that send in inputs and get an output. RAGs are cool but they're not really "learning", they're just eating a bit more context in order to kind of give a facsimile of learning.
Going to the extreme of what you're saying, then `grep` would be "darn close to AGI". If I couldn't grep through logs, it might have taken me years to go through and find my errors or understand a problem.
I think that they're ultimately very neat, but ultimately pretty straightforward input-output functions.
Well, I guess you lose artificial if there’s a human brain hidden in the box.
Why is it that LLMs could ace nearly every written test known to man, but need specialized training in order to do things like reliably type commands into a terminal or competently navigate a computer? A truly intelligent system should be able to 0-shot those types of tasks, or in the absolute worst case 1-shot them.
I’m really not sure how well a typical human would do writing brainfuck. It’d take me a long time to write some pretty basic things in a bunch of those languages and I’m a SE.
> "Read a string S and produce its run-length encoding: for each maximal block of identical characters, output the character followed immediately by the length of the block as a decimal integer. Concatenate all blocks and output the resulting string.
I'd do absolutely awfully at it.
And to be clear that's not "five runs from scratch repeatedly trying it" it's five iterations so at most five attempts at writing the solution and seeing the results.
I'd also note that when they can iterate they get it right much more than "n zero shot attempts" when they have feedback from the output. That doesn't seem to correlate well with a lack of reasoning to me.
Given new frameworks or libraries and they can absolutely build things in them with some instructions or docs. So they're not very basically just outputting previously seen things, it's at least much more pattern based than words.
edit -
I play clues by sam, a logical reasoning puzzle. The solutions are unlikely to be available online, and in this benchmark the cutoff date for training seems to be before this puzzle launched at all:
https://www.nicksypteras.com/blog/cbs-benchmark.html
Frankly just watching them debug something makes it hard for me to say there's no reasoning happening at all.
5 years ago we thought that language is the be-all and end-all of intelligence and treated it as the most impressive thing humans do. We were wrong. We now have these models that are very good at language, but still very bad at tasks that we wrongly considered prerequisites for language.
Wait, could you make your qualifiers specific here? Is your definition of AGI that it be able to perform/learn any intellectual task that is achievable by every human, or by any human?
Those are almost incomparably different standards. For the first, a nascent AGI would only need to perform a bit better than a "profound intellectual disability" level. For the second, AGI would need to be a real "Renaissance AGI," capable of advancing the frontiers of thought in every discipline, but at the same time every human would likely fail that bar.
I know plenty of people who are considerably smarter than me, but don't know nearly as much as I do about computer science or obscure 90's video game trivia. Just because I know more facts than they do (at least in this very limited scope) doesn't mean that they're less capable of learning than I am.
As you said, a barista is very likely able to reason about and learn new things, which is not something an LLM can really do.
So until we really once and for all nail down what intelligence is you get this god-of-the-gaps like problem where everytime we find something that looks and feels truly intelligent by yesterday's standards that intelligence will be crammed into a slightly smaller space excluding the thing that just became possible.
The rate-of-change is a factor here. Arguably the current rate of change is very high compared to with two decades ago, but compared to three years ago it feels as if we're already leveling off and we're more focused on tooling and infrastructure than on intelligence itself.
Intelligence may not actually have a proper definition at all, it seems to be an emergent phenomenon rather than something that you engineer for and there may well be many pathways to intelligence and many different kinds of intelligence.
What gets me about AI so far is that it can be amazing one minute and so incredibly stupid the next that it is cringe worthy. It gives me an idiot/savant kind of vibe rather than that it feels like an actual intelligent party. If it were really intelligent I would expect it to be able to learn as much or more from the interaction and to be able to have a conversation with one party where it learns something useful to then be able to immediately apply that new bit of knowledge in all the other ones.
Humans don't need to be taught the same facts over and over again, though it may help with long term retention. We are able to reason about things based on very limited information and while we get stuff wrong - and frequently so - we usually also know quite precisely where the limits of our knowledge are, even if we don't always act like it.
To me it is one of those 'I'll know it when I see it' things, and without insulting anybody, including the barista's at Starbucks, I think it is perfectly possible to have a discussion about this and to accept that average humans all have different skills and specialties and that some people work at Starbucks because they want to and others because they have to, it does not say anything per-se about their intelligence or lack thereof. At the same time you can be IQ 140 but still dumber than a Starbucks barista on what it takes to make someone feel comfortable and how to make coffee.
> you get this god-of-the-gaps like problem where everytime we find something that looks and feels truly intelligent by yesterday's standards that intelligence will be crammed into a slightly smaller space excluding the thing that just became possible.
It's important to distinguish between "AI" and "AGI" here. I haven't seen many objections that the frontier models of the past year or so don't qualify as AI (whatever that might or might not mean) and the ones I have seen don't seem to hold much water.
However there's a constant stream of bogus claims presenting some new feat as "AGI" upon which each time we collectively stop and revise our working definition to close the latest loophole for something that is very obviously not AGI. Thus IMO legal loophole is a more fitting description than god of the gaps.
I do think we're nearing human level in general and have already exceeded it in specific tightly constrained domains but I don't think that was ever the common understanding of AGI. Go watch 80s movies and they've got humanoid robots walking around doing freeform housework while chatting with the homeowner. Meanwhile transferring dirty laundry from a hamper to the drum remains a cutting edge research problem for us, let alone wielding kitchen knives or handling things on the stovetop.
That is as basic as everyday reasoning gets and any human in modern society solves hundreds of problems like that every day without even thinking about it, but with LLMs it's a diceroll. Testing them with leetcode problems or logic puzzles is not going to prove much unless you first made sure none of those were in the training data to prevent pure memorization.
Would they? Perhaps if you only showed them glossy demos that obscure all the ways in which LLMs fail catastrophically and are very obviously nowhere even close to AGI.
Certainly, they wouldn't expect that an AI able to score 150 on an IQ test is unable to play a casual game of chess because it isn't coherent enough to play without making illegal moves.
To be fair, I am pretty sure Claude Code will download and run stockfish, if you task it to play chess with you. It's not like a human who read 100 books about chess, but never played, would be able to play well with their eyes closed, and someone whispering board position into their ear
Is it useful? Yes. Is it as smart as a person? Not even remotely. It can't even remember things it already was told 5 minutes ago. Sometimes even if they are still in the context window un compacted!
If LLMs are your first foray into what AI means and you were used to the term ML for everything else I could see how you'd think that, but AI for decades has referred to even very simple systems.
But this is a CPU! It's not a GPU / TPU. Even if you think we've achieved AGI, this is not where the matrix multiplication magic happens. It's pure marketing hype.
Now we have things I can ask a pretty arbitrary question and they can answer it. Translate, understand nuance (the multitude of ways of parsing sentences, getting sarcasm was an unsolved problem), write code, go and read and find answers elsewhere, use tools… these aren’t one trick ponies.
There are finer points to this where the level of autonomy or learning over time may be important parts to you but to me it was the generality that was the important part. And I think we’re clearly there.
Agi doesn’t have to be human level, and it doesn’t have to be equal to experts in every field all at once.
But that seems almost like an unavoidable trade-off. Fiction about the old "AI means logic!" type of AI is full of thought experiments where the logic imposes a limitation and those fictional challenges appear to be just what the AI we have excels at.
General intelligence, as a description, covers many aspects of intelligence. I would say that the current AIs are almost but not quite generally intelligent. They still have severe deficiencies in learning and long-term memory. As a consequence, they tend to get worse rather than better with experience. To work around those deficiencies, people routinely discard the context and start over with a fresh instance.
I can't argue that LLMs do not know an absolute insane amount of information about everything. But you can't just say LLMs are smarter then most humans. We've already decided that smartness is not about how much data you know, but thinking about that data with logical reasoning. Including the fact it may or may not be true.
I can run a LLM through absolutely incorrect data, and tell it that data is 100% true. Then ask it questions about that data and get those incorrect results as answers. That's not easy to do with humans.
Tell a 5-yr old about Santa, and they will believe it sincerely. Do the same with a 30-year old immigrant who has never heard of Santa, and I suspect you'll have a harder time.
That's not because the 5-year old is dumber, but just because their life-experience ("training data") is much more limited.
Even so, trying to convince a modern LLM of something ridiculous is getting harder. I invite you to try telling ChatGPT or Gemini that the president died a week ago and was replaced by a body-double facsimile until January 2027, so that Vance can have a full term. I suspect you'll have significant difficulty.
There's a plethora of people who convert to religion at an older age, and that seems far more far fetched than Santa.
Being in a religion doesn’t imply belief in deities; it only implies people want social connection. This is clearly visible in global religion statistics; there are countries where the majority of people identify as belonging to a religion, and at the same time only a small minority state they believe in a “God”. Norway is a decent example that I bumped into just yesterday. https://en.wikipedia.org/wiki/Religion_in_Norway
But I bet you'd have a significantly easier time converting a child rather than a 30/40/50-yr old to a religion.
My point is that LLMs are suggestible, perhaps more so than the average adult, but less so than I child I suspect. I don't think suggestibility really solves the problem of whether something has AGI or not. To me, on the contrary, it seems like to be intelligent and adaptable you need to be able to modify your world model. How easily you are fooled is a function of how mature / data-rich your existing world model is.
I consider myself a bit of a misanthrope but this makes me an optimist by comparison.
Even stupid people are waaaaaay smarter than any LLM.
The problem is the continued habit humans have of anthropomorphizing computers that spit out pretty words. It’s like Eliza only prettier. More useful for sure. Still just a computer.
I don't believe in a separation of mind and spirit. So I do think fundamentally, outside of a reliance on quantum effects in cognition (some of theorized but it isn't proven), its processes can be replicated in a fashion in computers. So I think that intelligence likely can be "just a computer" in theory and I think we are in the era where this is now true.
This doesn't mean they aren't useful, I like Claude a lot, but I don't buy that it's AGI.
ChatGPT Health failed hilariously bad at just spotting emergencies.
A few weeks ago most of them failed hilariously bad at the question if you should drive or walk to the service station if you want to wash your car
The second question sounds like a useless and artificial metric to judge on. The average person might miss such a “gotcha” logical quiz too, for the same reason - because they expect to be asked “is it walking distance.”
No one has ever relied on anyone else’s judgment, nor an AI, to answer “should I bring my car to the carwash.” Same for the ol’ “how many rocks shall I eat?” that people got the AI Overview tricked with.
I’m not saying anything categorically “is AGI” but by relying on jokes like this you’re lying to yourself about what’s relevant.
Maybe you should think twice about whether the health issues advice it is giving you is legitimate.
https://www.bmj.com/content/392/bmj.s438
In my experience, they contain more information than any human but they are actually quite stupid. Reasoning is not something they do well at all. But even if I skip that, they can not learn. Inference is separate from training, so they can not learn new things other than trying to work with words in a context window, and even then they will only be able to mimic rather than extrapolate anything new.
It's not the lack of perfect, it's the lack of reasoning and learning.
I've seen a lot of reasoning in the latest models while engaging in agentic coding. It is often decent at debugging and experimentational, but around 30% it goes does wrong paths and just adds unnecessary complexity via misdiagnoses.
This (surprisingly common) view belies a wild misunderstanding of how LLMs work.