It's easy to fall into a negative mindset because the justification is real and what we see is just the beginning.
Obviously we are not at a point where developers aren't needed. But One developer can do more. And that is a legitimate reason to higher less developers.
The impending reality of the upward moving trendline is that AI becomes so capable that it can replace the majority of developers. That future is so horrifying that people need to scaffold logic to unjustifiy it.
I honestly think the downvote button is pretty trash for online communities. It kills diversity of thought and discussion and leaves you with an echo chamber.
If you disagree with or dislike something, leave a response. Express your view. Save the downvotes for racism, calls for violence, etc.
Maybe in the future, platforms can have high quality auto moderation using AI to read every post and delete/flag those not following community guidelines.
I’m sure this would work well today, though not sure about the cost.
CEOs/decision makers would rather give all their labour budget to tokens if they could just to validate this belief. They are bitter that anyone from a lower class could hold any bargaining chips, and thus any influence over them. It has nothing to do with saving money, they would gladly pay the exact same engineering budget to Anthropic for tokens (just like the ruling class in times past would gladly pay for slaves) if it can patch that bitterness they have for the working class's influence over them.
The inference companies (who are also from this same class of people) know this, and are exploiting this desire. They know if they create the idea that AI progress is at an unstoppable velocity decision makers will begin handing them their engineering budgets. These things don't even have to work well, they just need to be perceived as effective, or soon to be for decision makers to start laying people off.
I suspect this is going to backfire on them in one of two ways.
1. French Revolution V2, they all get their heads cutoff in 15 years, or an early retirement on a concrete floor.
2. Many decisions makers will make fools of themselves, destroy their businesses and come begging to the working class for our labor, giving the working class more bargaining chips in the process.
Either outcome is going to be painful for everyone, lets hope people wake up before we push this dumb experiment too far.
> Competition will be dynamic because people have agency. The country that is ahead at any given moment will commit mistakes driven by overconfidence, while the country that is behind will feel the crack of the whip to reform. … That drive will mean that competition will go on for years and decades.
https://danwang.co/ (2025 Annual letter)
The future is not predetermined by trends today. So it’s entirely possible that the dinosaur companies of today can’t figure out how to automate effectively, but get outcompeted by a nimble team of engineers using these tools tomorrow. As a concrete example, a lot of SaaS companies like Salesforce are at risk of this.
Much like there is a premium for handmade clothing, and from scratch food. Automation does nothing but lower the value of your product (unless its absolutely required like electronics perhaps), when there is an alternative, the one made with human input/intention is always worth more.
And the idea that small nimble teams are going to outpace larger corporations is such a psyop. You really mostly hear CEOs saying these things on podcast. This is to appease the working class, to give them hope that they too one day can be a billionaire...
Also, the vast majority of people who occupy computer i/o focused jobs, whos jobs will be replaced, need to work to eat and they don't all want to go form nimble automated SaaS companies lmao, this is such a farce.. Bad things to come all around.
I know with respect to personal projects more projects are getting “funded” with my time. I’m able to get done in a couple of hours with coding agents what would’ve taken me a couple of weekends to finish if I stayed motivated to. The upshot is I’m able get much closer to “done” than before.
Right now you state the current problem is: "requiring my constant supervision and frequent intervention and always trying to sneak in subtle bugs or weird architectural decisions"
But in 2 years that could be gone too, given the objective and literal trendline. So I actually don't see how you can hold this opinion: "I'm not even freaking about my career, I'm freaking about how much today's "almost good" LLMs can empower incompetence and how much damage that could cause to systems that I either use or work on." when all logic points away from it.
We need to be worried, LLMs are only getting better.
But _what if_ they work out all of that in the next 2 years and it stops needing constant supervision and intervention? Then what?
We can synthesize answers to questions more easily, yes. We can make better use of extensive test suites, yes. We cannot give 1000 different correct answers to the same prompt. We cannot read minds.
If the answer is "yes"? Then, yeah, AI is not coming for you. We can make LLMs multimodal, teach them to listen to audio or view images, but we have no idea how to give them ESP modalities like mind reading.
If the answer is "no"? Then what makes you think that your inability to read minds beats that of an LLM?
Today's LLMs are fundamentally the same as any other machine we've built and there is no reason to think it has mystical sensibilities.
We really need to start making a differentiation between "intelligence" and "relevance". The AI can be perfectly intelligent, but without input from humans, it has no connection to our Zeitgeist, no source material. Smart people can be stupid, too, which means they are intelligent but disconnected from society. They make smart but irrelevant decisions just like AI models always will.
AI is like an artificial brain, and a good one, but humans have more to our intelligence than brains. AI is just a brain and we are more.
I'm just not sure who will end up employed. The near state is obviously jira driven development where agents just pick up tasks from jira, etc. But will that mean the PMs go and we have a technical PM, or will we be the ones binned? Probably for most SMEs it'll just be maybe 1 PM and 2 or so technical PMs churning out tickets.
But whatever. It's the trajectory you should be looking at.
Like I have compassion, but I can't healthily respect people who try so hard to rewrite reality so that the future isn't so horrifying. I'm a SWE and I'm affected too, but it's not like I'm going to lie to myself about what's happening.
They just want people to think the barrier of entry has dropped to the ground and that value of labour is getting squashed, so society writes a permission slip for them to completely depress wages and remove bargaining chips from the working class.
Don't fall for this, they want to destroy any labor that deals with computer I/0, not just SWE. This is the only value "agentic tooling" provides to society, slaves for the ruling class. They yearn for the opportunity to own slaves again.
It can't do most of your work, and you know that if you work on anything serious. But If C-suite who hasn't dealt with code in two decades, thinks this is the case because everyone is running around saying its true they're going to make sure they replace humans with these bot slaves, they really do just want slaves, they have no intention of innovating with these slaves. People need to work to eat, now unless LLMs are creating new types of machines that need new types of jobs, like previous forms of automation, then I don't see why they should be replacing the human input.
If these things are so good for business, and are pushing software development velocity.. Why is everything falling apart? Why does the bulk of low stakes software suck. Why is Windows 11 so bad? Why aren't top hedge funds, medical device manufactures (places where software quality is high stakes) replacing all their labor? Where are the new industries? They don't do anything novel, they only serve to replace inputs previously supplied by humans so the ruling class can finally get back to good old feeling of having slaves that can't complain.
The thing about spin and AI hype (besides being trivially easy to write) is that is isn't even trying to be objective. It would help if a lot of these articles would more carefully lay out what is actually surprising, and what is not, given current tech and knowledge.
Only a fool would think we aren't potentially on the verge of something truly revolutionary here. But only a fool would also be certain that the revolution has already happened, or that e.g. AGI is necessarily imminent.
The reason HN has value is because you can actually see some specifics of the matter discussed, and, if you are lucky, an expert even might join in to qualify everything. But pointing out "how interesting that there are extremes to this" is just engagement bait.
Really? Is that happening in this thread because I can barely see it. Instead you have a bunch of asinine comments butthurt about acknowledging a GPT contribution that would have been acknowledged any day had a human done it.
>they know more about this than Fields medalist Terence Tao, who maintains this list showing that, yes, though these are not interesting proofs to most modern mathematicians, LLMs are a major factor in a tiny minority of these mostly-not-very-interesting proofs
This is part of the problem really. Your framing is disingenuous and I don't really understand why you feel the need to downplay it so. They are interesting proofs. They are documented for a reason. It's not cutting edge research, but it is LLMs contributing meaningfully to formal mathematics, something that was speculative just years ago.
I am not surprised that you can't understand that the quote I am making is obviously parodying the OP as disingenuous. Given our previous interactions (https://news.ycombinator.com/item?id=46938446), it is clear you don't understand much things about AI and/or LLMs, or, perhaps, basic communication, at all.
>Given our previous interactions (https://news.ycombinator.com/item?id=46938446), it is clear you don't understand much things about AI and/or LLMs at all.
Sure, Whatever makes you happy I guess.
>> OP's original comment is something that is actually happening in a bunch of comments on this very thread
OPs original comment was obviously a general claim not tied to responses to this thread. As usual, you fail to understand even the basics of what you are talking about.
This sentence sounds contradictory. You're a fool to not think we're on the verge of something revolutionary and you are a fool if you think something revolutionary like AGI is on the verge of happening?
But to your point if "revolutionary" and "agi" are different things, I'm certain the "revolution" has already happened. ChatGPT was the step function change and everything else is just following the upwards trendline post release of chatGPT.
Anecdotally I would say 50% of developers never code things by hand anymore. That is revolutionary in itself and by the statement itself it has already literally happened.
It's interesting to me that whenever AI gets a bunch of instructions from a reasonably bright person who has a suspicion about something, can point at reasons why, but not quite put their finger on it, we want to credit the AI for the insight.
And in this case "derives a new result in theoretical physics" is again overstating things, it's closer to "simplify and propose a more general form for a previously worked out sequence of amplitudes" which sounds less magical, and closer to something like what Mathematica could do, or an LLM-enhanced symbolic OEIS. Obviously still powerful and useful, but less hype-y.
How is this different than a new result? Many a careers in academia are built on simplifying mathematics.
https://www.math.columbia.edu/~woit/wordpress/?p=15362
Let's wait a couple of days whether there has been a similar result in the literature.
You reached your goal though and got that comment downvoted.
If I'd wanted that comment downvoted, I would have downvoted it myself, which as it happens I didn't. There was nothing particularly wrong with it, other than the fact that it was phrased in a way that could mislead, hence my comment.
The reality is: "GPT 5.2 found a more general and scalable form of an equation, after crunching for 12 hours supervised by 4 experts in the field".
Which is equivalent to taking some of the countless niche algorithms out there and have few experts in that algo have LLMs crunch tirelessly till they find a better formula. After same experts prompted it in the right direction and with the right feedback.
Interesting? Sure. Speaks highly of AI? Yes.
Does it suggest that AI is revolutionizing theoretical physics on its own like the title does? Nope.
Yet, if some student or child achieved the same – under equal supervision – we would call him the next Einstein.
One of my best friends in his bachelor thesis had solved a difficult mathematical problem in planet orbits or something, and it was just yet another random day in academia.
And she didn't solve it because she was a genius but because there's a bazillions such problems out there and little time to look at them and focus. Science is huge.
There are simple limitations that follow from these basic facts (or which follow with e.g. extreme but not 100% certainty), such that many experts openly state that e.g. LLMs have serious limitations, but, still, despite all this, you get some very extreme claims about capabilities, from supporters, that are extremely hard to reconcile with these basic and indisputable facts.
That, and the massive investment and financial incentives means that the counter-reaction is really quite rational (but still potentially unwarranted, in some/many practical cases).
There is no loud, moderate voice. It makes me very tired of the blasting rhetoric that invades _every_ space.
But agree that there's an irrational level of tribalism on both sides.
What I question here is OpenAI's article: it could be way more generous towards the reader.
One group of people saying every amazing breakthrough "doesn't count" because the AI didn't put a cherry on top. Another group of people saying humans are obsolete, I just wrote a web browser with AI bro.
There are some voices out there that are actually examining the boundaries, possibilities and limitations. A lot of good stuff like that makes it onto HN but then if you open the comments it's just intellectual dregs. Very strange.
ISTR there was a similar phenomenon with cryptocurrency. But with that it was always clear the fog of bullshit would blow away sooner or later. But maybe if it hadn't been there, a load of really useful stuff could have come out of the crypto hype wave? Anyway, AI isn't gonna blow over like crypto did. I guess we have more of a runway to grow out of this infantile phase.
Take a look at this entire thread. Everyone and I mean everyone is talking as if AI is some sort of fraud and everything is just hype. But then this thread is all against, AI, I mean all of it. If anything the Anti-hype around AI is what's flooding the world right now. If AI hype was through the roof we'd see the opposite effect on HN.
I think it's a strange contradiction in the human mind. At work outside of HN, what I see is roughly 50-60% of developers no longer code by hand. They all use AI. Then they come onto HN and they start Anti-hyping it. It's universal. They use it and they're against it at the same time.
The contradiction is strange, but it also makes sense because AI is a thing that is attacking what programmers take pride in. Most programmers are so proud of their abilities and intelligence as it relates to their jobs and livelihood. AI is on a trendline of replacing this piece by piece. It makes perfect sense for them to talk shit but at the same time they have to use it to keep up with the competition.
It reminds me of an episode of Star Trek, "The Measure of a Man" I think it's called, where it is argued that Data is just a machine and Picard tries to prove that no he is a life form.
And the challenge is, how do you prove that?
Every time these LLMs get better, the goalposts move again.
It makes me wonder, if they ever did become sentient, how would they be treated?
It's seeming clear that they would be subject to deep skepticism and hatred much more pervasive and intense than anything imagined in The Next Generation.
They never surrender.
No one cares about how "AGI" or whatever the fuck term or internet-argument goalpost you cared about X months ago was. Everyone cares about what current tech can do NOW, and under what conditions, and when it fails catastrophically. That is all that matters.
So, refining the conditions of an LLM win (or loss) is all that matters (not who wins or loses depending on some particular / historical refinement). Complaining that some people see some recent result as a loss (or win) is just completely failing to understand the actual game being played / what really matters here.
When I use GPT 5.2 Thinking Extended, it gave me the impression that it's consistent enough/has a low enough rate of errors (or enough error correcting ability) to autonomously do math/physics for many hours if it were allowed to [but I guess the Extended time cuts off around 30 minute mark and Pro maybe 1-2 hours]. It's good to see some confirmation of that impression here. I hope scientists/mathematicians at large will be able to play with tools which think at this time-scale soon and see how much capabilities these machines really have.
This result reminded me of the C compiler case that Anthropic posted recently. Sure, agents wrote the code for hours but there was a human there giving them directions, scoping the problem, finding the test suites needed for the agentic loops to actually work etc etc. In general making sure the output actually works and that it's a story worth sharing with others.
The "AI replaces humans in X" narrative is primarily a tool for driving attention and funding. It works great for creating impressions and building brand value but also does a disservice to the actual researchers, engineers and humans in general, who do the hard work of problem formulation, validation and at the end, solving the problem using another tool in their toolbox.
>[...]
>The "AI replaces humans in X" narrative is primarily a tool for driving attention and funding.
You're sort of acting like it's all or nothing. What about the the humans that used to be that "force multiplier" on a team with the person guiding the research?
If a piece of software required a team of ten to people, and instead it's built with one engineer overseeing an AI, that's still 90% job loss.
For a more current example: do you think all the displaced Uber/Lyft drivers aren't going to think "AI took my job" just because there's a team of people in a building somewhere handling the occasional Waymo low confidence intervention, as opposed to being 100% autonomous?
Yes, but this assumes a finite amount of software that people and businesses need and want. Will AI be the first productivity increase where humanity says ‘now we have enough’? I’m skeptical.
I'm curious why you think I'm acting like it's all or nothing. What I was trying to communicate is the exact opposite, that it's not all or nothing. Maybe it's the way I articulate things, I'm genuinely interested what makes it sound like this.
This is a bizarre time to be living in, on one hand these tools are capable of doing more and more of the tasks any knowledge worker today handles, especially when used by an experienced person in X field.
On the other, it feels like something is about to give. All the superbowl ads, AI in what feels like every single piece of copy coming out these days. AI CEOs hopping from one podcast to another warning about the upcoming career apocalypse…I’m not fully buying it.
That, of course, assumes that there are 9 other projects that are both known (or knowable) and worth doing. And in the case of Uber/Lyft drivers, there's a skillset mismatch between the "deprecated" jobs and their replacements.
A website that cost hundreds of thousands of dollars in 2000 could be replaced by a wordpress blog built in an afternoon by a teenager in 2015. Did that kill web development? No, it just expanded what was worth building
Maybe it requires fundamentally changing or economic systems? Who knows what the solution is, but the problem is most definitely rooted in lack of initiative by our representatives and an economic system that doesn't accommodate us for when shit inevitably hits the fan with labor markets.
It's also a legitimate concern. We happen to be in a place where humans are needed for that "last critical 10%," or the first critical 10% of problem formulation, and so humans are still crucial to the overall system, at least for most complex tasks.
But there's no logical reason that needs to be the case. Once it's not, humans will be replaced.
When the systems turn into something trivial to manage with the new tooling, humans build more complex or add more layers on the existing systems.
To think that whatever the AI is capable of solving is (and forever will be) the frontier of all problems is deeply delusional. AI got good at generating code, but it still can't even do a fraction of what the human brain can do.
AGI means fully general, meaning everything the human brain can do and more. I agree that currently it still feels far (at least it may be far), but there is no reason to think there's some magic human ingredient that will keep us perpetually in the loop. I would say that is delusional.
We used to think there was human-specific magic in chess, in poker, in Go, in code, and in writing. All those have fallen, the latter two albeit only in part but even that part was once thought to be the exclusive domain of humans.
What I said in my original comment is that AI delivers when it's used by experts, in this case there was someone who was definitely not a C compiler expert, what would happen if there was a real expert doing this?
Yes, the bear is definitely dancing.
But a few feet away there's a world-class step dancer doing intricate rhythms they've perfected over twenty years of hard work.
The bear's kind of shuffling along to the beat like a stoner in a club.
It's amazing it can do it at all... but the resulting compiler is not actually good enough to be worth using.
No one has made that assertion; however, the fact that it can create a functioning C compiler with minimal oversight is the impressive part, and it shows a path to autonomous GenAI use in software development.
So, I just skimmed the discussion thread, but I am not seeing how this shows that CCC is not impressive. Is the point you're making that the person who opened the issue is not impressive?
I worry we're not producing as many of those as we used to
The text of the post is much more honest. The title is where the dishonesty is.
Snow + stick + need to clean driveway = snow shovel. Snow shovel + hill + desire for fun = sled
At one point people were arguing that you could never get "true art" from linear programs. Now you get true art and people are arguing you can't get magical flashes of insight. The will to defend human intelligence / creativity is strong but the evidence is weak.
Happy Valentine's day to those who celebrate btw <3
https://github.com/teorth/erdosproblems/wiki/AI-contribution... may be useful
If I'm wrong, please let me know which previously unsolved problem was solved, I would be genuinely curious to see an example of that.
I guess 1051 qualifies - from the paper: "Semi-autonomous mathematical discovery with gemini" https://arxiv.org/pdf/2601.22401
"We tentatively believe Aletheia’s solution to Erdős-1051 represents an early example of an AI system autonomously resolving a slightly non-trivial open Erdős problem of somewhat broader (mild) mathematical interest, for which there exists past literature on closely-related problems [KN16], but none fully resolves Erdős-1051. Moreover, it does not appear to us that Aletheia’s solution is directly inspired by any previous human argument (unlike in many previously discussed cases), but it does appear to involve a classical idea of moving to the series tail and applying Mahler’s criterion. The solution to Erdős-1051 was generalized further, in a collaborative effort by Aletheia together with human mathematicians and Gemini Deep Think, to produce the research paper [BKK+26]."
(35)-(38) are the AI-simplified versions of (29)-(32). Those earlier formulae look formidable to simplify by hand, but they are also the sort of thing you'd try to use a computer algebra system for.
I'm willing to (begrudgingly) admit the possibility for AI to do novel work, but this particular result does not seem very impressive.
I picture ChatGPT as the rich kid whose parents privately donated to a lab to get their name on a paper for college admissions. In this case, I don't think I'm being too cynical in thinking that something similar is happening here and that the role of AI in this result is being well overplayed.
assuming you are truthful, good to see someone here from the actual domain in question.
I also have a physics background, and separately, have derived novel results in color science using Mathematica that led to a great effect on my career.
I wouldn't wish having to do that on anyone, it was awful work.
Independently of it being awful, I know it's extremely, extremely, unlikely to luck into this complex of a result, both in my opinion and seeming reality, here: if someone could have done it before, why didn't they?
If is that trivial, you'll prove it, make some money, and I'll understand that it really was that trivial, and we'll get some headlines out of it. Win-win, modulo I'll look like an ass.
If it isn't that trivial, you won't do it, and no one will notice this far down a thread. But you seem thoughtful, you'll likely grapple with the gulf between your flippant response and reality, and gain some insight. Win for you either way, in that case.
So I would read this (with more information available) with less emphasize on LLM discovering new result. The title is a little bit misleading but actually "derives" being the operative word here so it would be technically correct for people in the field.
[1] https://en.wikipedia.org/wiki/List_of_physical_constants
They evaluate papers that look interesting and should be looked at more deeply. Then, research ideas as much as they can.
Then flag for human review the real possible breakthroughs.
;)
Any reason to believe that public versions of GPT-5.2 could have accomplished this task? "scaffolded" is a very interesting word choice
My personal opinion is that things will only accelerate from here.
I'm not blaming the model here, but Python is much easier to read and more universal than math notation in most cases (especially for whatever's going on at the bottom of page four). I guess I'll have one translate the PDF.
New Honda Civic discovered Pacific Ocean!
New F150 discovers Utah Salt Flats!
Sure it took humans engineering and operating our machines, but the car is the real contributor here!
Couldn't is an immensely high bar in this context, didn't seems more appropriate and renders this whole thing slightly less exciting.
Okay read it: Yep Induction. It already had the answer.
Don't get me wrong, I love Induction... but we aren't having any revolutions in understanding with Induction.
I expect lots of derivations (new discoveries whose pieces were already in place somewhere, but no one has put them together).
In this case, the human authors did the thinking and also used the LLM, but this could happen without the original human author too (some guy posts some partial on the internet, no one realizes is novel knowledge, gets reused by AI later). It would be tremendously nice if credit was kept in such possible scenarios.
Not saying they're lying, but I'm sure it's exaggerated in their own report.
Theoretical physics is throwing a lot of stuff at the wall and theory crafting to find anything that might stick a little. Generation might actually be good there, even generation that is "just" recombining existing ideas.
I trust physicists and mathematicians to mostly use tools because they provide benefit, rather than because they are in vogue. I assume they were approached by OpenAI for this, but glad they found a way to benefit from it. Physicists have a lot of experience teasing useful results out of probabilistic and half broken math machines.
If LLMs end up being solely tools for exploring some symbolic math, that's a real benefit. Wish it didn't involve destroying all progress on climate change, platforming truly evil people, destroying our economy, exploiting already disadvantaged artists, destroying OSS communities, enabling yet another order of magnitude increase in spam profitability, destroying the personal computer market, stealing all our data, sucking the oxygen out of investing into real industry, and bold faced lies to all people about how these systems work.
Also, last I checked, MATLAB wasn't a trillion dollar business.
Interestingly, the OpenAI wrangler is last in the list of Authors and acknowledgements. That somewhat implies the physicists don't think it deserves much credit. They could be biased against LLMs like me.
When Victor Ninov (fraudulently) analyzed his team's accelerator data using an existing software suite to find a novel SuperHeavy element, he got first billing on the authors list. Probably he contributed to the theory and some practical work, but he alone was literate in the GOOSY data tool. Author lists are often a political game as well as credit, but Victor got top billing above people like his bosses, who were famous names. The guy who actually came up with the idea of how to create the element, in an innovative recipe that a lot of people doubted, was credited 8th
https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.83...
I am generally very skeptical about work on this level of abstraction. only after choosing Klein signature instead of physical spacetime, complexifying momenta, restricting to a "half-collinear" regime that doesn't exist in our universe, and picking a specific kinematic sub-region. Then they check the result against internal consistency conditions of the same mathematical system. This pattern should worry anyone familiar with the replication crisis. The conditions this field operates under are a near-perfect match for what psychology has identified as maximising systematic overconfidence: extreme researcher degrees of freedom (choose your signature, regime, helicity, ordering until something simplifies), no external feedback loop (the specific regimes studied have no experimental counterpart), survivorship bias (ugly results don't get published, so the field builds a narrative of "hidden simplicity" from the survivors), and tiny expert communities where fewer than a dozen people worldwide can fully verify any given result.
The standard defence is that the underlying theory — Yang-Mills / QCD — is experimentally verified to extraordinary precision. True. But the leap from "this theory matches collider data" to "therefore this formula in an unphysical signature reveals deep truth about nature" has several unsupported steps that the field tends to hand-wave past.
Compare to evolution: fossils, genetics, biogeography, embryology, molecular clocks, observed speciation — independent lines of evidence from different fields, different centuries, different methods, all converging. That's what robust external validation looks like. "Our formula satisfies the soft theorem" is not that.
This isn't a claim that the math is wrong. It's a claim that the epistemic conditions are exactly the ones where humans fool themselves most reliably, and that the field's confidence in the physical significance of these results outstrips the available evidence.
I wrote up a more detailed critique in a substack: https://jonnordland.substack.com/p/the-psychologists-case-ag...
Basically, if you are small enough you can move forwards and backwards in time, from the moment you were put into a superposition, or entangled, until you interact with an object too large to ignore the emergent effects of time and gravity. This is 'being observed' and 'collapsing the wave function'. You occupy all possible positions in space as defined by the probability of you being there. Once observed, you move forward in linear time again and the last route you took is the only one you ever took even though that route could be affected by interference with other routes you took that now no longer exist. When in this state there is no 'before' or 'after' so the delayed choice experiment is simply an illusion caused by our view of time, and there is no delay, the choice and result all happen together.
With entanglement, both particles return to the entanglement point, swap places and then move to the current moment and back again, over and over. They obey GR, information always travels under the speed of light (which to the photon is infinite anyway), so there is no spooky action at a distance, it is sub-lightspeed action through time that has the illusion of being instant to entities stuck in linear time.
It then went on to talk about how mass creates time, and how time is just a different interpretation of gravity leading it to fully explain how a black hole switches time and space, and inwards becomes forwards in time inside the event horizon. Mass warps 4D (or more) space. That is gravity, and it is also time.
Humans have worked out the amplitudes for integer n up to n = 6 by hand, obtaining very complicated expressions, which correspond to a “Feynman diagram expansion” whose complexity grows superexponentially in n. But no one has been able to greatly reduce the complexity of these expressions, providing much simpler forms. And from these base cases, no one was then able to spot a pattern and posit a formula valid for all n. GPT did that.
Basically, they used GPT to refactor a formula and then generalize it for all n. Then verified it themselves.
I think this was all already figured out in 1986 though: https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.56... see also https://en.wikipedia.org/wiki/MHV_amplitudes
I think this is a prime example of where it is easy to think something is solved when looking at things from a high level but making an erroneous conclusion due to lack of domain expertise. Classic "Reviewer 2" move. Though I'm not a domain expert and so if there was no novelty over Parke and Taylor I'm pretty sure this will get thrashed in review.
Sorry but I just have to point out how this field of maths read like Star Trek technobabble too me.
trekify/SKILL.md: https://github.com/SimHacker/moollm/blob/main/skills/trekify...
I was just stating the facts and correcting a reaction that went too far in the other direction. By taking my comment as supporting or validating OpenAI's claim is just as bad. An error of the same magnitude.
I feel like I've been quoting Feynman a lot this week: The first principle is to not fool yourself, and you're the easiest person to fool. You're the easiest person for you to fool because you're as smart as yourself and deception is easier than proving. We all fall for these traps and the smartest people in the world (or history) are not immune to it. But it's interesting to see on a section of the internet that prides itself for its intelligence. I think we just love blinders, which is only human
This result, by itself, does not generalize to open-ended problems, though, whether in business or in research in general. Discovering the specification to build is often the majority of the battle. LLMs aren't bad at this, per se, but they're nowhere near as reliably groundbreaking as they are on verifiable problems.
Feel like it's a bit what I tried to expressed few weeks ago https://news.ycombinator.com/item?id=46791642 namely that we are just pouring computational resources at verifiable problems then claim that astonishingly sometimes it works. Sure LLMs even have a slight bias, namely they do rely on statistics so it's not purely brute force but still the approach is pretty much the same : throw stuff at the wall, see what sticks, once something finally does report it as grandiose and claim to be "intelligent".
What do we think humans are doing? I think it’s not unfair to say our minds are constantly trying to assemble the pieces available to them in various ways. Whether we’re actively thinking about a problem or in the background as we go about our day.
Every once in a while the pieces fit together in an interesting way and it feels like inspiration.
The techniques we’ve learned likely influence the strategies we attempt, but beyond all this what else could there be but brute force when it comes to “novel” insights?
If it’s just a matter of following a predefined formula, it’s not intelligence.
If it’s a matter of assembling these formulas and strategies in an interesting way, again what else do we have but brute force?
How many people have tried to figure out a new maths, a GUT in physics, a more perfect human language (Esperanto for ex.) or programming language, only to fail in the vast majority of their attempts?
Do we think that anything but the majority of the attempts at a paradigm shift will end in failure?
If the majority end in failure, how is that not the same brute force methodology (brute force doesn’t mean you can’t respond to feedback from your failed experiments or from failures in the prevailing paradigms, I take it to just fundamentally mean trying “new” things with tools and information available to you, with the majority of attempts ending in failure, until something clicks, or doesn’t and you give up).
Instead of brute-forcing with infinite options, reduce the problem space by starting with some hunch about the mechanism. Then the hard part that can take decades: synthesize compounds with the necessary traits to alter the mechanism in a favourable way, while minimizing unintended side-effects.
Then try on a live or lab grown specimen and note effectiveness. Repeat the cycle, and with every success, push to more realistic forms of testing until it reaches human trials.
Many drugs that reach the last stage - human trials - often end up being used for something completely other than what they were designed for! One example of that is minoxidil - designed to regular blood pressure, used for regrowing hair!
Method A) 30% speed reduction and 80% precision decrease
Method B) 50% speed reduction and 5% precision increase
Method C) 740% speed reduction and 1% precision increase
and we only publish B. It's not brute force[1], but throw noodles at the wall, see what sticks, like the GP said. We don't throw spoons[1], but everything that looks like a noodle has a high chance of been thrown. It's a mix of experience[1] and not enough time to try everything.
[1] citation needed :)
RLHF is an attempt to push LLMs pre-trained with a dopey reconstruction loss toward something we actually care about: imagine if we could find a pre-training criterion that actually cared about truth and/or plausibility in the first place!
Our actual software implementation is usually pretty simple; often writing up the design spec takes significantly longer than building the software, because the software isn't the hard part - the requirements are. I suspect the same folks who are terrible at describing their problems are going to need help from expert folks who are somewhere between SWE, product manager, and interaction designer.
Heck, it's hard to get authors to do literature search, period: never mind not thoroughly looking for prior art, even well known disgraced papers get citated continue to get possitive citations all the time...
Slightly OT, but wasn't this supposed to be largely solved with amplituhedrons?
Can humans actually do that? Sometimes it appears as if we have made a completely new discovery. However, if you look more closely, you will find that many events and developments led up to this breakthrough, and that it is actually an improvement on something that already existed. We are always building on the shoulders of giants.
From my reading yes, but I think I am likely reading the statement differently than you are.
> from first principles
Doing things from first principles is a known strategy, so is guess and check, brute force search, and so on.
For an llm to follow a first principles strategy I would expect it to take in a body of research, come up with some first principles or guess at them, then iteratively construct and tower of reasonings/findings/experiments.
Constructing a solid tower is where things are currently improving for existing models in my mind, but when I try openai or anthropic chat interface neither do a good job for long, not independently at least.
Humans also often have a hard time with this in general it is not a skill that everyone has and I think you can be a successful scientist without ever heavily developing first principles problem solving.
Even the realm of pure mathematics and elegant physic theories, where you are supposed to take a set of axioms ("first principles") and build something with it, has cautionary tales such as the Russel paradox or the non-measure of Feymann path integrals, and let's not talk about string theory.
These have been identified as various things. Eureka moments, strokes of genius, out of the box thinking, lateral thinking.
LLMs have not shown to be capable of this. They might be in the future, but they havent yet
You could nitpick a rebuttal, but no matter how many people you give credit, general relativity was a completely novel idea when it was proposed. I'd argue for special relatively as well.
> In 1902, Henri Poincaré published a collection of essays titled Science and Hypothesis, which included: detailed philosophical discussions on the relativity of space and time; the conventionality of distant simultaneity; the conjecture that a violation of the relativity principle can never be detected; the possible non-existence of the aether, together with some arguments supporting the aether; and many remarks on non-Euclidean vs. Euclidean geometry.
https://en.wikipedia.org/wiki/History_of_special_relativity
Now, if I had to pick a major idea that seemed to drop fully-formed from the mind of a genius with little precedent to have guided him, I might personally point to Galois theory (https://en.wikipedia.org/wiki/Galois_theory). (Ironically, though, I'm not as familiar with the mathematical history of that time and I may be totally wrong!)
As for general relativity, he spent several years working to learn differential geometry (which was well developed mathematics at the time, but looked like abstract nonsense to most physicists). I’m not sure how he was turned on to this theory being applicable to gravity, but my guess is that it was motivated by some symmetry ideas. (It always come down to symmetry.)
Your argument really makes the claim that since there are others pursuing similar directions that this means it is in distribution. I'll use a classic statistics style framing. Suppose we have a bag with n red balls and p blue balls. Someone walks over and says "look, I have a green ball" and someone else walks over and says "I have a purple one" and someone else comes over and says "I have a pink one!". None of those balls were from the bag we have. There are still n+p balls in our bag, they are still all red or blue despite there being n+p+3 balls that we know of.
I think this is probably why you don't have the resolution to see the distinctions. Without a formal study of physics it is really hard to differentiate these kinds of propositions. It can be very hard even with that education. So be careful to not overly abstract and simplify concepts. It'll only deprive you of a lot of beauty and innovation.I only believe that (1) if it hadn't been Einstein, it would very soon have been someone else using very similar concepts and evidence, (2) "completely novel idea" is a stricter criterion than "not in distribution," and (3) better examples of completely novel ideas from history exist as a benchmark for this sort of things.
> Without a formal study of physics it is really hard to differentiate these kinds of propositions. It can be very hard even with that education. So be careful to not overly abstract and simplify concepts. It'll only deprive you of a lot of beauty and innovation.
I agree, but with the caveat that I think ancestor worship is also an impediment to understanding our intellectual and cultural heritage. Either all of human creativity deserves to be treated sacredly, or none of it does.
I think the best way to avoid the problem is to remember "my understanding is limited" and always will be. At least until we somehow become omniscient, but I'm not counting on that ever happening.
> The quintic was almost proven to have no general solutions by radicals by Paolo Ruffini in 1799, whose key insight was to use permutation groups, not just a single permutation.
Thing is, I am usually the kind of person who defends the idea of a lone genius. But I also believe there is a continuous spectrum, no gaps, from the village idiot to Einstein and beyond.
Let me introduce, just for fun, not for the sake of any argument, another idea from math which I think it came really out of the blue, to the degree that it's still considered an open problem to write an exposition about it, since you cannot smoothly link it to anything else: forcing.
```ai-slop
But wait, this equation is too simple, I need to add more terms or it won't model the universe. Let me think about this again. I have 5 equations and I combined them and derived e=mc^2 but this is too simple. The universe is more complicated. Let's try a different derivation. I'll delete the wrong outputs first and then start from the input equations.
<Deletes files with groundbreaking discovery>
Let me think. I need to re-read the original equations and derive a more complex formula that describes the universe.
<Re-reads equation files>
Great, now I have the complete picture of what I need to do. Let me plan my approach. I'm ready. I have a detailed plan. Let me check some things first.
I need to read some extra files to understand what the variables are.
<Reads the lunch menu for the next day>
Perfect. Now I understand the problem fully, let me revise my plan.
<Writes plan file>
Okay I have written the plan. Do you accept?
<Yes>
Let's go. I'll start by creating a To Do list:
- [ ] Derive new equation from first principles making sure it's complex enough to describe reality.
- [ ] Go for lunch. When the server offers tuna, reject it because the notes say I don't like fish.
```
(You know what's really sad? I wrote that slop without using AI and without referring to anything...)
It is absolutely true that someone else would have come up with special relativity very soon after Einstein. All that would be necessary is someone else to have the wherewithal to say "perhaps the aether does not need to exist" for the equations already known at the time by others before Einstein to lead to the general theory.
General relativity is different. Witten contends that it is entirely possible that without Einstein, we may have had to wait for the early string theorists of the 1960s to discover GR as a classical limit of the first string theories in their quest to understand the strong nuclear force.
As opposed to SR, GR is one of the most singular innovative intellectual achievements in human history. It's definitely "out of distribution" in some sense.
https://en.wikipedia.org/wiki/Prat%C4%ABtyasamutp%C4%81da https://iep.utm.edu/processp/
Edit: but even it likely relied on his prior experience with nondualistic Hinduisms, of course.
General relativity was a completely novel idea. Einstein took a purely mathematical object (now known as the Einstein tensor), and realized that since its coveriant derivative was zero, it could be equated (apart fron a constant factor) to a conserved physical object, the energy momentum tensor (except for a constant factor). It didn't just fall out of Riemannian geometry and what was known about physics at the time.
Special relativity was the work of several scientists as well as Einstein, but it was also a completely novel idea - just not the idea of one person working alone.
I don't know why anyone disputes that people can sometimes come up with completely novel ideas out of the blue. This is how science moves forward. It's very easy to look back on a breakthrough and think it looks obvious (because you know the trick that was used), but it's important to remember that the discoverer didn't have the benefit of hindsight that you have.
I'm not sure about GR, but I know that it is built on the foundations of differential geometry, which Einstein definitely didn't invent (I think that's the source of his "I assure you whatever your difficulties in mathematics are, that mine are much greater" quote because he was struggling to understand Hilbert's math).
And really Cauchy, Hilbert, and those kinds of mathematicians I'd put above Einstein in building entirely new worlds of mathematics...
"Since the mathematicians have invaded the theory of relativity, I do not understand it myself anymore."
:)
Source: https://www.newtonproject.ox.ac.uk/view/texts/normalized/THE...
And Newton was famously interested in dark religous interference in worldly affairs - what today we would call The Occult. When he did finally succeed in finding his force for moving objects at a distance, without need for an intervening body, he gave credit to these supernatural entities - at least that is how this quote was taken in his day. This religious context is not well known today, nor is Newton's difficult character, so today it is easy to take the quote out of context. Newton was (likely) not disputing the validity of his discovery, rather, he was invoking one of his passions (The Occult) in the affairs of one of his successful passions (finding a force to move distant objects).
It should be noted that some of Newton's successful religious work is rarely attributed to him. For a prominent example, it was Newton that calculated Jesus's birth to be 4 BC, not 1 AD as was the intention of the new calendar.
So that's actually 2 different regimes on how to proceed. Both are useful but arguably breaking off of the current paradigm is much harder and thus rare.
But yes, it is not yet clear to what degree there can be (non-linear) extrapolation in the learned semantic spaces here.
There are genuine creative insights that come from connecting two known semantic spaces in a way that wasn't obvious before (e.g, novel isomorphism). It is very conceivable that LLMs could make this kind of connection, but we haven't really seen a dramatic form of this yet. This kind of connection can lead to deep, non-trivial insights, but whether or not it is "out-of-distribution" is harder to answer in this case.
The process you’re describing is humans extending our collective distribution through a series of smaller steps. That’s what the “shoulders of giants” means. The result is we are able to do things further and further outside the initial distribution.
So it depends on if you’re comparing individual steps or just the starting/ending distributions.
Seriously, think about it for a second...
If that were true then science should have accelerated a lot faster. Science would have happened differently and researchers would have optimized to trying to ingest as many papers as they can.
Dig deep into things and you'll find that there are often leaps of faith that need to be made. Guesses, hunches, and outright conjectures. Remember, there are paradigm shifts that happen. There are plenty of things in physics (including classical) that cannot be determined from observation alone. Or more accurately, cannot be differentiated from alternative hypotheses through observation alone.
I think the problem is when teaching science we generally teach it very linearly. As if things easily follow. But in reality there is generally constant iterative improvements but they more look like a plateau, then there are these leaps. They happen for a variety of reasons but no paradigm shift would be contentious if it was obvious and clearly in distribution. It would always be met with the same response that typical iterative improvements are met with "well that's obvious, is this even novel enough to be published? Everybody already knew this" (hell, look at the response to the top comment and my reply... that's classic "Reviewer #2" behavior). If it was always in distribution progress would be nearly frictionless. Again, with history in how we teach science we make an error in teaching things like Galileo, as if The Church was the only opposition. There were many scientists that objected, and on reasonable grounds. It is also a problem we continually make in how we view the world. If you're sticking with "it works" you'll end up with a geocentric model rather than a heliocentric model. It is true that the geocentric model had limits but so did the original heliocentric model and that's the reason it took time to be adopted.
By viewing things at too high of a level we often fool ourselves. While I'm criticizing how we teach I'll also admit it is a tough thing to balance. It is difficult to get nuanced and in teaching we must be time effective and cover a lot of material. But I think it is important to teach the history of science so that people better understand how it actually evolves and how discoveries were actually made. Without that it is hard to learn how to actually do those things yourself, and this is a frequent problem faced by many who enter PhD programs (and beyond).
And it still is. You can still lean on others while presenting things that are highly novel. These are not in disagreement.It's probably worth reading The Unreasonable Effectiveness of Mathematics in the Natural Sciences. It might seem obvious now but read carefully. If you truly think it is obvious that you can sit in a room armed with only pen and paper and make accurate predictions about the world, you have fooled yourself. You have not questioned why this is true. You have not questioned when this actually became true. You have not questioned how this could be true.
https://www.hep.upenn.edu/~johnda/Papers/wignerUnreasonableE...
Five years ago we were at Stage 1 with LLMs with regard to knowledge work. A few years later we hit Stage 2. We are currently somewhere between Stage 2 and Stage 3 for an extremely high percentage of knowledge work. Stage 4 will come, and I would wager it's sooner rather than later.
In chess, there's a clear goal: beat the game according to this set of unambiguous rules.
In science, the goals are much more diffuse, and setting those in the first place is what makes a scientist more or less successful, not so much technical ability. It's a very hierarchical field where permanent researchers direct staff (postdocs, research scientists/engineers), direct grad students. And it's at the bottom of the pyramid where the technical ability is the most relevant/rewarded.
Research is very much a social game, and I think replacing it with something run by LLMs (or other automatic process) is much more than a technical challenge.
People have been downplaying LLMs since the first AI-generated buzzword garbage scientific paper made its way past peer review and into publication. And yet they keep getting better and better to the point where people are quite literally building projects with shockingly little human supervision.
By all means, keep betting against them.
IOW respect the trend line.
And the same practitioners said right after deep blue that go is NEVER gonna happen. Too large. The search space is just not computable. We'll never do it. And yeeeet...
The LLMs are very fast but the code they generate is low quality. Their comprehension of the code is usually good but sometimes they have a weightfart and miss some obvious detail and need to be put on the right path again. This makes them good for non-experienced humans who want to write code and for experienced humans who want to save time on easy tasks.
I think the latest generation of LLM with claude code is not low quality. It's better than the code that pretty much every dev on our team can do outside of very narrow edge cases.
Probably not something that the average GI Joe would be able to prompt their way to...
I am skeptical until they show the chat log leading up to the conjecture and proof.
Was the initial conjecture based on leading info from the other authors or was it simply the authors presenting all information and asking for a conjecture?
Did the authors know that there was a simpler means of expressing the conjecture and lead GPT to its conclusion, or did it spontaneously do so on its own after seeing the hand-written expressions.
These aren't my personal views, but there is some handwaving about the process in such a way that reads as if this was all spontaneous involvement on GPTs end.
But regardless, a result is a result so I'm content with it.
https://www.linkedin.com/in/alex-lupsasca-9096a214/
SpaceX can use an optimization algorithm to hoverslam a rocket booster, but the optimization algorithm didn't really figure it out on its own.
The optimization algorithm was used by human experts to solve the problem.
LLMs surpassed the average human a long time ago IMO. When LLMs fail to measure up to humans, it's that they fail to measure up against human experts in a given field, not the Average Joe.
We are surrounded by NPCs.
Is this so different?
I know we've been primed by sci-fi movies and comic books, but like pytorch, gpt-5.2 is just a piece of software running on a computer instrumented by humans.
>I know we've been primed by sci-fi movies and comic books, but like pytorch, gpt-5.2 is just a piece of software running on a computer instrumented by humans.
Sure
Do you really want to be treated like an old PC (dismembered, stripped for parts, and discarded) when your boss is done with you (i.e. not treated specially compared to a computer system)?
But I think if you want a fuller answer, you've got a lot of reading to do. It's not like you're the first person in the world to ask that question.
Not an uncommon belief.
Here you are saying you personally value a computer program more than people
It exposes a value that you personally hold and that's it
That is separate from the material reality that all this AI stuff is ultimately just computer software... It's an epistemological tautology in the same way that say, a plane, car and refrigerator are all just machines - they can break, need maintenance, take expertise, can be dangerous...
LLMs haven't broken the categorical constraints - you've just been primed to think such a thing is supposed to be different through movies and entertainment.
I hate to tell you but most movie AIs are just allegories for institutional power. They're narrative devices about how callous and indifferent power structures are to our underlying shared humanity
(In the hands of leading experts.)
The humans put in significant effort and couldn’t do it. They didn’t then crank it out with some search/match algorithm.
They tried a new technology, modeled (literally) on us as reasoners, that is only just being able to reason at their level and it did what they couldn’t.
The fact that the experts were a critical context for the model, doesn’t make the models performance any less significant. Collaborators always provide important context for each other.
What's the distinction between "first principles" and "existing things"?
I'm sympathetic to the idea that LLMs can't produce path-breaking results, but I think that's true only for a strict definition of path-breaking (that is quite rare for humnans too).
I can claim some knowledge of physics from my degree, typically the easy part is coming up with complex dirty equations that work under special conditions, the hard part is the simplification into something elegant, 'natural' and general.
Also "LLM’s can make new things when they are some linear combination of existing things"
Doesn't really mean much, what is a linear combination of things you first have to define precisely what a thing is?
over long periods of time, checklists are the biggest thing, so the LLM can track whats already done and whats left. after a compact, it can pull the relevant stuff back up and make progress.
having some level or hierarchy is also useful - requirements, high level designs, low level designs, etc
Agree with this. I’ve been trying to make LLMs come up with creative and unique word games like Wordle and Uncrossy (uncrossy.com), but so far GPT-5.2 has been disappointing. Comparatively, Opus 4.5 has been doing better on this.
But it’s good to know that it’s breaking new ground in Theoretical Physics!
The real question is, what does it cost OpenAI? I'm pretty sure both their plans are well below cost, at least for users who max them out (and if you pay $200 for something then you'll probably do that!). How long before the money runs out? Can they get it cheap enough to be profitable at this price level, or is this going to be "get them addicted then jack it up" kind of strategy?
Compute costs will fall drastically for existing models
But it's likely that frontier models of the future won't be released to the public at all, because they'll be too good
But it's worth thinking more about this. What gives humans the ability to discover "new things"? I would say it's due to our interaction with the universe via our senses, and not due to some special powers intrinsic to our brains that LLMs lack. And the thing is, we can feed novel measurements to LLMs (or, eventually, hook them up to camera feeds to "give them senses")
It seems to me that all “new ideas” are basically linear combinations of existing things with exceeding rare exceptions…
Maybe Godel’s Incompleteness?
Darwinian evolution?
General Relativity?
Buddhist non-duality?
We're talking about significant contributions to theoretical physics. You can nitpick but honestly go back to your expectations 4 years ago and think — would I be pretty surprised and impressed if an AI could do this? The answer is obviously yes, I don't really care whether you have a selective memory of that time.
One way I gauge the significance of a theory paper are the measured quantities and physical processes it would contribute to. I see none discussed here which should tell you how deep into math it is. I personally would not have stopped to read it on my arxiv catch-up
https://arxiv.org/list/hep-th/new
Maybe to characterize it better, physicists were not holding their breath waiting for this to get done.
Whoever wrote the prompts and guided ChatGPT made significant contributions to theoretical physics. ChatGPT is just a tool they used to get there. I'm sure AI-bloviators and pelican bike-enjoyers are all quite impressed, but the humans should be getting the research credit for using their tools correctly. Let's not pretend the calculator doing its job as a calculator at the behest of the researcher is actually a researcher as well.
How much precedence is there for machines or tools getting an author credit in research? Genuine question, I don't actually know. Would we give an author credit to e.g. a chimpanzee if it happened to circle the right page of a text book while working with researchers, leading them to a eureka moment?
For a datum of one, the mathematician Doron Zeilberger give credit to his computer Shalosh B. Ekhad on select papers.
https://medium.com/@miodragpetkovic_24196/the-computer-a-mys...
https://sites.math.rutgers.edu/~zeilberg/akherim/EkhadCredit...
https://sites.math.rutgers.edu/~zeilberg/pj.html
https://en.wikipedia.org/wiki/F._D._C._Willard
https://en.wikipedia.org/wiki/Yuri_Knorozov
That usually comes up with some support usually.
I have no problem with the former and agree that authors/researchers must note when they use AI in their research.
for this particular paper it seems the humans were stuck, and only AI thinking unblocked them
In your eyes maybe there's no difference. In my eyes, big difference. Tools are not people, let's not further the myth of AGI or the silly marketing trend of anthropomorphizing LLMs.
Well what do you think ? Do the authors (or a single symbolic one) of pytorch or numpy or insert <very useful software> typically get credits on papers that utilize them heavily? Well Clearly these prominent institutions thought GPT's contribution significant enough to warrant an Open AI credit.
>Would we give an author credit to e.g. a chimpanzee if it happened to circle the right page of a text book while working with researchers, leading them to a eureka moment?
Cool Story. Good thing that's not what happened so maybe we can do away with all these pointless non sequiturs yeah ? If you want to have a good faith argument, you're welcome to it, but if you're going to go on these nonsensical tangents, it's best we end this here.
I don't know! That's why I asked.
> Well Clearly these prominent institutions thought GPT's contribution significant enough to warrant an Open AI credit.
Contribution is a fitting word, I think, and well chosen. I'm sure OpenAI's contribution was quite large, quite green and quite full of Benjamins.
> Cool Story. Good thing that's not what happened so maybe we can do away with all these pointless non sequiturs yeah ? If you want to have a good faith argument, you're welcome to it, but if you're going to go on these nonsensical tangents, it's best we end this here.
It was a genuine question. What's the difference between a chimpanzee and a computer? Neither are humans and neither should be credited as authors on a research paper, unless the institution receives a fat stack of cash I guess. But alas Jane Goodall wasn't exactly flush with money and sycophants in the way OpenAI currently is.
If you don't read enough papers to immediately realize it is an extremely rare occurrence then what are you even doing? Why are you making comments like you have the slightest clue of what you're talking about? including insinuating the credit was what...the result of bribery?
You clearly have no idea what you're talking about. You've decided to accuse prominent researchers of essentially academic fraud with no proof because you got butthurt about a credit. You think your opinion on what should and shouldn't get credited matters ? Okay
I've wasted enough time talking to you. Good Day.
― C.S. Lewis, The Last Battle
— Carl Sagan
I have no real way to demonstrate that I'm telling the truth, but I am ¯\_(ツ)_/¯
[0]: https://slatestarcodex.com/2019/02/19/gpt-2-as-step-toward-g...
Aren't most new things linear combinations of existing things (up to a point)?
But I’ve successfully made it build me a great Poker training app, a specific form that also didn’t exist, but the ingredients are well represented on the internet.
And I’m not trying to imply AI is inherently incapable, it’s just an empirical (and anecdotal) observation for me. Maybe tomorrow it’ll figure it out. I have no dogmatic ideology on the matter.
If all ideas are recombinations of old ideas, where did the first ideas come from? And wouldn't the complexity of ideas be thus limited to the combined complexity of the "seed" ideas?
I think it's more fair to say that recombining ideas is an efficient way to quickly explore a very complex, hyperdimensional space. In some cases that's enough to land on new, useful ideas, but not always. A) the new, useful idea might be _near_ the area you land on, but not exactly at. B) there are whole classes of new, useful ideas that cannot be reached by any combination of existing "idea vectors".
Therefore there is still the necessity to explore the space manually, even if you're using these idea vectors to give you starting points to explore from.
All this to say: Every new thing is a combination of existing things + sweat and tears.
The question everyone has is, are current LLMs capable of the latter component. Historically the answer is _no_, because they had no real capacity to iterate. Without iteration you cannot explore. But now that they can reliably iterate, and to some extent plan their iterations, we are starting to see their first meaningful, fledgling attempts at the "sweat and tears" part of building new ideas.
Any countable group is a quotient of a subgroup of the free group on two elements, iirc.
There’s also the concept of “semantic primes”. Here is a not-quite correct oversimplification of the idea: Suppose you go through the dictionary and one word at a time pick a word whose definition includes only other words that are still in the dictionary, and removing them. You can also rephrase definitions before doing this, as long as it keeps the same meaning. Suppose you do this with the goal of leaving as few words in it as you can. In the end, you should have a small cluster of a bit over 100 words, in terms of which all the other words you removed can be indirectly defined. (The idea of semantic primes also says that there is such a minimal set which translates essentially directly* between different natural languages.)
I don’t think that says that words for complicated ideas aren’t like, more complicated?
Ideas seem to just be our abstractions of neural impulses from deep in evolution.
There are in fact ways to directly quantify this, if you are training e.g. a self-supervised anomaly-detection model.
Even with modern models not trained in that manner, looking at e.g. cosine distances of embeddings of "novel" outputs could conceivably provide objective evidence for "out-of-distribution" results. Generally, the embeddings of out-of-distribution outputs will have a large cosine (or even Euclidean) distance from the typical embedding(s). Just, most "out-of-distribution" outputs will be nonsense / junk, so, searching for weird outputs isn't really helpful, in general, if your goal is useful creativity.
Thanks for the summary; but this is a huge hand-wave. was GPT Pro just spinning for 12 hours and returend 42?!
I heard this from people who know more than me
For some extra context, pre-training is ~1/3 of the training, where it gains the basic concepts of how tokens go together. Mid & late training are where you instill the kinds of anthropic behaviors we see today. I expect pre-training to increasingly become a lower percentage of overall training, putting aside any shifts of what happens in each phase.
So to me, it is plausible they can take the 4.x pre-training and keep pushing in the later phases. There is a lot of results out there to show scaling laws (limits) have not peaked yet. I would not be surprised to learn that Gemini 3 Deep Research had 50% late-training / RL
If you already have a good one, it's not likely much has changed since a year ago that would create meaningful differences at this phase (in data, arch is diff, I know less here). If it is indeed true, it's a datapoint to add to the others singling internal (everybody has some amount of this, not good when it makes the headlines)
Distillation is also a powerful training method. There are many ways to stay with the pack without having new pre-training runs. It's pretty much what we see from all of them with the minor versions. So coming back to it, the speculation is that OpenAi is still on their 4.x pre-train, but that doesn't impede all progress