Some details on the timeline are not quite precise, and would benefit from linking to a source so that everyone can verify it. For example, HyperClOVA is listed as 204B parameters, but it seems it used 560B parameters (https://aclanthology.org/2021.emnlp-main.274/).
Allow me to contribute:
> Magistral: Magist(rate) + stral? Mag(nificent) + stral? Nobody knows.
That's just French for "masterful" or a way to describe lectures. There's a sense of greatness in that word that contrasts with the Mini in Ministral which is in turn might be a pun on "ménestrel" (minstrel), "ministre" (minister), or made to sound like Minitel (or all of the above).
Neural networks, computer vision, sentiment analysis, all of these and more have provided an unspeakable amount of value over the years.
Models that take visual input seem more focused on identifying what is in the image compared to what a human might perceive is in an image, and most interfaces lack any form of automated feedback mechanism for them to look at what it has made.
In short, I have made some fun things with AI but I still end up doing CSS by hand.
Also I would say add apple/DCLM-7B(not as milestone imo) as it was kind of the first fully open model which was at least somewhat competitive with closed data model.
It's in the timeline though? Or are you saying that one should somehow be highlighted, even though none of the other ones are? Seems it's just chronological order, with no one being more or less visible than others, as far as I can see.
This keeps bothering me, why they need several iterations to arrive at correct solution instead of doing it first time. The prompts like "repeat solving it until it is correct" don't help.
No, all the models are designed to be "helpful", but different companies see that as different things.
If you're seeing the model deliberately creating errors so you have something to fix, then that sounds like something is fundamentally wrong in your prompt.
Besides that, I'm guessing "repeat solving it until it is correct" is a concise version of your actual prompt, or is that verbatim what you prompt the model? If so, you need to give it more details to actually be able to execute something like that.
I am holding it wrong?
No, all these models are just bad for anything that they weren't RLed for, and decent for things they were. Decent, because people who evaluate them aren't experts.
Are you claiming that the models are RLed to intentionally adding errors to our programs when you use them, or what's the argument you're trying to make here? Otherwise I don't see how it's relevant to how I said.
Not necessarily relevant, but fun, I had the ChatGPT model correct itself mid-response when checking my math work. It started by saying that I was wrong, then it proceeded to solve the problem and at the end it realized that I was correct.
Why not? I can definitively fire of two prompts to the same model and harness, and one include "don't do X" and the other doesn't, and I get what I expect, one didn't try to avoid doing X, and the other did. Is that not your experience using LLMs?
It makes sense if you remember that it just predicts, what should probably be the next piece of text?
Maybe I'm missing some bigger picture you're trying to paint here? I understand (and see) them making "mistakes" all the time, and I guess you could argue it's deliberate in some way, because it's simply how they work and adjusting the prompt and redoing usually solves the problem. But I'm afraid I don't see how it's connected, at least yet.
One thing I regret to say that I learned very late in my children's development was the value of boredom and difficult challenges. However I think I've successfully passed these lessons on to my kids as they raise their own. I have no idea what to say about 'AI' and the rapid reconfiguration of our relationship with the world that's going to happen as a result. All I can tell them is that we're in this together and we'll try to figure it out as we go.
Good luck everybody!
There's a gag in Star Trek 4 where Scotty goes back in time, and tries talk to a computer.
The gag is funny because he is from the future where you talk to computers normally. When the computer doesn't respond, someone hands him the mouse, and he tries use it as a microphone.
I watched that scene with my kids recently (9 and 6).
They didn't get the gag. They thought Scotty was completely reasonable to try and talk to the computer.
It took a while to explain.
Web 2.0 broke this into millions of creators. Generative AI produces everything on-demand, but again there is a small number of (polymorphic) models producing the content.