GPT‑NL: a sovereign language model for the Netherlands
52 points by root-parent 3 hours ago | 40 comments

stared 26 minutes ago
I feel that not only is Europe losing its independence to the US and China, but it does not even try to take part in the race.

Unlike the US, Europe has no California-level VCs. I don't expect hundreds of billions of Euros to be poured into long-shot projects.

Unlike China, Europe has neither cohesive public investment at the global level nor the drive to grow. Long-term investments have a lot of words, a lot of regulations, a lot of proxy goals, but there is neither a lot of money nor urgency. It was captured by this post: https://x.com/piotrsankowski/status/2065795919623438546

So yeah, both in economy and warfare, Europe dooms itself to be in the hands of the US, China, or a mix of both.

reply
eightysixfour 3 minutes ago
I'll play devil's advocate a little bit - I'm not sure it is losing its "independence" by not taking part in the race. It could very well be that it is gaining independence from tech and choosing a "second mover advantage" to decide how it gets deployed after seeing how it impacts everyone else. Let the US and China experiment on the bleeding edge (and their citizens feel the effect, both good and bad), and then be picky about how you use it.

I don't know if it is the right strategy but there's certainly a legitimate strategy in there.

reply
creesch 19 minutes ago
> Unlike the US, Europe has no California-level VCs.

Some would consider that a good thing. There is a lot to be said for VC in recent years not being beneficial for the economy, certainly on an individual level, other than "number go up".

reply
stared 14 minutes ago
Sure.

At the same time, it made in many cases EU dependent on the US. A lot of governments are basically dependent on MS Office or Google Cloud.

With AI, it is even more strategic.

reply
ews 22 minutes ago
Europe decided to regulate the hell out of foreign AI instead of investing in their own systems. It's sad to see the European continent lost the race to create a decent startup ecosystem (no decent search engines, social networks, cloud, mobile OS) and now it seems to be hellbent in losing this battle.
reply
joe_mamba 20 minutes ago
>It's sad to see the European continent lost the race to create a decent startup ecosystem

What's ironic and sad at the same time is that pre-2022 Russia's Yandex(domestic Russian variant of Google) was lightyears ahead of what EU, a significantly richer and more capable block, had. Same for Israel, their tech sector is probably greater than the EU one combined

Absolutely shameful how the EU kept managing to snatch defeat from the jaws of victory over and over.

reply
surgical_fire 9 minutes ago
Europe is not a country.

Regulations are not even throughout each of the 27 member states. Each country is relatively small in the world stage.

Until EU progresses towards federalization, discussing this is a moot point.

reply
dwa3592 30 minutes ago
I don't understand countries (especially governments) wanting to have their own models when there are already pretty solid open source (weights) models out there.

Countries should want control over _where_ the compute is happening rather than _what code_ is running.

What's wrong with a country hosting a Kimi, Qwen or GPT-Oss on their hardware for their government work purpose?

reply
Achterlangs 28 minutes ago
It is not about the country but the language. Most llms have poor or no support for Dutch.
reply
tgv 19 minutes ago
Idk which models you refer to, but I tested a bunch recently, and they performed well on Dutch. Only the smallest, such as qwen 3.6 27B, made up words and switched languages.
reply
applfanboysbgon 26 minutes ago
Why should Dutch people be expected to make do with models 99% trained on American/Chinese cultural context and language?
reply
dwa3592 21 minutes ago
Understood, but they could fine tune base models on their own cultural context and language. Why reinventing the wheel?
reply
DonHopkins 12 minutes ago
They could apply the Polder Model of consensus decision making with a mixture of experts.

https://en.wikipedia.org/wiki/Polder_model

reply
applfanboysbgon 17 minutes ago
This gets better short-term results for a fraction of the cost, for sure, but what do you when China places an export control banning the release of open weight models? If you don't have your own talent, you're then relegated to using a base model from 2026 or whatever the cutoff date is, forever. That defeats the purpose of a 'sovereign' model made for and by your people.
reply
joe_mamba 27 minutes ago
>Countries should want control over _where_ the compute is happening

Yeah but Europe doesn't build any computer hardware, and EU Green eco-communists and NIMBVYs don't want to have data centers built in their backyard, so the only way left for EU consultancies to milk taxpayer money for the AI bubble, is shipping a sovereign AI model for each country/language.

Watch out US tech sector, we're coming for you. Feel our wrath.

reply
dwa3592 20 minutes ago
>>Yeah but Europe doesn't build any computer hardware,

Well, then this is will be a good start.

reply
joe_mamba 12 minutes ago
EU bureaucrats are too busy trying to keep the welfare/pension system from collapsing, defeating Russia, supporting Ukraine, managing the fossil fuels energy shortages, figuring out how to nerf Chinese EVs while supporting domestic car companies, and restricting social media free speech to make sure the "far right" don't win elections.

Semiconductor manufacturing sovereignty is very low on their priority list.

reply
davedx 19 minutes ago
Have you heard of ASML? NXP?

Ignorant comment

reply
joe_mamba 17 minutes ago
Please don't move the goalposts. What computer parts does ASML or NXP make?

ASML only makes the lithography machines, 85% of which go outside the EU (Let that sink in). And then fabs in Taiwan, Korea or the US use those ASML machines to etch US IP for computer chips. EU doesn't make any computer parts domestically.

And NXP mostly makes various microcontrollers and small chips, not high margin decenter centric parts like ASICS, FPGAs, CPUs or GPUs.

So not only are you the ignorant one here, but you also have the audacity to insult others with so much confidence.

@dwa3592 below. Firstly, why are you moving the goalposts in bad faith again? What does that have to do with my original comment? Are ASML machine computer parts?

And secondly, there's other lithography machines out there, not just ASML.

And thirdly, the IP Nvidia, AMD, etc develop to etch on silicone via ASML machines makes them more valuable than ASML.

Fourthly, reaping my "let that sink in" is childish and low-IQ trolling unworthy of this platform.

reply
dwa3592 8 minutes ago
>>ASML only makes the lithography machines

Woah! only lithography machines???? it is literally impossible to make any device capable of running anything close to AI without ASML. Let that sink in.

reply
rollulus 55 minutes ago
Interesting that this got posted now: the project is receiving increasingly more skepticism lately in the Dutch tech scene [0], and I think that’s fully justified.

[0]: https://www.quotenet.nl/zakelijk/a71588202/techondernemers-m...

reply
embedding-shape 44 minutes ago
What is the exact skepticism? The only thing I could get from that was from some "tech entrepreneur":

> GPT-NL was never built to compete with Claude or ChatGPT. It was trained exclusively on licensed data, and is intended more for governments and companies where privacy and compliance matter more than raw performance.”

That's it? That it didn't aim to compete with SOTA models? Maybe this is something you have to start with something, then ramp up, rather do what only a select few labs been able to do, start with really big models. Especially if you're resource constrained, which since this is a government project, I really hope for the sake of the tax payers it was.

reply
barrenko 30 minutes ago
I mean if you are wasting funds kind of knowing it's nowhere near remote competitive, then it's kind of a fraud.
reply
athrowaway3z 7 minutes ago
TNO is something like semi-DARPA. It gets a lot of stuff tax free and a lot of gov funding, but a lot of their budget is from getting businesses to hire their R&D teams.

They do really good R&D on a lot of stuff. This is just their attempt at public credibility/internal skill building to enter the LLM business.

Doubt its going to be successful, but they "waste" a lot more money on other things that you never heard of. Its not fraud, its just R&D dressed up a little too much too early.

reply
embedding-shape 26 minutes ago
But why is "competing against remote SOTA models on quality" the only thing that matters here?
reply
barrenko 21 minutes ago
What the hell else is there? All the other stuff can be done by an intern with an 8 euro HF Pro subscription.

Other than actual research, which is in a different camp.

reply
embedding-shape 3 minutes ago
Common approach I've seen is having workflows with paid/larger/hosted models for some workflow where you don't quite know exactly how it'll be when you first put it together, then with time you've locked down how things more or less work yet you still need free-form text parsing of some kind, so you end up replacing the bigger models with carefully post-trained small models.

Besides that, there is a ton of use cases for smaller models for a bunch of different things. We'll be unlikely to be able to run LLMs (actually Large) on smartphones for a while, while the smaller LLMs seem to run already on-device in experiments.

reply
InsideOutSanta 21 minutes ago
Targeting a niche audience with specific requirements is not fraud.
reply
wrs 59 minutes ago
They’re building a competitive-quality model, from scratch, with fair compensation to content owners, for €13.5 million? Something’s wrong with this picture.
reply
gnegggh 10 minutes ago
I'm making a Dutch dictionary and would be interested to see how this model would fair in evals vs non specialized ones. I've tested a variety of models for https://hetnederlands.com content and differences can be big
reply
HelloUsername 2 hours ago
Previously posted on 02-dec-2023 https://news.ycombinator.com/item?id=38497495 3 comments
reply
ronsor 2 hours ago
Two and a half years and still not complete? That's ridiculous.
reply
pedromlsreis 2 hours ago
AMALIA, from Portugal, going the same path!

https://en.wikipedia.org/wiki/Am%C3%A1lia_(LLM)

reply
jansenmac 2 hours ago
This is not an open source model. In that sense I think the sovereign claim is a bit strange. It's the data providers that determine access to the model.
reply
frangonf 2 hours ago
So it's a model that's sovereign as in sovereign kingdom of the Netherlands vs sovereign for the people's?
reply
embedding-shape 51 minutes ago
"sovereign" the marketing term basically means "in-house" now, where "house" depends on who says it.
reply
stared 2 hours ago
Is it a proposal or a model? And if it is a model, how fies it fare on benchmarks?
reply
simianwords 36 minutes ago
I really think countries should build a sovereign _ecosystem_ and sovereign models are an excuse to achieve it.

An ecosystem is the tribal knowledge, revolving door of talent, known processes etc.

If the end goal is to make a half assed Dutch speaking model, I think it won’t cut it. I don’t see anyone using it over Gemma 4b that runs on my laptop.

An ecosystem is more durable and has desirable second order effects.

reply
Marciplan 2 hours ago
Supposedly this model also aims to treat publishers of all sizes well. Looking forward to its launch soon :)
reply
adalacelove 2 hours ago
Maybe it's time to acknowledge that current copyright laws do more harm than good and put another framework in place.
reply