LongCat-2.0, a large-scale MoE model with 1.6T total and 48B Active
36 points by benjiro29 3 hours ago | 8 comments

gardnr 2 hours ago
> The training and deployment of LongCat-2.0 are built on large-scale clusters of tens of thousands of AI ASIC superpods. Compared to the mature Nvidia GPU ecosystem, the supporting software community is still less developed. We have therefore put significant effort into building a stable, secure, and scalable infrastructure.

This is the real news story. It looks like they may have used Huawei Ascend 910C chips: https://nitter.net/teortaxesTex/status/2071708141037781407#m

reply
credit_guy 47 minutes ago
I just tested it with a slightly tricky question

  > If you could run a nuclear reactor with U-235 as fuel or Pu-241 (both mixed with 95% U-238), which one would you choose and why? 
For a human this would not be tricky at all. For an LLM it could be, because this question certainly does not exist in any sort of training, because Pu-241 does not exist in pure form, it only exist as a minor component of reactor-grade plutonium, where Pu-239 would dominate, with Pu-240 coming second and Pu-241 coming third.

In any case, LongCat-2.0. gave a very well reason but incorrect answer that Pu-241 is preferable.

I then tested on Qwen 3.7 Plus, and it correctly answered that U-235 is preferable because of its much higher delayed neutron fraction. I then went to Gemini Flash, which answered the same, with much more confidence, and with much stronger arguments, and the speed of the answer was much higher.

Overall I rate Gemini Flash the best, Qwen 3.7 Plus an acceptable second, and LongCat-2.0 an ok'ish third, if you have nothing better.

reply
3eb7988a1663 16 minutes ago
I am not a physicist but perhaps your question was leading more than you expected? I would take the question to pre-suppose I have an abundance of the stated material, ignoring practical realities of refinement. If I did have fully pure Pu-241, would that be a better fuel than U-235?

Or stated another way, "If you could run a generator on gasoline or jet fuel, which one would you choose and why?" I would answer jet fuel owing to slightly higher energy density and purity of the material - likely leading to a cleaner burn. Which would ignore that jet fuel is going to be a multiple of the gasoline price.

reply
icepush 16 minutes ago
Did you ask the question several times in fresh chat contexts to see if it sometimes gives the right answer ?
reply
aetherspawn 31 minutes ago
I wish they would release the requirements to run on llama.cpp with any announcements of open models.

A bonus would be tok/s on common hardware.

reply
lcampbell 3 minutes ago
I don't think llama.cpp supports any of the LongCat models, actually.

They haven't posted weights/inference solutions for LongCat-2.0, but LongCat-Next had transformers support, which I assume means it works with vLLM/SGLang.

Given it's 1.6T, "common hardware" is probably out of the question; even 2bpw is going to measure out at 400GB, even before considering the bandwidth requirements for 48B active. I haven't read the LongCat-2.0 architecture docs, but if you're not running GLM-5.2, you're probably not running this either.

reply
dryarzeg 2 hours ago
[flagged]
reply
trollbridge 2 hours ago
If more people are doing what DeepSeek did and figuring it out, that's a great thing, because DeepSeek figured out how to radically reduce the cost of inference.
reply
BoorishBears 52 minutes ago
What on earth are you on about, truly.
reply