Pool spare GPU capacity to run LLMs at larger scale
10 points by i386 15 hours ago | 3 comments

iwinux 15 hours ago
You lost me on "spare GPU". I don't have any capable GPUs, let alone spare ones :)
reply
vagrantJin 13 hours ago
This is very promising, definitely looks more user friendly than exo. Can't wait to try it out.
reply
lostmsu 11 hours ago
> MoE models via expert sharding with zero cross-node inference traffic

This makes the whole project questionable

reply