*counts world population*
[ ! ]
It’s so good that some people download two in case the first one breaks.
Yeah, I got a superbly functional and super fast search / research / assistant tool from Qwen 3.6 35B and Open Web UI + SearXNG. All running local. It passed the WAF benchmark with flying colors.
It’s honestly incredible how good the local stack is nowadays. It’s literally better than any frontier model you could’ve rented like a year ago.
I have a 16gb MacBook Air m4.
I like the idea of having a model I can run locally in the event of a possible long term internet outage.
Can you recommend a model that would be suitable for my computer?
16gb is a bit low unfortunately. You could run a 2 bit quant of latest Qwen, but that’s going to be a severely degraded performance. https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF
Might be worth trying though to see if it does what you need.
Thanks! I figured it’s low on ram, but with the way things are going in the world, maybe it’s better than nothing is what I’m thinking.
It’s entirely possible we might see fairly capable models that can be run with 16 gigs of RAM in the near future. Qwen 3.5 came out in February, and you needed a server with hundreds of gigs of memory to run a 397bln param model. Fast forward to a couple of weeks ago and 3.6 comes out with a 27bln param version beating the old 397bln param one in every way. Just stop and think about how phenomenal that is https://qwen.ai/blog?id=qwen3.6-27b
So, it’s entirely possible people will find ways to optimize this stuff even further this year or the next, and we’ll get an even smaller model that’s more capable.
Thanks! That’s really amazing to hear. I guess I’ll wait a bit and see what happens.
Still worth using Qwen3-Coder-Next 80B? Runs about slightly faster than 3.6 27B on my hw.
I haven’t tried comparing them myself, I guess you just kind of have to gauge if it works well enough. :)
What software are u using with the models for code? OpenCode, Nanocoder, etc.?
I ended up settling on opencode, but I find all of them work more or less the same nowadays. Pi is an interesting one which is very minimalist.
long term internet outage is not that likely. But getting priced out of any online models is quickly the reality.
gemma-4-E4B-it
Why does Ollama only have a cloud version?
Because how else will they take care of vendor lock-in without such requirement?
What is the difference between running locally and running on Qwen platform?
I do not have a capable computer to run this I am just interested
Mainly data sovereignty. Running a local model means all your data stays on your machine. Any time you use a service you’re sending whatever the model is working on to the company. Another advantage is the price. With services you have to pay a subscription, with local models you get to run them for the price of electricity.
And electronic devices
The land of the CCP is the last place I’d expect to see FOSS AI agents. Good for them! Beats the hell out of our greedy bastards in the United States.







