Discussion about this post

User's avatar
adlrocha's avatar

On Sunday I write about speculative decoding, and immediately we get Qwen3.6 with MTP and support for llama.cpp: https://huggingface.co/unsloth/Qwen3.6-27B-MTP-GGUF

I just tested it and it looks really promising. I'll report back with some numbers.

No posts

Ready for more?