Discussion about this post

User's avatar
adlrocha's avatar

I intentionally left DGX Spark out of this post, but as @0x3d mentions below, they definitely deserve a mention. As recently posted by 0xSero, 2 DGX Sparks (~8k$) can get you:

- 256gb

- 8tb

- 546gb/s memory bandwidth

This would run models like:

- Deepseek-v4-flash

- MiMo-v2.5-flash fp4

- MiniMax-M2.7

- Qwen3.5-397b-reap

As with the case of a single RTX3090 described above, the memory bandwidth won't be as large as in some of the configurations presented in the post, but looking at localmaxxing.com, it can take you to 30.9 tok/s on MiniMax-M2.7 (https://www.localmaxxing.com/en/models/MiniMaxAI/MiniMax-M2.7?run=cmon7v1f70007l204wtsw8ogj).

Anyway, I thought this deserved an update to complete the architectures presented in the post (reply to this comment if you rather have this update directly edited on the post, I was too lazy to do so now :) ).

Cheers!

Chris Henry's avatar

Happy to play in this sandbox with you. I’m running an M5 Pro w/64gb. Went Apple Silicon for ease of use and ease of setup, tho getting MLX optimized tools hasn’t been as easy as I hoped.

3 more comments...

No posts

Ready for more?