I intentionally left DGX Spark out of this post, but as @0x3d mentions below, they definitely deserve a mention. As recently posted by 0xSero, 2 DGX Sparks (~8k$) can get you:
Anyway, I thought this deserved an update to complete the architectures presented in the post (reply to this comment if you rather have this update directly edited on the post, I was too lazy to do so now :) ).
Happy to play in this sandbox with you. I’m running an M5 Pro w/64gb. Went Apple Silicon for ease of use and ease of setup, tho getting MLX optimized tools hasn’t been as easy as I hoped.
I really appreciate it, Chris. I am starting to figure out the best way to build a benchmarking harness and will let you know if I need your help (that M5 Pro will definitely come handy). Cheers!
That's actually a great point. I mainly focused on accelerators, but these actually provide you the plug-and-play experience that I am personally looking for.
I may actually buy one of these to benchmark it and see how they compare to the unified memory alternatives. Thanks!
I intentionally left DGX Spark out of this post, but as @0x3d mentions below, they definitely deserve a mention. As recently posted by 0xSero, 2 DGX Sparks (~8k$) can get you:
- 256gb
- 8tb
- 546gb/s memory bandwidth
This would run models like:
- Deepseek-v4-flash
- MiMo-v2.5-flash fp4
- MiniMax-M2.7
- Qwen3.5-397b-reap
As with the case of a single RTX3090 described above, the memory bandwidth won't be as large as in some of the configurations presented in the post, but looking at localmaxxing.com, it can take you to 30.9 tok/s on MiniMax-M2.7 (https://www.localmaxxing.com/en/models/MiniMaxAI/MiniMax-M2.7?run=cmon7v1f70007l204wtsw8ogj).
Anyway, I thought this deserved an update to complete the architectures presented in the post (reply to this comment if you rather have this update directly edited on the post, I was too lazy to do so now :) ).
Cheers!
Happy to play in this sandbox with you. I’m running an M5 Pro w/64gb. Went Apple Silicon for ease of use and ease of setup, tho getting MLX optimized tools hasn’t been as easy as I hoped.
I really appreciate it, Chris. I am starting to figure out the best way to build a benchmarking harness and will let you know if I need your help (that M5 Pro will definitely come handy). Cheers!
I think nVidia's DGX Spark and it's clones (Asus GX10, Dell Pro Max GB10, etc.) definitely deserve a mention here.
That's actually a great point. I mainly focused on accelerators, but these actually provide you the plug-and-play experience that I am personally looking for.
I may actually buy one of these to benchmark it and see how they compare to the unified memory alternatives. Thanks!