On top of that, you will still be heavily quantized.
You can cluster these beasts too. Two and three (with two IP subnets) is fairly obvious. Four or more might need a switch depending on how much network latency affects things.
Apple seem to have forgotten about M series with gobs of RAM. I can't get the Apple shop to show more than 96GB of unified RAM and that costs a kidney.
If you are training and doing research it's great, if you want to cluster them it cant be beat, but if you just want local inference on a single box buy a mac or even a strix halo device.
You might also notice that your Spark has a pair of QSFP28 or DD (not sure yet) type interfaces as well as the 10Gb/s ethernet - that network card is a right old beast and adds quite a lot to the cost. It is capable of either 200 or 400Mb/s and can be split into two lots of four. Your mate's Mac probably has a wifi connection and is too cool for ethernet 8)
That NIC is there for a good reason - the Spark wants some friends to cluster with and you will absolutely spank any Mac when you spaff Mac style money on say three of these beasts and some cables and cluster them up. If you want four or more, you will need a switch and Mikrotik and others have them.
Casual "tokens per second" in AI is a bit like gamers whittering on about "ping" when they are using TCP and UDP for their games. ICMP request/response is a handy way of testing network paths and can give some indications towards potential performance limitations.
I'm a Linux guy, but also don't always have alot of time. The Spark comes out of the box with a nice Linux distro that's pre-configured to be easy to setup and the guides and online resources make getting up and running trivial, for even some complex tasks. You would have to do a LOT of tinkering just to figure out some of the things the nvidia resources walk you through natively. They have guides for a ton of stuff that include the optimal settings so you don't have to figure it all out through trial and error.
Check out these "playbooks" for some examples. [0] There's a lot to be said for not having to piece all that together yourself.
https://build.nvidia.com/spark
I think between unboxing mine setting it up to run headless, and generating tokens was like 20 minutes total for me.
Only the M3 Ultra really beats it, and once you start scoping out the cost of a M3 Ultra with 128GB or 256GB, the DGX Spark doesn’t look bad after all.
I see ~274 GB/sec for the DGX Spark[1], versus 307 GB/sec for M5 Pro and 460 or 614 GB/sec for M5 Max[2]. One might call 90% "basically the same", but there are nominally two tiers above "Pro".
Yes, a MacBook Pro with 128 GB and M5 Max costs $5100 (14") or $5400 (16") versus currently $4700 for the DGX Spark, but the MBP includes keyboard, mouse, battery and portability. I believe its prefill is slower and you get 2 TB vs 4 TB SSD, but overall one gives up a lot to save 10% of the cost.
[1]- https://docs.nvidia.com/dgx/dgx-spark/hardware.html [2]- https://support.apple.com/en-us/126319
Apple could actually be a good deal and you folks would still make up something to not justify it. In a way, it’s amazing what Apple has accomplished- Baseless negatively-tainted perception in certain influential tech circles.
(To be fair, they’re kind of earning it. I’m glad Tim “Sweet T” Cook is departing.)
Plus, my original comment got downvoted despite being factually-correct. Thanks, Reddit. Oh, wait…
The spark can fine tune models in 1/4 the time and excels at other compute tasks in ways that Mac never can. Plus the high bandwidth ConnectX-7 ports would be like $1700 to buy on a card just for the network adapters... But for generating tokens, it just plain loses.
(Still potentially very useful! But not magically ultra fast.)
page 11
Again I’m not saying you should trust an American company necessarily more than a Chinese one, but as an American, I probably can.
So are the 96% of us humans that aren't USians.
Also had to do an Intel build, and there was no way we were going cudimm at current prices. =3
Haven't they already proven to be extremely useful? In some areas they are definitely here to stay, coding/software and search (retrieve and summarize information). There's a bunch of places where they are surely shoehorned in, overhyped, and don't belong, but there's also equally many places where they might still be transformative but aren't used yet.
But overall I think the technology is well proven.
Besides, the marketplace is still in its infancy for LLMs, with a lot of unanswered questions. A lot of those questions surround the commercial viability of frontier models on bespoke hyperscaler data centers with limited usage outside of LLMs specifically should those economics be non-viable. Since that's where the memory is being tied up into, that means it's a critical question to answer in order to determine long-term investment needs into further memory fabrication.
Most certainly not. The accuracy issues mean that they can't really be used effectively for coding or search, the two things you mentioned.
The problem is that while one these gpus is a huge improvement over a laptop or a single 3090, you very quickly wish you had more. I would buy a second one, but I did the math and realized that with the current crop of models, 2 Blackwells doesn't buy me any new capability that I didn't have with one. So I would need a 3rd one. And when I buy a 3rd one I will feel like I want to running a higher quant, so then I will want a 4th.
Also, the 4-bit quants of MiniMax 2.7 will run at 100 tps or so with two cards, which is pretty decent. It doesn't go any faster at all with 4 GPUs from what I've seen, so if you don't actively need 384 GB of VRAM, 2x RTX6000 is a good place to be.
1. "Retail" does not have enough purchasing power to have all of these "bags" unloaded on to.
2. Institutions buy shares in public firms post-IPO all the time even when they're "unloading bags onto retail". Take Uber (random example) ~83% is owned by institutions.
3. General factual history of the stock market shows that you are incorrect. Successful companies that IPO and continue to do business still have quite a lot of room left to grow. What was Google's market capitalization at IPO? What is it now? Is it possible some early investors made higher multiples than the IPO -> May 20th valuation? Yea for sure. That doesn't mean that all the value was captured. It also doesn't take into account the early stage risk for investing. Is Google an "at this point IPO"? No, but the principle is the same.
It's also worth mentioning however that the number of IPOs is going down over time. You could maybe argue that the only ones that actually IPO are all the bags, but that seems like a stretch.
These cynical comments "IPOs are mainly for unloading bags on to retail" lack explanatory power and data.
A wise man once said: "if you're given an opportunity to cut an amazing deal and you can't tell who's getting screwed, then it's probably you"
0: https://pestakeholder.org/news/trump-admin-bails-out-private...
What is absolutely true? I'm not sure specifically what you are referring to.
> Just look at how private equity is now getting access to public markets and retirement accounts[0].
Nobody forces you to reallocate your Vanguard Total Stock Market Index Fund or wherever you have your retirement assets into a new Apollo fund.
Secondarily, we should treat people like adults and allow them to make their own investment decisions.
Institutions merely owning a newly-IPO'd stock means nothing. They get access to shares at a reasonable price before opening while retail is buying at insane prices after open. See Figma as an example where institutional investors got it at $33/share and it ended the IPO day at $115/share with retail buying all the way up (including pops above that at like $127)
I thought it was common knowledge that IPOs are a way for insiders and early investors (not IPO flippers) to get a nice exit during the frenzy.
Probably not. Do you understand however that your comment does not make sense in the context of my comment?
> Institutions merely owning a newly-IPO'd stock means nothing. They get access to shares at a reasonable price before opening while retail is buying at insane prices after open. See Figma as an example where institutional investors got it at $33/share and it ended the IPO day at $115/share with retail buying all the way up (including pops above that at like $127)
It also doesn't mean nothing - you have to go and analyze any given stock to make these kinds of claims on a per-IPO/equity basis. You also are ignoring traders and trading algorithms run by... big institutions and trading firms, and you're not accounting for volume or accounting for post-IPO purchases nor breaking those down by segment. In other words, you're just making stuff up.
Earlier stage investors take risk and are rewarded for that. Most companies go bankrupt and folks lose their principal. For the companies that are successful yea some go bust after IPO - so what? Are you against public markets or something? That would at least be an interesting discussion.
Google IPO'd in 2004 and returned from what I'm reading about 6,500% after IPO (and this was in 2024, so the gains have gone up much higher since then) and all of that was the bags dumped on retail. If someone wants to dump their 6,500% return on me I'll take them up on that all day every day and twice on Sunday.
Being cynical is a recipe for poverty.
I already saw a article recently about how to set up a business domain which can reliably show up in a search result and dump overly positive reviews into anyone's context.