Rendered at 18:23:20 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
porphyra 20 hours ago [-]
Lots of comments are expressing skepticism about compatibility but it's pretty cool how Nvidia has the clout to convince a bunch of game publishers and creative apps to release Arm versions. Popular games like League of Legends as well as stuff like Adobe Photoshop and Premiere are getting native Arm ports.
> Over 100 Windows software providers such as Adobe, Blackmagic Design, Blender, CapCut, ComfyUI and OTOY, and game developers such as KRAFTON, NetEase, Remedy Entertainment, Riot Games and XBOX are embracing the new RTX Spark platform. [...] NVIDIA is partnering with Adobe to rearchitect Adobe Premiere and Photoshop for RTX Spark. [0]
> Gaming on Arm is finally coming of age thanks to the NVIDIA partnership. Native anti-cheat solutions from Epic and BattlEye are fully supported on the RTX Spark platform. Major developers are jumping on board, with Riot Games bringing League of Legends and Valorant natively to the architecture, alongside KRAFTON bringing PUBG Battlegrounds. [1]
Also, Nintento Switch is an Nvidia/Arm gaming device so many game publishers already have some experience with the combo.
Quite a few of those already have arm ports for windows, and have since the 1st gen Snapdragon X Elite. I have the surface laptop 7 with that chip, and I remember it being made a big deal when photoshop & lightroom were ported. I believe Blender also had an arm build for windows a while ago too as did Davinci Resolve (as of 2024 I believe).
The big news is more so on the games side, which is probably where Nvidia had some pull.
I'm curious what "rearchitect for RTS Spark" means in practice though. Sounds like its less convincing them to make an arm build for windows, but they are maybe taking advantage of some hardware specific features? If so, what does that mean for the Snapdragon X series I wonder?
zamadatix 6 hours ago [-]
Don't read too much into the marketing speak. "Embracing" does not necessarily mean most of these companies are actually doing anything other than providing a marketing statement about how their product already works on ARM64 Windows. E.g. Photoshop has had an ARM64 version on Windows since 2020, EAC & BattlEye since 2024 & 2025 respectively. And, even then, it'd be a lot more exciting if e.g. Fortnite would actually enable ARM64 support in EAC rather than it just be supported by EAC.
Only the ones which explicitly list something like the Riot Games mention are really related to the device/Nvidia. The thing which really pushes this along is user adoption/market share, not big names. This device will help that, especially in the gaming space, but it's easy to get over eager as it being from Nvidia means everyone else who has been waiting will just now jump on board too because of that.
Shitty-kitty 19 hours ago [-]
Press releases are easy. Delivering on promises, not so much.
stingraycharles 19 hours ago [-]
This thread is almost 1:1 identical to when Apple released their own silicon. This has the potential to be a worthy competitor for the Windows ecosystem, precisely because of NVidia’s moat as the grandparent pointed out.
Microsoft pulls in their weight as well, so this seems like it has a decent chance of getting industry support.
selicos 17 hours ago [-]
If you can get desktop RTX 5070 performance, oodles of (v)RAM, and minimal power usage out of a thin and light mobile device it's a win. This is change. If you can afford it.
Shitty-kitty 18 hours ago [-]
Yes, there is a chance but it could also turn into another Itanium. Just because it is a superior product and backed by giants, doesn't necessarily guarantee success.
Rohansi 18 hours ago [-]
Not sure how it's comparable to Itanium at all? ARM is not a new architecture. It's not even a new architecture for Windows.
branko_d 13 hours ago [-]
Itanium was arguably not superior. The assumption behind it (that the compiler can bring order to the chaos) was wrong, making it slower, more expensive, and less efficient than x86 in real-world scenarios.
wslh 18 hours ago [-]
That said, Apple still deserves a lot of credit. They had a 5+ year edge, especially around the vision of tightly integrating the NPU and unified memory.
stingraycharles 15 hours ago [-]
I think they have a longer lead, considering how long they’ve been making iPhone A-style processors. Migrating the desktop ecosystem to it was only the logical next step.
Shitty-kitty 17 hours ago [-]
Like gaming consoles they calculated that unified memory will be cheaper for them in the long run. The funny thing is that while it gave them a unintended edge on local A.I, the "cheaper" calculation, didn't work out so well for them.
fragmede 17 hours ago [-]
In what way hasn't it worked out for Apple? Some of their products are totally sold out.
Shitty-kitty 17 hours ago [-]
They are limiting sku's on everything but highest end. The ramocalips is hitting high end RAM, especially hard.
russelg 15 hours ago [-]
Even on the highest end, the M3 Ultra at 512gb RAM doesn't exist anymore :'(
Actually, I went to the Mac Studio configure page on Apple.com and you can't do higher than 96GB now...
noodletheworld 12 hours ago [-]
Yeah but like, Apple put Rosetta 2 out and it was damn good.
Vendors didn’t have to do shit to support the platform, they just got better performance if they did (like factorio).
There is something of a difference between “all your stuff will still work, at comparable performance” and arm windows which (as evidenced by all the vendor promises) you can’t really currently say with prism.
I would describe prism as “surprisingly rubbish considering they had an example of how to do it right” and “your app probably doesn’t run because of drivers or some ??? compatibility thing”.
Am I misremembering? I remember being blown away by Rosetta.
Prism… yeah. Toggle the settings. Disable jit. Disable FP. …bin laptop. Get an intel laptop.
branko_d 7 hours ago [-]
I think a lot of it is down to Windows, not Prism itself.
For decades, Windows made it too easy for games and even some application to install drivers. Windows games use drivers for anti-cheat (and historically for copy protection too). Neither Apple Rosetta nor Microsoft Prism can translate/emulate drivers, but since drivers have been much more prevalent on Windows, now Windows has a much biggest compatibility problem.
hootz 17 hours ago [-]
Will this push even more games into Linux?
BatteryMountain 12 hours ago [-]
Well, linux already runs perfectly fine on ARM chips, so it probably won't matter much. The real bottleneck is getting game studios to build arm releases of their code, which by itself is easy in normal circumstances but they often have third party code that doesn't have ports or are abandoned or hidden behind NDA's (networking code, sound processing, custom tooling etc). So ARM and Linux are not the blocking factor at all and I'm willing to bet most of the engineers working on game engines have ported them to linux/arm for fun already, they just can't release for various reasons above.
So if anything, we need to push more game studios to use open source dependencies which will make porting easier.
gargan 12 hours ago [-]
Linux is terrible on ARM, I don't agree that it runs perfectly fine at all. Try loading Ubuntu onto a Snapdragon laptop for example. It works but lots of issues eg sound, webcam quality, etc
BatteryMountain 11 hours ago [-]
Those are specific firmwares for those devices that are either closed-source binary blobs, open-source hackjobs/reverse engineered attempts, or just plain missing firmwares. The fault is not on Linux but rather on Qualcomm not releasing things for that specific SoC. Some SoC's have better support than others. ARM cpu's themselves works perfectly fine on linux.
Intel has closed things down: some wifi and webcam firmwares are poop and a massive pain to get working on newer chips (if at all). Their wifi firmwares also don't respect certain kernel overrides (which is why I replaced my Intel Wifi 7 chip with a mediatek Wifi 6 one). Blame is 100% on intel and not linux. Broadcom is also pretty bad at being a team player in this regard.
I basically recommend everyone to stick with AMD chipset & GPU's where possible, because they have mainline kernel support nailed down 95% of the time.
Again, ARM works fine, their extra firmwares for extra devices on SoC's are to blame if you struggle.
ChocolateGod 12 minutes ago [-]
> Those are specific firmwares for those devices that are either closed-source binary blobs, open-source hackjobs/reverse engineered attempts, or just plain missing firmwares
This isn't fully the reason, Linux is infamous for requiring a specific build for each SoC (and usually each board of said SoC) where as Windows on ARM uses ACPI which Linux doesn't support to the same level. Linux prefers the landfill promoting device tree for each device approach.
BatteryMountain 11 hours ago [-]
Webcam firmware & colour grading & programming is a black art btw.
ranger_danger 9 hours ago [-]
I think you're referring to a specific SoC that's used on recent laptops which happens to have driver issues, something not specific to ARM at all. Other (usually embedded) ARM devices have been running just fine on Linux for over 20 years now.
fragmede 9 hours ago [-]
GNU/Linux has trouble, but thanks to Android (and ChromeOS), we know Linux itself specifically on ARM does actually work. Freeing those drivers is another matter, unfortunately.
pjmlp 13 hours ago [-]
Most games are Windows games running on Proton.
What would push more games would be Valve actually making it worthwhile to natively target Linux.
BoredPositron 18 hours ago [-]
You know most of them already have arm versions...
cmxch 16 hours ago [-]
So given how ARM has favored closed-ended boxes like the Switch, how is this any different than a Switch with a keyboard and Windows RT?
deadbabe 18 hours ago [-]
Somewhere, a monkey’s paw must have curled its finger.
lallysingh 19 hours ago [-]
Apple and Steam have been successfully applying pressure for years. Who's willfully staying behind at this point?
nerdjon 24 hours ago [-]
Some competition for Apple in this space and competition for Intel and AMD is great.
But I really do question how well Windows on Arm is really going to work out long term.
For Apple it worked because they were able to force the issue. If you wanted a new Mac it was going to be Arm and we all knew eventually (this year or is it next year?) Intel support would drop. Over time we have seen M series exclusive features.
Developers were forced to update or abandon Mac which gave users a great experience (with some early growing pains).
This is something that Windows will never be able too do. They will always be stuck maintaining an emulator and a likely large subset of apps only supporting one over the other. (also does this work the other way around with an Arm only app working on x86?)
This seems like a repeat of when it was not uncommon for games to only support Intel or AMD or NVIDIA or AMD. But worse since they are not both x86. Sure at least we have emulation but just like with Rosetta2 it shouldn't ever be the long term solution.
kllrnohj 23 hours ago [-]
For Apple it worked because they waited until they had a really, really good ARM ISA CPU (combined with arguably sandbagging their x86 offering for a few years prior but I digress).
Qualcomm is also working on a really good ARM ISA CPU with their acquisition of NuVia and subsequent Oryon architecture.
Meanwhile this is just using off-the-shelf ARM CPUs in a MediaTek SoC with blackwell bolted to the side of it. ARM's CPUs so far have been subpar for laptop-class chips. Hence why neither Apple nor Qualcomm are using them.
zamadatix 6 hours ago [-]
> Meanwhile this is just using off-the-shelf ARM CPUs in a MediaTek SoC with blackwell bolted to the side of it
MediaTek is involved in the SoC but both the CPU & GPU from Nvidia are bolted on to it. I.e. it's not a standard MediaTek CPU with an Nvidia GPU added.
kllrnohj 5 hours ago [-]
MediaTek's press release pretty clearly indicates the CPU came from MediaTek, and so far Nvidia doesn't have any custom CPU core they've called "Grace". Seeing as the DGX Spark has what seems to be the same core chip, it'd be really surprising if the RTX Spark swapped out the CPU cores without any fanfare announcing that
dijit 21 hours ago [-]
> arguably sandbagging their x86 offering
tbh, I always read this as Intel doing some sales magic here.
Apple: "Hey, we're making a product that has a 15w thermal envelope, do you have anything?"
Intel: "Yes!"
(Unspoken: their products will throttle down to fit, in fact, they will try to run always at 99ºC so you always get the best performance! FEATURE!)
Apple: "uhhhh..."
Consumers: "HEH IS IT EVEN A PRO DEVICE IF IT DOESN"T HAVE <INTEL MARKETING BRAND TERM>?"
Apple: "UHHHH... Guess we'll do it ourselves"
kllrnohj 20 hours ago [-]
> tbh, I always read this as Intel doing some sales magic here.
Possibly, but Apple choosing a new, thicker chassis the same generation that they introduce their more power efficient replacement is certainly a thing. Even if Intel failed to achieve the TDP they told Apple, Apple also seems to no longer believe the thinness they were doing was viable for that TDP anyway.
Intel's product offering certainly wasn't as compelling towards the end there, but it also looked almost uniquely bad in Apple's chassis vs everyone else's
amazingman 18 hours ago [-]
[dead]
internet2000 18 hours ago [-]
"Sandbagging their x86 offering" is a new one. There's no winning.
yurishimo 11 hours ago [-]
The Intel chips of that time were fine but it was a problem from both sides. Apple refused to "compromise" their hardware design and Intel failed to deliver on their promises regarding power/heat budgets and kept telling Apple execs that they were just one cycle away from fixing all of the problems.
Ultimately, Apple won that fight when they decided to stop letting Intel control their hardware roadmap and it's been a great change for the entire industry. Intel is finally seeing some changes in their own products, largely in response to Apple dropping them. Now Nvidia is getting into the game which means more competition which is also good.
Danox 2 hours ago [-]
Apple had a similar run-in with AMD and Nvidia. Over design issues.
rickdeckard 23 hours ago [-]
That's surely one thing, Apple went all-in on ARM, for Microsoft it's still a kinda "reduced experience".
But the bigger problem in my opinion: How much of the Windows userbase actually sticks to Windows because of its backwards-compatibility?
--> What would happen if they break this model and the OS is only judged based on its user experience and available applications...?
I'm not sure it would stand any chance to compete in the B2C space. If I think about it, there's not a single new feature in Windows of the last ~20 years I particularly care about.
Without backwards compatibility, there's barely any ecosystem. MacOS on the other hand is full of ecosystem features, improving collaboration, connectivity, handoff across devices, etc.
brailsafe 23 hours ago [-]
> MacOS on the other hand is full of ecosystem features, improving collaboration, connectivity, handoff across devices, etc.
True, but if you're only in the ecosystem as a mac user, in many ways it's felt like a mixed bag. I still wildly prefer mac over other operating systems, but if upgrades had a price, I think those sales would mostly go to iPhone users. Even at free, I'm yet to find a compelling reason to install Tahoe, and will probably just continue waiting until the next one.
rickdeckard 8 hours ago [-]
Agree, it's a fairly closed ecosystem, that's why I personally don't use it.
But despite that, as a Windows user I acknowledge that any kind of interaction with another Mac from within MacOS (Handoff, Sidecar, Universal Control, Bluetooth-pairing to Apple-ID instead of Hardware MAC-ID,...) is leaps ahead of what Microsoft was doing with their OS for the past years.
Just the scenario of an employee getting a Windows laptop as a work-PC, there's barely any halo-effect if he/she also uses Windows at home. No easier handoff, no interaction, hardly any "just-works" connectivity.
Windows is mostly a vessel for the (legacy) applications it can run, and for these Browser-based Microsoft Online-Applications (which work equally-well on other platforms)
They didn't invest in creating "just works" frameworks for their PCs which amplify the ecosystem the more compatible devices you have, instead most of their focus is now on "just-works" stuff in the cloud.
So if Microsoft would make a clean cut on backwards-compatibility, I'm not sure there would be a reason left for most B2C users to even stay with Windows.
The "you can make it work if you invest a bit of time or google it" paradigm is nowadays well-covered by Linux already, and it's getting even harder for brands to compete on price/quality with Apple's scale, for almost any portable device...
brailsafe 44 minutes ago [-]
Ya I agree. I'm primarily a Mac and Android user, but also use Windows for gaming, and Windows has never been particularly good at anything. It's never been a smooth user experience.
Recently I upgraded my motherboard and tried reinstalling Win10 Pro, but couldn't activate it despite saving the product key. They have at least THREE obscure flows for re-activation depending on how it was originally activated. The license in my flow needed to have been bound to a Microsoft account that I never previously needed, because it ties itself to the hardware. I had to dismantle and rebuild with my old installation, activate it with my old motherboard on a Microsoft account that I wasn't planning to use to login with, then rebuild again with my new components, sign in to activate, and then disable sign in to be able to use a local user account. Insane.
Hmm, that actually is an important difference. I was considering trying to set something like this up so I didn't have to bring two laptops with me while traveling, but was skeptical it would go smoothly, apparently for good reason.
crims0n 23 hours ago [-]
I feel like making universal binaries a thing, and pushing for it to be standard is one viable path.
pjmlp 23 hours ago [-]
They already kind of are with ARM64EC, however Windows ecosystem isn't macOS, unless there is market pressure, most shops will keep doing x86/x64.
MagicMoonlight 23 hours ago [-]
Microslop doesn’t want people to be able to run their binaries elsewhere, it’s the only reason people buy their product.
pjmlp 23 hours ago [-]
They also buy it, because to this day most people cannot buy GNU/Linux powered laptops on the stores they usually buy their computers from.
They only know Apple, Windows and Chromebooks.
486sx33 19 hours ago [-]
[dead]
ThunderSizzle 17 hours ago [-]
I added an R9700 32GB to my 10+ year old desktop that had a 980 4GB card in it, for a grand total of $1350 or so. The payoff compared to what I was using with GHCP was 33 months, but when GHCP announced their price increase, it basically became a 3 month payoff at minimum (so yes, GHCP did a 10x price increase for non-parallel agentic workflows)
I can easily run Qwen3.6 35B-A3B with Q5_K_M with a 260k+ context window with some vram to spare. It easily runs probably 80tps. It took me quite a while to find the
Compared to GHCP Claude Sonnet 4.5 or 4.6, I have full parity. The wall clock time is faster for agentic workflows, and rule following is about on par.
With either, doing something kind of novel or obscure takes more hand holding compared to just generate a GUI or crud app. For example, trying to build an actual program that performs a complicated process correctly requires quite a bit of hand holding to get it to properly help.
Sure, it isn't Opus or something, but I think with the right harness, it probably can get close. I think most of the issues these days is the harnesses are lacking.
eVeechu7 15 hours ago [-]
What is GHCP in this context? Glasgow Haskell Compiler Platform? Google Hostage Computer Program?
ThunderSizzle 14 hours ago [-]
GitHub Copilot. It was one of the best values around in terms of cheap LLM access since each prompt was basically 4 cents (more or less), no matter how much it would do or how many tokens it used. A simple "Proceed" prompt that was telling the agent to execute a sophisticated plan could burn a lot of time without needing any direct intervention by the user, but as of June 1st, they switched to metered billing, meaning each token in/out has a cost now.
It was suspected to come soon enough, but it was a nice cheap road for my small hobby stuff. When they announced the price changes, I started to explore alternatives, and with the news of Qwen3.6 35B being both and having quality, I figured it was worth a try out, and self-hosting made the most sense to me, since that meant I was free from being a forever-renter.
SeriousM 12 hours ago [-]
And when you had a tool call that asked the user for the next step, you could easily run a whole day with 4c. Guess how the people did 5k $ worth of token with 100$ spent.
mh- 15 hours ago [-]
Github Copilot, as far as I can tell. Though I like yours.
kramadeshak 12 hours ago [-]
Can you give some context of how you are getting both of those to work? I am guessing vulkan. Did you face any pain during integration? I am planning to add R9700 to my 5070ti and my only concern is if vulkan wouldn't be able to do the heavy lifiting.
ThunderSizzle 8 hours ago [-]
My 980 is currently unpowered. I have not tried to integrate them yet. It's on my to-do list, but the first time they were both powered on, the system had booting issues, and I didn't want to care at the time, since the 980 was probably going to be idle 99% of the time anyway.
I'll probably try to figure that problem out in about a month. Worst case is I move it to another even older desktop to replace the 9800 GTX+ inside of that one.
cmxch 16 hours ago [-]
Between that and the Arc Pro B70, they’re the 32gb cards that are actually affordable and worth getting.
I’ve got both (single R9700, dual B70) and they do nicely for about anything I throw at them, such that the latter has a visible improvement when the model is well-cached.
giancarlostoro 24 hours ago [-]
Can it work with Linux? That's all I care about.
tarruda 24 hours ago [-]
I don't think there's any incentive for Nvidia to make this a Windows-only device, so most likely it will be fully supported on Linux, just like their GPUs are.
Matl 23 hours ago [-]
> just like their GPUs are
So with proprietary blobs that give you more trouble that they're worth?
wmf 23 hours ago [-]
Those blobs are worth $5T; show some respect.
Danox 2 hours ago [-]
Bubble incoming....
cromka 19 hours ago [-]
Kids these days, amirite?
HerbManic 21 hours ago [-]
Depends. It is the typical Nvidia problem. Everything is a black box but when it all works it is the best option available. But when it breaks, you hate them with a passion.
bigstrat2003 16 hours ago [-]
I've never had a single problem with my Nvidia GPUs on Linux.
Matl 7 hours ago [-]
It can work quite well in the desktop GPU 'happy path' (single monitor etc.) if you don't care about the proprietary nature of it.
But once we're talking about laptops, hybrid graphics etc. it quickly shows that this is not a platform Nvidia cares about.
raudette 4 hours ago [-]
Agree - but even for the basic use case, it has not been trouble free for me. With a simple 1080p display on a desktop running LTS Ubuntu on an older 3060:
- I've had updates where stuff just stopped working and I had to futz around with drivers
- Just the fact that you have to 'pick' from a selection of drivers (which one won't you hit issues with for your use case?)
- At least on mine, there have been display glitches on suspend/resume - as it's a desktop, I just leave it running
Just anecdotal, but I never had these issues with the desktop AMD APU I had before it or Intel on board graphics on numerous laptops.
lmm 17 hours ago [-]
What trouble? If you want a GPU that works on Linux, let alone FreeBSD, you buy nVidia, install their drivers and get on with your life (and sure, maybe you can't use Wayland, but why would you want to?). I'm all for open-source in theory, but in practice the AMD drivers cause far more trouble than the nVidia ones ever do.
abenga 5 hours ago [-]
After I switched from Nvidia to AMD GPUs on my main rig, I can now run Sid without issue and upgrade my Kernel whenever I want to without getting a black screen with a blinking cursor on the next boot.
fulafel 14 hours ago [-]
There's also the precedent of the several ARM Linux systems they shipped that tended to have much worse support.
browningstreet 19 hours ago [-]
They worked with Microsoft to make this commercially viable. That’s possibly reason enough.
shmerl 21 hours ago [-]
I wouldn't trust it to have good upstream support. It's Nvidia. So not really interested.
PcChip 19 hours ago [-]
An nvidia engineer in a discord server said it should work fine
eitally 23 hours ago [-]
Sort of. It's the same chipset as in the DGX Spark & DGX Station, which run Ubuntu (NVIDIA's flavor).
acka 20 hours ago [-]
Sorry, but when it comes to chipsets, they're not even close.
The DGX Spark uses a GB10 with 128 GB unified LPDDR5X memory, while the DGX Station has a GB300 with 496 GB LPDDR5X (CPU) + 252 GB HBM3e (GPU) memory.
It's like Little League versus Major League, which is why the latter costs about 20 times more than the former.
The fact that both run Linux is just because they're part of the same DGX family.
chrsw 19 hours ago [-]
I took "same" to mean "compatible driver and software stack" not "compute perf", rightly or wrongly.
verdverm 23 hours ago [-]
DGX Spark comes with linux out of the box, it would be hard to imagine this device is not also compatible
kllrnohj 23 hours ago [-]
Doesn't it come with Nvidia's blend of Ubuntu with a custom kernel? Do other distros work as well as "DGX OS" or are nvidia's kernel changes pretty important to have?
cmrdporcupine 22 hours ago [-]
I've not noticed much in it that is NVIDIA specific.
But I would say that as an Ubuntu and Debian user for decades I have no incentive to use anything else on it and I'm just pleased to have a Linux on Aarch64 machine that is well supported for a change.
rnxrx 22 hours ago [-]
For some value of "well supported" - NVIDIA's own internal catalogs (libraries, NIMs, etc) are still spotty on aarch64 coverage.
verdverm 22 hours ago [-]
afaict, they have their own package repo mirrors and a few dedicated packages for nvidia stuff
tbh, I was rather unimpressed with the out-of-box experience for an "ai" computer, you couldn't even run a model locally with the common tools people use (no llama-cpp, ollama, vllm, etc). No huggingface CLI eiher, like come on!
I need to update that because I have a nice vllm setup on there now with 4 models running, but should be able to get anyone else going without having to muddle about as I did.
zipy124 22 hours ago [-]
Hopefully better than support on their Jetson or orin boards, where compiling anything is hard because of the outdated stack.
selicos 17 hours ago [-]
This plus the price different had me buy an AMD Strix Halo board last week. It seems the work with vLLM and training models could make the Spark worth the price difference, but before today's news I had the same thought about support and did not want to lock myself into a cool paperweight, especially with 128gb of RAM on the line. AMD is x86 and I can repurpose that or run Linux forever.
fidotron 23 hours ago [-]
Honestly this looks like Microsoft must have thrown a pile of money at them to not mention it, as it's just too obviously the main question.
No one seriously cares about this running Windows. We want Steam and CUDA/Ollama, and Windows just gets in the way. nVidia are simply not that oblivious, but I have to admit in their position I'd have considered the Microsoft involvement more trouble than it's worth, which is among the many reasons I'm not a billionaire.
Maybe they think the RAM market is so terrible it will kill the whole initiative regardless.
dist-epoch 23 hours ago [-]
You misspelled llama.cpp
kennywinker 21 hours ago [-]
I’ve read all the stuff about how llama.cpp is much faster and better than ollama, and i believe it - but good god llama.cpp isn’t user friendly.
You’d think in an era where “code is free” there would be an easier story around running local ai than compiling llama.cpp by hand and then spending hours researching flags - only for it to crash from an oom error every ten prompts or so.
numberwan9 7 hours ago [-]
Please take a look at https://llama.app, all you need is 'llama serve'
greenavocado 21 hours ago [-]
You're supposed to use a cheap ChatGPT subscription to run optimization loops over llama.cpp flags with a self-contained reproducible benchmark script and just let it burn for hours/days until it is fully optimized ))))
pjmlp 23 hours ago [-]
WSL is the answer in what most folks are concerned.
Has Steam finally started to push for native Linux games instead of translating Windows ones?
drakythe 22 hours ago [-]
Valve did that little more than a decade ago, the original Steam Machines. It didn't take, and despite the success of the Deck and current techy trends, Linux does not have the % to make the ROI worthwhile if it isn't simple for developers. Proton is a wedge in the door that will help Linux get there.
pjmlp 22 hours ago [-]
It is simple, Android NDK has all the same APIs for 3D rendering and audio, as do all major middleware engines.
The failure of business, only reinforces Windows as the platform most studios reach for.
Buy Windows, buy Visual Studio, pay game engines licenses, let Valve do the work.
This ignoring that current Valve's management doesn't live forever, so who knows what happens afterwards.
thewebguyd 21 hours ago [-]
A potential change in Valve's culture/management aside, "let valve do the work" is a feature, not a bug. Studio spends all their budget targeting one platform (which still has ~90+% of the PC gaming market), and get Linux support for free.
Windows' monopoly on game dev isn't just market share either, since game dev isn't just code. You still need Photoshop, Maya, etc. and in smaller studies there's typically a crossover where some devs are doing art as well. Visual Studio's C++ debugger is still one of the best, and the tooling elsewhere hasn't caught up yet (compared to DX + PIX).
Then you also have to solve distribution and handling the fragmented display & audio stack. It's gotten a lot better, but its still a factor.
I'm fine with most of the work going into Wine/Proton. A stable ABI for Linux is a boon, if it happens to be Win32 then so be it.
pjmlp 11 hours ago [-]
Hardly a monopoly, and the tooling has gotten there, on macOS, on PlayStation, on Switch, even on Android, it is better than GNU/Linux.
Valve isn't going to be around forever porting Windows games into Proton, which is actually hardly any different if they would start selling Nintendo games with Dolphin, if we ignore the legal implications for a moment.
fidotron 21 hours ago [-]
At this point Valve look more capable of running a platform business than Microsoft do.
Microsoft have spent the whole Nadella era in "oooo cloud" inspired wonder and actively screwed up everything else.
pjmlp 13 hours ago [-]
And yet where are the native games for the platform?
Note how game developers rather spend their working hours in Windows with all its issues, even if they happen to have Linux servers for running their MMOs.
fidotron 6 hours ago [-]
> And yet where are the native games for the platform?
My Steam library is full of old win32 games that run better on Linux than they do on Windows 11. There are some native games appearing because of the Steam Deck, but the fact is they aren't necessary.
Valve aren't simply better at running a platform business, they've thoroughly subverted Microsoft's old one and have done a better job at running that than Microsoft themselves.
Look at the absolute state of the XBox business: all native games, tens of billions spent on something, and yet it's just a trainwreck from top to bottom.
bigyabai 20 hours ago [-]
> Valve's management doesn't live forever, so who knows what happens afterwards.
Tens of thousands of Windows games would remain playable with ubiquitous Vulkan-capable hardware and a 500mb Proton runtime?
stubish 16 hours ago [-]
... after you watch these ads, or pay for our premium subscription. After all, those games aren't going to host themselves and your license doesn't allow an alternative.
t-writescode 22 hours ago [-]
If it runs faster than the windows ones, who cares?
pjmlp 22 hours ago [-]
The game developers that use Windows, with Visual Studio, to develop such games.
fidotron 21 hours ago [-]
This is, admittedly, the great anomaly.
In truth if AMD or nVidia put their mind to having decent profiling tooling on Linux, and the AI wave suggests they will have no option, then this could readily become a thing.
23 hours ago [-]
newsclues 24 hours ago [-]
This is strangely absent from the news.
wtallis 23 hours ago [-]
There are two new things being announced here: the GB10 chip being put into laptops, and GB10 running Windows. GB10 running Linux is not news, it's a product that's been shipping since last fall.
dismalaf 24 hours ago [-]
It's a collaboration with Microsoft so going to say no, probably not.
BoggleOhYeah 22 hours ago [-]
Kinda underwhelming. I was hoping to see that they improved their memory bandwidth to move toward competing with the M5 Max. But this is more akin to the Strix Halo.
kristianp 19 hours ago [-]
From what I'm reading it's probably the same chip that's used in the DGX Spark, the memory bandwidth at 300MB/s is equivalent to an M5 Pro, however you can't get an M5 Pro with 128GB of RAM. Apple pushes you to the biggest M5 Max chip, which at the 14 inch form factor, costs you $5099. You can get an ASUS GB10 machine with 2TB storage for $4000, so I guess the RTX Spark laptops will be more than that due to battery and screen, etc.
Perhaps the next generation of the spark will improve on the bandwidth and RAM size numbers. Yes it's a lot like a Strix Halo, but this has CUDA, which will be of interest to developers who want that.
I was looking for AMD AI Max+ 395 laptops recently, and the only ones I've found were 13 inch models, which seems odd from a heat dumping standpoint. I'm looking for 16 inches, I guess the 13 inch form factor would make it easy for commutes where you're taking it to dock to a large monitor at work or home, but no 14 inch screens?
BoggleOhYeah 18 hours ago [-]
I've tried the Z13 Flow and I actually like the form factor except for the folio keyboard. I especially like that, since it's a tablet, it vents hot air out the top instead of into your lap/table. But the whole driver situation was very weird and things would randomly stop working. That may have improved since I tried one ~1 year ago.
raggles 19 hours ago [-]
128 GB memory is also lame. I'm hankering for a windows equivalent of the mac studio that came with 512 GB.
FireBeyond 14 hours ago [-]
The one that Apple discontinued not because of demand but memory pricing?
Danox 2 hours ago [-]
Isn't there a possibility they were killed because the M5 Ultra is coming why waste memory on a M3 series Ultra or any other high memory Mac Studio computer when the next generation is coming within six months?
LTL_FTC 15 hours ago [-]
There's a photo here showing 600GBps memory bandwidth so maybe they have doubled it:
M5 Max beats it, but for the price of an M5 Max, you are better off just getting a desktop with 2 3090s, which will be cheaper even at current prices.
hs86 1 days ago [-]
I am wary of those ARM-based Windows machines because I am unsure how good the ongoing driver support for those SoCs will be. Will they even outlive the Windows version they currently ship with?
Looking at devices like the NVIDIA Shield gives me some hope that NVIDIA will be better than Qualcomm here. I just hope this is not a case where the OEM has to purchase X years of driver support from the chip vendor beforehand, and that NVIDIA will provide support directly itself.
jwlake 21 hours ago [-]
I would love a RTX Spark Shield. ;p
ilia-a 18 hours ago [-]
Looks like just rebranded DGX in laptop form, the biggest miss is the weak memory speed, 1/2 of the M5 laptop memory speed, and 1/3 of the M3 ultra that is now years old...
mbreese 18 hours ago [-]
I'm not sure that's such a bad thing. It's not going to challenge the Apple M5, but if you're specifically looking for something in the "not-Mac" market, having a laptop-sized version of the DGX is probably going to be pretty successful.
Danox 2 hours ago [-]
Then they better release it within the next two weeks.
mbreese 2 hours ago [-]
What's coming out in two weeks?
15 hours ago [-]
ActorNightly 14 hours ago [-]
The main bus is 300gb/sec, which is on par with MB Pro. MB Max has the 600gb/sec of unified memory (about ~500 or so in practice for token generation) only for the 40 core variant, which is like $7k +, which is ironically more expensive than a dual 3090 card desktop. The 32 core variant which is still wildly expensive is like ~400 gb/sec.
The biggest thing where this will crush Apple is the initial prefill phase. 6000+ cores vs 32/40, + active cooling with fans. For local llm models, this matters quite a bit more than tokens/second.
In the end, neither are really worth it for llm use compared to just building a desktop and just port forwarding over ssh to ollama.
ilia-a 2 hours ago [-]
Because of the memory costs lately, I doubt this will be much cheaper. Also this is quite a bit slower than even 4070 let alone *90 Nvidia variants albeit with much lower memory.
rnxrx 22 hours ago [-]
There are still a *lot* of sharp edges with the Spark: compatibility, overstated performance, power consumption/heat generation, etc. It's one thing to have that situation on a box explicitly aimed at developers and quite another with an actual consumer-focused laptop.
eigenspace 1 days ago [-]
I'm surprised they released this thing. Brand perception is probably a lot more important to Nvidia than whatever sales they could get from this thing, and if it's basically just DGX Spark, it's likely to underwhelm.
I've heard there's still a large backlog of both software problems, and hardware problems with the platform. The software problems could be fixed with time, but they'll still give a shitty first impression. I'd have thought Nvidia would just bury this and try again with a successor run of silicon with a new design.
This thing seems practically destined to just be a repeat of the Snapdragon laptop debacle.
easygenes 12 hours ago [-]
Speaking as someone who has had a DGX Spark all year and been active developing at the driver and kernel level for it and other ARM64 Linux devices the last couple of years, it's not bad now and certainly doesn't have any issues that I wouldn't expect to be fully fixed with the second-gen motherboards going into these. The main hardware issues are not with the core SoC. They're replaceable edge peripherals like the PD PMIC.
Danox 2 hours ago [-]
It has to work on day one whatever the Apple Mac M5 Ultra or Mac M6 Ultra are they will work well on day one, the cost of the laptop Spark probably in the thousands of dollars has to work from the start or its dead.
fg137 1 days ago [-]
I cannot think why someone would run those workflows on a Windowslaptop, unless someone has way too much money to spend.
bigfishrunning 1 days ago [-]
> someone has way too much money to spend.
that's what nvidia is hoping for
thinkingtoilet 1 days ago [-]
If the workload is offloaded to the chip, why would the host platform matter?
DGX Spark runs Linux, and nobody is going to install Windows on that machine. This laptop got it backwards.
If someone decides to run Ollama for local inference with this laptop, they fit perfectly into the "has too much money to waste" bracket, which is addressed by a few other comments in the discussion.
woctordho 12 hours ago [-]
There is vllm-windows, and it's just as fast as on Linux. BTW I'm the maintainer of triton-windows.
chris_money202 1 days ago [-]
WSL
fg137 24 hours ago [-]
It often works, but you always lose something compared to native Linux.
pjmlp 13 hours ago [-]
Nah, before WSL I was already using a mix of Virtual Box and VMware Workstation, between home and work computers.
Installing Linux natively on laptops has always had some specific features not working.
Even my Asus netbook, which came with Linux pre-installed, had wlan issues that I learned to work around with, and the driver never supported the same OpenGL version as the Windows one (3.3 vs 4.1).
fg137 5 hours ago [-]
My comment was saying you lose something (e.g. performance) when using WSL2 compared to native Linux on a proper workstation.
Linux driver has always been an issue on laptops, but that's not the concern for running Python code.
rvz 1 days ago [-]
Believe it or not, Windows (WSL) is the best Linux distro and Nvidia knows that.
lostmsu 1 days ago [-]
vllm-windows works well enough
2001zhaozhao 1 days ago [-]
I think this is the first time an ARM windows device gets marketed for gaming. Would be interesting to see what kind of performance hit games have on the x86 to ARM translation layer.
fidotron 23 hours ago [-]
Rosetta on Mac was obviously impressive. There was also impressive Arm->Intel translation in the mobile ecosystem at one time.
One reason it works surprisingly well on modern systems is how much is offloaded to the GPU. You aren't going to get great power optimization or anything without it being truly native though.
There are games which are CPU limited though, and it will be interesting how those do. Curiously those also tend to be in engines with Arm support already.
HerbManic 21 hours ago [-]
There was a presentation from Valve about their Dex compatibility layer. They did something that seems so obvious in retrospect.
When you lay out the software stack it is essentially OS > Game code > APIs. Both the OS and APIs are native code, it is only that middle point that needs the real work.
This is why x86 to ARM doesn't have such a heavy performance cost. So games can be CPU heavy but if it is heavy at the API end, that isnt a huge issue.
Very cool.
lowbloodsugar 1 days ago [-]
Apple Silicon has a special mode that modified how the ARM chip handles memory transactions to be like x86. Does this nvidia ARM have the same?
What would be interesting to me would be how quickly developers start targeting ARM64 directly.
For Apple use of Rosetta 2 was only temporary as they moved whole lineup to ARM. MS would not abandon x64 anytime soon. So I'm guessing they will try hard to convince developers to release for both architectures.
Tiberium 1 days ago [-]
For anyone curious to know how this will fare against Macbooks, at least in CPU perf: DGX Spark has the exact same GPU and CPU as the top RTX Spark laptops will, so you can just directly compare from that.
Of course, DGX Spark is a miniPC, so laptops will likely be slower due to power limits/throttling.
siquick 19 hours ago [-]
DGX Spark is really poor at inference due to the memory bandwidth so hopefully they’ve fixed that before touting this as a way to run local models.
wtallis 17 hours ago [-]
I think DGX Spark has poor memory bandwidth because these laptops were the plan all along. NVIDIA didn't want to commit to the extra costs of a 512-bit memory bus for their first laptop SoC, so they went with the more modest 256-bit bus, same as AMD did for Strix Halo.
exabrial 22 hours ago [-]
UEFI, display panels, wifi, storage controllers, etc would be what I'm worried about. I doubt Microsoft is going to make it easy.
FuriouslyAdrift 23 hours ago [-]
DGX Spark is also $4700
comandillos 1 days ago [-]
So they have basically reused the same hardware as in the DGX Spark (GB10)...
That chip isn't great for LLM inference actually.
It is great for inference for single user/single session. it is not replacement for graphical accelerator, that run several concurrent inference sessions in parallel.
Basically the same tradeoff as macmini with unified memory.
general_reveal 1 days ago [-]
The RTX GPU laptops run very hot. Even though they are pound for pound better, it’s just runs too hot for local llm usage for me at least. Prefer Macs for this. A lot of AMD cards also run cooler. I wonder if undervting would help with smaller models and heat.
comandillos 1 days ago [-]
I mean the GB10 is pretty efficient for the power it has, but imho is nowhere near the power efficiency of Apple Silicon (it was never intended to be a chip used for mobile devices). I guess this is kind of the movement Apple did with the A12Z and the Mini but... the other way around?
I think its gonna be another failure as we are used to see with the PC market these days.
joe_mamba 1 days ago [-]
>That chip isn't great for LLM inference actually.
Why do I have the feeling it's been intentionally made to be bad in order to get you on to their most pensive datacenter gear.
ekidd 1 days ago [-]
It's probably more that LLM inference speed comes from having a large amount of fast RAM. And fast RAM is brutally expensive right now.
At this point, your cost-efficient options include used 3090s, "frankenrigs" using recycled data center cards, and a handful of "workstation" class cards, where the originally high margins and the long enterprise purchasing cycles have kept prices from going up too fast.
In contrast, a lot of these "personal" AI systems are basically a GPU-like core wired to larger amounts of slow RAM. Which is still semi-affordable. Generally speaking, they make for OK chatbots but extremely slow coding agents. Whereas you can run a modestly useful coding agent at reasonable speed on a 3090.
So yeah, a lot of these systems are bit scammy. But not because it's a secret conspiracy to protect data center cards. Rather, there simply isn't enough fast RAM in the entire world. So they'll flog you disappointly slow RAM instead.
TL;dr: Might be useful for some use cases, but benchmark very carefully.
PeterStuer 1 days ago [-]
It's been almost 30 years, and a single letter changed. When will we get the Sparkstation, the UltraSpark and the SuperSpark?
a1o 1 days ago [-]
SuperSpark and then UltraSpark. And then we can get SparkCube, Sparkii, and SparkiiU.
With competition from the MegaSpark and SparkGenesis.
m463 20 hours ago [-]
you forgot Xerox sPARC with guis, ethernet, laser printers, ...
happosai 22 hours ago [-]
Will we get enterprise ready open firmware too instead of this "we missed DOS so we invented UEFI" for boot firnware?
airstrike 23 hours ago [-]
This seems to be an attempt to compete with people running local models on Apple hardware—even though those local Mac Mini setups aren't really powerful.
I expect we'll get there in a few years, so perhaps this is Nvidia taking an early step in that direction.
In that case, this goes against Anthropic and OpenAI's business models. Which is a double whammy after Jensen Huang's recent comment about how agentic coding will only increase demand for software engineers, not reduce it.
So it also feels like a part of a budding shift in the competitive tension between the various parts of the AI supply chain.
thewebguyd 22 hours ago [-]
Local AI was/is bound to happen, eventually. It'd be smart of Nvidia to get ahead of it.
Non-techy consumers may never do it, but at some point businesses are going to start asking when do they stop paying per token and start running models themselves. Right now the hardware is cost prohibitive, but I doubt that'll always be the case. Eventually the hardware will get cheaper and more available, and Nvidia seems to be betting on that.
They don't care where inference happens, so long as it happens on Nvidia hardware.
h14h 22 hours ago [-]
IMO it's only a matter of time before "self-hosting local AI" is as complicated as installing an app and clicking a download button.
And when that happens, the pitch to non-techy users is "Free ChatGPT you can use offline with zero privacy risk". Once hardware accessibility and LLM efficiency advance to the point that this becomes feasible, I suspect it'll result in a much bigger hit to the cloud AI market than many expect.
ribosometronome 20 hours ago [-]
That workflow has been around for awhile now. I'm sure there are others but LM Studio has a model browser in app that effectively simplifies things to hitting download and hitting launch. The complexity tends to be in that there's a lot of models to choose from and also knowing how to set up whatever tool you're using with a local model. None of it's particularly hard, unless you start trying to customize settings.
I think the bigger hang up is that they're still slower and less capable than the frontier models, especially at the hardware specs most home users are likely to have.
h14h 37 minutes ago [-]
The performance hangup is definitely a barrier, but I think LM Studio and other similar apps are still too far on the "techy" end of the spectrum and have UX barriers that will need to be addressed. IMO for most people, exposing things even as "basic" as the official model name is a leaky abstraction that could be overwhelming.
If the first thing (for example) my mom sees upon installing the app is a dropdown model picker that contains things like "Qwen3.6-35b-a3b-mlx" she will 100% be bouncing off of it.
IMO the best version of this is a custom app/harness with a couple of pre-selected (and ideally fine-tuned) open models that immediately start downloading after checking the system's hardware specs. This would likely be a turn-off to most devs, but is absolutely essential if building an app for general consumers.
selicos 17 hours ago [-]
LM Studio Link is brilliant, outside their central login/auth requirement. Tailscale is the backbone, I think, so it makes sense but I'm sure a method with wireguard could exist and enable similar performance.
the current dielmma for me is how do I install a model on a remote LM Studio device without bypassing Lm Studio to SSH or remote in?
> lms link [servername] get model ?
> lms get [servername] model ?
> lms get model --link [servername] ?
Maybe I need to read the docs again but I swear the only way is remote or go to that device and download via the GUI, ssh in and use the local cli.
Maybe can copy/paste from one device's downloads dir to the server? Maybe I need to try hosting models on my NAS and see if I can download from device 1 then run on device 2 without install/setup?
adamrezich 21 hours ago [-]
Why is it only a matter of time? The AI-as-a-service companies are going to continue to improve their products by improving both the part that could be reproduced in a self-hosted setup, but also the “secret sauce” they put on top of that to make it a better product. There is no incentive for this “secret sauce” to be something that can be reproduced for self-hosting, is there?
thewebguyd 21 hours ago [-]
What secret sauce? We already have open source tooling for tool use, web browsing, and code execution/computer use. Open weight models will win in the end.
AIaaS might keep an edge with multi-modal agentic workflows, but for 80% of general use cases, no "secret sauce" needed, the open weight models are already there, and tooling is constantly getting better.
The bottleneck is the cost of local hardware right now.
Shitty-kitty 19 hours ago [-]
The "secret sauce" is vendor lock-in. A textbook case is the vmware broadcom situation. Vmware was cheap so corporations found little reason to use open source. Broadcom made vmware expensive but now those corporations are finding out that it is a lot of work (aka expensive) to switch infrastructure.
woctordho 12 hours ago [-]
The purpose of open source is exactly to fight against vendor lock-in. There is always a way to convert VMs from VMware to open source formats.
h14h 21 hours ago [-]
I think a major incentive could be to sell hardware. If Apple is able to get their hands on a local LLM capable of covering a significant % of what people use ChatGPT for, the pitch they can offer is:
"Free, private, offline ChatGPT so long as your laptop has X GB of RAM"
Beyond that, I wouldn't underestimate the incentive of "because I can". The "secret sauce" you refer to is effectively just a DB & a while loop that feeds text to a bunch of tensors. If an indie dev decides they want to release something that dismantles the OpenAI & Anthropic moats, there really isn't all that big of a technical barrier stopping them.
bigyabai 21 hours ago [-]
LLM inference decode is heavily dependent on memory speed, not just having lots of memory. You can't say "X amount of ram" because the memory bandwidth on an M1 is 68.3 GB/s versus the 614 GB/s of an M5 Max, or a 4090's 1.01 TB/s over GDDR6X.
This basically creates a bottleneck at the oldest/cheapest Apple Silicon machines, which are already crippled for context prefill.
h14h 20 hours ago [-]
Thanks for clarifying -- I was oversimplifying.
But honestly, obsoleting a huge number of otherwise great Apple Silicon machines is something Apple would moment consider a major "pro" of building a compelling local AI stack.
With how much speculation around the difficult time Apple has had getting people to upgrade from M1, I'm sure they'd jump at such an opportunity.
bijowo1676 20 hours ago [-]
this might be a way for Apple to milk product revenue for many years.
- Please buy our new Macbook pro M5 that gives you 20 tokens/s on local 80B LLM
next year
- Please buy our new Macbook pro M6 that gives you 25 tokens/s on local 80B LLM
milking product revenue in perpetuity by offering meaningful marginal improvements, while keeping same architecture will be the golden goose for Apple
+plus if it allows to segment market by wallet size into poor/middle/rich classes, thats even better
artyom 22 hours ago [-]
I'm from the times when you had to purchase a separate chip to perform floating point math. It was called a math co-processor. [1]
After a few generations (and over a decade) that was indistinguishable from the CPU chip itself.
It's a long hyperbole, I know, but I think local inference is inevitable; and the big fishes know it.
Will that be a complex technical setup? An appliance? An additional chip in your motherboard? So transparent it's burned right into the CPU? Those are just implementation details. We're probably just one generational breakthrough away from it.
Like the math co-processor it might end up just being new instructions for the cpu to handle ai related math.
woctordho 12 hours ago [-]
And here comes ACE (AI Compute Extensions) on the latest CPUs
CuriouslyC 19 hours ago [-]
I think non-techy users will get subsidized hardware with compute workloads running in the background on idle to recover the cost (and lots of ads).
smrtinsert 21 hours ago [-]
> Non-techy consumers may never do it
They will. As some point in the future, people will want everything, they'll prompt full movies because they're bored and want to watch something.
hdgvhicv 20 hours ago [-]
You’re assuming that owning compute will be possible.
bityard 21 hours ago [-]
I don't believe Anthropic and OpenAI are any more fearful of local AI than Google or Microsoft are of people hosting their own email.
Local AI capabilities are growing at a rapid pace, but so is hosted AI. While you can do a surprising amount of useful work with a model occupying a few to a few hundred gigs of VRAM, the hosted models are going to be way ahead for a long time.
mindwok 14 hours ago [-]
The fundamental difference is that email you host yourself requires ongoing maintenance and expertise to work at a basic level, and people would rather outsource it.
AI inference is different. You get the outcome by passing text through some weights at the time you need it. There's no ongoing work besides training and releasing new models. If I had something that rivalled Opus 4+ I could use locally, I would switch in a heartbeat.
13 hours ago [-]
qdotme 17 hours ago [-]
I fear the same thing, but still am unsure why or how :)
Google/Microsoft and hosting your own email is a byproduct of how difficult (socially, not technically) hosting your own email has become - mostly because SMTP protocol is inherently broken by spam and patched by social construct (trusted nodes, abuse@, 3+ DNS entries and counting, etc). Purely technical solutions, such HashCash etc, got discontinued in exchange for social ones. Central providers made (sometimes in exchange for, sometimes as excuse of, spam protection) self-hosting socially hard.
Now, I wonder if, and how, once Anthropic and OpenAI need to demonstrate profitability, could hamstring local AI. Which has been /so far/ very valuable for me in doing things that hosted providers don't want liability for, and align against (even if totally lawful and fair use!).
selicos 16 hours ago [-]
If it's something like:
- v4.5: 1x cost, 100% quality, 100% speed but maybe sometimes 80% speed because of load
- v4.6: 3x cost, 105% quality, 80% speed most of the time depends
- v4.7: 9x cost, 115% quality, 90% speed most of the time
Then people will either stick with v4.5 for everything it can do and, if knowledgeable, use v4.7+ for critical or specific tasks.
But if we add the option of:
LocalLLM: one time hardware + electricity cost, good enough quality for 90% of work, good enough speed for 90% of work, no vendor lock in/sudden cost spikes...
Then there is an edge to running it yourself unless you can burn investor cash to get to the next level.
I think the recent headlines on org token spend plus my own experience just today (June 1) with the new Copilot Pro limits is going to push those with the compute to run locally.
As of about 1pm today I did something to hit 47% of my entire June premium requests (copilot Pro, not converted).
As of 2pm I'm using Gemma 4 E4B on a 12gb GPU (with large context window) off my desktop to power VS Code with Copilot on my laptop. I'm going to build an AMD Strix Halo system next week when parts arrive so I can queue up a few models in parallel or work with something I need that much RAM for.
I'm not lifting the earth with my LLM setup. Gemma 4 E4B is solid for accelerating my current projects. and it's costing me pennies more per hour vs blowing half my Copilot Pro plan in a distracted morning.
I'm at a vendor conference this weekend that is showing off their Agent/Agentic workflows. Nobody can tell me how they balance the cost long term. Hopefully whoever the vendor is paying for their cloud LLM token usage doesn't spike cost in a year (or the vendor themselves) after companies convert and are trapped VMware style with these agent processes. You can bring your own (cloud) model subscription. I need to find out if we can point it back to our own local LLM endpoint and try local models for the same processes. Even if it takes 5x longer, it could be cheaper and more secure.
h14h 21 hours ago [-]
One can only hope.
That said, Apple's vertical integration is a massive competitive advantage here, IMO. Nvidia's reliance on Microsoft & Windows for software support likely makes competing w/ Apple an uphill battle.
If/when Local AI gets good enough to compete with Cloud AI on most inference workloads, Apple starts to look like Nvidia's biggest competitor.
While this is admittedly a dream scenario, the biggest downside would be Apple effectively having a monopoly in "Agent-ready" consumer electronics. Hopefully local AI both becomes the norm, and there is sufficient competition among the consumer platforms.
Side-note: I would love to see an "RTX Spark" Framework 13 mainboard at some point.
bigyabai 21 hours ago [-]
I don't understand this stance. Microsoft is reliant on Nvidia, they don't have a good ARM SOC to ship with without them. They will bend over backwards to accommodate these SOCs on Windows, and probably don't have much work to do in the first place.
Apple's vertical integration has led to a Siri overhaul that took half a decade to roll out, and it won't even run locally. They built an NPU coprocessor that's basically dark silicon for expensive inference, and then shipped MLX to stop Tensorflow and Pytorch from replacing Apple's role in the stack entirely. Mac owners are pleading for signed CUDA drivers for the PCIe or Thunderbolt in their $5,000+ Mac Pros. Apple's ecosystem is pure liability for AI, they're not moving any product for datacenter inference and can't even sell the hardware to themselves: https://9to5mac.com/2026/03/02/some-apple-ai-servers-are-rep...
Nvidia's profit margins are safe. Even if the RTX Spark is a completely failed product, Apple is not encroaching on the markets that Nvidia dominates.
Danox 2 hours ago [-]
Apple has kicked out Motorola, Intel, Nvidia, Google Maps, AMD, Broadcom, Samsung (chip division), and probably Qualcomm in 2027-2028, Apple going back to Nvidia and on reliance CUDA doubtful, even the current Gemini use will be short-lived.
See a pattern there even the memory companies will probably get designed around the Chinese are in the process of doing it now so will Apple probably for long range survival.
h14h 21 hours ago [-]
Fair points all around. Ultimately it all comes down to execution.
In theory, Apple SHOULD have an advantage given they have everything they need in house and can all pull in a unified direction. In practice, it's not always the case that all the teams in a large corporation are all that much better at pulling in the same direction than multiple different corporations in a partnership. And all this will be moot if Local LLMs never catch up to cloud LLMs in terms of quality.
Regardless, it'll be very interesting to see how Nvidia's partnerships with Microsoft & hardware OEMs play out. If the AI inference compute share shifts appreciably to local consumer hardware, I'll want to see strong competition.
bigyabai 20 hours ago [-]
I'd argue that Apple had the upper hand, but they folded super early. They abandoned OpenCL, which was the most promising CUDA competitor with industry-wide buy in from dozens of companies. Then they transitioned to an ecosystem-first mindset prevented Apple from cooperating to take down Nvidia, and their locked-down software stopped the industry's first high-speed ARM servers from reaching their audience. Nvidia capitalized on both opportunities to the tune of trillions in valuation.
Without Khronos involved, I don't think that Apple has the buy-in to create a real industry-scale CUDA alternative. At this point, it might just be most profitable to support CUDA in macOS and give the people what they want.
20 hours ago [-]
c7b 20 hours ago [-]
It's not even anything new, it's basically the mobile version of the DGX Spark. The two chips (N1X/GB10) are pretty similar in terms of architecture and specs. I don't get why this seems to be getting so much attention now.
But I like it. It's a copy of Apple's SoC design philosophy, same as AMD's Strix Halo, which I always thought was really cool both for laptops and home PCs. NVidia's traditional consumer cards pull way too much power and are too noisy to comfortably put them in a living or office environment.
grahamburger 21 hours ago [-]
You can do a lot with existing devices in a medium to decent gaming PC (or probably phone/laptop, I haven't tried.) I think HN tends to skew toward only thinking of LLM as useful for coding, but they are very useful for many non-coding things, and existing local LLMs are quite capable. I imagine it won't be long before apps with LLM-based features will try to run locally first and fall back to cloud LLMs just to save token costs. Actually I'd be surprised if some apps aren't doing this already.
spamizbad 21 hours ago [-]
Might be aimed at people who spec out the $5100 Macbook Pros with M5 Maxes and 128GB.
spullara 21 hours ago [-]
definitely! it has the advantage that it can run CUDA kernels but on the other hand it has lower memory bandwidth and probably loses a token/s fight for many LLMs.
23 hours ago [-]
mschuster91 21 hours ago [-]
> In that case, this goes against Anthropic and OpenAI's business models. Which is a double whammy after Jensen Huang's recent comment about how agentic coding will only increase demand for software engineers, not reduce it.
The writing is on the wall, neither Anthropic nor OpenAI are anywhere near close to sustainability and if one or, worse, both fail the entire demand bubble for NVDA crashes.
It's smart to set up alternative destination markets while they can do so in peace.
minraws 1 days ago [-]
Awesome, won't be buying it all at current prices but once they calm down, I will very much like to get one.
Around 2-3K USD something with a good GPU + CPU + 128GB of integrated RAM is just going to be an awesome experience.
Considering Mac options are north of 5K+ even on a regular day.
Tiberium 1 days ago [-]
DGX Spark is $4700, so I kind of doubt that RTX Spark's top configs will be cheaper than that.
KeplerBoy 1 days ago [-]
The DGX also contains the 200 GbE networking and linux support.
fmajid 1 days ago [-]
The ConnectX 7 2x200 Gbps networking card in the DGX Spark alone is worth $700
KeplerBoy 1 days ago [-]
To be fair the connectx-7 in the spark can't even push 2x200 Gbps since it is connected via 4 pcie lanes.
fmajid 7 hours ago [-]
Yes, ConnectX-7 itself is capable of 32x PCIe5 lanes, but the lane limitations of the GB10 SoC/chipset throttle it:
Technically it's connected via 8 PCIe gen 5 lanes (two 4x connections), allowing ~100Gbps per port.
KeplerBoy 1 days ago [-]
Thanks for the correction. I should have looked it up; I only remembered it being somewhat odd.
Tiberium 1 days ago [-]
Laptops will also have to contain a much tighter configuration, display, keyboard, camera, etc ;)
minraws 1 days ago [-]
there is desktop variant as well
minraws 1 days ago [-]
isn't dgx ai first and rtx prosumer first. I think it will be cheaper longer term not atm with component inflation
thot_experiment 22 hours ago [-]
Love seeing AMD forcing Novideo to catch up for once rather than the other way around.
timpera 2 days ago [-]
We'll need to wait for the benchmarks, but this looks great! Windows 11 ARM64 is already amazing, and if these really are an upgrade from the Qualcomm chips we're going to have even better laptops on the market.
aseipp 1 days ago [-]
The GB10 itself is pretty good and I love using mine for broad Linux development. But it's too expensive for consumer level pricing, and even for the "prosumer" the price is pretty stiff. Even if they dropped the CX-7 and halfed the RAM and shipped a smaller hard drive, would it be below, say, $2500 USD? I guess we'll see, but this variant is coming out pretty late so maybe it's just best to wait for the 2nd generation.
h14h 1 days ago [-]
This feels like getting a foot in the door to ensure Apple doesn't entirely eat Nvidia's lunch if AI inference workloads start to shift from cloud to local.
With MLX, Apple is building an answer to CUDA, and if people start switching from ChatGPT & Claude to some app that runs on their M5, suddenly Apple starts to look like Nvidia's biggest competitor.
If Nvidia doesn't have a pathway towards getting hardware into the hands of consumers, it could be a really difficult road ahead for them.
selicos 16 hours ago [-]
Apple seems to still own the creative space. If those tools are able to run local models for any AI workflows suddenly anthropic/etc could lose a massive segment. Or at least demonstrate to others wanting a slice of the cloud AI profits it can be done.
I'm here for it. Local models can do a lot of what I need at almost no cost, plus the fun of making them work better or building a new system to handle that aspect of my home lab. A Strix Halo system may not be amazingly fast but at 128gb of RAM it can keep up with most open models worth exploring.
Based on June 1 Copilot Pro plan premium token burn and cost, unless you REALLY know how to use cloud AI efficiently and are tooled up to do so a local LLM on hardware you may already own is very appetizing.
I converted a lot of work today to a 6.5gb local LLM on a 12gb GPU and no, it's not as good. But it is 'free' or at least feels that way, especially when I need to redo something and my copilot premium request % doesn't change.
analogpixel 23 hours ago [-]
Oh, btw, we are only making 10 of these, the rest of our capacity has been sold off to the large AI firms.
boredatoms 2 days ago [-]
Is this just dgx spark, but a laptop?
pella 2 days ago [-]
yes, same chip
+ Windows
+ Screen
- ConnectX-7 Smart NIC
pedrocr 1 days ago [-]
+ battery too. I've wondered if a mini pc with battery would make for a good form factor. I often move between places where I have a desk with a screen but still use a laptop because I want to just suspend and resume. If a mini pc had a small battery just to hold its RAM while suspended I could move between places and just plug in a single USB-C cable and have my full workstation up and running. The thermals could be better than in a laptop and having a built-in UPS better than with a desktop. But last time I checked no one packaged things like that.
pbadams 1 days ago [-]
There's the Khadas Mind series of mini pcs. They have a proprietary docking interface though. Agree that it would be great if this form-factor was more common.
throw0101c 24 hours ago [-]
> - ConnectX-7 Smart NIC
Can the link type be toggled between Ethernet and Infiniband? (Don't think I've ever heard of a laptop with IB.)
zer0zzz 1 days ago [-]
What about the desktop version? It seemed like it is not a dgx since it has the CPUs cores done by mediatek
cpgxiii 1 days ago [-]
The DGX Spark/GB10 has CPU cores from Mediatek (in a pretty odd cluster configuration, too).
Bulat_Ziganshin 1 days ago [-]
They didn't say that Mediatek made the cpu sores. Grace is NVidia's own cpu arm cores. I bet that Mediatek made other parts of SoC necessary for a notebook
Well, MediaTek actually said they made most of the SoC in fact. But the actual CPU cores themselves are all but certainly off-the-shelf Cortex parts, since MediaTek doesn't have a custom core design at all afaik.
wtallis 1 days ago [-]
NVIDIA hasn't done custom CPU cores for anything they've yet branded "Grace". The original Grace data center CPU (paired with the Hopper data center GPU) used ARM Neoverse V2 cores. The "GB10" chip shipped in DGX Spark and announced here for RTX Spark uses Cortex X925 and Cortex A725 CPU cores.
Physically, NVIDIA did the GPU chiplet and Mediatek did the other chiplet that has the CPU, DRAM controller, and IO.
pipyakas 1 days ago [-]
desktop is GB300, not GB10 like Spark
wtallis 1 days ago [-]
GB300 is nominally "available" in desktop form factor workstations priced around $100k. That's a few orders of magnitude away from the ordinary desktop PC market that consumers participate in.
zer0zzz 3 hours ago [-]
Yeah this is why it’s important to get something with similar programmability for less money. I don’t need the power of a gb300 just to do experiments with tma or “tcgen05” instructions
KeplerBoy 1 days ago [-]
they also announced a GB10/N1X windows desktop mini PC.
dvhh 14 hours ago [-]
I might be in a niche user, but what I a mostly looking forward into an ARM laptop, would be to be silent with preferably passive heat management or as little at possible active heat management (all day battery usage is a given).
Schlagbohrer 12 hours ago [-]
They show premium skinny laptops which will have this. I wonder how much the lack of heat dissipation capability will limit it's compute capabilities?
In university a friend of mine had a large hardcover book she kept in her dorm freezer. I asked here WTF she had a big book in there. She said it was for minecraft - she'd place her laptop on top of it while playing. The book was cold but also quite dry. I wonder how well it worked.
bananadonkey 11 hours ago [-]
I had a high end spec gaming laptop with 0 airflow back in the 2000s, and used to raid dorm LAN party freezers for frozen meat + towel = heatsink.
I was lucky that iteration 1 (sans towel) didn't ruin the laptop...
easygenes 12 hours ago [-]
Looks like RTX Spark desktop is the DGX Spark desktop, minus the expensive 200GbE Connect-X NIC. Only since the DGX Spark released, memory and nand prices have jumped, so it will likely retail for the same amount as the DGX Spark did on release (which has since gone up significantly).
rsolva 1 days ago [-]
Will NVIDIA get a monopoly on providing laptops and desktops with a lot of RAM going forward?
nycdatasci 1 days ago [-]
No. You can get a PowerBook today with 128 GB ram.
Bosgame M5 AI Mini Desktop Ryzen AI Max+ 395 96GB variant €1.800,95 (sold out)
128GB+2TB variant €2.401,95 (in stock)
I have the latter, it's fantastic
artificialLimbs 1 days ago [-]
$600 for 32GB ram seems bananas
riknos314 22 hours ago [-]
Unfortunately in the current market 32GB of ddr5 seems to run about $400 as 2x16gb DIMMS, and even more for 1x32GB DIMM (higher density chips are more expensive). So $600 really isn't much over market price, especially considering strix halo uses 8000MHz ram instead of the typical 6000 found in consumer dimms.
Running local agents 24/7, I get that it's a powerful CPU or GPU or whatever it is, but still, isn't it going to be constantly loud and 95C hot, that can't be good for the laptop if it's like that 24/7
nokeya 1 days ago [-]
It was wintel (windows + intel) before. This will be what? Windia? Wintek?
Geee 1 days ago [-]
Winvidia
MoonWalk 1 days ago [-]
Nvideous
igravious 1 days ago [-]
Nvidiows
grassfedgeek 1 days ago [-]
Nvindows
smcleod 9 hours ago [-]
Pretty low bandwidth and given how terrible nvidias software has been with the DXG et el, I would not put much faith into this.
jqbd 2 days ago [-]
They made their own x86 CPU? Or was that part outsourced? Ok ARM MediaTek.
try-working 2 days ago [-]
ARM cpu made by MediaTek.
zamadatix 2 days ago [-]
But probably worth clarifying it's not a typical "MediaTek CPU" some might assume by that. It has Nvidia's customized ARM CPU implementation + their GPU.
TiredOfLife 1 days ago [-]
This has off-the-shelf Arm cores.
Bulat_Ziganshin 1 days ago [-]
I think that Nvidia made GPU and CPU, and Mediatek made other parts of SoC necessary for a notebook. Grace is Nvidia's own CPU ARM core
SomeHacker44 1 days ago [-]
I believe Grace is an ARM designed core. Vera is the nVidia designed core.
modeless 15 hours ago [-]
Single-core CPU performance is going to be fully 20% slower than Snapdragon X2 Elite Extreme. People are sleeping on Qualcomm's latest. It's the only chip out there to approach Apple's single core CPU performance and power efficiency.
KetoManx64 23 hours ago [-]
I really hope these take off and succeed and they support Linux. Qualcomm is seriously holding back the Linux ARM adoption with their continuous missteps.
jmyeet 1 days ago [-]
I didn't see this in the article but elsewhere I've seen the memory bandwidth quoted as 600GB/s [1]. For comparison:
- 5090/6000 Pro: 1792GB/s
- 5080:: 960GB/s
- 5070Ti: 892GB/s
- M3 Ultra: 819GB/s
- DGX Spark: 273GB/s (less than an M5 Pro at 307GB/s)
Memory bandwidth isn't everything but it will cap inference rate pretty heavily. Also, the M3 Ultra is for an almost 2 year old Mac Studio. It's widely expected that it'll be refreshed in Q3 with a likely M5 or M4 Ultra with >1000GB/s. I really hope Apple realizes what a market opportunity Apple has here.
The above shows just how good value the 5090 really is. It basically a RTX 6000 Pro with less RAM (and ~12% fewer CUDA units), which is a ~$10k card, for 20-30% of the price. This also demonstrates how NVidia uses VRAM for market segmentation. As an aside, the true data center cards (eg B100, H100) use HBM memory at ~3.2TB/s.
Spark memory bandwidth is ~300 GB/s. Internal bandwidth is 600 GB/s but that doesn't matter.
dist-epoch 23 hours ago [-]
128 GB at 600 GB/s for this versus 32 GB at 1800 GB/s for 5090.
This is much better value than 5090, you can run much bigger models.
jmyeet 22 hours ago [-]
Here's a pretty detailed breakdown of this [1]:
> tl;dr - For software development, Qwen3.6 27B, 5090 gives you ~3x speed over M5 Max, letting you plow through code, while M5 Max gives you ~4x memory, letting you use higher quantization and bigger context. Which would you choose and why?
I've read a number of things from which the consensus seems to be that yes you can run a larger model and/or have more context with a 128GB+ Mac but the performance gap is still massive and with current hardware we're still talking about inference rates that matter. By this I mean there's a big difference between 10tok/s vs 30. Once we get to t apoint where it's 100 vs 300, it won't be as big of a deal, a bit like FPS in games.
Oh and there are similar concerns with the DGX Spark [2].
Wow, specs seem amazing. So want one of those once you can run linux on it, especially for the ability to run local models on a laptop.
atilimcetin 20 hours ago [-]
With 128GB ram, the price tag would be pretty high. And lots of application does not work Windows on Arm. Even Microsoft provides something like Rosetta 2 for windows, still x86 architecture would be the most popular one for Windows for a looong time.
Saying that I think this is product is kinda dead on arrival.
dom96 1 days ago [-]
I’m getting more and more convinced that we will end up running LLMs in our personal computers. Which makes me wonder where Anthropic/OpenAIs moats will come from.
VMG 1 days ago [-]
Convince me
1. in order to run LLMs, especially the best ones, you need complicated devices which are expensive
2. if you buy one for your personal use, you are probably not going to utilize it all the time and it will be idle a lot
It seems to me that it will always be more economical that the LLM-running devices are in a datacenter where it is easier to make sure they are always utilized
OtherShrezzing 1 days ago [-]
If a model is substantially better than most humans at most tasks, the human isn't going to be able to perceive the difference between Claude Opus 7.7 and 8.7. Humans at some point aren't going to be able to perceive the difference on benchmarks either, because they are going to get wildly abstract.
AI vendors are really going to struggle to shift tokens far beyond the frontier of human capabilities. It's reasonable (not guaranteed) to assume that, if the trend of frontier models (doubling capabilities on benchmarks every n months) holds, then the same trend will hold for local models, and those local models will meet and exceed the perception frontier. This would mean a human cannot tell the difference between Mistral-Open-2030 and Claude Opus 2030.
That's a bunch of "ifs", but there's nothing exceptional about those "ifs". They're basically the scenario if nothing changes between now and ~2030 with regards to capabilities trend attainment.
Mordisquitos 1 days ago [-]
The trend over the past three decades of personal computing has been for devices to become exponentially more powerful regardless of the actual computing needs of users. The excess computing power has famously been requested by projects such as SETI@Home and Folding@Home, and been exploited by bad actors for crypto mining. The most basic laptop today used only for web browsing and word processing would be a powerful workstation 20 years ago, when the most basic laptop was also used only for web browsing and word processing (and arguably for more things, as it was all mostly local software).
There is no ceiling to the power of consumer hardware. If it's cheap enough, it will be bought.
VMG 23 hours ago [-]
most crypto mining has moved to specialists, even where there were deliberate attempts to make it ASIC-resistant
SETI@Home is a very niche use case
and web browsing still happens by connecting to data centers and server farms, not by connecting to another laptop
Mordisquitos 23 hours ago [-]
I think you missed the point of my message. Web browsing still happens by connecting to data centres, so why are consumer laptops so much more (unnecessarily) powerful today than they were 20 years ago? All the more so given that, at that time, you were running MS Office locally rather than using Office 365 or Google Docs remotely.
fg137 1 days ago [-]
This.
Even two or three years people were pointing out "The ChatGPT subscriptions you can buy with $2000 give you much more compute than whatever home setup you come up with" on r/LocalLLM. I did my own elementary school maths and came to the same conclusion.
Yet till this day people still boast how their beefy M4 Pro/Max machine with 32+GB RAM (which is not at all a "normal person's setup" and costs $2000+) runs LLMs smoothly, and "that's the future".
Someone needs to re-learn basic maths and take a walk around Best Buy to understand what "consumer laptop" looks like.
nemomarx 1 days ago [-]
If there end up being useful workflows where you keep stuff running in the background or overnight that's one advantage, compared to a data center that might cut off your access during peak hours or etc.
Think of it like having a graphics card at home versus using a cloud gaming stream? Technically subscribing to GeForce is much cheaper up front than getting a card, but people still do that. So will the audience of people running agents at home be as large as PC gaming? I think that's kind of plausible.
VMG 1 days ago [-]
> if there end up being useful workflows where you keep stuff running in the background or overnight that's one advantage
That is not how LLMs are typically used though in my experience
> Think of it like having a graphics card at home versus using a cloud gaming stream?
Latency seems to be much more important in that use case
OtherShrezzing 1 days ago [-]
>2. if you buy one for your personal use, you are probably not going to utilize it all the time and it will be idle a lot
I think consumers are primed for that type of behaviour though. I have an iPhone on my desk. It has something like 2-3tflops CPU+GPU, which is double that of the largest super computer on earth when Jurassic Park came out, and is probably more computing power than existed on earth when I was born in the 80s.
I use this device for around 1hr per day to write text messages.
KeplerBoy 1 days ago [-]
It's inevitable. What might be a prosumer device today priced at 4000$ will be a regular consumer device in 10 years and models only get better.
Local models today are fine for a lot of mundane tasks and will continue to be so. The use cases where paying for frontier models is worth it, will continue to shrink for folks not doing frontier work.
parineum 1 days ago [-]
> models only get better.
Or stall. Acceleration has been slowing significantly and gains seem to be tied to huge memory footprints.
davebren 1 days ago [-]
Uploading your IP to the biggest IP thieves in human history seems bad idk.
2. Eventually we'll get to where local models that don't have sycophancy and slot-machine mechanics trained into them will perform better.
Guillaume86 1 days ago [-]
3. If your device run on battery, why not using a relatively cheap network call in place of a very power hungry local inference call?
mejutoco 23 hours ago [-]
Privacy and offline use would affect the choice as well. How niche are they, I am not sure.
nerbert 1 days ago [-]
Just like cloud vs private server. It'll be based on use case.
adrian_b 1 days ago [-]
While I agree with that in principle, it is very worrisome that the prices of personal computers, especially of any personal computer that is not a big desktop, have been increasing continuously.
The price of a mini-PC with Intel Panther Lake is at least double in comparison with the price of a mini-PC with Arrow Lake H having similar specifications, and I am talking about barebones, before adding DRAM and SSDs, whose prices have risen even more.
The rise in prices is somewhat obfuscated by the confusing names of CPUs, i.e. some old and new CPUs may seem to be at similar prices and they have similar names, but the new CPU actually corresponds to a lower segment of the market, by having e.g. a smaller GPU and a lower clock frequency, while the CPU model that really corresponds to the old is named such that it seems to belong to the class corresponding to its present price.
As a concrete example of this obfuscation, which may confuse the buyers of laptops or mini-PCs, I have an ASUS 15 Pro with "Core Ultra 5 225H". If I would buy an ASUS 16 Pro now, the corresponding CPU model, the cheapest which is not worse than what I have, would be "Core Ultra X7 358H".
fg137 1 days ago [-]
The best open weight LLMs don't run on this computer, or almost any consumer grade computer. Even the memory requirement for Gemma 4 is out of reach for most consumers (by which I mean those who are not on HN). Unless there is some magic that would make high quality LLMs consume no more than 8GB RAM which makes them usable on a 16GB laptop (which is the norm these days), "local LLM for personal computing" is mostly just a myth.
xnx 1 days ago [-]
We're hitting the atomic limits of what's possible with minimum feature size in silicon. It's also very hard to remove 1 kW of heat from a laptop, let alone do it quietly or on battery.
itake 1 days ago [-]
My biggest concern with local LLMs is there just isn't enough RAM or HD space to run multiple models, and the generic LLMs are too generic...
xdertz 1 days ago [-]
I find it hard to see how that would ever be economical. LLMs need very expensive power hungry chips and datacenters have
- bulk discounts
- cheaper electricity
- high utilisation to spread the costs among many users
I don't see how PCs could ever compete against it. Most users AI demands would probably result in >90% idle time on the GPU.
pjmlp 1 days ago [-]
First we need to actually still be employed, and have them at affordable price.
eigenspace 1 days ago [-]
If we do, it won't be on this chip.
wasmitnetzen 1 days ago [-]
It'll be just another round of the client-side vs server-side processing rounds. We've been through them, we will keep going through them.
notepad0x90 1 days ago [-]
i think a lot of that is for government and enterprise use. even for personal computers themselves (i.e.: laptops) they're usually loss leaders, they don't turn profit. You can run a server (and many do) on laptops, but that didn't replace cloud services or server hosting. You can't store enormous amounts of data on your laptop/phone for the llm to use, or access tools the app dev wouldn't want exposed on untrusted devices.
The whole replacing people angle is just the short term use case the more ghoulish executives are thinking about. In practice, lots of lots of new use cases have been made possible by LLMs. A lot of which can be done locally. But whatever capacity you have locally, they can have more of and for cheaper, and they manage the model instead of you doing it yourself. I think you put it nicely though, their moat will be thinned, and I doubt they'll be as profitable as their funding suggests, but at the same time the demand for them won't go away either. I don't know if OpenAI and Anthropic will be viable, but I'm nearly certain Deepseek is.
The tipping point will be power usage, if a local llm can run the same workload for less power that would be a game changer. Nvidia might get decimated, but even Google and others have moved on from GPUs already, they have faster and more power efficient TPUs. Add to that network bandwidth and availability issues, their moat remains. Also consider that even for graphics capabilities, user devices just don't have a consistent spec to make things like widespread 3d graphics and webgl usage viable. Someone's cheap android phone will never run a local llm reliably,same as it won't a 3d game. even if they have a high-end iphone, network providers aren't always performant as they are in western countries, and then there are people that won't want to install your app or local software, and then browser based exposure of the capability to sites which will have similar hardware spec issues, OS instabilities, competing tabs,etc...
hgoel 1 days ago [-]
Looks like the MSI one might be a 2-in-1, if it has good stylus support I might have a good candidate for an upgrade, thought my ~3-4 year old Galaxy Book is holding up alright for now.
amelius 21 hours ago [-]
"Notify me" -> i.e. when we finally have the DRAM to build this SoC.
donkeylazy456 1 days ago [-]
hope nvidia support driver better than qualcomm. also hope they support linux soon.
orthoxerox 1 days ago [-]
Well, it was only a matter of time, since both AMD and now Intel are now switching to APUs. Nvidia could either cede the desktop GPU market to them, going all-in into AI datacenter chips, or it could challenge them.
Maybe the Nth time's the charm and Microsoft+Nvidia will manage to make Windows on ARM a viable platform.
synergy20 19 hours ago [-]
can it run linux as I am not a windows nor a macos user
mastermage 2 days ago [-]
Is this finally Macbook Chip Efficiency coming to Windows or will it just be shittier compatibility for slightly better battery life?
24 hours ago [-]
zer0zzz 1 days ago [-]
I heard leaked geekbench putting it behind the m3, which is couple years old now.
All I care about is if I can get one of these for significantly less than a dgx and get Linux on it for some cuda Blackwell kerneling.
perarneng 23 hours ago [-]
No thunderbolt is a big no for me. Its one of the greatest feature of MacbookPro that makes it dockable and expandable as a desktop with a good thunderbolt dock.
Almondioco 23 hours ago [-]
Thats also possible with usb-c.
gillesjacobs 23 hours ago [-]
With some caveats, you wouldn't be able to connect two 4k monitors to a dock without TB5.
kllrnohj 23 hours ago [-]
USB 4 v2 has the same display capabilities as TB5. In fact, TB5 gets its display capabilities from USB 4 v2
ma2kx 1 days ago [-]
Unified RAM means its soldered to the mainboard, right?
I'm not sure if I like this. Sure for a laptop this might be not a big problem but if this ARM ecosystem is a success it will spread to desktop computers and I fear we could lose the existing modularity.
Skinney 1 days ago [-]
"Unified" means that it's shared between CPU and GPU, I believe.
But yes, it tends to be soldered on.
Bulat_Ziganshin 1 days ago [-]
No, but LPDDR means soldered, there are no LPDDR dimms
debugnik 1 days ago [-]
There's LPCAMM2, but it's very recent. The Framework Pro laptop supports it, for example, although only on the Intel variant.
phcreery 24 hours ago [-]
I think unified RAM means soldered to the SoC, which is in turn soldered to the mainboard
Rekindle8090 1 days ago [-]
[dead]
throwa356262 1 days ago [-]
I have no idea how powerful or power efficient these guys are, but this seems to be the first step in a bigger push towards Windows on ARM (without loosing gaming).
I think more announcements will follow soon from other companies.
fmajid 1 days ago [-]
My DGX Sparks are the first and only devices I have with 200W USB-C PD. Low power by AI workstation standards, but intolerable in a laptop.
MoonWalk 1 days ago [-]
Intolerable? Why?
fmajid 7 hours ago [-]
Your lap cooking. They generate enough to noticeably heat a room.
ternaryoperator 1 days ago [-]
Battery life
MoonWalk 24 hours ago [-]
The comment I'm replying to appears to be talking about power DELIVERY, not consumption. Why would extra power-delivery capacity be intolerable?
kllrnohj 23 hours ago [-]
The DGX Spark doesn't have a battery. If it comes with 200W delivery (actually 240W), it's because it plans on consuming close to that amount.
Although I'm kinda surprised the DGX Spark used USB-C at all for power instead of just like a DC jack or whatever. But whatever.
jauntywundrkind 1 days ago [-]
It's worth noting that Nvidia power management on Linux has been absymal. There also aren't any of the usual power management options to see how much power things are using, which is quite atypical for a modern system.
Nvidia really threw stuff over the wall with the DGX Spark release. They don't seem to really care. I sort of think they'll spend a little more time on Windows, where there's no pesky upstreaming to do and they can just do whatever, but man, it's such typical hubris from Nvidia to build such an expensive box with good chips but make it basically unsupportable and roasty hot all the time.
You also generally have to run an ever more stale two year old Ubuntu derived DGX OS to get anywhere, with bespoke kernel and drivers all. None of it is well supported, none of it just works like a comparable PC or even well behaved arm system would.
As for other ARM, there were rumors AMD Sound Wave is/was going to be a ~10W arm APU, but there hasn't been much said about it lately. Honestly given the ram crunch, it's maybe just not worth trying to build a system with a cheap core, if the rest of your costs are going to stay so stratospheric.
https://www.techpowerup.com/341848/amd-sound-wave-arm-powere...
awesomeusername 18 hours ago [-]
I've been daily driving a dgx spark. Once you start there is no going back.
NVIDIA nailed it
not_a_bot_4sho 18 hours ago [-]
Mind sharing more details about your use and experience with DGX? I'm just curious
zmk5 2 days ago [-]
I really like this, but I think the reason Apple Silicon took off was that Apple sort of forced devs to support ARM. Not sure if Microsoft can do the same for Windows…
supersing 1 days ago [-]
Developers weren’t really “forced” to support ARM. They simply recognized that all future Macs would be ARM, whereas most new PCs would continue to run on x86. So the incentive to adopt ARM was much weaker on the PC side.
ryukoposting 14 hours ago [-]
> Developers weren’t really “forced” to support ARM. They simply recognized that all future Macs would be ARM
One might call this "forcing"
trvz 1 days ago [-]
They didn’t though. Rosetta 2.
ptole_my 1 days ago [-]
rosetta is a relatively short term solution. will be supported up to macOS 28
aa-jv 1 days ago [-]
Microsoft can do the same for windows - they need to address the fat bundle solution that Apple came up with, but for Windows, though ..
chris_money202 1 days ago [-]
It’s a step in the right direction, but there’s still a long ways to go in terms of smaller LLMs ability and hardware costs
tonoto 1 days ago [-]
What is this product anyway? Is it a general purpose CPU or is it specifically designed for MS Windows? Nvidia stepping back from the open source?
"Introducing the NVIDIA RTX Spark™ Superchip. The fusion of NVIDIA AI and RTX graphics in a single chip redefines Windows PCs and delivers amazing creating, AI development, and gaming—on the slimmest, most beautiful RTX laptops ever and small, ultra-efficient desktops."
mingus88 1 days ago [-]
It’s nivdia attempting to compete with Apple’s M-series
Almondioco 24 hours ago [-]
Its nvidia attempt to gain additional market share and expected as well. If the whole ecosystem is around nvidia and its the easiest way of running stuff, Nvidia offering more enterprise infrastrcuture allows companies to just buy directly nvidia.
Nvidia is also very very rich and pushes the boundaries of stuff. They stoped waiting for industry standards. You can see this in there network stuff. All nvidia.
Next logical step (at least now, not something i thought about) was there CPU for their GPU racks/clusters/systems.
Now they have everything anyway, RTX Spark is just logical.
I don't think its specificly targeted at Apple at all.
Apple has like 10-15% market share and just because some IT nerds buy themselves a mac mini doesn't mean much.
Plenty of them actually just run openclaw without local models. Something which surprised me quite a lot.
But i have two 4090 at home. They consume a lot of power and i had to research the proper Mainboardmodel and had to mod one 4090 to use water cooling because they run too hot.
There Spark setup was at 3k, way to expensive for normal people. If they can get this down and sell more, great for their ecosystem (strengthening it) and getting more money from people.
It does surprise me though that they have enough capacity for this chip and not just putting everyting in Rubin but perhaps the build out has slowed down a little or they start to diverse already for economic savety
FuriouslyAdrift 23 hours ago [-]
Their target competition is the AMD Strix Halo which is eating the Sparks lunch right now.
giancarlostoro 24 hours ago [-]
Also sounds like they are ditching the discrete GPU altogether.
dawnerd 1 days ago [-]
All the news articles in my feed mentioned Nvidia reinventing personal computing which is laughable given the specs are worse than the m series. I’m guessing they saw how well Apple devices were selling and rushed to get something similar out so they can ride the hype train and have something to fall back on if ai DC spend slows down.
AlotOfReading 24 hours ago [-]
There's a lot of companies trying to support datacenter systems like GH and Rubin that don't have dev hardware remotely resembling it. M-series isn't a good option, speaking from the personal experience of currently using one for this exact purpose.
hasteg 1 days ago [-]
I wouldn't say it's Nvidia stepping back from open source... if anything this is doubling down on it, as one of the selling points of this is the 128GB of unified memory which will allow for hosting local models (i.e, nvidia's new open model they just released). I guess it's pretty cool, I'm a big supporter of local LLMs/open weight models so seems enticing to me, although I'm not sure this will be super applicable to a lot of regular consumers. Seems like a pretty niche product.
wmf 23 hours ago [-]
Linux works but MS is just paying them not to mention it.
bch 13 hours ago [-]
"The performance is off the chart!" <basic chart is displayed to illustrate(?) the point>
More seriously, obviously a ton of work in an incredibly competitive space, and an incredible machine (without getting into competitive comparisons/minutiae). Was watching a techtechpotato[0] quick post pre-launch about "why is this even being tried?", which was also interesting. What an age we live in.
Great! More pressure on fabs, price of standard GPU will again rise.
Guess I need to postpone my gamer PC renewal to end 2030.
cultofmetatron 22 hours ago [-]
can these do training or only inference? currently working on learning machine learning and I'd love to have a physical machine I could aim to build real workloads on in a few years.
LogicFailsMe 22 hours ago [-]
They're Turing complete. What else do you need?
Npovview 21 hours ago [-]
There is a reason why Google has tpu8i and tpu8t
porphyra 22 hours ago [-]
technically in order for something to be turing complete it needs infinite memory
LogicFailsMe 21 hours ago [-]
The more (memory) you buy, the more you save!
Our_Benefactors 22 hours ago [-]
It’s possible (likely, even) to have a chip fast enough for inference, but not fast enough or with enough memory to do meaningful training runs. Like the current DGX spark.
airjason 22 hours ago [-]
not for llm full training, but can do some finetuning for sure.
the_real_cher 22 hours ago [-]
I believe training is way more processor intensive than inference.
locusm 14 hours ago [-]
Jensen Huang delivers the absolute worst keynotes - I wasnt ready for that level of cringe and stupidity.
agnosticmantis 1 days ago [-]
How would these compare to a MacBook Pro M5 in terms of performance and price?
lanycrost 1 days ago [-]
I'm waiting for powerful on device LLM models, since that not worth it
Hugsun 1 days ago [-]
Have you tried Qwen 3.6 or Gemma 4? They're not frontier level but certainly have their uses.
lanycrost 7 hours ago [-]
Yes, they work great for small tasks, but not smart and powerful enough to beat with frontier models. Hope they will become better.
PunchyHamster 1 days ago [-]
The fact they advertise it as some step forward in PCs is outright bizzare.
It's just worse Strix Halo, as you are landing square in middle of Windows ARM problems
Iolaum 1 days ago [-]
Strix Halo chips have around 210+ GB/sec gpu memory bandwidth and announcements put the new nvidia chip at around 300GB/sec gpu memory bandwidth.
I 'd say that is an improvement if you want to run local llm inference. Still well below with what you can achieve with Apple chips though.
koolkao 1 days ago [-]
Very exciting! sounds like we're finally leaving x86 behind
24 hours ago [-]
numron-dev 24 hours ago [-]
Yeaaaah . But at what Cost though.
t_mahmood 1 days ago [-]
After nvidia's many years of neglecting Linux, paired with direct Microsoft's involvement? Are we going to trust them, to allow installing Linux in these easily?
I don't think so.
This most likely be a winmodem situation, again
TiredOfLife 1 days ago [-]
DGX Spark has the same soc and ships with Ubuntu
t_mahmood 24 hours ago [-]
Okay, but still it's highly skeptical trusting MS, and NVIDIA.
lern_too_spel 23 hours ago [-]
It ships with DGX OS 7, which includes Ubuntu's 24.04 repos. It is not using mainline Ubuntu, and if you want to run Ubuntu 26.04, you'll have to do some work.
bigyabai 1 days ago [-]
[dead]
cyanydeez 2 days ago [-]
competitor is already on the market and is x86: AMD AI 395+
bechmarks with DGX arnt spectacular for NVIDIAs software and CUDA lead.
wouldnt count on this being a price/compute challenger. especially with overpriced VRAM.
porphyra 2 days ago [-]
Strix halo's 8060S gpu is very weak, and is roughly equivalent to a 4060 laptop GPU, whereas GB10's gpu is equivalent to a desktop 5070. For LLM throughput, tok/s is similar due to bottleneck by memory bandwidth, but the GB10 has 3x faster prefill. People have also been able to squeeze out much better performance on GB10 using NVFP4 and other improvements in the months after the DGX Spark launch, so don't be misled by early lackluster benchmarks. For the RTX Spark, which also targets gaming and creative applications, the 3x faster GPU is quite nice.
xyzzy123 2 days ago [-]
Or like a m4 max? This thing has <300GB/s vs the max with 550GB/s
All those CUDA cores in the sparks but they're starved for memory bandwidth.
I am still waiting for NVidia to release a system that legit beats 3090 maxxing for the home gamer...
moondev 2 days ago [-]
Spark:
OS: Windows/Ubuntu
Mbw: 300GB/s
Cuda cores: 6000
GPU accelerated containers: yes
M5 max:
OS: macOS
Mbw: 600GB/s
Cuda cores: 0
GPU accelerated containers: no
xyzzy123 1 days ago [-]
I feel like the shape of the market right now for "home lab" inference is:
The sparks are good if your ultimate plan is to spend even more on NVidia hardware in future to run your dev setups at usable speeds. Or, you're developing for a work cluster.
If you mainly want to run local models at acceptable speeds portably, buy a mac with lots of RAM. If you’re happy with non-portable / racked, buy 3090s (dense) or mac studios (MoEs). Buy newer cards if you are restricted on power or slots. If you are rich, buy a6000 blackwells.
zer0zzz 1 days ago [-]
The only Question is is it worth suffering hip and x86? I suspect a lot of folks might like a machine that mimics their GB300 But costs less than a dgx.
Also I heard the tensor core instructions on the dgx are gimped and you’re better off with a rtx pro x000. Is that the same with these machines?
SilverElfin 2 days ago [-]
Is CUDA really a lead for long? Aren’t all the latest competitive approaches avoiding all the standard software stacks and writing deeply customized software that is very directly tied to whatever hardware they use?
And is it really a way to lock in people? With AI coding tools, isn’t it trivial to write software on top of CUDA and rewrite it to target some other hardware?
ptole_my 1 days ago [-]
yes.
no.
zer0zzz 13 hours ago [-]
What are these things going to cost? I hope not the same as a mac equivalent or as much as a dgx.
Geekbench cpu bench leaks indicate they aren’t as good as m3 at single core even.
Will they support booting into a Linux installer?
ChrisArchitect 1 days ago [-]
Related:
A powerful new chapter for Windows PCs, accelerated by Nvidia RTX Spark
The thing I think is really funny is that if this takes off, frontier model companies and datacenters will end up holding the bag, and as per usual after the last few tech hype cycles, NVIDIA will still be selling.
Eventually a lot of inference will get right-sized into something you affordably run yourself.
LatencyKills 1 days ago [-]
First:
> "Our goal is to deliver unmetered intelligence to every home and every desk with Windows," said Satya Nadella, chairman and head of Microsoft.
Then:
> However, Ian Fogg, Research Director at industry analyst firm FDM CCS Insight said the change was "likely to come with a significant price tag" and Nvidia would be targeting "those looking for workstation-class performance".
So... not every desk with Windows.
pitched 1 days ago [-]
First, make it possible. Then, expand the market. The early adopters help pay R&D for later efforts. Every desk is a good goal, even if not hit by the first doodad.
It just feels too much like what they said about Apple II and early Windows. A play at nostalgia instead putting real thought into it.
LatencyKills 1 days ago [-]
I was an engineer at both MS and Apple, and wholeheartedly agree with you.
My question is, what happens to the people who use RTX cards for gaming? This new solution isn't meant for that. Do they need an "AI accelerator" and a gaming-centric GPU?
cryo32 1 days ago [-]
I don’t know anyone other than a very small but vocal minority who will give a shit about this.
Even in the analytics side most of the stuff is some shonky ass numpy or excel gank.
I don’t know what the market is. I just can’t see it.
netdevphoenix 1 days ago [-]
The constant deliberate conflagration of LLMs with general intelligence is so grating.
atlgator 19 hours ago [-]
Anything they can do to avoid producing more 5090 FEs.
lowbloodsugar 1 days ago [-]
ARM64+GPU sure seems like the future. I'm still using my M1 and even that can handle models well, has decent graphics, M5 is a beast, and M6 must surely go even bigger on LLM compute. Now Microsoft has a compelling ARM64+GPU future too.
What does AMD or Intel have here?
throwa356262 22 hours ago [-]
Don't know about intel, but AMD has Strix Halo with unified memory and really impressive performance.
I think the future will be 50/50 x64 vs arm64 for PCs.
SilverElfin 2 days ago [-]
Some other relevant discussions and sources …
NVIDIA and Microsoft Reinvent Windows PCs for the Age of Personal AI
It all sounds good on paper. But I have trouble believing Windows can be a good platform for this. Microsoft has lost all trust after inserting ads into windows, slowly removing power user features, and exploiting every dark pattern they can. And for years, the ARM based Windows laptops have been useless due to app compatibility issues. Why would this change now? Is it priced to be a lot cheaper than Apple’s laptops? Or is this a niche product for AI developers basically?
bentcorner 1 days ago [-]
Anecdotally Windows ARM works fine for me, although to be honest most of my work is command line + browser anyway. WSL works like a treat. Steam installs and most lower end games also play fine on my ARM laptop too. Games that require kernel anticheat don't work.
I think they make a great "second device" where you have something meatier to fall back to if something doesn't quite work right. I'm not sure if it's ready to take on the "main device" role just yet. But it's a far far better experience than the Surface RT days.
__atx__ 2 days ago [-]
The "gaming" take is a strange one indeed for an ARM platform. Hopefully they (Microsoft or Nvidia?) put some real effort into the translation layer. They claim modern AAA games, but it is possible they strongarmed the developers to make them an ARM build for a few select titles...
satvikpendem 1 days ago [-]
It's clear gaming was not a major concern, it's just "good enough" for someone running AI models and occasionally wants to play some games, not made to primarily play games.
SilverElfin 1 days ago [-]
Yep. I noticed the press releases talk about all the partners they have. It seems like a desperate attempt to manufacture a consensus to invest in this new hardware instead of leaving it sort of abandoned like the other Windows ARM stuff. But the problem is that these attempts end up having a few very visible apps working on the architecture and others not actually doing anything substantial.
Sure the graphics capabilities are probably very good. But if you’re a game developer who has traditionally built on Windows on x86 chips, would you want to invest in this new chip or invest in making games for the Apple ecosystem? Aren’t there more new customers to reach in the Apple world than this new Nvidia world?
andsoitis 1 days ago [-]
> But if you’re a game developer who has traditionally built on Windows on x86 chips, would you want to invest in this new chip or invest in making games for the Apple ecosystem?
Windows and the new chip. Higher developer productivity and higher chances of a substantial audience.
satvikpendem 1 days ago [-]
Who cares about Windows, the goal is to run local AI models similar to AMD Strix Halo and Apple Silicon machines. The OS is honestly a distant last concern as long as the models work well, as you could put Linux on these too, but not sure how well wake lock works.
try-working 2 days ago [-]
Hopefully MSFT would look at this as a do or die system, and go all in on improving the user and ownership experience. Will they? Not so sure.
Gigachad 1 days ago [-]
Microsoft sees windows purely as a platform to sell AI products these days.
jfim 2 days ago [-]
That's what they're working on, in theory, with Windows K2.
fhn 1 days ago [-]
I would never trust Microsoft. Their next drama is revoking Office 2019 perpetual licenses https://www.youtube.com/watch?v=KRnno9VIZx0. It never ends with them because they know they have you by the balls.
twilo 1 days ago [-]
I trust them on a daily basis. No issues thus far..
TreeInBuxton 1 days ago [-]
A lot of the app compatibility issues on current machines are down to Qualcomm's poor drivers - the actual core bits are mostly okay.
sorry_outta_gas 2 days ago [-]
[dead]
yobid20 19 hours ago [-]
how will this compare to having an rtx pro 6000 for inference? (not training)
exabrial 22 hours ago [-]
Yeah, there is zero chance I'm ever running Windows ROFL.
However, I'd jump from Mac in a Heartbeat if this supported Linux.
toksum 1 days ago [-]
[flagged]
lucamark 1 days ago [-]
[flagged]
sylware 1 days ago [-]
[flagged]
twobitshifter 1 days ago [-]
Right, the export controls are only forcing Chinese AI to innovate, build their own fabs, and make training and inference more efficient. The end game of this will be NVIDIA chips won’t be wanted because you can get a $50 chinese chip running a ternary model that is competitive with claude in English and is much better in Mandarin.
adrian_b 1 days ago [-]
The US government has failed to learn from its own history.
60 years ago the US government had forbidden the export of fast computers to France, with the hope that this sanction will prevent the French from developing thermonuclear bombs.
The result was that the French state (which at that time was lead by de Gaulle, not much less autocratically than China) subsidized some of their computer manufacturers, which previously could not compete with the American companies like IBM and CDC, and also their semiconductor manufacturing industry, which had to provide the components for the locally-made computers.
Eventually, the French produced TTL circuits and mainframe computers made with them, and finally they also made thermonuclear bombs.
So the American "sanctions" against France have been a complete failure and have been great for the French industry of semiconductors and computers.
Many years later, when USA no longer had export restrictions towards France and the French state no longer protected their industry, the French industries of integrated circuits and computers have been greatly reduced, their companies either becoming bankrupt or being bought or merged into multinational companies.
sylware 7 hours ago [-]
"De Gaulle, not much less autocratically"?
When De Gaulle did ask the french via a poll if they wanted him to leave, they said yes, and he left. He is also the guy who did setup the balance between the various political powers, which has been kind of working... until now (currently the government can hardly get laws from the parliament, because few people representatives are on gov side, and they won't die or disappear if they disagree, bugger!). The fact that the president must leave after 10 years is kind of recent though.
France has always been a very strong US ally, in an honest relationship, namely without agreeing or being on board with everything.
And France never had the intention to nuke the US... unlike some other country we talk about all the time in news (that said, France is not far from the US on their list...).
And compared to the rest of the world, don't forget the 'western world' (which is not 'western' only anymore...) has very, very close core values. A good way to think about it: a big dysfunctional family.
On the software side, aka the 'silicon master control' side: currently, the french are just Big Tech slaves. To be more current, Holland president and Valls prime minister did install a document (2015/2016) which has been "law" since which literaly "pushes" (hard) administration online services to be hardcore dependent on Big Tech (mostly the whatng cartel) without any reasonable technical way out (unless noscript/basic HTML web sites, are brought back in the security infrastructure, like they were a few years back). This document is out of reach of even the parliament, namely only the president and prime minister have control over it, in other words, to interact with this document you need the same level of power required to decide to increase the number of atomic bombs(huh). The following president and prime ministers did nothing and kept increasing french administration dependency, I guess they were/are as guilty OR BRAIN WASHED than Hollande and Valls.
Open source does not matter anymore (look at how big tech controls open source software via often-non-pertinent complexity and size), _LEAN_ open source does, and that includes the SDK (aka the computer languages: if you need a giga huge and complex compiler, you already lost).
On the hardware side, state-of-the-art chip is an international effort with an insane supply chain. This is mostly 'driven'/hogged by US chip designers. State of the art, foundries are currently in TW (the US is working at getting some back), EUV is from EU-ish (the EUV light is from the US), and many, many more high-hech tools are from the US/JP/TW/etc.
What I am wondering: did Holland and Valls "give" France to Big Tech... or "sold" it, if you see what I mean, because it is very easy to setup public money channels using 'Big Tech' which look "clean", aka hidden behind a technologi-blablublo smoke screen, since most people are scared of tech and/or don't understand the fine details.
It is all about simple file formats and network protocols, good enough to do the job and stable in time. A good compromise is to use a strongely and dynamically defined subset of Big Tech stuff, which you know can be locally implemented with reasonable effort (by citizens, small companies, state administrations, etc). That will foster alternatives (good I guess). That's why I am talking about web sites, and not web apps (noscript/basic HTML), and we could talk about a strongely defined subset of PDF.
Ofc, the devil hides in the details, this is a very coarse overview: you have to basically decide in a fine-grained case by case, mistakes will be made and will have to painfully be fixed. You cannot get it all in one shot, it is module per module, back and forth, and probably slowly.
pitched 1 days ago [-]
I would order that in a heartbeat. Even if it required proprietary Chinese-government drivers. I would try to segregate in a VM without internet or something. Please make this happen! Tokens cost too much in the current system.
throw0101c 24 hours ago [-]
Imagine a Beowulf cluster of these. /slashdot
pseudosavant 1 days ago [-]
This may finally be the chip family ARM on Windows has always needed. Qualcomm's chips have always been dogs with slow off-the-shelf ARM CPU cores that have pathetic single-threaded performance compared to x86 AMD/Intel or ARM Apple Silicon designs.
pseudosavant 23 hours ago [-]
For reference, this is just a single benchmark, but as an idea of each vendor's top mobile CPU single-threaded performance:
Geekbench Single Thread Score:
- DGX Spark (same CPU as RTX Spark): 3125
- Snapdragon X1 Elite: 2950
- Snapdragon X2 Elite Extreme: 4050
- AMD Ryzen 9 9955HX: 3225
- Intel Core Ultra 9 290HX Plus: 3175
- Apple M5 Max: 4350
I'm happy to be wrong about Qualcomm's latest X2 chip performance, even if it is shipping in only a single product so far. Their previous best was the lowest in this list.
kcb 1 days ago [-]
This will likely have worse single threaded performance than recent Qualcomm CPUs.
24 hours ago [-]
BoggleOhYeah 1 days ago [-]
These chips also appear to be using off-the-shelf ARM cores.
TiredOfLife 1 days ago [-]
Qualcomm Snapdragon x1 and upcoming x2 use their Oryon core and have much faster single-thread performance than Intel/Amd and this nvidia soc that uses off-the-shelf arm cores
pseudosavant 1 days ago [-]
That wasn't true of the X1, but apparently the X2 (which is only in a single device so far) does appear to finally be fast. The first Windows ARM CPU to be faster than any of its x86 rivals. Competitive with Apple Silicon single-thread performance even.
I was disappointed to see that the RTX Spark has the ARM cores from the DGX Spark. I was hoping it had their new in-house developed cores that Nvidia is starting to use on their latest gen server parts. They look really fast. That said, if RTX Spark has CPU performance like the DGX Spark, it will be almost as fast as the top AMD/Intel parts.
renoir 2 days ago [-]
So basically Cerebras style?
KeplerBoy 1 days ago [-]
Not at all. This is a more like what Apple has been doing the past few years. A bunch of decent arm cores paired with a beefy integrated GPU.
trvz 1 days ago [-]
No.
babhishek21 24 hours ago [-]
Question is: "Can it run Doom?"
officerk 1 days ago [-]
This will crush the M5 Max going by the numbers. I'm curious to see how much they end up costing
Tiberium 1 days ago [-]
It won't, the top tier RTX Spark has the same exact CPU and GPU as DGX Spark, so you can check DGX Spark CPU benchmarks to see how it fares. Spoiler: it's about M3 Max level. And they're only coming this fall.
aenis 1 days ago [-]
Nah, still ~300GB/s memory bandwidth. That will be slower than the M5 max, by a wide margin for LLM inference.
Rekindle8090 1 days ago [-]
M5 max is 3x stronger and 50% more power efficient. nice try though.
spwa4 1 days ago [-]
... but you'll be rewriting inference for any model that isn't a well-known LLM. Yourself.
wbolt 1 days ago [-]
AI coding agents can do that pretty nicely already and it will only (slowly) improve over time.
seanalltogether 24 hours ago [-]
"Unified Memory" still means divided address space right? You have to pre-allocate system vs gpu and copy from one to the other?
asimovDev 1 days ago [-]
>Lenovo, HP, Dell and Apple accounted for almost 75% of the world's PC market in the first three months of this year, according to research firm Gartner.
> Over 100 Windows software providers such as Adobe, Blackmagic Design, Blender, CapCut, ComfyUI and OTOY, and game developers such as KRAFTON, NetEase, Remedy Entertainment, Riot Games and XBOX are embracing the new RTX Spark platform. [...] NVIDIA is partnering with Adobe to rearchitect Adobe Premiere and Photoshop for RTX Spark. [0]
> Gaming on Arm is finally coming of age thanks to the NVIDIA partnership. Native anti-cheat solutions from Epic and BattlEye are fully supported on the RTX Spark platform. Major developers are jumping on board, with Riot Games bringing League of Legends and Valorant natively to the architecture, alongside KRAFTON bringing PUBG Battlegrounds. [1]
Also, Nintento Switch is an Nvidia/Arm gaming device so many game publishers already have some experience with the combo.
[0] https://nvidianews.nvidia.com/news/nvidia-microsoft-windows-...
[1] https://www.windowslatest.com/2026/06/01/microsoft-builds-it...
The big news is more so on the games side, which is probably where Nvidia had some pull.
I'm curious what "rearchitect for RTS Spark" means in practice though. Sounds like its less convincing them to make an arm build for windows, but they are maybe taking advantage of some hardware specific features? If so, what does that mean for the Snapdragon X series I wonder?
Only the ones which explicitly list something like the Riot Games mention are really related to the device/Nvidia. The thing which really pushes this along is user adoption/market share, not big names. This device will help that, especially in the gaming space, but it's easy to get over eager as it being from Nvidia means everyone else who has been waiting will just now jump on board too because of that.
Microsoft pulls in their weight as well, so this seems like it has a decent chance of getting industry support.
Actually, I went to the Mac Studio configure page on Apple.com and you can't do higher than 96GB now...
Vendors didn’t have to do shit to support the platform, they just got better performance if they did (like factorio).
There is something of a difference between “all your stuff will still work, at comparable performance” and arm windows which (as evidenced by all the vendor promises) you can’t really currently say with prism.
I would describe prism as “surprisingly rubbish considering they had an example of how to do it right” and “your app probably doesn’t run because of drivers or some ??? compatibility thing”.
Am I misremembering? I remember being blown away by Rosetta.
Prism… yeah. Toggle the settings. Disable jit. Disable FP. …bin laptop. Get an intel laptop.
For decades, Windows made it too easy for games and even some application to install drivers. Windows games use drivers for anti-cheat (and historically for copy protection too). Neither Apple Rosetta nor Microsoft Prism can translate/emulate drivers, but since drivers have been much more prevalent on Windows, now Windows has a much biggest compatibility problem.
So if anything, we need to push more game studios to use open source dependencies which will make porting easier.
Intel has closed things down: some wifi and webcam firmwares are poop and a massive pain to get working on newer chips (if at all). Their wifi firmwares also don't respect certain kernel overrides (which is why I replaced my Intel Wifi 7 chip with a mediatek Wifi 6 one). Blame is 100% on intel and not linux. Broadcom is also pretty bad at being a team player in this regard.
I basically recommend everyone to stick with AMD chipset & GPU's where possible, because they have mainline kernel support nailed down 95% of the time.
Again, ARM works fine, their extra firmwares for extra devices on SoC's are to blame if you struggle.
This isn't fully the reason, Linux is infamous for requiring a specific build for each SoC (and usually each board of said SoC) where as Windows on ARM uses ACPI which Linux doesn't support to the same level. Linux prefers the landfill promoting device tree for each device approach.
What would push more games would be Valve actually making it worthwhile to natively target Linux.
But I really do question how well Windows on Arm is really going to work out long term.
For Apple it worked because they were able to force the issue. If you wanted a new Mac it was going to be Arm and we all knew eventually (this year or is it next year?) Intel support would drop. Over time we have seen M series exclusive features.
Developers were forced to update or abandon Mac which gave users a great experience (with some early growing pains).
This is something that Windows will never be able too do. They will always be stuck maintaining an emulator and a likely large subset of apps only supporting one over the other. (also does this work the other way around with an Arm only app working on x86?)
This seems like a repeat of when it was not uncommon for games to only support Intel or AMD or NVIDIA or AMD. But worse since they are not both x86. Sure at least we have emulation but just like with Rosetta2 it shouldn't ever be the long term solution.
Qualcomm is also working on a really good ARM ISA CPU with their acquisition of NuVia and subsequent Oryon architecture.
Meanwhile this is just using off-the-shelf ARM CPUs in a MediaTek SoC with blackwell bolted to the side of it. ARM's CPUs so far have been subpar for laptop-class chips. Hence why neither Apple nor Qualcomm are using them.
MediaTek is involved in the SoC but both the CPU & GPU from Nvidia are bolted on to it. I.e. it's not a standard MediaTek CPU with an Nvidia GPU added.
tbh, I always read this as Intel doing some sales magic here.
Apple: "Hey, we're making a product that has a 15w thermal envelope, do you have anything?"
Intel: "Yes!"
(Unspoken: their products will throttle down to fit, in fact, they will try to run always at 99ºC so you always get the best performance! FEATURE!)
Apple: "uhhhh..."
Consumers: "HEH IS IT EVEN A PRO DEVICE IF IT DOESN"T HAVE <INTEL MARKETING BRAND TERM>?"
Apple: "UHHHH... Guess we'll do it ourselves"
Possibly, but Apple choosing a new, thicker chassis the same generation that they introduce their more power efficient replacement is certainly a thing. Even if Intel failed to achieve the TDP they told Apple, Apple also seems to no longer believe the thinness they were doing was viable for that TDP anyway.
Intel's product offering certainly wasn't as compelling towards the end there, but it also looked almost uniquely bad in Apple's chassis vs everyone else's
Ultimately, Apple won that fight when they decided to stop letting Intel control their hardware roadmap and it's been a great change for the entire industry. Intel is finally seeing some changes in their own products, largely in response to Apple dropping them. Now Nvidia is getting into the game which means more competition which is also good.
But the bigger problem in my opinion: How much of the Windows userbase actually sticks to Windows because of its backwards-compatibility?
--> What would happen if they break this model and the OS is only judged based on its user experience and available applications...?
I'm not sure it would stand any chance to compete in the B2C space. If I think about it, there's not a single new feature in Windows of the last ~20 years I particularly care about.
Without backwards compatibility, there's barely any ecosystem. MacOS on the other hand is full of ecosystem features, improving collaboration, connectivity, handoff across devices, etc.
True, but if you're only in the ecosystem as a mac user, in many ways it's felt like a mixed bag. I still wildly prefer mac over other operating systems, but if upgrades had a price, I think those sales would mostly go to iPhone users. Even at free, I'm yet to find a compelling reason to install Tahoe, and will probably just continue waiting until the next one.
But despite that, as a Windows user I acknowledge that any kind of interaction with another Mac from within MacOS (Handoff, Sidecar, Universal Control, Bluetooth-pairing to Apple-ID instead of Hardware MAC-ID,...) is leaps ahead of what Microsoft was doing with their OS for the past years.
Just the scenario of an employee getting a Windows laptop as a work-PC, there's barely any halo-effect if he/she also uses Windows at home. No easier handoff, no interaction, hardly any "just-works" connectivity.
Windows is mostly a vessel for the (legacy) applications it can run, and for these Browser-based Microsoft Online-Applications (which work equally-well on other platforms)
They didn't invest in creating "just works" frameworks for their PCs which amplify the ecosystem the more compatible devices you have, instead most of their focus is now on "just-works" stuff in the cloud.
So if Microsoft would make a clean cut on backwards-compatibility, I'm not sure there would be a reason left for most B2C users to even stay with Windows.
The "you can make it work if you invest a bit of time or google it" paradigm is nowadays well-covered by Linux already, and it's getting even harder for brands to compete on price/quality with Apple's scale, for almost any portable device...
Recently I upgraded my motherboard and tried reinstalling Win10 Pro, but couldn't activate it despite saving the product key. They have at least THREE obscure flows for re-activation depending on how it was originally activated. The license in my flow needed to have been bound to a Microsoft account that I never previously needed, because it ties itself to the hardware. I had to dismantle and rebuild with my old installation, activate it with my old motherboard on a Microsoft account that I wasn't planning to use to login with, then rebuild again with my new components, sign in to activate, and then disable sign in to be able to use a local user account. Insane.
They only know Apple, Windows and Chromebooks.
I can easily run Qwen3.6 35B-A3B with Q5_K_M with a 260k+ context window with some vram to spare. It easily runs probably 80tps. It took me quite a while to find the
Compared to GHCP Claude Sonnet 4.5 or 4.6, I have full parity. The wall clock time is faster for agentic workflows, and rule following is about on par.
With either, doing something kind of novel or obscure takes more hand holding compared to just generate a GUI or crud app. For example, trying to build an actual program that performs a complicated process correctly requires quite a bit of hand holding to get it to properly help.
Sure, it isn't Opus or something, but I think with the right harness, it probably can get close. I think most of the issues these days is the harnesses are lacking.
It was suspected to come soon enough, but it was a nice cheap road for my small hobby stuff. When they announced the price changes, I started to explore alternatives, and with the news of Qwen3.6 35B being both and having quality, I figured it was worth a try out, and self-hosting made the most sense to me, since that meant I was free from being a forever-renter.
I'll probably try to figure that problem out in about a month. Worst case is I move it to another even older desktop to replace the 9800 GTX+ inside of that one.
I’ve got both (single R9700, dual B70) and they do nicely for about anything I throw at them, such that the latter has a visible improvement when the model is well-cached.
So with proprietary blobs that give you more trouble that they're worth?
But once we're talking about laptops, hybrid graphics etc. it quickly shows that this is not a platform Nvidia cares about.
Just anecdotal, but I never had these issues with the desktop AMD APU I had before it or Intel on board graphics on numerous laptops.
But I would say that as an Ubuntu and Debian user for decades I have no incentive to use anything else on it and I'm just pleased to have a Linux on Aarch64 machine that is well supported for a change.
tbh, I was rather unimpressed with the out-of-box experience for an "ai" computer, you couldn't even run a model locally with the common tools people use (no llama-cpp, ollama, vllm, etc). No huggingface CLI eiher, like come on!
I did put together my eventual setup in a repo https://github.com/verdverm/sparky
I need to update that because I have a nice vllm setup on there now with 4 models running, but should be able to get anyone else going without having to muddle about as I did.
No one seriously cares about this running Windows. We want Steam and CUDA/Ollama, and Windows just gets in the way. nVidia are simply not that oblivious, but I have to admit in their position I'd have considered the Microsoft involvement more trouble than it's worth, which is among the many reasons I'm not a billionaire.
Maybe they think the RAM market is so terrible it will kill the whole initiative regardless.
You’d think in an era where “code is free” there would be an easier story around running local ai than compiling llama.cpp by hand and then spending hours researching flags - only for it to crash from an oom error every ten prompts or so.
Has Steam finally started to push for native Linux games instead of translating Windows ones?
The failure of business, only reinforces Windows as the platform most studios reach for.
Buy Windows, buy Visual Studio, pay game engines licenses, let Valve do the work.
This ignoring that current Valve's management doesn't live forever, so who knows what happens afterwards.
Windows' monopoly on game dev isn't just market share either, since game dev isn't just code. You still need Photoshop, Maya, etc. and in smaller studies there's typically a crossover where some devs are doing art as well. Visual Studio's C++ debugger is still one of the best, and the tooling elsewhere hasn't caught up yet (compared to DX + PIX).
Then you also have to solve distribution and handling the fragmented display & audio stack. It's gotten a lot better, but its still a factor.
I'm fine with most of the work going into Wine/Proton. A stable ABI for Linux is a boon, if it happens to be Win32 then so be it.
Valve isn't going to be around forever porting Windows games into Proton, which is actually hardly any different if they would start selling Nintendo games with Dolphin, if we ignore the legal implications for a moment.
Microsoft have spent the whole Nadella era in "oooo cloud" inspired wonder and actively screwed up everything else.
Note how game developers rather spend their working hours in Windows with all its issues, even if they happen to have Linux servers for running their MMOs.
My Steam library is full of old win32 games that run better on Linux than they do on Windows 11. There are some native games appearing because of the Steam Deck, but the fact is they aren't necessary.
Valve aren't simply better at running a platform business, they've thoroughly subverted Microsoft's old one and have done a better job at running that than Microsoft themselves.
Look at the absolute state of the XBox business: all native games, tens of billions spent on something, and yet it's just a trainwreck from top to bottom.
Tens of thousands of Windows games would remain playable with ubiquitous Vulkan-capable hardware and a 500mb Proton runtime?
In truth if AMD or nVidia put their mind to having decent profiling tooling on Linux, and the AI wave suggests they will have no option, then this could readily become a thing.
Perhaps the next generation of the spark will improve on the bandwidth and RAM size numbers. Yes it's a lot like a Strix Halo, but this has CUDA, which will be of interest to developers who want that.
I was looking for AMD AI Max+ 395 laptops recently, and the only ones I've found were 13 inch models, which seems odd from a heat dumping standpoint. I'm looking for 16 inches, I guess the 13 inch form factor would make it easy for commutes where you're taking it to dock to a large monitor at work or home, but no 14 inch screens?
https://www.servethehome.com/nvida-introduces-rtx-spark-an-a...
M5 Max beats it, but for the price of an M5 Max, you are better off just getting a desktop with 2 3090s, which will be cheaper even at current prices.
Looking at devices like the NVIDIA Shield gives me some hope that NVIDIA will be better than Qualcomm here. I just hope this is not a case where the OEM has to purchase X years of driver support from the chip vendor beforehand, and that NVIDIA will provide support directly itself.
The biggest thing where this will crush Apple is the initial prefill phase. 6000+ cores vs 32/40, + active cooling with fans. For local llm models, this matters quite a bit more than tokens/second.
In the end, neither are really worth it for llm use compared to just building a desktop and just port forwarding over ssh to ollama.
I've heard there's still a large backlog of both software problems, and hardware problems with the platform. The software problems could be fixed with time, but they'll still give a shitty first impression. I'd have thought Nvidia would just bury this and try again with a successor run of silicon with a new design.
This thing seems practically destined to just be a repeat of the Snapdragon laptop debacle.
that's what nvidia is hoping for
DGX Spark runs Linux, and nobody is going to install Windows on that machine. This laptop got it backwards.
If someone decides to run Ollama for local inference with this laptop, they fit perfectly into the "has too much money to waste" bracket, which is addressed by a few other comments in the discussion.
Installing Linux natively on laptops has always had some specific features not working.
Even my Asus netbook, which came with Linux pre-installed, had wlan issues that I learned to work around with, and the driver never supported the same OpenGL version as the Windows one (3.3 vs 4.1).
Linux driver has always been an issue on laptops, but that's not the concern for running Python code.
One reason it works surprisingly well on modern systems is how much is offloaded to the GPU. You aren't going to get great power optimization or anything without it being truly native though.
There are games which are CPU limited though, and it will be interesting how those do. Curiously those also tend to be in engines with Arm support already.
When you lay out the software stack it is essentially OS > Game code > APIs. Both the OS and APIs are native code, it is only that middle point that needs the real work.
This is why x86 to ARM doesn't have such a heavy performance cost. So games can be CPU heavy but if it is heavy at the API end, that isnt a huge issue.
Very cool.
What would be interesting to me would be how quickly developers start targeting ARM64 directly.
https://docs.nvidia.com/dgx/dgx-spark-porting-guide/porting/...
Of course, DGX Spark is a miniPC, so laptops will likely be slower due to power limits/throttling.
https://www.techpowerup.com/gpu-specs/gb10.c4342 https://www.nvidia.com/en-us/products/rtx-spark/
Basically the same tradeoff as macmini with unified memory.
I think its gonna be another failure as we are used to see with the PC market these days.
Why do I have the feeling it's been intentionally made to be bad in order to get you on to their most pensive datacenter gear.
At this point, your cost-efficient options include used 3090s, "frankenrigs" using recycled data center cards, and a handful of "workstation" class cards, where the originally high margins and the long enterprise purchasing cycles have kept prices from going up too fast.
In contrast, a lot of these "personal" AI systems are basically a GPU-like core wired to larger amounts of slow RAM. Which is still semi-affordable. Generally speaking, they make for OK chatbots but extremely slow coding agents. Whereas you can run a modestly useful coding agent at reasonable speed on a 3090.
So yeah, a lot of these systems are bit scammy. But not because it's a secret conspiracy to protect data center cards. Rather, there simply isn't enough fast RAM in the entire world. So they'll flog you disappointly slow RAM instead.
TL;dr: Might be useful for some use cases, but benchmark very carefully.
I expect we'll get there in a few years, so perhaps this is Nvidia taking an early step in that direction.
In that case, this goes against Anthropic and OpenAI's business models. Which is a double whammy after Jensen Huang's recent comment about how agentic coding will only increase demand for software engineers, not reduce it.
So it also feels like a part of a budding shift in the competitive tension between the various parts of the AI supply chain.
Non-techy consumers may never do it, but at some point businesses are going to start asking when do they stop paying per token and start running models themselves. Right now the hardware is cost prohibitive, but I doubt that'll always be the case. Eventually the hardware will get cheaper and more available, and Nvidia seems to be betting on that.
They don't care where inference happens, so long as it happens on Nvidia hardware.
And when that happens, the pitch to non-techy users is "Free ChatGPT you can use offline with zero privacy risk". Once hardware accessibility and LLM efficiency advance to the point that this becomes feasible, I suspect it'll result in a much bigger hit to the cloud AI market than many expect.
I think the bigger hang up is that they're still slower and less capable than the frontier models, especially at the hardware specs most home users are likely to have.
If the first thing (for example) my mom sees upon installing the app is a dropdown model picker that contains things like "Qwen3.6-35b-a3b-mlx" she will 100% be bouncing off of it.
IMO the best version of this is a custom app/harness with a couple of pre-selected (and ideally fine-tuned) open models that immediately start downloading after checking the system's hardware specs. This would likely be a turn-off to most devs, but is absolutely essential if building an app for general consumers.
the current dielmma for me is how do I install a model on a remote LM Studio device without bypassing Lm Studio to SSH or remote in?
> lms link [servername] get model ?
> lms get [servername] model ?
> lms get model --link [servername] ?
Maybe I need to read the docs again but I swear the only way is remote or go to that device and download via the GUI, ssh in and use the local cli.
Maybe can copy/paste from one device's downloads dir to the server? Maybe I need to try hosting models on my NAS and see if I can download from device 1 then run on device 2 without install/setup?
AIaaS might keep an edge with multi-modal agentic workflows, but for 80% of general use cases, no "secret sauce" needed, the open weight models are already there, and tooling is constantly getting better.
The bottleneck is the cost of local hardware right now.
"Free, private, offline ChatGPT so long as your laptop has X GB of RAM"
Beyond that, I wouldn't underestimate the incentive of "because I can". The "secret sauce" you refer to is effectively just a DB & a while loop that feeds text to a bunch of tensors. If an indie dev decides they want to release something that dismantles the OpenAI & Anthropic moats, there really isn't all that big of a technical barrier stopping them.
This basically creates a bottleneck at the oldest/cheapest Apple Silicon machines, which are already crippled for context prefill.
But honestly, obsoleting a huge number of otherwise great Apple Silicon machines is something Apple would moment consider a major "pro" of building a compelling local AI stack.
With how much speculation around the difficult time Apple has had getting people to upgrade from M1, I'm sure they'd jump at such an opportunity.
- Please buy our new Macbook pro M5 that gives you 20 tokens/s on local 80B LLM
next year - Please buy our new Macbook pro M6 that gives you 25 tokens/s on local 80B LLM
milking product revenue in perpetuity by offering meaningful marginal improvements, while keeping same architecture will be the golden goose for Apple
+plus if it allows to segment market by wallet size into poor/middle/rich classes, thats even better
After a few generations (and over a decade) that was indistinguishable from the CPU chip itself.
It's a long hyperbole, I know, but I think local inference is inevitable; and the big fishes know it.
Will that be a complex technical setup? An appliance? An additional chip in your motherboard? So transparent it's burned right into the CPU? Those are just implementation details. We're probably just one generational breakthrough away from it.
[1] https://en.wikipedia.org/wiki/X87
They will. As some point in the future, people will want everything, they'll prompt full movies because they're bored and want to watch something.
Local AI capabilities are growing at a rapid pace, but so is hosted AI. While you can do a surprising amount of useful work with a model occupying a few to a few hundred gigs of VRAM, the hosted models are going to be way ahead for a long time.
AI inference is different. You get the outcome by passing text through some weights at the time you need it. There's no ongoing work besides training and releasing new models. If I had something that rivalled Opus 4+ I could use locally, I would switch in a heartbeat.
Google/Microsoft and hosting your own email is a byproduct of how difficult (socially, not technically) hosting your own email has become - mostly because SMTP protocol is inherently broken by spam and patched by social construct (trusted nodes, abuse@, 3+ DNS entries and counting, etc). Purely technical solutions, such HashCash etc, got discontinued in exchange for social ones. Central providers made (sometimes in exchange for, sometimes as excuse of, spam protection) self-hosting socially hard.
Now, I wonder if, and how, once Anthropic and OpenAI need to demonstrate profitability, could hamstring local AI. Which has been /so far/ very valuable for me in doing things that hosted providers don't want liability for, and align against (even if totally lawful and fair use!).
- v4.5: 1x cost, 100% quality, 100% speed but maybe sometimes 80% speed because of load - v4.6: 3x cost, 105% quality, 80% speed most of the time depends - v4.7: 9x cost, 115% quality, 90% speed most of the time
Then people will either stick with v4.5 for everything it can do and, if knowledgeable, use v4.7+ for critical or specific tasks.
But if we add the option of:
LocalLLM: one time hardware + electricity cost, good enough quality for 90% of work, good enough speed for 90% of work, no vendor lock in/sudden cost spikes...
Then there is an edge to running it yourself unless you can burn investor cash to get to the next level.
I think the recent headlines on org token spend plus my own experience just today (June 1) with the new Copilot Pro limits is going to push those with the compute to run locally.
As of about 1pm today I did something to hit 47% of my entire June premium requests (copilot Pro, not converted).
As of 2pm I'm using Gemma 4 E4B on a 12gb GPU (with large context window) off my desktop to power VS Code with Copilot on my laptop. I'm going to build an AMD Strix Halo system next week when parts arrive so I can queue up a few models in parallel or work with something I need that much RAM for.
I'm not lifting the earth with my LLM setup. Gemma 4 E4B is solid for accelerating my current projects. and it's costing me pennies more per hour vs blowing half my Copilot Pro plan in a distracted morning.
I'm at a vendor conference this weekend that is showing off their Agent/Agentic workflows. Nobody can tell me how they balance the cost long term. Hopefully whoever the vendor is paying for their cloud LLM token usage doesn't spike cost in a year (or the vendor themselves) after companies convert and are trapped VMware style with these agent processes. You can bring your own (cloud) model subscription. I need to find out if we can point it back to our own local LLM endpoint and try local models for the same processes. Even if it takes 5x longer, it could be cheaper and more secure.
That said, Apple's vertical integration is a massive competitive advantage here, IMO. Nvidia's reliance on Microsoft & Windows for software support likely makes competing w/ Apple an uphill battle.
If/when Local AI gets good enough to compete with Cloud AI on most inference workloads, Apple starts to look like Nvidia's biggest competitor.
While this is admittedly a dream scenario, the biggest downside would be Apple effectively having a monopoly in "Agent-ready" consumer electronics. Hopefully local AI both becomes the norm, and there is sufficient competition among the consumer platforms.
Side-note: I would love to see an "RTX Spark" Framework 13 mainboard at some point.
Apple's vertical integration has led to a Siri overhaul that took half a decade to roll out, and it won't even run locally. They built an NPU coprocessor that's basically dark silicon for expensive inference, and then shipped MLX to stop Tensorflow and Pytorch from replacing Apple's role in the stack entirely. Mac owners are pleading for signed CUDA drivers for the PCIe or Thunderbolt in their $5,000+ Mac Pros. Apple's ecosystem is pure liability for AI, they're not moving any product for datacenter inference and can't even sell the hardware to themselves: https://9to5mac.com/2026/03/02/some-apple-ai-servers-are-rep...
Nvidia's profit margins are safe. Even if the RTX Spark is a completely failed product, Apple is not encroaching on the markets that Nvidia dominates.
See a pattern there even the memory companies will probably get designed around the Chinese are in the process of doing it now so will Apple probably for long range survival.
In theory, Apple SHOULD have an advantage given they have everything they need in house and can all pull in a unified direction. In practice, it's not always the case that all the teams in a large corporation are all that much better at pulling in the same direction than multiple different corporations in a partnership. And all this will be moot if Local LLMs never catch up to cloud LLMs in terms of quality.
Regardless, it'll be very interesting to see how Nvidia's partnerships with Microsoft & hardware OEMs play out. If the AI inference compute share shifts appreciably to local consumer hardware, I'll want to see strong competition.
Without Khronos involved, I don't think that Apple has the buy-in to create a real industry-scale CUDA alternative. At this point, it might just be most profitable to support CUDA in macOS and give the people what they want.
But I like it. It's a copy of Apple's SoC design philosophy, same as AMD's Strix Halo, which I always thought was really cool both for laptops and home PCs. NVidia's traditional consumer cards pull way too much power and are too noisy to comfortably put them in a living or office environment.
The writing is on the wall, neither Anthropic nor OpenAI are anywhere near close to sustainability and if one or, worse, both fail the entire demand bubble for NVDA crashes.
It's smart to set up alternative destination markets while they can do so in peace.
Around 2-3K USD something with a good GPU + CPU + 128GB of integrated RAM is just going to be an awesome experience.
Considering Mac options are north of 5K+ even on a regular day.
https://www.nvidia.com/content/dam/en-zz/Solutions/networkin....
With MLX, Apple is building an answer to CUDA, and if people start switching from ChatGPT & Claude to some app that runs on their M5, suddenly Apple starts to look like Nvidia's biggest competitor.
If Nvidia doesn't have a pathway towards getting hardware into the hands of consumers, it could be a really difficult road ahead for them.
I'm here for it. Local models can do a lot of what I need at almost no cost, plus the fun of making them work better or building a new system to handle that aspect of my home lab. A Strix Halo system may not be amazingly fast but at 128gb of RAM it can keep up with most open models worth exploring.
Based on June 1 Copilot Pro plan premium token burn and cost, unless you REALLY know how to use cloud AI efficiently and are tooled up to do so a local LLM on hardware you may already own is very appetizing.
I converted a lot of work today to a 6.5gb local LLM on a 12gb GPU and no, it's not as good. But it is 'free' or at least feels that way, especially when I need to redo something and my copilot premium request % doesn't change.
+ Windows
+ Screen
- ConnectX-7 Smart NIC
Can the link type be toggled between Ethernet and Infiniband? (Don't think I've ever heard of a laptop with IB.)
Well, MediaTek actually said they made most of the SoC in fact. But the actual CPU cores themselves are all but certainly off-the-shelf Cortex parts, since MediaTek doesn't have a custom core design at all afaik.
Physically, NVIDIA did the GPU chiplet and Mediatek did the other chiplet that has the CPU, DRAM controller, and IO.
In university a friend of mine had a large hardcover book she kept in her dorm freezer. I asked here WTF she had a big book in there. She said it was for minecraft - she'd place her laptop on top of it while playing. The book was cold but also quite dry. I wonder how well it worked.
I was lucky that iteration 1 (sans towel) didn't ruin the laptop...
https://www.bhphotovideo.com/c/product/1957120-REG/apple_mbp...
Bosgame M5 AI Mini Desktop Ryzen AI Max+ 395 96GB variant €1.800,95 (sold out)
128GB+2TB variant €2.401,95 (in stock)
I have the latter, it's fantastic
$3649 with 128GB of ram
- 5090/6000 Pro: 1792GB/s
- 5080:: 960GB/s
- 5070Ti: 892GB/s
- M3 Ultra: 819GB/s
- DGX Spark: 273GB/s (less than an M5 Pro at 307GB/s)
Memory bandwidth isn't everything but it will cap inference rate pretty heavily. Also, the M3 Ultra is for an almost 2 year old Mac Studio. It's widely expected that it'll be refreshed in Q3 with a likely M5 or M4 Ultra with >1000GB/s. I really hope Apple realizes what a market opportunity Apple has here.
The above shows just how good value the 5090 really is. It basically a RTX 6000 Pro with less RAM (and ~12% fewer CUDA units), which is a ~$10k card, for 20-30% of the price. This also demonstrates how NVidia uses VRAM for market segmentation. As an aside, the true data center cards (eg B100, H100) use HBM memory at ~3.2TB/s.
[1]: https://wccftech.com/nvidia-enters-pc-space-with-rtx-spark/
This is much better value than 5090, you can run much bigger models.
> tl;dr - For software development, Qwen3.6 27B, 5090 gives you ~3x speed over M5 Max, letting you plow through code, while M5 Max gives you ~4x memory, letting you use higher quantization and bigger context. Which would you choose and why?
I've read a number of things from which the consensus seems to be that yes you can run a larger model and/or have more context with a 128GB+ Mac but the performance gap is still massive and with current hardware we're still talking about inference rates that matter. By this I mean there's a big difference between 10tok/s vs 30. Once we get to t apoint where it's 100 vs 300, it won't be as big of a deal, a bit like FPS in games.
Oh and there are similar concerns with the DGX Spark [2].
[1]: https://www.reddit.com/r/LocalLLaMA/comments/1t5v2gr/need_ad...
[2]: https://www.reddit.com/r/LocalLLaMA/comments/1sqk333/dgx_spa...
The larger memory also allows for pre-training / finetuning models, hence why it's aimed at developers.
Saying that I think this is product is kinda dead on arrival.
1. in order to run LLMs, especially the best ones, you need complicated devices which are expensive
2. if you buy one for your personal use, you are probably not going to utilize it all the time and it will be idle a lot
It seems to me that it will always be more economical that the LLM-running devices are in a datacenter where it is easier to make sure they are always utilized
AI vendors are really going to struggle to shift tokens far beyond the frontier of human capabilities. It's reasonable (not guaranteed) to assume that, if the trend of frontier models (doubling capabilities on benchmarks every n months) holds, then the same trend will hold for local models, and those local models will meet and exceed the perception frontier. This would mean a human cannot tell the difference between Mistral-Open-2030 and Claude Opus 2030.
That's a bunch of "ifs", but there's nothing exceptional about those "ifs". They're basically the scenario if nothing changes between now and ~2030 with regards to capabilities trend attainment.
There is no ceiling to the power of consumer hardware. If it's cheap enough, it will be bought.
SETI@Home is a very niche use case
and web browsing still happens by connecting to data centers and server farms, not by connecting to another laptop
Even two or three years people were pointing out "The ChatGPT subscriptions you can buy with $2000 give you much more compute than whatever home setup you come up with" on r/LocalLLM. I did my own elementary school maths and came to the same conclusion.
Yet till this day people still boast how their beefy M4 Pro/Max machine with 32+GB RAM (which is not at all a "normal person's setup" and costs $2000+) runs LLMs smoothly, and "that's the future".
Someone needs to re-learn basic maths and take a walk around Best Buy to understand what "consumer laptop" looks like.
Think of it like having a graphics card at home versus using a cloud gaming stream? Technically subscribing to GeForce is much cheaper up front than getting a card, but people still do that. So will the audience of people running agents at home be as large as PC gaming? I think that's kind of plausible.
That is not how LLMs are typically used though in my experience
> Think of it like having a graphics card at home versus using a cloud gaming stream?
Latency seems to be much more important in that use case
I think consumers are primed for that type of behaviour though. I have an iPhone on my desk. It has something like 2-3tflops CPU+GPU, which is double that of the largest super computer on earth when Jurassic Park came out, and is probably more computing power than existed on earth when I was born in the 80s.
I use this device for around 1hr per day to write text messages.
Local models today are fine for a lot of mundane tasks and will continue to be so. The use cases where paying for frontier models is worth it, will continue to shrink for folks not doing frontier work.
Or stall. Acceleration has been slowing significantly and gains seem to be tied to huge memory footprints.
2. Eventually we'll get to where local models that don't have sycophancy and slot-machine mechanics trained into them will perform better.
The price of a mini-PC with Intel Panther Lake is at least double in comparison with the price of a mini-PC with Arrow Lake H having similar specifications, and I am talking about barebones, before adding DRAM and SSDs, whose prices have risen even more.
The rise in prices is somewhat obfuscated by the confusing names of CPUs, i.e. some old and new CPUs may seem to be at similar prices and they have similar names, but the new CPU actually corresponds to a lower segment of the market, by having e.g. a smaller GPU and a lower clock frequency, while the CPU model that really corresponds to the old is named such that it seems to belong to the class corresponding to its present price.
As a concrete example of this obfuscation, which may confuse the buyers of laptops or mini-PCs, I have an ASUS 15 Pro with "Core Ultra 5 225H". If I would buy an ASUS 16 Pro now, the corresponding CPU model, the cheapest which is not worse than what I have, would be "Core Ultra X7 358H".
- bulk discounts - cheaper electricity - high utilisation to spread the costs among many users
I don't see how PCs could ever compete against it. Most users AI demands would probably result in >90% idle time on the GPU.
The whole replacing people angle is just the short term use case the more ghoulish executives are thinking about. In practice, lots of lots of new use cases have been made possible by LLMs. A lot of which can be done locally. But whatever capacity you have locally, they can have more of and for cheaper, and they manage the model instead of you doing it yourself. I think you put it nicely though, their moat will be thinned, and I doubt they'll be as profitable as their funding suggests, but at the same time the demand for them won't go away either. I don't know if OpenAI and Anthropic will be viable, but I'm nearly certain Deepseek is.
The tipping point will be power usage, if a local llm can run the same workload for less power that would be a game changer. Nvidia might get decimated, but even Google and others have moved on from GPUs already, they have faster and more power efficient TPUs. Add to that network bandwidth and availability issues, their moat remains. Also consider that even for graphics capabilities, user devices just don't have a consistent spec to make things like widespread 3d graphics and webgl usage viable. Someone's cheap android phone will never run a local llm reliably,same as it won't a 3d game. even if they have a high-end iphone, network providers aren't always performant as they are in western countries, and then there are people that won't want to install your app or local software, and then browser based exposure of the capability to sites which will have similar hardware spec issues, OS instabilities, competing tabs,etc...
Maybe the Nth time's the charm and Microsoft+Nvidia will manage to make Windows on ARM a viable platform.
All I care about is if I can get one of these for significantly less than a dgx and get Linux on it for some cuda Blackwell kerneling.
I'm not sure if I like this. Sure for a laptop this might be not a big problem but if this ARM ecosystem is a success it will spread to desktop computers and I fear we could lose the existing modularity.
But yes, it tends to be soldered on.
I think more announcements will follow soon from other companies.
Although I'm kinda surprised the DGX Spark used USB-C at all for power instead of just like a DC jack or whatever. But whatever.
Nvidia really threw stuff over the wall with the DGX Spark release. They don't seem to really care. I sort of think they'll spend a little more time on Windows, where there's no pesky upstreaming to do and they can just do whatever, but man, it's such typical hubris from Nvidia to build such an expensive box with good chips but make it basically unsupportable and roasty hot all the time.
You also generally have to run an ever more stale two year old Ubuntu derived DGX OS to get anywhere, with bespoke kernel and drivers all. None of it is well supported, none of it just works like a comparable PC or even well behaved arm system would.
As for other ARM, there were rumors AMD Sound Wave is/was going to be a ~10W arm APU, but there hasn't been much said about it lately. Honestly given the ram crunch, it's maybe just not worth trying to build a system with a cheap core, if the rest of your costs are going to stay so stratospheric. https://www.techpowerup.com/341848/amd-sound-wave-arm-powere...
NVIDIA nailed it
One might call this "forcing"
"Introducing the NVIDIA RTX Spark™ Superchip. The fusion of NVIDIA AI and RTX graphics in a single chip redefines Windows PCs and delivers amazing creating, AI development, and gaming—on the slimmest, most beautiful RTX laptops ever and small, ultra-efficient desktops."
Nvidia is also very very rich and pushes the boundaries of stuff. They stoped waiting for industry standards. You can see this in there network stuff. All nvidia.
Next logical step (at least now, not something i thought about) was there CPU for their GPU racks/clusters/systems.
Now they have everything anyway, RTX Spark is just logical.
I don't think its specificly targeted at Apple at all.
Apple has like 10-15% market share and just because some IT nerds buy themselves a mac mini doesn't mean much.
Plenty of them actually just run openclaw without local models. Something which surprised me quite a lot.
But i have two 4090 at home. They consume a lot of power and i had to research the proper Mainboardmodel and had to mod one 4090 to use water cooling because they run too hot.
There Spark setup was at 3k, way to expensive for normal people. If they can get this down and sell more, great for their ecosystem (strengthening it) and getting more money from people.
It does surprise me though that they have enough capacity for this chip and not just putting everyting in Rubin but perhaps the build out has slowed down a little or they start to diverse already for economic savety
More seriously, obviously a ton of work in an incredibly competitive space, and an incredible machine (without getting into competitive comparisons/minutiae). Was watching a techtechpotato[0] quick post pre-launch about "why is this even being tried?", which was also interesting. What an age we live in.
[0] https://youtu.be/JdB722MK380?si=GnLAYqT9ZecMhWCS
Guess I need to postpone my gamer PC renewal to end 2030.
It's just worse Strix Halo, as you are landing square in middle of Windows ARM problems
I 'd say that is an improvement if you want to run local llm inference. Still well below with what you can achieve with Apple chips though.
I don't think so.
This most likely be a winmodem situation, again
bechmarks with DGX arnt spectacular for NVIDIAs software and CUDA lead.
wouldnt count on this being a price/compute challenger. especially with overpriced VRAM.
All those CUDA cores in the sparks but they're starved for memory bandwidth.
I am still waiting for NVidia to release a system that legit beats 3090 maxxing for the home gamer...
The sparks are good if your ultimate plan is to spend even more on NVidia hardware in future to run your dev setups at usable speeds. Or, you're developing for a work cluster.
If you mainly want to run local models at acceptable speeds portably, buy a mac with lots of RAM. If you’re happy with non-portable / racked, buy 3090s (dense) or mac studios (MoEs). Buy newer cards if you are restricted on power or slots. If you are rich, buy a6000 blackwells.
Also I heard the tensor core instructions on the dgx are gimped and you’re better off with a rtx pro x000. Is that the same with these machines?
And is it really a way to lock in people? With AI coding tools, isn’t it trivial to write software on top of CUDA and rewrite it to target some other hardware?
no.
Geekbench cpu bench leaks indicate they aren’t as good as m3 at single core even.
Will they support booting into a Linux installer?
A powerful new chapter for Windows PCs, accelerated by Nvidia RTX Spark
https://news.ycombinator.com/item?id=48352693
Surface Laptop Ultra: Made for World Makers
https://news.ycombinator.com/item?id=48352627
Eventually a lot of inference will get right-sized into something you affordably run yourself.
> "Our goal is to deliver unmetered intelligence to every home and every desk with Windows," said Satya Nadella, chairman and head of Microsoft.
Then:
> However, Ian Fogg, Research Director at industry analyst firm FDM CCS Insight said the change was "likely to come with a significant price tag" and Nvidia would be targeting "those looking for workstation-class performance".
So... not every desk with Windows.
It just feels too much like what they said about Apple II and early Windows. A play at nostalgia instead putting real thought into it.
My question is, what happens to the people who use RTX cards for gaming? This new solution isn't meant for that. Do they need an "AI accelerator" and a gaming-centric GPU?
Even in the analytics side most of the stuff is some shonky ass numpy or excel gank.
I don’t know what the market is. I just can’t see it.
What does AMD or Intel have here?
I think the future will be 50/50 x64 vs arm64 for PCs.
NVIDIA and Microsoft Reinvent Windows PCs for the Age of Personal AI
https://news.ycombinator.com/item?id=48352705
NVIDIA DGX Station for Windows Puts a Trillion-Parameter AI Supercomputer on Every Enterprise Desk
https://news.ycombinator.com/item?id=48352691
Introducing Surface Laptop Ultra: Made for world makers
https://news.ycombinator.com/item?id=48352627
Introducing a powerful new chapter for Windows PCs, accelerated by NVIDIA RTX Spark
https://news.ycombinator.com/item?id=48352693
I think they make a great "second device" where you have something meatier to fall back to if something doesn't quite work right. I'm not sure if it's ready to take on the "main device" role just yet. But it's a far far better experience than the Surface RT days.
Sure the graphics capabilities are probably very good. But if you’re a game developer who has traditionally built on Windows on x86 chips, would you want to invest in this new chip or invest in making games for the Apple ecosystem? Aren’t there more new customers to reach in the Apple world than this new Nvidia world?
Windows and the new chip. Higher developer productivity and higher chances of a substantial audience.
However, I'd jump from Mac in a Heartbeat if this supported Linux.
60 years ago the US government had forbidden the export of fast computers to France, with the hope that this sanction will prevent the French from developing thermonuclear bombs.
The result was that the French state (which at that time was lead by de Gaulle, not much less autocratically than China) subsidized some of their computer manufacturers, which previously could not compete with the American companies like IBM and CDC, and also their semiconductor manufacturing industry, which had to provide the components for the locally-made computers.
Eventually, the French produced TTL circuits and mainframe computers made with them, and finally they also made thermonuclear bombs.
So the American "sanctions" against France have been a complete failure and have been great for the French industry of semiconductors and computers.
Many years later, when USA no longer had export restrictions towards France and the French state no longer protected their industry, the French industries of integrated circuits and computers have been greatly reduced, their companies either becoming bankrupt or being bought or merged into multinational companies.
When De Gaulle did ask the french via a poll if they wanted him to leave, they said yes, and he left. He is also the guy who did setup the balance between the various political powers, which has been kind of working... until now (currently the government can hardly get laws from the parliament, because few people representatives are on gov side, and they won't die or disappear if they disagree, bugger!). The fact that the president must leave after 10 years is kind of recent though.
France has always been a very strong US ally, in an honest relationship, namely without agreeing or being on board with everything. And France never had the intention to nuke the US... unlike some other country we talk about all the time in news (that said, France is not far from the US on their list...). And compared to the rest of the world, don't forget the 'western world' (which is not 'western' only anymore...) has very, very close core values. A good way to think about it: a big dysfunctional family.
On the software side, aka the 'silicon master control' side: currently, the french are just Big Tech slaves. To be more current, Holland president and Valls prime minister did install a document (2015/2016) which has been "law" since which literaly "pushes" (hard) administration online services to be hardcore dependent on Big Tech (mostly the whatng cartel) without any reasonable technical way out (unless noscript/basic HTML web sites, are brought back in the security infrastructure, like they were a few years back). This document is out of reach of even the parliament, namely only the president and prime minister have control over it, in other words, to interact with this document you need the same level of power required to decide to increase the number of atomic bombs(huh). The following president and prime ministers did nothing and kept increasing french administration dependency, I guess they were/are as guilty OR BRAIN WASHED than Hollande and Valls.
Open source does not matter anymore (look at how big tech controls open source software via often-non-pertinent complexity and size), _LEAN_ open source does, and that includes the SDK (aka the computer languages: if you need a giga huge and complex compiler, you already lost).
On the hardware side, state-of-the-art chip is an international effort with an insane supply chain. This is mostly 'driven'/hogged by US chip designers. State of the art, foundries are currently in TW (the US is working at getting some back), EUV is from EU-ish (the EUV light is from the US), and many, many more high-hech tools are from the US/JP/TW/etc.
What I am wondering: did Holland and Valls "give" France to Big Tech... or "sold" it, if you see what I mean, because it is very easy to setup public money channels using 'Big Tech' which look "clean", aka hidden behind a technologi-blablublo smoke screen, since most people are scared of tech and/or don't understand the fine details.
It is all about simple file formats and network protocols, good enough to do the job and stable in time. A good compromise is to use a strongely and dynamically defined subset of Big Tech stuff, which you know can be locally implemented with reasonable effort (by citizens, small companies, state administrations, etc). That will foster alternatives (good I guess). That's why I am talking about web sites, and not web apps (noscript/basic HTML), and we could talk about a strongely defined subset of PDF.
Ofc, the devil hides in the details, this is a very coarse overview: you have to basically decide in a fine-grained case by case, mistakes will be made and will have to painfully be fixed. You cannot get it all in one shot, it is module per module, back and forth, and probably slowly.
Geekbench Single Thread Score:
- DGX Spark (same CPU as RTX Spark): 3125
- Snapdragon X1 Elite: 2950
- Snapdragon X2 Elite Extreme: 4050
- AMD Ryzen 9 9955HX: 3225
- Intel Core Ultra 9 290HX Plus: 3175
- Apple M5 Max: 4350
I'm happy to be wrong about Qualcomm's latest X2 chip performance, even if it is shipping in only a single product so far. Their previous best was the lowest in this list.
I was disappointed to see that the RTX Spark has the ARM cores from the DGX Spark. I was hoping it had their new in-house developed cores that Nvidia is starting to use on their latest gen server parts. They look really fast. That said, if RTX Spark has CPU performance like the DGX Spark, it will be almost as fast as the top AMD/Intel parts.
https://www.gartner.com/en/newsroom/press-releases/2026-4-10...