Intense Analytics · AI Hardware

An analysis of DGX Spark ownership: when it makes sense

When a DGX Spark beats renting an H100, and when it doesn't.

By Tejas Patel · intenseanalytics.com

Every few weeks someone asks me whether they should buy an NVIDIA DGX Spark or just rent H100s in the cloud. It sounds like a spec-sheet question, one box against one card. It isn't. The real decision underneath it is older and simpler: should you own your AI compute, or rent it. The DGX Spark is the cleanest case for owning. A cloud H100 is the cleanest case for renting. Frame it that way, and the choice mostly makes itself.

Here is the whole article in one idea, and then the scenarios.

First: you probably don't need full fine-tuning

The most expensive assumption in this whole decision is that you need a full fine-tune. You usually don't. QLoRA, a parameter-efficient method, adapts a 70B model on a single GPU by training small adapters on top of a frozen 4-bit base. For most domain adaptation it gets you most of the way there, at a tiny fraction of the cost and with none of the multi-GPU complexity. Full fine-tuning retrains every weight, which needs a whole node of GPUs and runs into the hundreds or thousands of dollars per run.

So the honest first step is not which machine. It is can QLoRA clear my quality bar. If it can, and it usually can, the hardware question gets a lot smaller and a lot cheaper.

A note on wording When I say H100 here, I mean one you rent in the cloud, not one you buy. You can buy an H100, but at $25,000 to $40,000 that is a different conversation. Throughout this piece the H100 stands in for renting, and the Spark stands in for owning.

The real question: own or rent

The cleanest way to see this is not as a hardware fight. It is a financing choice. Owning means capital up front, a box that depreciates, and compute that sits idle between jobs. Renting means paying only while you train, on hardware you never have to resell or watch go obsolete. The DGX Spark is the best version of owning. A cloud H100 is the best version of renting.

Underneath the money sits one technical fact that shapes everything. Fit versus speed.

Memory capacity decides whether a model runs at all. Memory bandwidth decides how fast.

The DGX Spark has 128GB of unified memory at 273 GB/s. A single cloud H100 has 80GB of HBM3 at about 3.35 TB/s, roughly twelve times the bandwidth. So an owned Spark holds bigger models, and a rented H100 runs them far faster. Owning leans on capacity; renting leans on speed and elasticity.

Head-to-head spec comparison of DGX Spark and a cloud H100. — Head-to-head. The Spark wins capacity, power, and price; the H100 wins compute and bandwidth.

When owning a DGX Spark wins

Owning makes sense when the box works for you often enough, or holds something valuable enough, that paying once beats renting again and again.

You need to fit a large model on hardware you own. 128GB lets you load a 70B in full FP16, which no single 80GB H100 can do, and hold models up to the 200B range when quantized.
You are prototyping and exploring, not racing the clock. For building a workflow and iterating, your speed as a developer matters more than tokens per second. The Spark is a capable sandbox with the full CUDA stack.
Your data cannot leave the building. For regulated, healthcare, or federal work, the premium buys sovereignty, not performance. The cloud's cost advantage simply does not apply when the cloud is off the table.
You want a CUDA box that is always on, with no meter. No provisioning, no idle billing, no setup. It draws about 140W and sits quietly on a desk.

The honest caveat: the Spark decodes in the single digits on a dense 70B. It is a develop-and-validate machine, not a speed demon. Buy it to fit models and own your environment, not to win benchmarks.

When renting the cloud wins

Renting makes sense when your need for compute is spiky, or when speed matters more than ownership.

Throughput is the point. Fine-tuning runs, batch inference, anything measured in wall-clock time. The twelve-times-faster memory means the same job finishes far sooner.
Your workload is bursty. Rent by the hour, or by the second on spot and serverless tiers, and pay only while the GPU works. The price of one Spark buys roughly 1,600 H100 hours in the cloud.
You need the best speed per dollar for a specific run. A 70B QLoRA fits on a single 80GB H100 and finishes far faster than on the Spark. Rent the card, run the job, shut it down.
You would rather not own a depreciating asset. GPU hardware holds its price for about a year, then slides as new generations ship. Renting hands that risk to someone else.

The fine-tuning detail people get wrong

"Can you fine-tune a 70B?" hides two very different answers.

QLoRA fits a 70B on a single 80GB H100, or on the Spark. The 4-bit base is about 35GB, the adapters are tiny, and it is the standard, affordable path. This is the one most people actually want.

Full fine-tuning is a different animal. It holds weights, gradients, optimizer states, and activations, roughly sixteen bytes per parameter with Adam, which is over a terabyte for a 70B. That is not one card. It is an eight-GPU node, an 8×H200 or 8×H100, sharded with FSDP or DeepSpeed, at hundreds to low-thousands of dollars per run. Neither a single H100 nor the Spark comes close. Test whether QLoRA clears your bar before reaching for it.

The money, over five years

This is where own versus rent gets concrete. The purchase price is not the whole cost of owning. A Spark is $4,699 plus power, and as an asset it depreciates toward almost nothing while it sits there. Renting only charges you when you train.

At a moderate cadence, around forty GPU-hours a month, renting stays cheaper than owning across all five years. The break-even only arrives if you run premium GPUs almost constantly. On-demand H100 use crosses the Spark's total cost at roughly year four; below that, renting wins.

A quick way to choose

What you need	Own or rent
Fit a 200B model on your own desk	Own · DGX Spark
Keep regulated or on-prem data in-house	Own · DGX Spark
Prototype and explore large models locally	Own · DGX Spark
A CUDA box always on, no meter	Own · DGX Spark
A fast 70B QLoRA run	Rent · cloud H100
Bursty training you pay for by the hour	Rent · cloud H100 · spot
A full 70B fine-tune	Rent · 8×H200 node
Lowest cost with near-constant use	Own the box
Lowest cost with occasional use	Rent the cloud

One number not to compare You will see the Spark's "1 PFLOP" set against the H100's tensor numbers. Do not read that as a head-to-head. The Spark's figure is FP4 with sparsity, and Hopper does not even do FP4, so its FP8 number is a different unit entirely. The honest determinant of real-world speed is memory bandwidth, and there the H100 is about twelve times ahead.

Bottom line

This was never really Spark versus H100. It is own versus rent. Owning wins when you need to fit big models, keep data in-house, or run near-constantly. Renting wins when you need speed, when work is bursty, or when you would rather not hold a depreciating asset. The DGX Spark is the best box to own; a cloud H100 is the best engine to rent.

Match the choice to how your work actually flows. And for the fine-tuning most teams do, a single rented card handles it, so you may not need to buy anything at all.

Intense Analytics · intenseanalytics.com

Search This Blog

Intense Analytics