- cross-posted to:
- technology@lemmy.zip
- cross-posted to:
- technology@lemmy.zip
NVIDIA just trained a 12B-parameter language model on 10 trillion tokens entirely in 4-bit precision.
Here’s why this matters:
- NVFP4 delivers 2–3× faster math throughput and 50% less memory vs FP8
- Accuracy? Practically identical. (MMLU-Pro: FP8 = 62.62%, NVFP4 = 62.58%)
- Stability issues have been solved using Random Hadamard transforms, stochastic rounding, and 2D scaling
This is the first successful demonstration of large-scale 4-bit pretraining without losing accuracy.
The next generation of frontier models will be faster, cheaper, without compromise.
The math is 62% accurate? Is that what that’s saying?
In this context, accuracy is a metric that measures the percentage of questions the model answered correctly on the MMLU-Pro benchmark. So, it’s not math specifically being 62% accurate, but the overall ability of the model to converge on a correct answer.
I thought fp4 was for quantization only. Is it for training now too?
looks like and without loss of quality supposedly
next generation of frontier models
lol. Too much grifter speak for me. Slow down on that kool aid.
People building their whole identity around hating LLM tech will never stop being hilarious.
iTs jUsT a PaTtErN mAcHiNe