Nvidia breakthrough gives 4-bit pretraining technique the accuracy of FP8

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 9 days ago

Nvidia breakthrough gives 4-bit pretraining technique the accuracy of FP8

queermunist she/her@lemmy.ml · 5 days ago

The math is 62% accurate? Is that what that’s saying?

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 5 days ago

In this context, accuracy is a metric that measures the percentage of questions the model answered correctly on the MMLU-Pro benchmark. So, it’s not math specifically being 62% accurate, but the overall ability of the model to converge on a correct answer.

geneva_convenience@lemmy.ml · 6 days ago

I thought fp4 was for quantization only. Is it for training now too?

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 5 days ago

looks like and without loss of quality supposedly

technocrit@lemmy.dbzer0.com · 8 days ago

next generation of frontier models

lol. Too much grifter speak for me. Slow down on that kool aid.

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 8 days ago

People building their whole identity around hating LLM tech will never stop being hilarious.

locuester@lemmy.zip · 5 days ago

iTs jUsT a PaTtErN mAcHiNe

Nvidia breakthrough gives 4-bit pretraining technique the accuracy of FP8

Nvidia breakthrough gives 4-bit pretraining technique the accuracy of FP8

Pretraining Large Language Models with NVFP4