In the middle of the desert you can say anything you want

05 Dec 2022


Previously: [[garden/it/221119-2306 LM paper garden]] has more context about such metrics, [[garden/it/221204-2349 Interesting block with explanations of ML stuff]] has the compression angle for it.

Dumping these here for now.

The GPT21 paper puts it like this:

“Results on language modeling datasets are commonly reported in a quantity which is a scaled or ex- ponentiated version of the average negative log probability per canonical prediction unit - usually a character, a byte, or a word.”

GPT-2 (Metrics : PPL, BPB, BPC) led me to:

Evaluation Metrics for Language Modeling is really detailed.

  1. Radford: Language models are unsupervised multitask learners - Google Scholar / ↩︎

Nel mezzo del deserto posso dire tutto quello che voglio.