Untitled
Previously: 221119-2306 LM paper garden has more context about such metrics, 221204-2349 Interesting block with explanations of ML stuff has the compression angle for it.
Dumping these here for now.
The GPT21 paper puts it like this:
“Results on language modeling datasets are commonly reported in a quantity which is a scaled or ex- ponentiated version of the average negative log probability per canonical prediction unit - usually a character, a byte, or a word.”
GPT-2 (Metrics : PPL, BPB, BPC) led me to:
- python - How to calculate bits per character of a string? (bpc) - Stack Overflow
- python - How to calculate bits per character of a string? (bpc) - Stack Overflow
Evaluation Metrics for Language Modeling is really detailed.
Nel mezzo del deserto posso dire tutto quello che voglio.
comments powered by Disqus