serhii.net

In the middle of the desert you can say anything you want

28 Sep 2023

LM Benchmarks notes

Context: 230928-1527 Evaluation benchmark for DE-UA text Here I’ll keep random interesting benchmarks I find.

HELM

GLUECoS

code: GLUECoS/Code at master · microsoft/GLUECoS

Cross-lingual

XGLUE

Other

BIGBench

Leaderboards

’evaluation harness’es

Random relevant code

Nel mezzo del deserto posso dire tutto quello che voglio.