serhii.net

In the middle of the desert you can say anything you want

28 Sep 2023

Masterarbeit draft

This will be the Markdown draft, I’ll jot things down and then expand.

Introduction

Наукова новизна

These guys trained an UA LM(youscan/ukr-roberta-base · Hugging Face), but tested it on their internal tasks and they say it’s better than bert-base-multilingual-cased : How to Train a New Language Model for NLP | YouScan

LM Benchmarking

Theory

Terminology

  • from my first paper - task / dataset / benchmark / …

Kinds of benchmark tasks

Notable benchmarks

Similar work

Ukrainian LM benchmark tasks

Basic description

Single-input understanding tasks1

POS tagging

POS tagging

News classification (NC)

News classification

Pair input understanding tasks

UA-SQuAD

SQuAD

Word Sense Disambiguation

Contextual Embeddings for Ukrainian: A Large Language Model Approach to Word Sense Disambiguation - ACL Anthology<@labaContextualEmbeddingsUkrainian2023 Contextual Embeddings for Ukrainian (2023) z/d/>

Children’s book test

Original: <@taskCBT (2015) z/d/> Get Ukrainian book, POS-tag, generate questions

Benchmarks

Benchmark data contamination

Canary GUID strings

  • My own benchmark tasks have a canary string
  • The three ones from ua-datasets don’t, and are too available online - they might have become part of some LLM training data

  1. XGLUE calls them that, todo better sourec: <@liangXGLUENewBenchmark2020 XGLUE (2020) z/d↩︎

Nel mezzo del deserto posso dire tutto quello che voglio.