02 Oct 2023

Masterarbeit toread stack

Also: 231002-2311 Meta about writing a Masterarbeit

Relevant papers in Zotero will have a ’toread’ tag.

When can we trust model evaluations? — LessWrong

How truthful is GPT-3? A benchmark for language models — LessWrong
- paper: [2109.07958] TruthfulQA: Measuring How Models Mimic Human Falsehoods
  - especially the bits about constructing and validating!
- sylinrl/TruthfulQA: TruthfulQA: Measuring How Models Imitate Human Falsehoods
Code:
- spacy:
  - the entire site: Finding linguistic patterns using spaCy
lists: AI Evaluations - LessWrong
Datasets - The Best Ukrainian Language Datasets of 2022 | Twine some aren’t ones I addedj
Victoria Amelina: Ukraine and the meaning of home | Ukraine | The Guardian
Ukrainian and Russian: Two Separate Languages and Peoples – Ukrainian Institute of America
Bender and friensd:
- Unsupervised Cross-lingual Representation Learning & Unsupervised Cross-lingual Learning - Google Slides
- The cool state and fate paper¹ , especially the last bits about language typology.
Eval
- https://x.com/omarsar0/status/1719351676828602502?t=DzOSIX8j5Nozy0xoVD9zXg&s=31
  - tjunlp-lab/Awesome-LLMs-Evaluation-Papers: The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey. and the survey linked there

Python stuff

“Питон для продвинутой группы лингвистов, 2020-2021” (lecture): klyshinsky/AdvancedPyhon_2020_21

I should read through everything here: A quick tour

HF, LLM etc. Hamel’s Blog - Dataset Basics

<_(@inclusion) “The state and fate of linguistic diversity and inclusion in the NLP world” (2020) / Pratik Joshi, Sebastin Santy, Amar Budhiraja, Kalika Bali, Monojit Choudhury: z / https://arxiv.org/abs/2004.09095 / _> ↩︎

Nel mezzo del deserto posso dire tutto quello che voglio.

comments powered by Disqus