04 Jan 2024

Masterarbeit current status

Motivational accountability!

Status

# Theory
Intro:       [##                   ]
Grammar:     [##################   ]
LM theory:   [##                   ]
Task desc.:  [#######              ]
Literature:  [#####                ]
# Code
LMentry:     [############         ]
UA-CBT:      [####                 ]
Eval code:   [                     ]
Run exps:    [                     ]

Plan

Finish UA-CBT to a reasonable extent
Dig deep into formats / eval harnesses / benchmarks code,
1. write the relevant theory as I go
2. find cool UA LMs to use for my tests
Finish basic code for evaluation and experiments to have it ready
Finish the existing tasks, LMentry and UA-CBT as the key ones
1. (they alone would be enough honestly)
Run experiments, and hopefully write the paper based on what I have!
Finish the Pravda dataset eval task code
Solve for real the pandoc issues etc. and have code for camera-ready citations, glosses, etc.
Write the additional tasks if I have any time left at this point
Run all experiments

Stack

231002-2325 Masterarbeit toread stack:

CBT paper¹
LM eval paper²

<_(@taskCBT) “The goldilocks principle: Reading children’s books with explicit memory representations” (2015) / Felix Hill, Antoine Bordes, Sumit Chopra, Jason Weston: z / https://arxiv.org/abs/1511.02301 / 10.48550/ARXIV.1511.02301 _> ↩︎
<_(@Guo2023) “Evaluating Large Language Models: A Comprehensive Survey” (2023) / Zishan Guo, Renren Jin, Chuang Liu, Yufei Huang, Dan Shi, Supryadi, Linhao Yu, Yan Liu, Jiaxuan Li, Bojian Xiong, Deyi Xiong: z / http://arxiv.org/abs/2310.19736 / _> ↩︎

Nel mezzo del deserto posso dire tutto quello che voglio.

serhii.net

UNLISTED

Masterarbeit current status

Status

Plan

Stack