serhii.net

In the middle of the desert you can say anything you want

04 Jan 2024

Masterarbeit current status

Motivational accountability!

Status

# Theory
Intro:       [##                   ]
Grammar:     [##################   ]
LM theory:   [##                   ]
Task desc.:  [#######              ]
Literature:  [#####                ]
# Code
LMentry:     [############         ]
UA-CBT:      [####                 ]
Eval code:   [                     ]
Run exps:    [                     ]

Plan

  1. Finish UA-CBT to a reasonable extent
  2. Dig deep into formats / eval harnesses / benchmarks code,
    1. write the relevant theory as I go
    2. find cool UA LMs to use for my tests
  3. Finish basic code for evaluation and experiments to have it ready
  4. Finish the existing tasks, LMentry and UA-CBT as the key ones
    1. (they alone would be enough honestly)
  5. Run experiments, and hopefully write the paper based on what I have!
  6. Finish the Pravda dataset eval task code
  7. Solve for real the pandoc issues etc. and have code for camera-ready citations, glosses, etc.
  8. Write the additional tasks if I have any time left at this point
  9. Run all experiments

Stack

231002-2325 Masterarbeit toread stack:

  • CBT paper1
  • LM eval paper2

  1. <_(@taskCBT) “The goldilocks principle: Reading children’s books with explicit memory representations” (2015) / Felix Hill, Antoine Bordes, Sumit Chopra, Jason Weston: z / https://arxiv.org/abs/1511.02301 / 10.48550/ARXIV.1511.02301 _> ↩︎

  2. <_(@Guo2023) “Evaluating Large Language Models: A Comprehensive Survey” (2023) / Zishan Guo, Renren Jin, Chuang Liu, Yufei Huang, Dan Shi, Supryadi, Linhao Yu, Yan Liu, Jiaxuan Li, Bojian Xiong, Deyi Xiong: z / http://arxiv.org/abs/2310.19736 / _> ↩︎

Nel mezzo del deserto posso dire tutto quello che voglio.