serhii.net

In the middle of the desert you can say anything you want

14 Dec 2023

Masterarbeit benchmark task for Russian-Ukrainian interference

  • [[231213-1710 Ukrainska Pravda dataset#Can I also use this to generate tasks for the UA-CBT ( 231024-1704 Master thesis task CBT ) task?]] : both 3.5 and 4 during summarization use definitely Russian-inspired phrases :

  • In the news summarization bit, it magically changed Євген->Евген (https://chat.openai.com/share/2f6cf1f3-caf5-4e55-9c1b-3dbd6b73ba29)

  • Та подивись, баране, як я виглядаю з цим стильним сурдутом1

Вертить хвостиком і крутить рогами. Цап робить враження2.

(from 230928-1630 Ideas for Ukrainian LM eval tasks)

  1. Frame as multiple-choice task! Or boolean? Or “Is this a correct sentence”?
  2. I really like this: `“Цей студент [взявся за/почав] дослідження важкої теми.”
  3. For fun, here’s ChatGPT lying about prefixes: https://chat.openai.com/share/0eda9061-d2cf-46bc-ad45-38cc6e58934a
  4. False friends!
  5. Here’s an itemized list: Фальшиві друзі перекладача — Вікіпедія
    1. сир/сыр, неділя/неделя/…
    2. False Friends of the Slavist/Russian-Ukrainian - Wikibooks, open books for an open world
  6. ChatGPT ideas:
  7. On the semantic front, exploit polysemy and homonymy differences. Formulate sentences with words that have multiple meanings in Russian, but those meanings have distinct equivalents in Ukrainian. This will challenge the model to accurately discern the intended sense based on context.

More ideas

Using correct English spelling of cities etc!

Nel mezzo del deserto posso dire tutto quello che voglio.