serhii.net

In the middle of the desert you can say anything you want

21 Oct 2022

HF token-classification pipeline prediction text

Pipelines: in the predictions, p['word'] is not the exact string from the input text! It’s the recovered one from the subtokens - might have extra spaces etc. For the exact string the offsets should be used.

EDIT - I did another good deed today: Fix error/typo in docstring of TokenClassificationPipeline by pchr8 · Pull Request #19798 · huggingface/transformers

Nel mezzo del deserto posso dire tutto quello che voglio.