serhii.net

In the middle of the desert you can say anything you want

01 Oct 2025

OlmOCR for pdf-png-xxx to text

allenai/olmocr: Toolkit for linearizing PDFs for LLM datasets/training

Online demo: https://olmocr.allenai.org/

  • can be exposed through eg vllm
  • really cool results on messy docs
Nel mezzo del deserto posso dire tutto quello che voglio.
comments powered by Disqus