OlmOCR for pdf-png-xxx to text
allenai/olmocr: Toolkit for linearizing PDFs for LLM datasets/training
Online demo: https://olmocr.allenai.org/
- can be exposed through eg vllm
- really cool results on messy docs
Nel mezzo del deserto posso dire tutto quello che voglio.
comments powered by Disqus