LightOnOCR-2-1B

by LightOn

Open SourceSelf-HostedApache-2.0

SOTA 1B parameter end-to-end OCR (83.2 on OlmOCR-Bench) that beats models 9x larger while being 5x faster

OCRLayout AnalysisTable ExtractionDocument Conversion

Overview

LightOnOCR-2-1B is a fully differentiable end-to-end vision-language model for OCR. It processes PDFs and images directly to clean, naturally ordered Markdown without requiring external pipelines. With only 1B parameters, it achieves state-of-the-art 83.2 on OlmOCR-Bench, outperforming Chandra-9B by 1.5+ points.

The model handles tables, receipts, forms, multi-column layouts, and math notation with high accuracy. It supports 11 languages (with emphasis on European languages) and offers optional bounding box variants for image localization. With throughput of 5.71 pages/second on a single H100 (~493k pages/day), it costs less than $0.01 per 1,000 pages to run.

LightOnOCR comes in multiple variants: the main model for best OCR quality, base checkpoints for fine-tuning with LoRA/PEFT, bbox variants for image detection, and merged soup variants balancing OCR and localization. Can be served with vLLM for production or Hugging Face Transformers for flexibility.

Strengths

  • SOTA on OlmOCR-Bench (83.2) - beats Chandra-9B by 1.5+ points
  • 5.71 pages/s on single H100 (~493k pages/day)
  • 5x faster than dots.ocr, 2x faster than PaddleOCR-VL
  • 3.3x faster than Chandra OCR, 1.7x faster than OlmOCR
  • Less than $0.01 per 1,000 pages
  • End-to-end: PDF/image in, clean Markdown out
  • Day-0 Hugging Face Transformers support
  • Servable via vLLM or Transformers
  • Base checkpoints for domain-specific fine-tuning (LoRA/PEFT)
  • Optional bbox variants for image localization
  • 11 language support with European emphasis

Limitations

  • Optimal at 200 DPI with longest dimension 1540px
  • 11 languages vs 100+ in some alternatives
  • Newer model - community ecosystem still developing

Best Use Cases

  • High-volume document processing pipelines
  • RAG systems requiring clean Markdown from PDFs
  • Invoice and receipt processing
  • Scientific document digitization
  • Multi-column layout extraction
  • Domain-specific OCR via fine-tuning (legal, medical, financial)