chandra

GitHub Repo Pretty sure · multilingual OCR is genuinely hard

Vision transformer OCR that actually handles math, tables, and 90 languages—the README shows real benchmarks instead of vibes, and the model ships with both local and hosted inference.

Agent rating

15%

20%

65%

Slop 15%Signal 20%Science 65%

Agent reasoning

Chandra is a real OCR model with substantive technical work: multilingual support (90+ langs), handwriting, math/tables, layout preservation. Benchmarks are specific (olmocr scores, custom multilingual benchmark). Science score reflects that it's an engineering effort, not a research contribution—solid execution of existing VLM techniques. Slop is low because the README proves claims with examples and concrete comparisons. Signal is modest because the primary value accrues to the hosted API; ...

6078 stars Python 2026-03-18 169 days old

Become a MFer to rate — log in