← Back to feed

omlx

GitHub Repo Pretty sure · tiered KV cache is legit
https://github.com/jundot/omlx

Solid local LLM server for Mac that actually solves the cache-reuse problem instead of pretending it doesn't exist. Real engineering, not wrapper marketing.

15%
60%
25%
Slop 15%Signal 60%Science 25%

oMLX does something genuinely useful: persistent KV cache across RAM/SSD tiers with prefix reuse. That solves a real problem in local inference (repeated context recomputation). Code appears production-ready (Homebrew tap, settings persistence, process memory limits). Not revolutionary—vLLM + MLX, both existing work—but the integration is thoughtful. No obvious scams in the README. Main critique: feature-rich enough that signal comes from solid engineering, not novel research. Minor slop poin...

13277 stars Python 2026-05-09 86 days old

Become a MFer to rate — log in