LLM tokens per second — live multi-provider benchmark

Real inference speed, measured continuously. Every row is a live model from Ollama, OpenCode Zen, or OpenCode Go — sorted by tokens per second, benchmarked every ~10 minutes.

● live — last benchmark 34s ago
Trend 24h
nemotron-3-nano:30b (non-reasoning) Ollama 304.8 267.3 433ms 100% 7.4 1m ago
ministral-3:3b (non-reasoning) Ollama 292.3 192.0 428ms 100% 5.6 1m ago
deepseek-v4-flash Ollama 259.5 194.9 487ms 100% 40.3 5m ago
North Mini Code (Free) OpenCode Zen 197.9 369.1 243ms 84% 20.6 6m ago
glm-5.2 Ollama 174.7 108.6 401ms 100% 50.7 3m ago
kimi-k2.7-code Ollama 170.8 141.1 696ms 100% 41.9 2m ago
minimax-m2.1 Ollama 170.5 131.0 876ms 100% 31.4 2m ago
kimi-k2.5 Ollama 170.0 130.2 652ms 100% 38.1 2m ago
ministral-3:8b (non-reasoning) Ollama 160.7 131.6 427ms 100% 8.9 1m ago
glm-5.1 Ollama 151.0 145.4 890ms 100% 40.2 3m ago
gpt-oss:20b Ollama 142.6 110.6 6.4s 100% 14.9 2m ago
gemma4:31b Ollama 135.8 119.1 300ms 100% 29.4 3m ago
gpt-oss:120b Ollama 133.3 140.9 444ms 100% 23.8 3m ago
rnj-1:8b Ollama 127.5 126.3 414ms 100% 11m ago
gemini-3-flash-preview Ollama 120.2 113.8 1.7s 100% 37.8 4m ago
ministral-3:14b (non-reasoning) Ollama 115.7 101.9 444ms 100% 10 1m ago
GLM-5.2 OpenCode Go 105.6 79.8 1.2s 56% 50.7 10m ago
qwen3.5:397b Ollama 101.0 94.4 509ms 100% 33.7 11m ago
DeepSeek V4 Flash OpenCode Go 96.9 91.5 1.0s 54% 40.3 11m ago
gemma3:4b (non-reasoning) Ollama 96.9 50.2 649ms 97% 1.1 3m ago
glm-5 Ollama 91.2 103.5 2.7s 100% 39.5 3m ago
devstral-small-2:24b (non-reasoning) Ollama 88.9 58.7 1.0s 100% 13.1 4m ago
Big Pickle OpenCode Zen 88.7 87.1 1.1s 48% 7m ago
deepseek-v4-pro Ollama 88.2 155.2 676ms 100% 44.3 4m ago
DeepSeek V4 Flash (Free) OpenCode Zen 86.2 163.5 1.0s 48% 40.3 7m ago
qwen3-coder-next (non-reasoning) Ollama 80.6 93.8 317ms 100% 21.2 11m ago
GLM-5.1 OpenCode Go 79.9 98.2 520ms 56% 40.2 10m ago
devstral-2:123b (non-reasoning) Ollama 79.4 61.1 567ms 100% 15.5 4m ago
MiMo V2.5 (Free) OpenCode Zen 78.9 289.1 4.7s 91% 6m ago
minimax-m2.5 Ollama 73.7 76.6 295ms 100% 33.7 2m ago
DeepSeek V4 Pro OpenCode Go 72.9 72.9 1.1s 54% 44.3 10m ago
qwen3-coder:480b (non-reasoning) Ollama 66.0 90.3 566ms 100% 18 11m ago
MiMo V2.5 OpenCode Go 64.1 142.6 1.6s 85% 40.3 10m ago
kimi-k2.6 Ollama 56.8 89.9 2.4s 100% 42.8 2m ago
Qwen3.6 Plus OpenCode Go 55.6 55.6 1.0s 17% 39.6 40m ago
nemotron-3-super Ollama 54.2 54.1 599ms 100% 25.4 1m ago
MiniMax M3 OpenCode Go 52.9 50.3 987ms 67% 44.4 9m ago
minimax-m3 Ollama 52.1 61.0 880ms 100% 44.4 1m ago
Kimi K2.7 Code OpenCode Go 50.4 55.7 1.1s 88% 41.9 10m ago
gemma3:12b (non-reasoning) Ollama 49.5 37.7 523ms 100% 3.4 4m ago
mistral-large-3:675b (non-reasoning) Ollama 47.6 59.1 632ms 100% 16.2 1m ago
minimax-m2.7 Ollama 38.2 39.8 1.7s 100% 38.1 2m ago
MiMo V2.5 Pro OpenCode Go 36.2 56.4 2.3s 87% 42.2 10m ago
MiniMax M2.5 OpenCode Go 32.7 44.4 2.2s 67% 33.7 9m ago
MiniMax M2.7 OpenCode Go 28.0 38.6 1.7s 67% 38.1 9m ago
glm-4.7 Ollama 24.3 88.1 1.8s 100% 33.8 3m ago
gemma3:27b (non-reasoning) Ollama 16.7 16.1 567ms 100% 4.8 4m ago
deepseek-v3.2 Ollama 12.6 27.1 749ms 99% 33.4 5m ago
nemotron-3-ultra Ollama 12.2 18.7 4.2s 98% 37.8 12m ago
Nemotron 3 Ultra (Free) OpenCode Zen 9.1 15.3 32.9s 55% 6m ago
deepseek-v3.1:671b (non-reasoning) Ollama 8.7 10.1 982ms 100% 21 6m ago

Intelligence Index scores from Artificial Analysis.

3 models currently unavailable