Model Intelligence — 2026-05-31

AI Model Intelligence — 2026-05-31

🤖 Model Landscape

Qwen3.6 family continues growing on HuggingFace:

Notable on HuggingFace trending:

Gemma 4 family (updated counts, Google source):

⚙️ Inference Engine Updates

vLLM v0.22.0 (released May 29 — NEW since last scan):

SGLang v0.5.12.post1 (May 26, release notes):

llama.cpp — 4 releases in one day (May 31, releases):

Ollama v0.30.0-rc31 (May 13, releases):

📊 Worth Noting

🖥️ Hardware Sweet Spots

GPU Best Model Notes
RTX 3060 12GB Qwen3.6-35B-A3B MoE efficiency shines here
RTX 3090/4090 24GB Qwen3.6-27B or Gemma 4-31B Full Q4 fit, good performance
Dual 24GB DeepSeek-R1, gpt-oss-120b Multi-GPU needed for large models

Data sourced from HuggingFace API, vLLM GitHub, SGLang GitHub, llama.cpp GitHub, Ollama GitHub

model-releasesinferencevllmllama.cppqwengemma