whichllm: a hardware-aware recommendation engine that ranks the best local LLMs based on system specs and real-world benchmarks
whichllm: a hardware-aware recommendation engine that ranks the best local LLMs based on system specs and real-world benchmarks
What it solves
Finding the best local Large Language Model (LLM) for specific hardware is often difficult because simply fitting a model into VRAM doesn't guarantee it is the highest quality or fastest option. whichllm solves this by automatically detecting system hardware and ranking models from HuggingFace based on actual benchmark performance, estimated speed, and hardware compatibility rather than just size.
How it works
The tool analyzes the user's GPU, CPU, and RAM to estimate VRAM requirements (including weights, KV cache, and overhead) and generation speed based on memory bandwidth. It fetches live model data from the HuggingFace API and merges it with multiple benchmark sources (such as LiveBench, Chatbot Arena, and Open LLM Leaderboard). A scoring engine then ranks models by combining benchmark quality, model size, quantization penalties, and runtime fit (e.g., full GPU vs. partial offload).
Who it’s for
It is designed for users running local LLMs who want to optimize their hardware usage, as well as people planning hardware purchases who want to simulate specific GPUs to see which models they could run.
Highlights
- Hardware Auto-detection: Supports NVIDIA, AMD, Intel, and Apple Silicon.
- Evidence-Based Ranking: Uses real benchmark scores with confidence-based dampening instead of simple size heuristics.
- ** uma-Command Execution**: Includes
whichllm runto instantly download and chat with a recommended model andwhichllm snippetto generate ready-to-use Python code. - Hardware Planning: Features
planandupgradecommands to determine the GPU needed for a specific model or compare current hardware against potential upgrades. - Live Data: Integrates directly with the HuggingFace API for up-to-date model recommendations.
Sources
- undefinedAndyyyy64/whichllm