Custom and local models

Hyprcore is designed to be extensible. If you have a fine-tuned Whisper model, want to plug in a community variant like Moonshine or SenseVoice, or run an LLM fully on-device, you can.

Custom Whisper models

Hyprcore auto-discovers any Whisper GGML .bin model placed in your models directory.

Get a model

Download a Whisper GGML .bin file from anywhere—Hugging Face is the most common source. Many fine-tuned variants exist for specific languages or domains.

Drop it in the models directory

The directory is ~/Library/Application Support/ai.hyprcore.desktop/models/. Place the .bin file directly inside.

Restart Hyprcore

On the next launch, your model appears in Settings → Models under “Custom Models”. The display name comes from the filename: my-model.bin becomes “My Model”.

Custom models are user-provided and unsupported. If a model crashes Hyprcore, swap it out and report it via Discord—but don’t expect official help. The model has to be valid Whisper GGML format; otherwise it just won’t load.

Specialized speech models

Hyprcore ships several specialized models beyond Whisper and Parakeet:

NVIDIA Canary. Multilingual NeMo model with strong English / Spanish / French / German accuracy.
Moonshine. Small and fast English-only models, optimized for low-latency dictation.
SenseVoice. Multi-language model with emotion recognition.
Breeze. Optimized for Traditional Chinese.
GigaAM. Russian-language models.

These appear as alternative download targets in Settings → Models. Pick whichever fits the language or use case you record most.

Ollama for LLMs

For a fully local LLM:

Install Ollama

ollama.com/download. It runs as a background service.

Pull a model

From a terminal:

ollama pull llama3.1:8b      # 4.7 GB, decent quality
ollama pull qwen2.5:14b      # 9 GB, good for summaries
ollama pull llama3.1:70b     # 40 GB, near-cloud quality

Configure in Hyprcore

Settings → Models → Post-processing → Provider: Ollama. The dropdown lists every model you’ve pulled. Pick one and Hyprcore will test the connection.

Ollama also serves embedding models for semantic search:

ollama pull nomic-embed-text       # fast and good
ollama pull mxbai-embed-large      # higher quality, slower

Configure in Settings → Intelligence → Embedding provider → Ollama.

LM Studio

LM Studio is a GUI alternative to Ollama. The setup is the same—install, download a model, start the local server, and point Hyprcore at http://localhost:1234. Pros over Ollama: nicer UI for browsing models, supports more model formats out of the box.

OpenRouter

If you want one API to access dozens of models (Anthropic, OpenAI, Google, Mistral, Meta, …) without juggling keys, use OpenRouter. Settings → Models → Post-processing → Provider: OpenRouter and paste your key. This is also what powers Hyprcore Cloud under the hood, so picking OpenRouter directly gives you the same model selection at OpenRouter’s pricing.

Get started

Dictation

Meetings

Knowledge base

Models

Account & plans

Settings

Help

Custom and local models

Custom Whisper models

Specialized speech models

Ollama for LLMs

LM Studio

OpenRouter

Get started

Dictation

Meetings

Knowledge base

Models

Account & plans

Settings

Help

Documentation Index

​Custom Whisper models

​Specialized speech models

​Ollama for LLMs

​LM Studio

​OpenRouter

Custom Whisper models

Specialized speech models

Ollama for LLMs

LM Studio

OpenRouter