100% local · no cloud · free

Your terminal,
powered by a local LLM.

Type what you want in natural language. Kai translates it to the right shell command — entirely on your machine. The model (Qwen 3 1.7B) ships built into the binary via Candle. No accounts. No API keys. No daemons. No data leaves your device.

kai — localhost

$ show me disk usage sorted by size

> du -sh * | sort -rh

[Enter] execute [Tab] edit [Esc] cancel

Install Kai

Why local?

Modern small models are good enough for shell commands — and running them locally beats sending every keystroke to someone else's server.

Private by default

Every prompt, every directory listing, every command stays on your machine. There is no upstream to leak to.

No cost, no quota

No subscriptions, no rate limits, no per-token fees. Use it on a plane, on a flaky cafe wi-fi, on an air-gapped box.

Swap models freely

Qwen 3 0.6B / 1.7B / 4B today; more as Candle adds them. Edit one line in config.toml and the next launch downloads a new brain.

How it works

You type        Kai (Rust binary)               Candle (in-process)
─────────       ─────────────────────           ───────────────────
"list rust  ──> classify input          ──>     Qwen 3 1.7B (Q4)
 files"          + gather context                running on Metal/CPU
                 (cwd, git, OS)
                                         <──    "find . -name '*.rs'"
                show confirm UI
                [Enter]  ──> PTY  ──>           your shell runs it

Kai wraps your existing shell (zsh / fish / bash) as a PTY. Plain commands like ls pass through untouched; natural language gets routed to the in-process model. No daemon.

Install

Install with Homebrew

$ brew install kaishell/tap/kai (coming soon)

macOS (Apple Silicon / Intel) と Linux 向けにビルド済みバイナリを提供します。Homebrew tap の公開準備中です。

Run it

$ kai

First launch downloads the model (~1.1 GB) from Hugging Face and caches it in ~/.cache/huggingface/hub/. Subsequent launches are instant. On an M-series Mac, inference runs at 30–80 tok/s via Metal — commands feel instant.

Configuration

Defaults work out of the box. To use a different model, edit ~/.config/kai/config.toml:

model  = "qwen3:1.7b"   # or "qwen3:0.6b" / "qwen3:4b"
device = "auto"         # or "cpu" / "metal" / "cuda"

Your terminal, powered by a local LLM.