100% local · no cloud · free

Your terminal,
powered by a local LLM.

Type what you want in natural language. Kai translates it to the right shell command — entirely on your machine. The model (Qwen 3 1.7B) ships built into the binary via Candle. No accounts. No API keys. No daemons. No data leaves your device.

kai — localhost
$ show me disk usage sorted by size
> du -sh * | sort -rh
[Enter] execute   [Tab] edit   [Esc] cancel

Why local?

Modern small models are good enough for shell commands — and running them locally beats sending every keystroke to someone else's server.

$

Private by default

Every prompt, every directory listing, every command stays on your machine. There is no upstream to leak to.

~

No cost, no quota

No subscriptions, no rate limits, no per-token fees. Use it on a plane, on a flaky cafe wi-fi, on an air-gapped box.

>

Swap models freely

Qwen 3 0.6B / 1.7B / 4B today; more as Candle adds them. Edit one line in config.toml and the next launch downloads a new brain.

How it works

You type        Kai (Rust binary)               Candle (in-process)
─────────       ─────────────────────           ───────────────────
"list rust  ──> classify input          ──>     Qwen 3 1.7B (Q4)
 files"          + gather context                running on Metal/CPU
                 (cwd, git, OS)
                                         <──    "find . -name '*.rs'"
                show confirm UI
                [Enter]  ──> PTY  ──>           your shell runs it
        

Kai wraps your existing shell (zsh / fish / bash) as a PTY. Plain commands like ls pass through untouched; natural language gets routed to the in-process model. No daemon.

Install

1

Install with Homebrew

$ brew install kaishell/tap/kai  (coming soon)

macOS (Apple Silicon / Intel) と Linux 向けにビルド済みバイナリを提供します。Homebrew tap の公開準備中です。

2

Run it

$ kai

First launch downloads the model (~1.1 GB) from Hugging Face and caches it in ~/.cache/huggingface/hub/. Subsequent launches are instant. On an M-series Mac, inference runs at 30–80 tok/s via Metal — commands feel instant.

Configuration

Defaults work out of the box. To use a different model, edit ~/.config/kai/config.toml:

model  = "qwen3:1.7b"   # or "qwen3:0.6b" / "qwen3:4b"
device = "auto"         # or "cpu" / "metal" / "cuda"