Troubleshooting 🔧

When Bot misbehaves, here's how to fix it.

Start with the FAQ’s First 60 seconds if you just want a quick triage recipe. This page goes deeper on runtime failures and diagnostics.

Provider-specific shortcuts: /channels/troubleshooting

Status & Diagnostics

Quick triage commands (in order):

Command	What it tells you	When to use it
`bot status`	Local summary: OS + update, gateway reachability/mode, service, agents/sessions, provider config state	First check, quick overview
`bot status --all`	Full local diagnosis (read-only, pasteable, safe-ish) incl. log tail	When you need to share a debug report
`bot status --deep`	Runs gateway health checks (incl. provider probes; requires reachable gateway)	When “configured” doesn’t mean “working”
`bot gateway probe`	Gateway discovery + reachability (local + remote targets)	When you suspect you’re probing the wrong gateway
`bot channels status --probe`	Asks the running gateway for channel status (and optionally probes)	When gateway is reachable but channels misbehave
`bot gateway status`	Supervisor state (launchd/systemd/schtasks), runtime PID/exit, last gateway error	When the service “looks loaded” but nothing runs
`bot logs --follow`	Live logs (best signal for runtime issues)	When you need the actual failure reason

Sharing output: prefer bot status --all (it redacts tokens). If you paste bot status, consider setting BOT_SHOW_SECRETS=0 first (token previews).

Common Issues

No API key found for provider "anthropic"

This means the agent’s auth store is empty or missing Anthropic credentials. Auth is per agent, so a new agent won’t inherit the main agent’s keys.

Fix options:

Re-run onboarding and choose Anthropic for that agent.

Or paste a setup-token on the gateway host:

bot models auth setup-token --provider anthropic

Or copy auth-profiles.json from the main agent dir to the new agent dir.

Verify:

bot models status

OAuth token refresh failed (Anthropic Claude subscription)

This means the stored Anthropic OAuth token expired and the refresh failed. If you’re on a Claude subscription (no API key), the most reliable fix is to switch to a Claude Code setup-token or re-sync Claude Code CLI OAuth on the gateway host.

Recommended (setup-token):

# Run on the gateway host (runs Claude Code CLI)
bot models auth setup-token --provider anthropic
bot models status

If you generated the token elsewhere:

bot models auth paste-token --provider anthropic
bot models status

If you want to keep OAuth reuse: log in with Claude Code CLI on the gateway host, then run bot models status to sync the refreshed token into Bot’s auth store.

More detail: Anthropic and OAuth.

Control UI fails on HTTP ("device identity required" / "connect failed")

If you open the dashboard over plain HTTP (e.g. http://<lan-ip>:18789/ or http://<tailscale-ip>:18789/), the browser runs in a non-secure context and blocks WebCrypto, so device identity can’t be generated.

Fix:

Prefer HTTPS via Tailscale Serve.
Or open locally on the gateway host: http://127.0.0.1:18789/.
If you must stay on HTTP, enable gateway.controlUi.allowInsecureAuth: true and use a gateway token (token-only; no device identity/pairing). See Control UI.

CI Secrets Scan Failed

This means detect-secrets found new candidates not yet in the baseline. Follow Secret scanning.

Service Installed but Nothing is Running

If the gateway service is installed but the process exits immediately, the service can appear “loaded” while nothing is running.

Check:

bot gateway status
bot doctor

Doctor/service will show runtime state (PID/last exit) and log hints.

Logs:

Preferred: bot logs --follow
File logs (always): /tmp/bot/bot-YYYY-MM-DD.log (or your configured logging.file)
macOS LaunchAgent (if installed): $BOT_STATE_DIR/logs/gateway.log and gateway.err.log
Linux systemd (if installed): journalctl --user -u bot-gateway[-<profile>].service -n 200 --no-pager
Windows: schtasks /Query /TN "Bot Gateway (<profile>)" /V /FO LIST

Enable more logging:

Bump file log detail (persisted JSONL):
```
{ "logging": { "level": "debug" } }
```

Bump console verbosity (TTY output only):

{ "logging": { "consoleLevel": "debug", "consoleStyle": "pretty" } }

Quick tip: --verbose affects console output only. File logs remain controlled by logging.level.

See /logging for a full overview of formats, config, and access.

"Gateway start blocked: set gateway.mode=local"

This means the config exists but gateway.mode is unset (or not local), so the Gateway refuses to start.

Fix (recommended):

Run the wizard and set the Gateway run mode to Local:
```
bot configure
```
Or set it directly:
```
bot config set gateway.mode local
```

If you meant to run a remote Gateway instead:

Set a remote URL and keep gateway.mode=remote:

bot config set gateway.mode remote
bot config set gateway.remote.url "wss://gateway.example.com"

Ad-hoc/dev only: pass --allow-unconfigured to start the gateway without gateway.mode=local.

No config file yet? Run bot setup to create a starter config, then rerun the gateway.

Service Environment (PATH + runtime)

The gateway service runs with a minimal PATH to avoid shell/manager cruft:

macOS: /opt/homebrew/bin, /usr/local/bin, /usr/bin, /bin
Linux: /usr/local/bin, /usr/bin, /bin

This intentionally excludes version managers (nvm/fnm/volta/asdf) and package managers (pnpm/npm) because the service does not load your shell init. Runtime variables like DISPLAY should live in ~/.bot/.env (loaded early by the gateway). Exec runs on host=gateway merge your login-shell PATH into the exec environment, so missing tools usually mean your shell init isn’t exporting them (or set tools.exec.pathPrepend). See /tools/exec.

WhatsApp + Telegram channels require Node; Bun is unsupported. If your service was installed with Bun or a version-managed Node path, run bot doctor to migrate to a system Node install.

Skill missing API key in sandbox

Symptom: Skill works on host but fails in sandbox with missing API key.

Why: sandboxed exec runs inside Docker and does not inherit host process.env.

Fix:

set agents.defaults.sandbox.docker.env (or per-agent agents.list[].sandbox.docker.env)
or bake the key into your custom sandbox image
then run bot sandbox recreate --agent <id> (or --all)

Service Running but Port Not Listening

If the service reports running but nothing is listening on the gateway port, the Gateway likely refused to bind.

What "running" means here

Runtime: running means your supervisor (launchd/systemd/schtasks) thinks the process is alive.
RPC probe means the CLI could actually connect to the gateway WebSocket and call status.
Always trust Probe target: + Config (service): as the “what did we actually try?” lines.

Check:

gateway.mode must be local for bot gateway and the service.
If you set gateway.mode=remote, the CLI defaults to a remote URL. The service can still be running locally, but your CLI may be probing the wrong place. Use bot gateway status to see the service’s resolved port + probe target (or pass --url).
bot gateway status and bot doctor surface the last gateway error from logs when the service looks running but the port is closed.
Non-loopback binds (lan/tailnet/custom, or auto when loopback is unavailable) require auth: gateway.auth.token (or BOT_GATEWAY_TOKEN).
gateway.remote.token is for remote CLI calls only; it does not enable local auth.
gateway.token is ignored; use gateway.auth.token.

If bot gateway status shows a config mismatch

Config (cli): ... and Config (service): ... should normally match.
If they don’t, you’re almost certainly editing one config while the service is running another.
Fix: rerun bot gateway install --force from the same --profile / BOT_STATE_DIR you want the service to use.

If bot gateway status reports service config issues

The supervisor config (launchd/systemd/schtasks) is missing current defaults.
Fix: run bot doctor to update it (or bot gateway install --force for a full rewrite).

If Last gateway error: mentions “refusing to bind … without auth”

You set gateway.bind to a non-loopback mode (lan/tailnet/custom, or auto when loopback is unavailable) but left auth off.
Fix: set gateway.auth.mode + gateway.auth.token (or export BOT_GATEWAY_TOKEN) and restart the service.

If bot gateway status says bind=tailnet but no tailnet interface was found

The gateway tried to bind to a Tailscale IP (100.64.0.0/10) but none were detected on the host.
Fix: bring up Tailscale on that machine (or change gateway.bind to loopback/lan).

If Probe note: says the probe uses loopback

That’s expected for bind=lan: the gateway listens on 0.0.0.0 (all interfaces), and loopback should still connect locally.
For remote clients, use a real LAN IP (not 0.0.0.0) plus the port, and ensure auth is configured.

Address Already in Use (Port 18789)

This means something is already listening on the gateway port.

Check:

bot gateway status

It will show the listener(s) and likely causes (gateway already running, SSH tunnel). If needed, stop the service or pick a different port.

Extra Workspace Folders Detected

If you upgraded from older installs, you might still have ~/bot on disk. Multiple workspace directories can cause confusing auth or state drift because only one workspace is active.

Fix: keep a single active workspace and archive/remove the rest. See Agent workspace.

Main chat running in a sandbox workspace

Symptoms: pwd or file tools show ~/.bot/sandboxes/... even though you expected the host workspace.

Why: agents.defaults.sandbox.mode: "non-main" keys off session.mainKey (default "main"). Group/channel sessions use their own keys, so they are treated as non-main and get sandbox workspaces.

Fix options:

If you want host workspaces for an agent: set agents.list[].sandbox.mode: "off".
If you want host workspace access inside sandbox: set workspaceAccess: "rw" for that agent.

"Agent was aborted"

The agent was interrupted mid-response.

Causes:

User sent stop, abort, esc, wait, or exit
Timeout exceeded
Process crashed

Fix: Just send another message. The session continues.

"Agent failed before reply: Unknown model: anthropic/claude-haiku-3-5"

Bot intentionally rejects older/insecure models (especially those more vulnerable to prompt injection). If you see this error, the model name is no longer supported.

Fix:

Pick a latest model for the provider and update your config or model alias.
If you’re unsure which models are available, run bot models list or bot models scan and choose a supported one.
Check gateway logs for the detailed failure reason.

See also: Models CLI and Model providers.

Messages Not Triggering

Check 1: Is the sender allowlisted?

bot status

Look for AllowFrom: ... in the output.

Check 2: For group chats, is mention required?

# The message must match mentionPatterns or explicit mentions; defaults live in channel groups/guilds.
# Multi-agent: `agents.list[].groupChat.mentionPatterns` overrides global patterns.
grep -n "agents\\|groupChat\\|mentionPatterns\\|channels\\.whatsapp\\.groups\\|channels\\.telegram\\.groups\\|channels\\.imessage\\.groups\\|channels\\.discord\\.guilds" \
  "${BOT_CONFIG_PATH:-$HOME/.bot/bot.json}"

Check 3: Check the logs

bot logs --follow
# or if you want quick filters:
tail -f "$(ls -t /tmp/bot/bot-*.log | head -1)" | grep "blocked\\|skip\\|unauthorized"

Pairing Code Not Arriving

If dmPolicy is pairing, unknown senders should receive a code and their message is ignored until approved.

Check 1: Is a pending request already waiting?

bot pairing list <channel>

Pending DM pairing requests are capped at 3 per channel by default. If the list is full, new requests won’t generate a code until one is approved or expires.

Check 2: Did the request get created but no reply was sent?

bot logs --follow | grep "pairing request"

Check 3: Confirm dmPolicy isn’t open/allowlist for that channel.

Image + Mention Not Working

Known issue: When you send an image with ONLY a mention (no other text), WhatsApp sometimes doesn't include the mention metadata.

Workaround: Add some text with the mention:

❌ @bot + image
✅ @bot check this + image

Session Not Resuming

Check 1: Is the session file there?

ls -la ~/.bot/agents/<agentId>/sessions/

Check 2: Is the reset window too short?

{
  "session": {
    "reset": {
      "mode": "daily",
      "atHour": 4,
      "idleMinutes": 10080  // 7 days
    }
  }
}

Check 3: Did someone send /new, /reset, or a reset trigger?

Agent Timing Out

Default timeout is 30 minutes. For long tasks:

{
  "reply": {
    "timeoutSeconds": 3600  // 1 hour
  }
}

Or use the process tool to background long commands.

WhatsApp Disconnected

# Check local status (creds, sessions, queued events)
bot status
# Probe the running gateway + channels (WA connect + Telegram + Discord APIs)
bot status --deep

# View recent connection events
bot logs --limit 200 | grep "connection\\|disconnect\\|logout"

Fix: Usually reconnects automatically once the Gateway is running. If you’re stuck, restart the Gateway process (however you supervise it), or run it manually with verbose output:

bot gateway --verbose

If you’re logged out / unlinked:

bot channels logout
trash "${BOT_STATE_DIR:-$HOME/.bot}/credentials" # if logout can't cleanly remove everything
bot channels login --verbose       # re-scan QR

Media Send Failing

Check 1: Is the file path valid?

ls -la /path/to/your/image.jpg

Check 2: Is it too large?

Images: max 6MB
Audio/Video: max 16MB
Documents: max 100MB

Check 3: Check media logs

grep "media\\|fetch\\|download" "$(ls -t /tmp/bot/bot-*.log | head -1)" | tail -20

High Memory Usage

Bot keeps conversation history in memory.

Fix: Restart periodically or set session limits:

{
  "session": {
    "historyLimit": 100  // Max messages to keep
  }
}

Common troubleshooting

“Gateway won’t start — configuration invalid”

Bot now refuses to start when the config contains unknown keys, malformed values, or invalid types. This is intentional for safety.

Fix it with Doctor:

bot doctor
bot doctor --fix

Notes:

bot doctor reports every invalid entry.
bot doctor --fix applies migrations/repairs and rewrites the config.
Diagnostic commands like bot logs, bot health, bot status, bot gateway status, and bot gateway probe still run even if the config is invalid.

“All models failed” — what should I check first?

Credentials present for the provider(s) being tried (auth profiles + env vars).
Model routing: confirm agents.defaults.model.primary and fallbacks are models you can access.
Gateway logs in /tmp/bot/… for the exact provider error.
Model status: use /model status (chat) or bot models status (CLI).

I’m running on my personal WhatsApp number — why is self-chat weird?

Enable self-chat mode and allowlist your own number:

{
  channels: {
    whatsapp: {
      selfChatMode: true,
      dmPolicy: "allowlist",
      allowFrom: ["+15555550123"]
    }
  }
}

See WhatsApp setup.

WhatsApp logged me out. How do I re‑auth?

Run the login command again and scan the QR code:

bot channels login

Build errors on `main` — what’s the standard fix path?

git pull origin main && pnpm install
bot doctor
Check GitHub issues or Discord
Temporary workaround: check out an older commit

npm install fails (allow-build-scripts / missing tar or yargs). What now?

If you’re running from source, use the repo’s package manager: pnpm (preferred). The repo declares packageManager: "pnpm@…".

Typical recovery:

git status   # ensure you’re in the repo root
pnpm install
pnpm build
bot doctor
bot gateway restart

Why: pnpm is the configured package manager for this repo.

How do I switch between git installs and npm installs?

Use the website installer and select the install method with a flag. It upgrades in place and rewrites the gateway service to point at the new install.

Switch to git install:

curl -fsSL https://hanzo.bot/install.sh | bash -s -- --install-method git --no-onboard

Switch to npm global:

curl -fsSL https://hanzo.bot/install.sh | bash

Notes:

The git flow only rebases if the repo is clean. Commit or stash changes first.
After switching, run:
```
bot doctor
bot gateway restart
```

Telegram block streaming isn’t splitting text between tool calls. Why?

Block streaming only sends completed text blocks. Common reasons you see a single message:

agents.defaults.blockStreamingDefault is still "off".
channels.telegram.blockStreaming is set to false.
channels.telegram.streamMode is partial or block and draft streaming is active (private chat + topics). Draft streaming disables block streaming in that case.
Your minChars / coalesce settings are too high, so chunks get merged.
The model emits one large text block (no mid‑reply flush points).

Fix checklist:

Put block streaming settings under agents.defaults, not the root.
Set channels.telegram.streamMode: "off" if you want real multi‑message block replies.
Use smaller chunk/coalesce thresholds while debugging.

See Streaming.

Discord doesn’t reply in my server even with `requireMention: false`. Why?

requireMention only controls mention‑gating after the channel passes allowlists. By default channels.discord.groupPolicy is allowlist, so guilds must be explicitly enabled. If you set channels.discord.guilds.<guildId>.channels, only the listed channels are allowed; omit it to allow all channels in the guild.

Fix checklist:

Set channels.discord.groupPolicy: "open" or add a guild allowlist entry (and optionally a channel allowlist).
Use numeric channel IDs in channels.discord.guilds.<guildId>.channels.
Put requireMention: false under channels.discord.guilds (global or per‑channel). Top‑level channels.discord.requireMention is not a supported key.
Ensure the bot has Message Content Intent and channel permissions.
Run bot channels status --probe for audit hints.

Docs: Discord, Channels troubleshooting.

Cloud Code Assist API error: invalid tool schema (400). What now?

This is almost always a tool schema compatibility issue. The Cloud Code Assist endpoint accepts a strict subset of JSON Schema. Bot scrubs/normalizes tool schemas in current main, but the fix is not in the last release yet (as of January 13, 2026).

Fix checklist:

Update Bot:
- If you can run from source, pull main and restart the gateway.
- Otherwise, wait for the next release that includes the schema scrubber.
Avoid unsupported keywords like anyOf/oneOf/allOf, patternProperties, additionalProperties, minLength, maxLength, format, etc.
If you define custom tools, keep the top‑level schema as type: "object" with properties and simple enums.

See Tools and TypeBox schemas.

macOS Specific Issues

App Crashes when Granting Permissions (Speech/Mic)

If the app disappears or shows "Abort trap 6" when you click "Allow" on a privacy prompt:

Fix 1: Reset TCC Cache

tccutil reset All com.bot.mac.debug

Fix 2: Force New Bundle ID If resetting doesn't work, change the BUNDLE_ID in scripts/package-mac-app.sh (e.g., add a .test suffix) and rebuild. This forces macOS to treat it as a new app.

Gateway stuck on "Starting..."

The app connects to a local gateway on port 18789. If it stays stuck:

Fix 1: Stop the supervisor (preferred) If the gateway is supervised by launchd, killing the PID will just respawn it. Stop the supervisor first:

bot gateway status
bot gateway stop
# Or: launchctl bootout gui/$UID/com.bot.gateway (replace with com.bot.<profile> if needed)

Fix 2: Port is busy (find the listener)

lsof -nP -iTCP:18789 -sTCP:LISTEN

If it’s an unsupervised process, try a graceful stop first, then escalate:

kill -TERM <PID>
sleep 1
kill -9 <PID> # last resort

Fix 3: Check the CLI install Ensure the global bot CLI is installed and matches the app version:

bot --version
npm install -g bot@<version>

Debug Mode

Get verbose logging:

# Turn on trace logging in config:
#   ${BOT_CONFIG_PATH:-$HOME/.bot/bot.json} -> { logging: { level: "trace" } }
#
# Then run verbose commands to mirror debug output to stdout:
bot gateway --verbose
bot channels login --verbose

Log Locations

Log	Location
Gateway file logs (structured)	`/tmp/bot/bot-YYYY-MM-DD.log` (or `logging.file`)
Gateway service logs (supervisor)	macOS: `$BOT_STATE_DIR/logs/gateway.log` + `gateway.err.log` (default: `~/.bot/logs/...`; profiles use `~/.bot-<profile>/logs/...`) Linux: `journalctl --user -u bot-gateway[-<profile>].service -n 200 --no-pager` Windows: `schtasks /Query /TN "Bot Gateway (<profile>)" /V /FO LIST`
Session files	`$BOT_STATE_DIR/agents/<agentId>/sessions/`
Media cache	`$BOT_STATE_DIR/media/`
Credentials	`$BOT_STATE_DIR/credentials/`

Health Check

# Supervisor + probe target + config paths
bot gateway status
# Include system-level scans (legacy/extra services, port listeners)
bot gateway status --deep

# Is the gateway reachable?
bot health --json
# If it fails, rerun with connection details:
bot health --verbose

# Is something listening on the default port?
lsof -nP -iTCP:18789 -sTCP:LISTEN

# Recent activity (RPC log tail)
bot logs --follow
# Fallback if RPC is down
tail -20 /tmp/bot/bot-*.log

Reset Everything

Nuclear option:

bot gateway stop
# If you installed a service and want a clean install:
# bot gateway uninstall

trash "${BOT_STATE_DIR:-$HOME/.bot}"
bot channels login         # re-pair WhatsApp
bot gateway restart           # or: bot gateway

⚠️ This loses all sessions and requires re-pairing WhatsApp.

Getting Help

Check logs first: /tmp/bot/ (default: bot-YYYY-MM-DD.log, or your configured logging.file)
Search existing issues on GitHub
Open a new issue with:
- Bot version
- Relevant log snippets
- Steps to reproduce
- Your config (redact secrets!)

"Have you tried turning it off and on again?" — Every IT person ever

🤖🔧

Browser Not Starting (Linux)

If you see "Failed to start Chrome CDP on port 18800":

Most likely cause: Snap-packaged Chromium on Ubuntu.

Quick fix: Install Google Chrome instead:

wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo dpkg -i google-chrome-stable_current_amd64.deb

Then set in config:

{
  "browser": {
    "executablePath": "/usr/bin/google-chrome-stable"
  }
}

Full guide: See browser-linux-troubleshooting

Troubleshooting 🔧

On this page