Troubleshooting đ§
Quick troubleshooting guide for common Bot failures
When Bot misbehaves, here's how to fix it.
Start with the FAQâs First 60 seconds if you just want a quick triage recipe. This page goes deeper on runtime failures and diagnostics.
Provider-specific shortcuts: /channels/troubleshooting
Status & Diagnostics
Quick triage commands (in order):
| Command | What it tells you | When to use it |
|---|---|---|
bot status | Local summary: OS + update, gateway reachability/mode, service, agents/sessions, provider config state | First check, quick overview |
bot status --all | Full local diagnosis (read-only, pasteable, safe-ish) incl. log tail | When you need to share a debug report |
bot status --deep | Runs gateway health checks (incl. provider probes; requires reachable gateway) | When âconfiguredâ doesnât mean âworkingâ |
bot gateway probe | Gateway discovery + reachability (local + remote targets) | When you suspect youâre probing the wrong gateway |
bot channels status --probe | Asks the running gateway for channel status (and optionally probes) | When gateway is reachable but channels misbehave |
bot gateway status | Supervisor state (launchd/systemd/schtasks), runtime PID/exit, last gateway error | When the service âlooks loadedâ but nothing runs |
bot logs --follow | Live logs (best signal for runtime issues) | When you need the actual failure reason |
Sharing output: prefer bot status --all (it redacts tokens). If you paste bot status, consider setting BOT_SHOW_SECRETS=0 first (token previews).
See also: Health checks and Logging.
Common Issues
No API key found for provider "anthropic"
This means the agentâs auth store is empty or missing Anthropic credentials. Auth is per agent, so a new agent wonât inherit the main agentâs keys.
Fix options:
- Re-run onboarding and choose Anthropic for that agent.
- Or paste a setup-token on the gateway host:
bot models auth setup-token --provider anthropic - Or copy
auth-profiles.jsonfrom the main agent dir to the new agent dir.
Verify:
bot models statusOAuth token refresh failed (Anthropic Claude subscription)
This means the stored Anthropic OAuth token expired and the refresh failed. If youâre on a Claude subscription (no API key), the most reliable fix is to switch to a Claude Code setup-token or re-sync Claude Code CLI OAuth on the gateway host.
Recommended (setup-token):
# Run on the gateway host (runs Claude Code CLI)
bot models auth setup-token --provider anthropic
bot models statusIf you generated the token elsewhere:
bot models auth paste-token --provider anthropic
bot models statusIf you want to keep OAuth reuse:
log in with Claude Code CLI on the gateway host, then run bot models status
to sync the refreshed token into Botâs auth store.
More detail: Anthropic and OAuth.
Control UI fails on HTTP ("device identity required" / "connect failed")
If you open the dashboard over plain HTTP (e.g. http://<lan-ip>:18789/ or
http://<tailscale-ip>:18789/), the browser runs in a non-secure context and
blocks WebCrypto, so device identity canât be generated.
Fix:
- Prefer HTTPS via Tailscale Serve.
- Or open locally on the gateway host:
http://127.0.0.1:18789/. - If you must stay on HTTP, enable
gateway.controlUi.allowInsecureAuth: trueand use a gateway token (token-only; no device identity/pairing). See Control UI.
CI Secrets Scan Failed
This means detect-secrets found new candidates not yet in the baseline.
Follow Secret scanning.
Service Installed but Nothing is Running
If the gateway service is installed but the process exits immediately, the service can appear âloadedâ while nothing is running.
Check:
bot gateway status
bot doctorDoctor/service will show runtime state (PID/last exit) and log hints.
Logs:
- Preferred:
bot logs --follow - File logs (always):
/tmp/bot/bot-YYYY-MM-DD.log(or your configuredlogging.file) - macOS LaunchAgent (if installed):
$BOT_STATE_DIR/logs/gateway.logandgateway.err.log - Linux systemd (if installed):
journalctl --user -u bot-gateway[-<profile>].service -n 200 --no-pager - Windows:
schtasks /Query /TN "Bot Gateway (<profile>)" /V /FO LIST
Enable more logging:
- Bump file log detail (persisted JSONL):
{ "logging": { "level": "debug" } } - Bump console verbosity (TTY output only):
{ "logging": { "consoleLevel": "debug", "consoleStyle": "pretty" } } - Quick tip:
--verboseaffects console output only. File logs remain controlled bylogging.level.
See /logging for a full overview of formats, config, and access.
"Gateway start blocked: set gateway.mode=local"
This means the config exists but gateway.mode is unset (or not local), so the
Gateway refuses to start.
Fix (recommended):
- Run the wizard and set the Gateway run mode to Local:
bot configure - Or set it directly:
bot config set gateway.mode local
If you meant to run a remote Gateway instead:
- Set a remote URL and keep
gateway.mode=remote:bot config set gateway.mode remote bot config set gateway.remote.url "wss://gateway.example.com"
Ad-hoc/dev only: pass --allow-unconfigured to start the gateway without
gateway.mode=local.
No config file yet? Run bot setup to create a starter config, then rerun
the gateway.
Service Environment (PATH + runtime)
The gateway service runs with a minimal PATH to avoid shell/manager cruft:
- macOS:
/opt/homebrew/bin,/usr/local/bin,/usr/bin,/bin - Linux:
/usr/local/bin,/usr/bin,/bin
This intentionally excludes version managers (nvm/fnm/volta/asdf) and package
managers (pnpm/npm) because the service does not load your shell init. Runtime
variables like DISPLAY should live in ~/.bot/.env (loaded early by the
gateway).
Exec runs on host=gateway merge your login-shell PATH into the exec environment,
so missing tools usually mean your shell init isnât exporting them (or set
tools.exec.pathPrepend). See /tools/exec.
WhatsApp + Telegram channels require Node; Bun is unsupported. If your
service was installed with Bun or a version-managed Node path, run bot doctor
to migrate to a system Node install.
Skill missing API key in sandbox
Symptom: Skill works on host but fails in sandbox with missing API key.
Why: sandboxed exec runs inside Docker and does not inherit host process.env.
Fix:
- set
agents.defaults.sandbox.docker.env(or per-agentagents.list[].sandbox.docker.env) - or bake the key into your custom sandbox image
- then run
bot sandbox recreate --agent <id>(or--all)
Service Running but Port Not Listening
If the service reports running but nothing is listening on the gateway port, the Gateway likely refused to bind.
What "running" means here
Runtime: runningmeans your supervisor (launchd/systemd/schtasks) thinks the process is alive.RPC probemeans the CLI could actually connect to the gateway WebSocket and callstatus.- Always trust
Probe target:+Config (service):as the âwhat did we actually try?â lines.
Check:
gateway.modemust belocalforbot gatewayand the service.- If you set
gateway.mode=remote, the CLI defaults to a remote URL. The service can still be running locally, but your CLI may be probing the wrong place. Usebot gateway statusto see the serviceâs resolved port + probe target (or pass--url). bot gateway statusandbot doctorsurface the last gateway error from logs when the service looks running but the port is closed.- Non-loopback binds (
lan/tailnet/custom, orautowhen loopback is unavailable) require auth:gateway.auth.token(orBOT_GATEWAY_TOKEN). gateway.remote.tokenis for remote CLI calls only; it does not enable local auth.gateway.tokenis ignored; usegateway.auth.token.
If bot gateway status shows a config mismatch
Config (cli): ...andConfig (service): ...should normally match.- If they donât, youâre almost certainly editing one config while the service is running another.
- Fix: rerun
bot gateway install --forcefrom the same--profile/BOT_STATE_DIRyou want the service to use.
If bot gateway status reports service config issues
- The supervisor config (launchd/systemd/schtasks) is missing current defaults.
- Fix: run
bot doctorto update it (orbot gateway install --forcefor a full rewrite).
If Last gateway error: mentions ârefusing to bind ⌠without authâ
- You set
gateway.bindto a non-loopback mode (lan/tailnet/custom, orautowhen loopback is unavailable) but left auth off. - Fix: set
gateway.auth.mode+gateway.auth.token(or exportBOT_GATEWAY_TOKEN) and restart the service.
If bot gateway status says bind=tailnet but no tailnet interface was found
- The gateway tried to bind to a Tailscale IP (100.64.0.0/10) but none were detected on the host.
- Fix: bring up Tailscale on that machine (or change
gateway.bindtoloopback/lan).
If Probe note: says the probe uses loopback
- Thatâs expected for
bind=lan: the gateway listens on0.0.0.0(all interfaces), and loopback should still connect locally. - For remote clients, use a real LAN IP (not
0.0.0.0) plus the port, and ensure auth is configured.
Address Already in Use (Port 18789)
This means something is already listening on the gateway port.
Check:
bot gateway statusIt will show the listener(s) and likely causes (gateway already running, SSH tunnel). If needed, stop the service or pick a different port.
Extra Workspace Folders Detected
If you upgraded from older installs, you might still have ~/bot on disk.
Multiple workspace directories can cause confusing auth or state drift because
only one workspace is active.
Fix: keep a single active workspace and archive/remove the rest. See Agent workspace.
Main chat running in a sandbox workspace
Symptoms: pwd or file tools show ~/.bot/sandboxes/... even though you
expected the host workspace.
Why: agents.defaults.sandbox.mode: "non-main" keys off session.mainKey (default "main").
Group/channel sessions use their own keys, so they are treated as non-main and
get sandbox workspaces.
Fix options:
- If you want host workspaces for an agent: set
agents.list[].sandbox.mode: "off". - If you want host workspace access inside sandbox: set
workspaceAccess: "rw"for that agent.
"Agent was aborted"
The agent was interrupted mid-response.
Causes:
- User sent
stop,abort,esc,wait, orexit - Timeout exceeded
- Process crashed
Fix: Just send another message. The session continues.
"Agent failed before reply: Unknown model: anthropic/claude-haiku-3-5"
Bot intentionally rejects older/insecure models (especially those more vulnerable to prompt injection). If you see this error, the model name is no longer supported.
Fix:
- Pick a latest model for the provider and update your config or model alias.
- If youâre unsure which models are available, run
bot models listorbot models scanand choose a supported one. - Check gateway logs for the detailed failure reason.
See also: Models CLI and Model providers.
Messages Not Triggering
Check 1: Is the sender allowlisted?
bot statusLook for AllowFrom: ... in the output.
Check 2: For group chats, is mention required?
# The message must match mentionPatterns or explicit mentions; defaults live in channel groups/guilds.
# Multi-agent: `agents.list[].groupChat.mentionPatterns` overrides global patterns.
grep -n "agents\\|groupChat\\|mentionPatterns\\|channels\\.whatsapp\\.groups\\|channels\\.telegram\\.groups\\|channels\\.imessage\\.groups\\|channels\\.discord\\.guilds" \
"${BOT_CONFIG_PATH:-$HOME/.bot/bot.json}"Check 3: Check the logs
bot logs --follow
# or if you want quick filters:
tail -f "$(ls -t /tmp/bot/bot-*.log | head -1)" | grep "blocked\\|skip\\|unauthorized"Pairing Code Not Arriving
If dmPolicy is pairing, unknown senders should receive a code and their message is ignored until approved.
Check 1: Is a pending request already waiting?
bot pairing list <channel>Pending DM pairing requests are capped at 3 per channel by default. If the list is full, new requests wonât generate a code until one is approved or expires.
Check 2: Did the request get created but no reply was sent?
bot logs --follow | grep "pairing request"Check 3: Confirm dmPolicy isnât open/allowlist for that channel.
Image + Mention Not Working
Known issue: When you send an image with ONLY a mention (no other text), WhatsApp sometimes doesn't include the mention metadata.
Workaround: Add some text with the mention:
- â
@bot+ image - â
@bot check this+ image
Session Not Resuming
Check 1: Is the session file there?
ls -la ~/.bot/agents/<agentId>/sessions/Check 2: Is the reset window too short?
{
"session": {
"reset": {
"mode": "daily",
"atHour": 4,
"idleMinutes": 10080 // 7 days
}
}
}Check 3: Did someone send /new, /reset, or a reset trigger?
Agent Timing Out
Default timeout is 30 minutes. For long tasks:
{
"reply": {
"timeoutSeconds": 3600 // 1 hour
}
}Or use the process tool to background long commands.
WhatsApp Disconnected
# Check local status (creds, sessions, queued events)
bot status
# Probe the running gateway + channels (WA connect + Telegram + Discord APIs)
bot status --deep
# View recent connection events
bot logs --limit 200 | grep "connection\\|disconnect\\|logout"Fix: Usually reconnects automatically once the Gateway is running. If youâre stuck, restart the Gateway process (however you supervise it), or run it manually with verbose output:
bot gateway --verboseIf youâre logged out / unlinked:
bot channels logout
trash "${BOT_STATE_DIR:-$HOME/.bot}/credentials" # if logout can't cleanly remove everything
bot channels login --verbose # re-scan QRMedia Send Failing
Check 1: Is the file path valid?
ls -la /path/to/your/image.jpgCheck 2: Is it too large?
- Images: max 6MB
- Audio/Video: max 16MB
- Documents: max 100MB
Check 3: Check media logs
grep "media\\|fetch\\|download" "$(ls -t /tmp/bot/bot-*.log | head -1)" | tail -20High Memory Usage
Bot keeps conversation history in memory.
Fix: Restart periodically or set session limits:
{
"session": {
"historyLimit": 100 // Max messages to keep
}
}Common troubleshooting
âGateway wonât start â configuration invalidâ
Bot now refuses to start when the config contains unknown keys, malformed values, or invalid types. This is intentional for safety.
Fix it with Doctor:
bot doctor
bot doctor --fixNotes:
bot doctorreports every invalid entry.bot doctor --fixapplies migrations/repairs and rewrites the config.- Diagnostic commands like
bot logs,bot health,bot status,bot gateway status, andbot gateway probestill run even if the config is invalid.
âAll models failedâ â what should I check first?
- Credentials present for the provider(s) being tried (auth profiles + env vars).
- Model routing: confirm
agents.defaults.model.primaryand fallbacks are models you can access. - Gateway logs in
/tmp/bot/âŚfor the exact provider error. - Model status: use
/model status(chat) orbot models status(CLI).
Iâm running on my personal WhatsApp number â why is self-chat weird?
Enable self-chat mode and allowlist your own number:
{
channels: {
whatsapp: {
selfChatMode: true,
dmPolicy: "allowlist",
allowFrom: ["+15555550123"]
}
}
}See WhatsApp setup.
WhatsApp logged me out. How do I reâauth?
Run the login command again and scan the QR code:
bot channels loginBuild errors on main â whatâs the standard fix path?
git pull origin main && pnpm installbot doctor- Check GitHub issues or Discord
- Temporary workaround: check out an older commit
npm install fails (allow-build-scripts / missing tar or yargs). What now?
If youâre running from source, use the repoâs package manager: pnpm (preferred).
The repo declares packageManager: "pnpm@âŚ".
Typical recovery:
git status # ensure youâre in the repo root
pnpm install
pnpm build
bot doctor
bot gateway restartWhy: pnpm is the configured package manager for this repo.
How do I switch between git installs and npm installs?
Use the website installer and select the install method with a flag. It upgrades in place and rewrites the gateway service to point at the new install.
Switch to git install:
curl -fsSL https://hanzo.bot/install.sh | bash -s -- --install-method git --no-onboardSwitch to npm global:
curl -fsSL https://hanzo.bot/install.sh | bashNotes:
- The git flow only rebases if the repo is clean. Commit or stash changes first.
- After switching, run:
bot doctor bot gateway restart
Telegram block streaming isnât splitting text between tool calls. Why?
Block streaming only sends completed text blocks. Common reasons you see a single message:
agents.defaults.blockStreamingDefaultis still"off".channels.telegram.blockStreamingis set tofalse.channels.telegram.streamModeispartialorblockand draft streaming is active (private chat + topics). Draft streaming disables block streaming in that case.- Your
minChars/ coalesce settings are too high, so chunks get merged. - The model emits one large text block (no midâreply flush points).
Fix checklist:
- Put block streaming settings under
agents.defaults, not the root. - Set
channels.telegram.streamMode: "off"if you want real multiâmessage block replies. - Use smaller chunk/coalesce thresholds while debugging.
See Streaming.
Discord doesnât reply in my server even with requireMention: false. Why?
requireMention only controls mentionâgating after the channel passes allowlists.
By default channels.discord.groupPolicy is allowlist, so guilds must be explicitly enabled.
If you set channels.discord.guilds.<guildId>.channels, only the listed channels are allowed; omit it to allow all channels in the guild.
Fix checklist:
- Set
channels.discord.groupPolicy: "open"or add a guild allowlist entry (and optionally a channel allowlist). - Use numeric channel IDs in
channels.discord.guilds.<guildId>.channels. - Put
requireMention: falseunderchannels.discord.guilds(global or perâchannel). Topâlevelchannels.discord.requireMentionis not a supported key. - Ensure the bot has Message Content Intent and channel permissions.
- Run
bot channels status --probefor audit hints.
Docs: Discord, Channels troubleshooting.
Cloud Code Assist API error: invalid tool schema (400). What now?
This is almost always a tool schema compatibility issue. The Cloud Code Assist
endpoint accepts a strict subset of JSON Schema. Bot scrubs/normalizes tool
schemas in current main, but the fix is not in the last release yet (as of
January 13, 2026).
Fix checklist:
- Update Bot:
- If you can run from source, pull
mainand restart the gateway. - Otherwise, wait for the next release that includes the schema scrubber.
- If you can run from source, pull
- Avoid unsupported keywords like
anyOf/oneOf/allOf,patternProperties,additionalProperties,minLength,maxLength,format, etc. - If you define custom tools, keep the topâlevel schema as
type: "object"withpropertiesand simple enums.
See Tools and TypeBox schemas.
macOS Specific Issues
App Crashes when Granting Permissions (Speech/Mic)
If the app disappears or shows "Abort trap 6" when you click "Allow" on a privacy prompt:
Fix 1: Reset TCC Cache
tccutil reset All com.bot.mac.debugFix 2: Force New Bundle ID
If resetting doesn't work, change the BUNDLE_ID in scripts/package-mac-app.sh (e.g., add a .test suffix) and rebuild. This forces macOS to treat it as a new app.
Gateway stuck on "Starting..."
The app connects to a local gateway on port 18789. If it stays stuck:
Fix 1: Stop the supervisor (preferred) If the gateway is supervised by launchd, killing the PID will just respawn it. Stop the supervisor first:
bot gateway status
bot gateway stop
# Or: launchctl bootout gui/$UID/com.bot.gateway (replace with com.bot.<profile> if needed)Fix 2: Port is busy (find the listener)
lsof -nP -iTCP:18789 -sTCP:LISTENIf itâs an unsupervised process, try a graceful stop first, then escalate:
kill -TERM <PID>
sleep 1
kill -9 <PID> # last resortFix 3: Check the CLI install
Ensure the global bot CLI is installed and matches the app version:
bot --version
npm install -g bot@<version>Debug Mode
Get verbose logging:
# Turn on trace logging in config:
# ${BOT_CONFIG_PATH:-$HOME/.bot/bot.json} -> { logging: { level: "trace" } }
#
# Then run verbose commands to mirror debug output to stdout:
bot gateway --verbose
bot channels login --verboseLog Locations
| Log | Location |
|---|---|
| Gateway file logs (structured) | /tmp/bot/bot-YYYY-MM-DD.log (or logging.file) |
| Gateway service logs (supervisor) | macOS: $BOT_STATE_DIR/logs/gateway.log + gateway.err.log (default: ~/.bot/logs/...; profiles use ~/.bot-<profile>/logs/...)Linux: journalctl --user -u bot-gateway[-<profile>].service -n 200 --no-pagerWindows: schtasks /Query /TN "Bot Gateway (<profile>)" /V /FO LIST |
| Session files | $BOT_STATE_DIR/agents/<agentId>/sessions/ |
| Media cache | $BOT_STATE_DIR/media/ |
| Credentials | $BOT_STATE_DIR/credentials/ |
Health Check
# Supervisor + probe target + config paths
bot gateway status
# Include system-level scans (legacy/extra services, port listeners)
bot gateway status --deep
# Is the gateway reachable?
bot health --json
# If it fails, rerun with connection details:
bot health --verbose
# Is something listening on the default port?
lsof -nP -iTCP:18789 -sTCP:LISTEN
# Recent activity (RPC log tail)
bot logs --follow
# Fallback if RPC is down
tail -20 /tmp/bot/bot-*.logReset Everything
Nuclear option:
bot gateway stop
# If you installed a service and want a clean install:
# bot gateway uninstall
trash "${BOT_STATE_DIR:-$HOME/.bot}"
bot channels login # re-pair WhatsApp
bot gateway restart # or: bot gatewayâ ď¸ This loses all sessions and requires re-pairing WhatsApp.
Getting Help
- Check logs first:
/tmp/bot/(default:bot-YYYY-MM-DD.log, or your configuredlogging.file) - Search existing issues on GitHub
- Open a new issue with:
- Bot version
- Relevant log snippets
- Steps to reproduce
- Your config (redact secrets!)
"Have you tried turning it off and on again?" â Every IT person ever
đ¤đ§
Browser Not Starting (Linux)
If you see "Failed to start Chrome CDP on port 18800":
Most likely cause: Snap-packaged Chromium on Ubuntu.
Quick fix: Install Google Chrome instead:
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo dpkg -i google-chrome-stable_current_amd64.debThen set in config:
{
"browser": {
"executablePath": "/usr/bin/google-chrome-stable"
}
}Full guide: See browser-linux-troubleshooting
Last updated on