OpenClaw gateway crash loop: CIAO PROBING fix guide
Problem statement: the OpenClaw gateway starts, prints a ready message, and then disappears. The dashboard stops loading. The chat client says it is not connected. Logs mention CIAO PROBING CANCELLED, CIAO ANNOUNCEMENT CANCELLED, Bonjour, mDNS, ECONNREFUSED on 127.0.0.1:18789, or a process manager that keeps bringing the gateway back. This guide shows how to diagnose the real failure, recover safely, and prove the gateway is stable before trusting it with browser, cron, or channel workflows again.
- Issue #71799 describes a gateway that exited roughly 25 seconds after startup while Bonjour was announcing or probing the service.
- Issue #71918 shows an Ubuntu system where the gateway appeared to bind to
127.0.0.1:18789, then crashed; clients sawECONNREFUSEDeven though Avahi and multicast checks looked healthy. - Issue #72561 reports a Linux VM where the gateway crashed 12–15 seconds after startup with
Unhandled promise rejection: CIAO PROBING CANCELLED, after which the service restarted repeatedly. - Issue #72686 adds a proxy/VPN-shaped case: Bonjour was stuck in probing, then cancelled, while attempted plugin-disabling edits introduced invalid legacy config keys.
- Issue #72689 shows the misleading version of the problem: the gateway reported ready after 8.6 seconds, then Bonjour failed and the process crashed.
- Our own baseline check on a healthy OpenClaw Setup host returned
Connectivity probe: okandListening: *:18789. The same check also warned that user-level systemd was unavailable, which is a useful reminder: process-manager warnings matter, but they are not the same thing as a failed gateway probe.
What the CIAO message actually tells you
CIAO is the library path used by Bonjour-style local service discovery. OpenClaw uses local gateway discovery so tools, paired devices, and browser-related workflows can find the gateway without every user hand-writing URLs. When you see CIAO PROBING CANCELLED, the advertisement attempt was cancelled while the service was trying to prove that the local name and network announcement were safe to use.
That does not automatically mean your entire network is broken. It also does not mean the fix is always to reinstall OpenClaw. Treat the log line as a crash-loop marker. The user-visible failure is one of these: the dashboard does not load, the gateway port does not stay open, the process exits after a ready line, or a supervisor keeps restarting the same failing process.
First rule: stop creating restart noise
A restart can help when a service is stuck. A restart loop makes diagnosis worse. If your service manager restarts OpenClaw every 15, 30, or 60 seconds, pause long enough to capture the first clean startup log. You are looking for the first hard failure after the gateway says it is ready, not the tenth repeated copy of the same rejection.
Safe triage sequence
- Record the timing. Does the process die immediately, 12–30 seconds after ready, or only when a client connects?
- Capture logs before running repair commands. Save the gateway log, the stability bundle if one is written, and the exact command or service unit that launched the process.
- Check whether the port is actually listening. A ready line is not enough. Confirm that
127.0.0.1:18789or your configured bind address remains open after the suspected crash window. - Probe the dashboard and gateway separately. A browser cache or failed client can mislead you. Test the dashboard URL, then run the gateway status probe if the CLI is available.
- Look for duplicate supervisors. WSL2, foreground shells, user services, and system services can fight over the same port and turn one failure into an
EADDRINUSEor split-brain loop. - Check network discovery interference. Proxy tools, VPN clients, firewall rules, multicast restrictions, and mDNS name conflicts can make Bonjour look broken while normal HTTP still works.
- Separate gateway connectivity from process-manager health. A warning about systemd, launch agents, or lingering sessions may be important, but the immediate question is whether the gateway remains reachable.
Common causes
- mDNS name conflict: another advertised service uses the same local name, so the advertiser retries, renames, or cancels probing.
- Proxy or VPN interception: local discovery traffic behaves differently when Clash, Mihomo, a corporate VPN, or a firewall policy rewrites network behavior.
- Supervisor split-brain: two OpenClaw launch paths are active, so one process sees another process or a stale port state and exits.
- Invalid recovery edits: trying to disable plugins through old config keys can create a second startup error that masks the original Bonjour failure.
- Current working directory pollution: running the gateway from a directory with an unrelated
package.jsoncan interact badly with dependency repair flows. Start it from a clean directory or the expected service environment.
Recovery playbook
1. Start from a quiet launch path
Stop duplicate foreground sessions and extra service managers. If you normally use a service, use the service. If you are testing manually, stop the service first. Do not run a foreground gateway and a supervised gateway against the same state directory and port while debugging.
2. Use a known bind while you diagnose
If the gateway is flapping on local discovery, temporarily prefer a simple loopback or known LAN bind while you collect evidence. The goal is not to permanently weaken access. The goal is to reduce variables: one bind address, one port, one launch path, one log stream.
3. Remove obvious network interference
Temporarily disable proxy/VPN tools only if that is safe for your environment, then restart once and watch the same 30-second window. If the gateway survives without the proxy, you have a network discovery conflict to solve. If it still crashes, move on; do not keep toggling random network settings.
4. Avoid unsupported config surgery
The fastest way to turn a recoverable mDNS problem into a messy outage is to paste old plugin keys into the current config format. If config validation reports unrecognized plugin keys, revert that edit and return to logs. Use documented configuration paths or the dashboard controls rather than guessing at nested JSON.
5. Run doctor after evidence is saved
openclaw doctor is useful, but run it after you save the failing logs. Repair commands can normalize config, update dependency state, or change warnings, which is helpful for recovery but bad if you still need the original failure for comparison.
Fix once. Stop recurring gateway crash loops.
If this keeps coming back, you can move your existing setup to managed OpenClaw cloud hosting instead of rebuilding the same stack. Import your current instance, keep your context, and move onto a runtime with lower ops overhead.
- Import flow in ~1 minute
- Keep your current instance context
- Run with managed security and reliability defaults
If you would rather compare options first, review OpenClaw cloud hosting or see the best OpenClaw hosting options before deciding.
Edge cases that mislead operators
Avahi is healthy, but OpenClaw still dies. This can happen. Host-level multicast health only proves that one layer works. The gateway can still fail in its own advertisement lifecycle, name conflict handling, or process recovery path.
The port exists for a moment, then disappears. Polling once can give false confidence. Watch the port through the crash window reported in your logs. If the gateway dies after it says ready, the first successful check is not proof of recovery.
WSL2 reports an address conflict with no obvious listener. Treat that as a split-brain problem until proven otherwise. Check system-level units, user-level units, foreground shells, and stale process groups before changing the gateway configuration.
The status command warns about systemd. That warning may explain why persistence is fragile, but it is not the same as a failed connectivity probe. Fix the gateway first, then harden supervision.
How to verify the fix
- The configured port stays listening for several minutes after startup.
- The dashboard loads from the expected URL without refreshing into an offline state.
- The gateway status probe reports healthy connectivity.
- The same
CIAO PROBING CANCELLEDor Bonjour watchdog line does not repeat. - No new stability bundle appears for the same unhandled rejection.
- An agent can complete one real test message through your normal channel.
- If you use browser workflows, a simple tab attach or browser snapshot works after the gateway has stayed up.
Typical mistakes
- Restarting repeatedly before saving logs.
- Assuming a ready line means the gateway survived the crash window.
- Changing plugin config with unsupported keys.
- Debugging from a random shell directory with unrelated Node files.
- Ignoring proxy, VPN, and mDNS name-conflict behavior.
- Fixing the process manager before proving the gateway itself is healthy.
When managed hosting is the cleaner fix
If this is a one-off local networking issue, the playbook above is usually enough. If the same class of failure keeps interrupting scheduled jobs, browser access, or customer-facing agents, the operational answer may be to move OpenClaw to an environment built for stable supervision and remote access. Review OpenClaw cloud hosting, compare your options in the hosting comparison, or use OpenClaw Setup when you want the gateway, dashboard, channels, and recovery path managed together.
FAQ
Is CIAO PROBING CANCELLED always fatal?
Not by itself. It becomes urgent when it is followed by an unhandled rejection, process exit, unreachable dashboard, or repeated service restarts.
Should I disable Bonjour entirely?
Do not start by disabling pieces blindly. First confirm whether Bonjour is the crashing path, whether a duplicate gateway exists, and whether a proxy or mDNS name conflict is involved. Unsupported config edits often make recovery harder.
What if the dashboard works but my chat client says disconnected?
Test both paths. The dashboard can load while a channel plugin, WebSocket client, or stale session is broken. A real verification pass includes dashboard, gateway probe, and one completed agent turn.
Where should I go next if the gateway keeps failing?
Preserve the log bundle, reduce the launch path to one supervisor, and compare your current setup against a managed option. Recurring gateway instability is usually an operations problem, not just a command-line problem.