Blog

OpenClaw webhook 405 fix: diagnose and recover from Control UI interception

Problem statement: inbound webhook channels stop working right after upgrade. Your POST /webhooks/... requests return 405 Method Not Allowed, and some GET checks return Control UI HTML instead of webhook responses. This is a high-impact integration outage.

Recent reports
  • GitHub issue #31448 (2026-03-02) details route precedence bug: method guard runs before basePath exclusion.
  • Duplicate field report in #31462 shows BlueBubbles webhook regression after upgrade to v2026.3.1.
  • Confirmed upstream fix exists on main in commit 93a37213b, but was not yet in that release.

Who should care

If you run OpenClaw with webhook-based channels (BlueBubbles, custom plugins, and potentially webhook-mode connectors), this issue can silently kill inbound events. The platform appears "up" while message intake is down.

How to identify this specific regression in under 5 minutes

  1. Check your version. If you recently upgraded to v2026.3.1 and webhook failures started immediately, suspect this first.
  2. Run a direct POST test to webhook endpoint. If response is instant 405, continue diagnosis.
  3. Run a GET against same path. If you receive Control UI HTML, your route is being captured by UI catch-all.
  4. Validate plugin registration logs. If logs show "webhook listening on ..." but no payload handling logs, request never reaches plugin handler.

Minimal reproducible test

curl -sv 'http://127.0.0.1:18789/webhooks/bluebubbles?password=<pw>'   -X POST   -H 'Content-Type: application/json'   -d '{"type":"new-message","data":{"text":"test"}}'

# Symptom of this regression:
# HTTP/1.1 405 Method Not Allowed

This test is useful because it removes third-party uncertainty. If local curl already fails with 405, the problem is in gateway HTTP request routing, not in BlueBubbles server, not in DNS, and not in remote transport reliability.

Technical root cause

In the impacted release, Control UI HTTP handling performs method rejection for non-GET/HEAD too early. Because the control handler is evaluated before plugin webhook handlers in the chain, it can claim requests globally and return 405, even when the path belongs to plugin webhooks.

Correct behavior is: first determine whether request is inside Control UI namespace (basePath), and only then apply method constraints. The upstream fix moved method guard after path checks, restoring pass-through for non-UI paths.

Fast mitigation options

Option A: disable Control UI temporarily

If your team can operate via channel clients/CLI short term, disabling Control UI avoids path capture and allows webhook routes to receive POST traffic.

{
  "gateway": {
    "controlUi": {
      "enabled": false
    }
  }
}

Option B: rollback to last known-good version

Issue #31462 confirms rollback from v2026.3.1 to v2026.2.26 restored webhook behavior. Use this if UI is required and you prefer stable release over local patching.

Option C: patch locally if you run custom build operations

For teams comfortable maintaining temporary patch drift, apply the method-check ordering fix in your local deployment and track upstream release to remove patch debt later.

Production incident runbook

  1. Freeze non-critical upgrades and communicate integration incident status.
  2. Run reproducible local POST test and capture response headers/body.
  3. Apply one mitigation path (disable UI or rollback).
  4. Restart gateway and replay queued webhook events where possible.
  5. Validate message ingress with three independent test payloads.
  6. Document blast radius: which channels were affected, how long, how many messages lost/delayed.
  7. Create release gate requiring synthetic webhook tests before future upgrades.

Common misdiagnoses that waste hours

  • Blaming API provider keys (irrelevant when request never reaches channel plugin).
  • Debugging remote tunnel before local loopback reproduction.
  • Assuming firewall because of 405 status code.
  • Restarting dependent services repeatedly while gateway route precedence remains unchanged.

Edge cases and nuance

1) You see mixed behavior by endpoint

Some plugin paths may appear healthy if they avoid captured route patterns, while others fail. Test each critical webhook path explicitly; do not assume one passing path means full recovery.

2) Reverse proxy can mask response origin

If a proxy rewrites responses, you may misread where 405 originates. Compare upstream gateway response directly on loopback to separate proxy behavior from gateway behavior.

3) "Issue closed" does not mean your deployment is fixed

Both cited issues were closed because fix exists on main or as duplicate. If your binary is still on affected release, you are still vulnerable until upgrade/rollback/patch is applied.

Verification checklist after mitigation

  • POST to every active webhook endpoint returns expected non-405 behavior.
  • Inbound messages from real channel source appear in OpenClaw session logs.
  • No Control UI HTML returned for webhook URLs.
  • Synthetic webhook canary test added to deploy pipeline.
  • Upgrade playbook now includes route precedence regression checks.

How to prevent this class of outage long term

Three controls matter most. First, maintain pre-production smoke tests that send real POST payloads to each webhook path. Second, pin versions and promote by stages (dev -> staging -> prod) with explicit pass criteria. Third, define a standard rollback protocol so operators can recover in minutes, not hours.

If your team lacks bandwidth to continuously own this operational layer, managed hosting is often the better tradeoff: you focus on agent outcomes while platform maintenance, regression checks, and runtime hardening are handled upstream.

Fix once. Stop recurring webhook routing regressions.

If this keeps coming back, you can move your existing setup to managed OpenClaw cloud hosting instead of rebuilding the same stack. Import your current instance, keep your context, and move onto a runtime with lower ops overhead.

  • Import flow in ~1 minute
  • Keep your current instance context
  • Run with managed security and reliability defaults

If you would rather compare options first, review OpenClaw cloud hosting or see the best OpenClaw hosting options before deciding.

OpenClaw import first screen in OpenClaw Setup dashboard (light theme) OpenClaw import first screen in OpenClaw Setup dashboard (dark theme)
1) Paste import payload
OpenClaw import completed screen in OpenClaw Setup dashboard (light theme) OpenClaw import completed screen in OpenClaw Setup dashboard (dark theme)
2) Review and launch

Use managed OpenClaw hosting See setup and migration options

Channel-specific validation scenarios

Do not stop at one generic curl check. Different channels format payloads differently, and some failures appear only with real provider headers or query params. Run at least one synthetic plus one real-event test per active channel.

  • BlueBubbles: POST test payload plus real message from iMessage side.
  • Custom webhook plugins: test signed/unsigned payload behavior and auth path.
  • Proxy fronted deployments: verify direct loopback and public URL both route correctly.
  • Failover path: confirm retries do not amplify duplicate message processing.

Upgrade safety gate you can automate

Add a release job that spins a temporary gateway, registers one webhook plugin, sends POST fixtures, and fails build if any endpoint returns 405 or HTML fallback. This converts an outage class into a CI failure class.

# pseudo-check
for path in /webhooks/bluebubbles /webhooks/custom-a /webhooks/custom-b; do
  code=$(curl -s -o /tmp/body -w "%{http_code}" -X POST "http://127.0.0.1:18789$path" -d '{"type":"smoke"}')
  if [ "$code" = "405" ] || grep -qi "<!doctype html" /tmp/body; then
    echo "FAILED route precedence check on $path"
    exit 1
  fi
done

FAQ

Will this affect only BlueBubbles?

No. BlueBubbles was the cleanest reproduced case, but any plugin endpoint relying on POST pass-through could be affected by the same precedence bug.

Can I keep Control UI enabled and still avoid this?

Yes, once you run a version that includes the upstream fix or apply equivalent patch. Validate with endpoint-level tests before declaring incident resolved.

What if I cannot downgrade quickly?

Use temporary UI disablement, preserve webhook functionality, and schedule controlled migration to a fixed release.

Sources

Cookie preferences