Blog

OpenClaw Telegram dmScope=main bug: how to stop cross-chat broadcast leaks

Problem statement: your OpenClaw reply is generated in TUI or gateway, but it gets sent into Telegram unexpectedly. This is one of the highest-risk failure modes because it blends reliability and privacy risk: wrong channel, wrong audience, wrong context. The fix is not one command. You need isolation-first triage, routing validation, and a hard verification checklist before re-enabling traffic.

Recent reports
  • GitHub issue #35789 updated 2026-03-05: TUI/Gateway replies broadcast to Telegram when dmScope is main.
  • Related delivery-noise reports this week show growing operator concern about channel boundaries.
  • Teams searching this query are usually already running production bots and need safe operations now.

Why this bug is operationally expensive

Most incidents in agent operations are annoying but contained. This one is different. A routing leak affects trust, not just uptime. If your assistant posts an internal debug reply into Telegram, you now have a messaging incident, not a mere bug. The hidden cost is context switching: engineering, support, and possibly compliance all get pulled in. Teams with multiple channels (Telegram + Discord + local TUI) are especially exposed because traffic patterns look normal until a message lands in the wrong place.

That is why the right response is layered. You are not just "fixing Telegram." You are restoring message-domain separation. In practical terms: identify trigger path, lock down outbound surfaces, verify routing metadata, and only then reopen automated sends. This playbook is built for exactly that pattern.

Root-cause model: where the leak usually starts

In a healthy setup, dmScope determines which conversations can emit to channel plugins. With dmScope=main, responses from local contexts should remain local. The leak pattern suggests a boundary mismatch between session context and channel output hinting. Typical triggers include: stale session metadata from previous interactions, a fallback path treating main scope as routable, or sub-agent completion handlers that ignore message tool hints.

  • Context confusion: output pipeline reuses a previous channel envelope even when current reply should stay in gateway/TUI.
  • Plugin fallback: default outbound plugin selected when no explicit route is intended.
  • Concurrency edge case: fast consecutive turns and completion callbacks crossing session boundaries.
  • Policy drift: config changed, process not restarted cleanly, leaving mixed runtime state.

Immediate containment (first 15 minutes)

  1. Freeze risky outbound actions: temporarily disable non-essential Telegram automations and high-volume scheduled tasks.
  2. Create an incident test room: use a dedicated private Telegram chat for validation traffic only.
  3. Snapshot current config and state: export your OpenClaw config and capture current openclaw status output for diffing.
  4. Record reproduction path: document exact command, originating interface (TUI vs gateway), and destination chat that received the leak.
  5. Set safety comms: if the bot serves a team group, announce temporary degraded mode to avoid trust erosion.

Deep diagnosis workflow (step-by-step)

Step 1 — Verify scope assumptions

Confirm your active dmScope is actually loaded at runtime. Teams often edit config but forget that long-lived processes keep older state. Restarting alone is not enough; you need post-restart verification. Compare effective runtime output with file state. If values disagree, treat the environment as untrusted until reconciled.

Step 2 — Reproduce with controlled prompts

Send deterministic prompts from TUI and gateway (for example, unique marker tokens) and observe destinations. Use one marker per path so you can identify which origin leaks. Avoid real business content during this phase. You are mapping route behavior, not testing assistant quality.

Step 3 — Inspect routing metadata and session lineage

Check session identifiers, parent/sub-agent links, and any message tool hints attached to completion events. The key question: does outbound selection inherit a stale channel target? If yes, purge or rotate affected sessions and rerun tests from a fresh session namespace.

Step 4 — Isolate plugin behavior

Temporarily disable all but one channel plugin to eliminate multi-plugin ambiguity. If leak disappears in single-plugin mode and returns in mixed mode, you have a cross-plugin routing arbitration issue, not a Telegram-only defect.

Step 5 — Validate regression boundary

Compare with a known stable environment (staging or prior deployment snapshot). If behavior differs, capture deployment delta: config commit, package update window, and runtime flags. This delta becomes your quickest path to mitigation and clean rollback criteria.

Practical fixes that work in real teams

  • Session reset strategy: rotate active session IDs after containment, then test from clean sessions only.
  • Outbound gating: enforce explicit route requirement for channel sends; no implicit fallback for main-scope replies.
  • Completion-event hardening: require completion handlers to honor messageToolHints before any outbound delivery.
  • Least-privilege channels: keep production groups on strict allowlists and isolate testing bots by token.
  • Canary checks: run a preflight command sequence after restart and block full traffic until route checks pass.

Edge cases you should actively test

Most teams verify only one happy path and miss the edge cases that trigger repeat incidents. Do not skip these:

  • Sub-agent completion message after long run timeout.
  • Reply tags present vs absent in channel responses.
  • Back-to-back prompts from different interfaces within one minute.
  • Manual message sends via tooling and automatic responses in same test cycle.
  • Restarts under load when pending outbound queue exists.

Verification checklist before returning to normal traffic

  1. 10 consecutive local replies remain local (no Telegram leakage).
  2. Telegram replies are emitted only from intended Telegram-origin prompts.
  3. No unrelated group receives any test markers.
  4. Incident runbook updated with exact reproduction + fix steps.
  5. Monitoring alert in place for unexpected outbound destination changes.

Next step: harden routing in a managed environment

If your team is repeatedly firefighting channel-routing regressions, move this class of risk out of your critical path. Use managed OpenClaw operations with update guardrails and repeatable routing validation so you can keep your instance updated without surprise channel leaks.

Operational playbook template you can reuse

Teams that recover once but fail later usually skip institutionalization. To prevent repeat incidents, convert this fix into a reusable operating playbook. Start with ownership: assign one incident commander for routing events, one operator for evidence capture, and one reviewer for post-incident hardening decisions. During triage, enforce message discipline: every action logged with timestamp, actor, command/result pair, and observed destination. This keeps timelines clear when multiple people are debugging in parallel.

Build your runbook around four lanes: containment, diagnosis, remediation, and verification. Containment actions should be reversible and low-risk, such as temporarily pausing specific automations and isolating validation traffic. Diagnosis should avoid speculative edits; instead, collect reproducible evidence until you can point to one likely route path. Remediation should be idempotent where possible, so operators can rerun safely if interrupted. Verification should include negative testing: prove that non-target channels do not receive output, not only that target channels work.

Add explicit rollback criteria before applying any change. Example criteria include message loss above threshold, inability to complete route canary checks, or unexplained destination variance after restart. A clean rollback plan prevents “hero debugging” in production and keeps customer communication consistent. Also add a communications template for stakeholder updates: what happened, what was impacted, what is now contained, when next update arrives. This protects trust while engineering focuses on technical work.

Finally, turn this incident into prevention by codifying pre-deploy route tests. Every deployment should pass a short suite that checks local-only replies, channel-specific replies, sub-agent completion behavior, and restart persistence. If any test fails, deployment halts automatically. This one change typically reduces recurring routing incidents more than ad-hoc patching ever will.

Detailed troubleshooting matrix (symptom → likely cause → action)

  • Symptom: local TUI reply appears in one Telegram group only. Likely cause: stale channel envelope from prior session context. Action: rotate sessions, clear cached route metadata, retest markers.
  • Symptom: leaks happen only after restart. Likely cause: startup path loading fallback output policy. Action: diff effective config pre/post restart and enforce explicit outbound policy key.
  • Symptom: only sub-agent completion messages leak. Likely cause: completion handler ignoring message hints. Action: gate completion sends behind hint validation and destination allowlist.
  • Symptom: leak appears random under load. Likely cause: concurrent turns crossing session boundaries. Action: introduce serialized completion queue for affected flows, then optimize safely.
  • Symptom: no leak in test chat, leak in production group. Likely cause: group-specific route rule or plugin config variance. Action: compare channel config objects directly; do not trust visual UI parity.

Use this matrix during on-call instead of free-form guessing. The goal is speed with correctness. Every ambiguous incident burns attention and confidence. Structured troubleshooting improves both.

FAQ

Should I fully disable Telegram until upstream fix lands?

Not always. For many teams, strict containment plus route canaries is enough. If you handle sensitive group traffic, disabling Telegram outbound temporarily is safer.

Can this happen even if my bot token is correct and healthy?

Yes. Token health only confirms delivery capability. This incident is about route selection, not authentication.

What metric should I watch after mitigation?

Track destination mismatch rate: responses whose destination channel differs from originating channel policy. Keep it at zero in steady state.

Sources

Looking for architecture context before changing production settings? Read our deep guide on OpenClaw Chrome Extension Relay, then use OpenClaw setup best practices for safer rollout sequencing.

Cookie preferences