OpenClaw Telegram topics: how to stop raw tool JSON leaks
Problem statement: a user asks a normal question in a Telegram forum topic, OpenClaw runs the tool call,
and then Telegram briefly shows a raw JSON wrapper such as {"tool_uses":[...]} before the normal answer arrives.
The assistant may still return the correct result, but the damage is already done. Users saw internal payload structure, the topic looks broken,
and trust drops immediately.
-
GitHub issue #71475 (opened 2026-04-25)
documents Telegram forum topics receiving a visible raw
tool_useswrapper before the final answer. - An earlier OpenClaw regression recovery runbook in this repository treated one acceptance test as non-negotiable: no raw tool-call payloads appear in user-visible messages. That is the right standard here too.
- Telegram topic incidents are especially dangerous because they can look like success at first glance. The tool executes, the final answer eventually appears, and yet the conversation is still unsafe because the wrong intermediate output was exposed.
Why this bug matters more than it first appears
Plenty of chat bugs are annoying but survivable. This one is different because it breaks the boundary between internal execution and user-visible output. Users should only ever see the final answer or a deliberate, human-readable progress message. They should never see routing payloads, wrapper JSON, or tool-call structure that belongs inside the assistant runtime.
In a Telegram forum topic, that boundary matters even more. Topics are used for narrow workflows: support triage, project threads, ops notifications, shared team discussions, or customer-facing channels. A raw JSON leak in that setting does not look like a small formatting mistake. It looks like the system lost control of what it sends. Teams stop trusting automation after incidents like this, even when the answer itself was technically correct.
The practical consequence is simple: you cannot close the incident just because the tool worked. The incident ends only when the topic shows one clean, natural-language response and nothing else.
What users usually see
The visible symptom is usually a JSON-shaped message that arrives as its own Telegram post in the topic. It can look like a tool wrapper,
an array of tool calls, or a payload with recipient_name and parameter fields. After that, OpenClaw may still send the normal answer.
That split behavior confuses operators because the workflow appears half-broken: execution succeeded, but the presentation layer failed.
- The leaked message appears before the final answer.
- The tool still runs successfully in the background.
- The final answer often uses the tool result correctly.
- The bug is visible in the Telegram topic, not just logs or debug output.
That last point is what changes the priority. This is not a private diagnostic artifact. It is a message your users can see and quote back to you.
Why this is not the same as a failed tool execution
When tools fail to execute, the usual symptoms are missing answers, generic failure text, timeouts, or obviously incomplete responses. Here the opposite can happen: the tool path works, and the assistant still lands on the right answer. The failure is the extra message that should never leave the system.
That distinction matters because it changes what you verify. If you only ask, “Did the tool run?” you can miss the real incident. The better question is, “What did the user actually see in the topic?” Output safety must be checked end-to-end at the channel level.
Likely causes behind raw JSON leaks in Telegram topics
1) Output-phase separation breaks during topic delivery
The clean model is simple: internal tool planning stays internal, then the final answer is rendered for the channel. When that separation breaks, the wrapper object can be treated like a normal outbound message instead of a private execution artifact. Telegram topics are sensitive because they add one more routing layer: the system must resolve the correct topic, attach the correct reply context, and still keep intermediate payloads out of the visible stream.
2) Hot reload or restart windows expose the wrong message path
Several OpenClaw regressions over time have shown that reload timing matters. A gateway can stay “up” while one code path is on the new logic and another still behaves like an older output handler. That kind of version-skew bug produces mixed symptoms: one message is leaked, the final answer still follows, and a second attempt may or may not reproduce the problem.
This is why blind restart loops are a bad first reaction. If the issue is timing-sensitive, repeated restarts can make the bug seem random instead of helping you classify it.
3) Topic-specific routing adds a second place to be wrong
Standard Telegram chats already require correct message rendering and delivery. Forum topics add another requirement: route the answer into the right thread without letting low-level artifacts leak across that boundary. If your main chat path is clean but topic routing uses a different wrapper or transport helper, the problem can appear only in topics even when ordinary Telegram chats look healthy.
4) Mixed success hides the real severity
The most misleading variant is the “works anyway” incident. Teams see the right answer eventually appear and assume they can defer the fix. That is a mistake. The presence of the final answer does not downgrade the incident. The leak already proved that user-visible output rules were violated.
Containment-first response: what to do before any restart
1) Capture one clean reproduction
Keep the leaked Telegram message, the exact prompt that triggered it, the topic identifier, and the timestamp. If possible, capture the final answer too. You want evidence of the full sequence: prompt, leaked wrapper, final answer. That sequence is what distinguishes this incident from a simple tool failure.
2) Stop testing in a high-stakes topic
Move validation into a disposable forum topic or a private Telegram test surface. Do not keep probing production threads where customers or teammates can watch internal payloads appear. The goal is to reduce user exposure while you verify the behavior.
3) Confirm the blast radius
Test one simple prompt in a normal Telegram chat and one in the affected forum topic. If ordinary chats are clean while topics leak, you have a routing-specific incident. If both leak, the problem is broader and should be treated as a channel-wide output-safety regression.
4) Freeze unrelated changes
Pause other config edits, plugin changes, and release work while you verify the channel behavior. When output safety is broken, changing multiple variables at once makes recovery much slower.
Step-by-step fix workflow
Step 1: Prove whether the leak is topic-only
Run the same low-risk prompt in two places: a standard Telegram chat and the affected topic. Keep the prompt simple enough that the tool result is easy to recognize. You are not testing answer quality here. You are testing message hygiene. If the raw wrapper appears only in the topic, prioritize topic-specific routing and rendering checks.
Step 2: Verify that execution and presentation are separate
Check whether the tool result was actually used in the final answer. If it was, then the tool path is probably healthy and the exposure happened when formatting or delivering the user-facing message. That means your repair effort should focus on output handling, channel routing, or reload state—not on the tool itself.
Step 3: Reproduce once after a controlled restart, not five times
After you have one clean reproduction and you understand the blast radius, perform a single controlled restart or reload of the affected gateway path. Then run one deterministic test in the disposable topic. Repeated restart spam creates new variables and can hide whether the restart actually changed anything.
Step 4: Use a deterministic acceptance test
The acceptance test for this class of bug is strict:
- Send a prompt that triggers the same tool path as the failing case.
- Confirm Telegram shows no raw JSON wrapper before the answer.
- Confirm the final answer still includes the expected tool result.
- Repeat once in the same topic and once in a normal Telegram chat.
- Close the incident only if all visible messages are clean in both cases.
Step 5: Keep the repaired path under observation
Watch the next several topic replies, especially after reloads, deploys, or gateway restarts. Timing-sensitive regressions often disappear for one turn and then return later. One clean reply is encouraging. It is not enough to prove the incident is over.
How to verify the fix the right way
Verification is where many teams get sloppy. They confirm that the answer arrived and stop there. For this issue, answer correctness is necessary but not sufficient. A safe verification pass checks both content and presentation.
| Check | Pass condition |
|---|---|
| Topic message stream | Only human-readable output appears |
| Tool result usage | Final answer still reflects the tool data |
| Repeat test | Second prompt stays clean in the same topic |
| Non-topic comparison | Standard Telegram chat also stays clean |
| Post-restart behavior | No regression after controlled restart or reload |
Edge cases that trip people up
Forum topic only
The cleanest-looking false negative is when normal chats work but forum topics leak. If you only test outside the topic, you will miss the bug. Always reproduce in the same delivery surface where users saw it.
Hot reload looks like a fix once
A reload can temporarily align state and make one test pass. Treat one good result as provisional. Repeat after a short delay and again after the next restart window.
Final answer is correct, so the team downgrades severity
This is the most common operational mistake. Correct answers do not excuse exposed internal payloads. The visible leak is the incident.
Operators chase the wrong tool
If the answer uses the tool result correctly, the tool itself is often not where you win back time. Focus first on rendering and delivery. Only move into tool-level debugging if the final answer is also wrong or missing.
Fix once. Stop recurring Telegram topic output leaks.
If this keeps coming back, you can move your existing setup to managed OpenClaw cloud hosting instead of rebuilding the same stack. Import your current instance, keep your context, and move onto a runtime with lower ops overhead.
- Import flow in ~1 minute
- Keep your current instance context
- Run with managed security and reliability defaults
If you would rather compare options first, review OpenClaw cloud hosting or see the best OpenClaw hosting options before deciding.
Typical mistakes that make this incident linger
- Restarting too early: you lose the clean sequence that proves what the user saw.
- Testing only in direct chat: the bug may be forum-topic specific.
- Checking only for answer correctness: output safety is the real close condition.
- Running noisy production tests: users keep seeing leaked wrappers while you experiment.
- Assuming one clean reply means done: timing-related regressions often return.
When to stop patching and change the operating model
If Telegram topic regressions keep returning around upgrades, hot reloads, or release windows, the real problem may no longer be one specific bug. It may be that your current operating setup makes output-safety regressions too easy to reintroduce and too expensive to verify every time.
At that point, the question is not just “How do we suppress this one wrapper leak?” It becomes “How much team attention are we burning on chat-safety checks?” If your assistant is customer-facing or tightly tied to business workflows, safer hosting and cleaner release control can be worth more than squeezing a little more life out of a brittle setup. If you want to compare that tradeoff, review OpenClaw cloud hosting, the broader deployment comparison, and the main setup guide before your next release window.
FAQ
Why does the JSON leak only in Telegram topics and not everywhere else?
Topics add extra routing and thread-resolution behavior. A delivery path can be correct for ordinary Telegram chats but still mishandle intermediate payloads in forum topics.
Can I hide the problem by filtering JSON-looking messages?
That may reduce exposure in the short term, but it is not a complete fix unless you also confirm the correct final answer still arrives cleanly. Suppression without validation can mask a deeper delivery bug.
What is the fastest safe test prompt?
Use a low-risk question that reliably triggers the same tool path as the failing case and produces a recognizable answer. The prompt should be easy to repeat in both the topic and a comparison chat.
Should I treat this like a security bug?
Treat it as an output-safety incident with user-visible trust impact. Even when the leaked payload does not contain sensitive data, it still proves that internal execution artifacts reached the channel.
Sources
- OpenClaw issue #71475 — Telegram topic leaks raw tool wrapper before final answer
- OpenClaw issue #39626 — earlier raw tool-call output exposure regression
- OpenClaw 2026.3.7 regressions recovery guide — prior acceptance-test standard for safe user-visible output