Usage Tips

OpenClaw stop button missing during long tasks: fix

Problem statement: an OpenClaw task is running longer than expected, but the Stop button disappears or no longer works. The gateway may have restarted, the dashboard may look disconnected, or a model call, browser action, code command, or agent subprocess may continue without a visible cancel control. This is not just a UI annoyance. It is a control problem: the operator needs a safe way to stop work, preserve state, and prevent repeated runaway tasks.

Evidence from the field
  • GitHub issue #70660 reports that the Stop button can disappear during execution, and if the gateway stops, the user may be unable to terminate the running process from the interface.
  • The reported impact is concrete: a process can continue until the gateway is forcefully killed, which is especially risky for long-running code tasks and agent workflows that touch external systems.
  • The field report names a Windows desktop environment and a long-running request path, but the recovery pattern applies more broadly: once the cancellation control is gone, you must identify which layer still owns the work.
  • Recent public chatter also shows users experimenting with low-cost model routing and OpenClaw swarms. Cheap or parallel agent work makes interruption and budget controls more important, not less important.

First decide what is still running

The missing Stop button is a symptom. The active work may live in several places: a model request waiting on a provider, a child shell command, a browser automation action, a background coding-agent session, a channel plugin, or the gateway process itself. Killing the wrong thing can lose logs, corrupt partial output, or leave a second process running in the background.

Your first goal is not to restart everything. Your first goal is to find the smallest safe interruption point. If the task is only a model response, a gateway restart may be enough. If a child process is editing files, you need to stop that child carefully. If browser automation is mid-action, you may need to close the specific browser task while keeping the gateway alive long enough to preserve logs.

Safe triage sequence

  1. Stop sending new prompts. Do not stack more requests onto a session that already has no visible cancel path.
  2. Capture the visible state. Note the task name, chat/session, elapsed time, model/provider, and whether the gateway appears connected.
  3. Check gateway health. If the dashboard is disconnected, verify whether the gateway process is alive, restarting, or completely stopped.
  4. Look for child processes. Long-running code tasks often spawn shells, package managers, test runners, browser drivers, or coding agents. These can outlive the UI request.
  5. Check external side effects. If the task can send messages, push code, call APIs, or modify files, identify whether it has already crossed that boundary before killing anything.
  6. Use the narrowest stop action. Cancel the child task if possible. If not, terminate the specific process group. Restart the whole gateway only after you have ruled out a narrower interruption.
  7. Save logs before cleanup. If this becomes a recurring problem, the timing and process tree are the evidence you need for a durable fix.

Common causes

  • UI state drift: the dashboard no longer reflects the running request correctly, so the Stop button disappears even though work continues.
  • Gateway restart during execution: the control plane resets while the subprocess or external request keeps running.
  • Child process ownership: a shell command, test runner, browser action, or coding agent was started by OpenClaw but is not cleanly tied to the visible cancel button.
  • Provider wait state: a slow model call, rate limit, or streaming issue leaves the session appearing active without useful progress.
  • Long browser workflow: a browser task hangs on a selector, navigation, upload, login, or modal and holds the turn open.
  • Over-wide agent instruction: “keep working until done” without budgets or checkpoints makes interruption harder when something goes wrong.

Recovery playbook

1. If the gateway is alive, cancel from the narrowest surface

If any session-level cancel, process-control panel, or task detail view still works, use that before killing the gateway. A clean cancellation can preserve logs and let OpenClaw mark the turn as interrupted instead of crashed.

2. If the UI is disconnected, inspect the process tree

When the visible button is gone, move one layer lower. Check which processes were started around the task time. Look for Node processes, shell commands, package managers, browsers, Python scripts, test runners, or coding-agent adapters. A runaway child should be stopped as a child, not by blindly rebooting the host.

3. If the task can cause external changes, pause credentials first

For tasks that can send messages, submit forms, push commits, or call paid APIs, consider pausing the external credential or revoking the immediate action path before cleanup. That is especially important when the process may still be alive but the dashboard no longer shows progress.

4. Restart the gateway after the runaway work is stopped

Restarting the gateway first can hide the process that is actually doing the work. Stop the child task, confirm it is gone, then restart the gateway so the UI returns to a clean state. After restart, do not immediately re-run the same broad prompt. Reduce scope and add checkpoints.

5. Re-run with a budgeted prompt

Replace “do everything” with a bounded instruction: run one diagnostic, report before making changes, ask before external writes, stop after ten minutes, or produce a plan before executing. Long tasks become safer when the agent has explicit stop criteria.

Fix once. Stop recurring runaway long tasks.

If this keeps coming back, you can move your existing setup to managed OpenClaw cloud hosting instead of rebuilding the same stack. Import your current instance, keep your context, and move onto a runtime with lower ops overhead.

  • Import flow in ~1 minute
  • Keep your current instance context
  • Run with managed security and reliability defaults

If you would rather compare options first, review OpenClaw cloud hosting or see the best OpenClaw hosting options before deciding.

OpenClaw import first screen in OpenClaw Setup dashboard (light theme) OpenClaw import first screen in OpenClaw Setup dashboard (dark theme)
1) Paste import payload
OpenClaw import completed screen in OpenClaw Setup dashboard (light theme) OpenClaw import completed screen in OpenClaw Setup dashboard (dark theme)
2) Review and launch

Design safer long-running OpenClaw workflows

Long-running work is not automatically bad. Some agent tasks need time: QA passes, code review, browser testing, report generation, multi-source research, and migration checks. The problem is running them without a safety envelope. A good long-running workflow has a budget, a checkpoint cadence, a cancellation path, and a clear definition of done.

For code tasks, ask the agent to make one change, run one verification gate, and stop with a summary. For browser tasks, ask it to capture the current page state before submitting forms. For paid model routes, set limits or route cheap drafting steps to lower-cost models while keeping high-consequence decisions on stronger models. For multi-agent or swarm-style work, define owner, scope, and max runtime before starting.

How to verify recovery

  1. The original runaway child process is no longer present.
  2. The gateway is healthy and does not immediately restart again.
  3. The affected chat/session shows a stopped, failed, or completed state instead of an endless active state.
  4. No new external messages, commits, file writes, or API calls appear after the stop time.
  5. A small follow-up prompt completes normally.
  6. The next long task includes a time budget, checkpoint, and stop rule.

Edge cases

Windows desktop processes: a visible app window, background Node process, and browser child process may all be separate. Check the process tree instead of assuming the desktop app owns everything cleanly.

Browser automation: if a browser task is stuck on upload, login, or navigation, closing only the browser tab may not stop the agent turn. Stop the automation process or session as well.

Streaming model responses: a provider can keep a stream open after useful output has stopped. Treat long silence as a failure mode and re-run with a smaller context or different provider.

Parallel agents: parallel work multiplies the need for control. If several agents are running, stop the one with external side effects first, then clean up read-only tasks.

Typical mistakes

  • Clicking around the dashboard repeatedly instead of preserving the first failure state.
  • Killing the gateway while a child command continues in the background.
  • Restarting and immediately re-running the same unbounded prompt.
  • Ignoring browser tasks that remain open after the chat turn appears dead.
  • Using cheap model routing or parallel agents without budget and stop rules.
  • Allowing long-running tasks to perform external writes without an approval checkpoint.

When managed hosting is safer

If you only hit this once during a local experiment, the recovery steps above should be enough. If long-running OpenClaw work is now part of your team’s operating system, control and observability matter more than raw convenience. Managed hosting gives you a clearer runtime boundary, easier recovery expectations, and a better place to enforce budgets, channels, browser access, and restart behavior.

Review OpenClaw cloud hosting if you want managed runtime defaults, compare self-hosted and hosted options on the comparison page, or start with OpenClaw Setup if you are deciding how much operational work your team should keep owning.

Make long-running agents easier to control

If your OpenClaw tasks now run for minutes or hours, treat cancellation, budgets, and recovery as product requirements. Move recurring work onto a runtime where you can supervise it deliberately.

FAQ

Is the missing Stop button only a visual bug?

No. The serious case is when the visible control disappears while work continues somewhere else. Treat it as a control-plane failure until you prove no process is still running.

Can I just restart my computer?

That may stop the work, but it is a last-resort recovery, not a diagnosis. You lose the process evidence needed to prevent the same failure next time.

What should I add to long prompts?

Add limits: maximum runtime, maximum files changed, one verification gate, no external writes without approval, and a checkpoint before continuing to the next phase.

Does this matter if I use cheaper models?

Yes. Lower model cost can encourage more parallel or longer-running work. That makes stop controls, budgets, and process ownership more important because the operational risk moves from token cost to uncontrolled execution.

Cookie preferences