OpenClaw Internals

Why hosted OpenClaw Tailscale uses a sidecar design

Hosted OpenClaw needs secure private networking without exposing every instance to the public internet. Tailscale is the obvious solution, but the implementation matters. This guide explains why we use a sidecar architecture, what security tradeoffs that implies, and how the pieces fit together in production.

Problem statement: secure networking in hostile environments

A hosted AI agent running in a shared environment faces a networking dilemma. On one hand, you want private, secure access without public IPs. On the other, you cannot trust the host environment, and you want to minimize the attack surface of networking components. Running a full VPN client inside the same container as your agent runtime blurs security boundaries and makes permission scoping difficult.

The sidecar pattern addresses this by isolating networking concerns into a separate container with its own security context, while still sharing a network namespace with the main application. This means OpenClaw can reach Tailscale-provisioned endpoints without running Tailscale code in the same process space.

High-level architecture

The implementation consists of three cooperating pieces:

  1. Main OpenClaw container: runs non-root, with dropped capabilities, handling agent logic, tool execution, and gateway functions.
  2. Tailscale sidecar container: optional privileged container that establishes the tailnet connection and exposes networking to the main container via shared network namespace.
  3. PVC-backed state: persistent volume storing encrypted auth keys and Tailscale state, surviving pod restarts.

Security boundary design

The sidecar design creates explicit security boundaries between components:

  • Main container: runs without root, without capabilities, without direct access to Tailscale state files. It only sees the network interface that the sidecar exposes.
  • Sidecar container: runs with the minimum privileges needed for Tailscale operation, isolated from the main application logic.
  • State isolation: auth keys and Tailscale node keys live in encrypted PVC storage, not in environment variables or configmaps that could leak to other containers.

Why sidecar instead of in-process

Dimension In-process Tailscale Sidecar design
Privilege scoping Main process needs elevated caps for networking Only sidecar needs privileges; main stays non-root
Restart isolation Networking restart restarts entire agent Sidecar restart does not affect main container
Resource limits Networking and agent share limits Independent CPU/memory constraints per container
Security boundary Tailscale code runs in same address space Process isolation between networking and agent logic
Operational complexity Simpler deployment More containers to monitor, but clearer failure modes

Auth key storage and lifecycle

One of the trickiest parts of any VPN integration is credential handling. The sidecar implementation uses a specific pattern:

  • Encrypted PVC storage: auth keys are stored encrypted in a persistent volume claim, not in plaintext environment variables.
  • Runtime decryption: the sidecar decrypts the key at startup, establishing the tailnet connection.
  • State persistence: Tailscale node keys and state live alongside the auth key, so re-auth is not needed after every restart.
  • No key exposure to main container: the OpenClaw main process never sees the raw auth key, only the network interface that Tailscale exposes.

This pattern means that even if the main container is compromised, an attacker cannot directly extract Tailscale credentials without also breaching the sidecar and the encrypted storage layer.

Hostname stability across redeploys

A common pain point with dynamic networking is hostname churn. The sidecar implementation addresses this with a stable default:

  • TS_HOSTNAME default: defaults to a stable value based on your instance identifier, not a random pod name.
  • User override optional: you can set a custom hostname if needed, but the default prevents accidental breakage.
  • DNS consistency: your tailnet DNS entries do not shift with every redeploy, preserving external integrations.

UDP egress and NetworkPolicy

Tailscale relies on UDP for direct peer connections, but Kubernetes NetworkPolicies often restrict UDP egress. The sidecar implementation uses a label-scoped policy approach:

  • Label-scoped UDP egress: NetworkPolicies allow UDP traffic specifically for the sidecar, without opening UDP broadly for all pods.
  • DERP relay fallback: when UDP is blocked, Tailscale falls back to DERP relay over TCP/443.
  • Live status visibility: the Addons UI shows current UDP, relay, and connection status so you can diagnose networking issues.

This design allows you to maintain strict NetworkPolicy hygiene while still enabling direct peer connections when possible.

Netcheck JSON preamble handling

Networking diagnostics are hard. The implementation includes a netcheck preamble that surfaces connectivity state in JSON format:

  • Connectivity probe results: whether DERP relays are reachable, whether UDP is working, and current connection paths.
  • Structured output: JSON format allows programmatic consumption by monitoring tools and UI components.
  • Addons UI integration: live status indicators show connection health, relay fallback, and UDP status directly in the dashboard.

Live status in the Addons UI

Operations visibility matters. The sidecar implementation surfaces key status indicators:

  • Connection status: whether the sidecar is connected to your tailnet.
  • Current IP address: the Tailscale-assigned IP for this instance.
  • UDP status: whether direct peer connections are working or if you are in relay mode.
  • Relay risk: whether DERP fallback is in use, which can indicate networking restrictions.

These indicators help you distinguish between expected relay fallback and potential connectivity problems.

Failure modes and diagnostics

Understanding common failure patterns helps with troubleshooting:

  • Auth key rotation: if you rotate keys in your tailnet ACL, update the stored key in the sidecar configuration and restart.
  • UDP blocked: if you see permanent relay fallback, check that your NetworkPolicy allows label-scoped UDP egress.
  • PV mount failures: if state cannot persist, the sidecar will fail to establish a stable identity. Check PVC provisioning.
  • Hostname conflicts: if you override TS_HOSTNAME and create a conflict, Tailscale will refuse to register. Use a unique hostname or revert to the default.

When to use the Tailscale sidecar

  • Use it when you need private access to your OpenClaw instance without exposing it on the public internet.
  • Use it when you want to integrate OpenClaw with existing tailnet services and resources.
  • Use it when your security policy prohibits public IP exposure for agent endpoints.
  • Skip it if you only need browser-based access via extension relay or public gateway URLs.
  • Skip it if you do not have an existing tailnet or do not want the operational overhead of Tailscale management.

Comparison with other remote access methods

Method Privacy Setup complexity Use case fit
Tailscale sidecar Private, encrypted Requires tailnet Team access, private resources
Public gateway Requires TLS/auth Simple Individual use, API access
Browser relay Session-scoped Moderate Local browser automation
Setup guide OpenClaw cloud hosting Enable in dashboard
Cookie preferences