Why hosted OpenClaw Tailscale uses a sidecar design
Hosted OpenClaw needs secure private networking without exposing every instance to the public internet. Tailscale is the obvious solution, but the implementation matters. This guide explains why we use a sidecar architecture, what security tradeoffs that implies, and how the pieces fit together in production.
Problem statement: secure networking in hostile environments
A hosted AI agent running in a shared environment faces a networking dilemma. On one hand, you want private, secure access without public IPs. On the other, you cannot trust the host environment, and you want to minimize the attack surface of networking components. Running a full VPN client inside the same container as your agent runtime blurs security boundaries and makes permission scoping difficult.
The sidecar pattern addresses this by isolating networking concerns into a separate container with its own security context, while still sharing a network namespace with the main application. This means OpenClaw can reach Tailscale-provisioned endpoints without running Tailscale code in the same process space.
High-level architecture
The implementation consists of three cooperating pieces:
- Main OpenClaw container: runs non-root, with dropped capabilities, handling agent logic, tool execution, and gateway functions.
- Tailscale sidecar container: optional privileged container that establishes the tailnet connection and exposes networking to the main container via shared network namespace.
- PVC-backed state: persistent volume storing encrypted auth keys and Tailscale state, surviving pod restarts.
Security boundary design
The sidecar design creates explicit security boundaries between components:
- Main container: runs without root, without capabilities, without direct access to Tailscale state files. It only sees the network interface that the sidecar exposes.
- Sidecar container: runs with the minimum privileges needed for Tailscale operation, isolated from the main application logic.
- State isolation: auth keys and Tailscale node keys live in encrypted PVC storage, not in environment variables or configmaps that could leak to other containers.
Why sidecar instead of in-process
| Dimension | In-process Tailscale | Sidecar design |
|---|---|---|
| Privilege scoping | Main process needs elevated caps for networking | Only sidecar needs privileges; main stays non-root |
| Restart isolation | Networking restart restarts entire agent | Sidecar restart does not affect main container |
| Resource limits | Networking and agent share limits | Independent CPU/memory constraints per container |
| Security boundary | Tailscale code runs in same address space | Process isolation between networking and agent logic |
| Operational complexity | Simpler deployment | More containers to monitor, but clearer failure modes |
Auth key storage and lifecycle
One of the trickiest parts of any VPN integration is credential handling. The sidecar implementation uses a specific pattern:
- Encrypted PVC storage: auth keys are stored encrypted in a persistent volume claim, not in plaintext environment variables.
- Runtime decryption: the sidecar decrypts the key at startup, establishing the tailnet connection.
- State persistence: Tailscale node keys and state live alongside the auth key, so re-auth is not needed after every restart.
- No key exposure to main container: the OpenClaw main process never sees the raw auth key, only the network interface that Tailscale exposes.
This pattern means that even if the main container is compromised, an attacker cannot directly extract Tailscale credentials without also breaching the sidecar and the encrypted storage layer.
Hostname stability across redeploys
A common pain point with dynamic networking is hostname churn. The sidecar implementation addresses this with a stable default:
- TS_HOSTNAME default: defaults to a stable value based on your instance identifier, not a random pod name.
- User override optional: you can set a custom hostname if needed, but the default prevents accidental breakage.
- DNS consistency: your tailnet DNS entries do not shift with every redeploy, preserving external integrations.
UDP egress and NetworkPolicy
Tailscale relies on UDP for direct peer connections, but Kubernetes NetworkPolicies often restrict UDP egress. The sidecar implementation uses a label-scoped policy approach:
- Label-scoped UDP egress: NetworkPolicies allow UDP traffic specifically for the sidecar, without opening UDP broadly for all pods.
- DERP relay fallback: when UDP is blocked, Tailscale falls back to DERP relay over TCP/443.
- Live status visibility: the Addons UI shows current UDP, relay, and connection status so you can diagnose networking issues.
This design allows you to maintain strict NetworkPolicy hygiene while still enabling direct peer connections when possible.
Netcheck JSON preamble handling
Networking diagnostics are hard. The implementation includes a netcheck preamble that surfaces connectivity state in JSON format:
- Connectivity probe results: whether DERP relays are reachable, whether UDP is working, and current connection paths.
- Structured output: JSON format allows programmatic consumption by monitoring tools and UI components.
- Addons UI integration: live status indicators show connection health, relay fallback, and UDP status directly in the dashboard.
Live status in the Addons UI
Operations visibility matters. The sidecar implementation surfaces key status indicators:
- Connection status: whether the sidecar is connected to your tailnet.
- Current IP address: the Tailscale-assigned IP for this instance.
- UDP status: whether direct peer connections are working or if you are in relay mode.
- Relay risk: whether DERP fallback is in use, which can indicate networking restrictions.
These indicators help you distinguish between expected relay fallback and potential connectivity problems.
Failure modes and diagnostics
Understanding common failure patterns helps with troubleshooting:
- Auth key rotation: if you rotate keys in your tailnet ACL, update the stored key in the sidecar configuration and restart.
- UDP blocked: if you see permanent relay fallback, check that your NetworkPolicy allows label-scoped UDP egress.
- PV mount failures: if state cannot persist, the sidecar will fail to establish a stable identity. Check PVC provisioning.
- Hostname conflicts: if you override TS_HOSTNAME and create a conflict, Tailscale will refuse to register. Use a unique hostname or revert to the default.
When to use the Tailscale sidecar
- Use it when you need private access to your OpenClaw instance without exposing it on the public internet.
- Use it when you want to integrate OpenClaw with existing tailnet services and resources.
- Use it when your security policy prohibits public IP exposure for agent endpoints.
- Skip it if you only need browser-based access via extension relay or public gateway URLs.
- Skip it if you do not have an existing tailnet or do not want the operational overhead of Tailscale management.
Comparison with other remote access methods
| Method | Privacy | Setup complexity | Use case fit |
|---|---|---|---|
| Tailscale sidecar | Private, encrypted | Requires tailnet | Team access, private resources |
| Public gateway | Requires TLS/auth | Simple | Individual use, API access |
| Browser relay | Session-scoped | Moderate | Local browser automation |