Connection + tunnel
When the server itself can't reach agentry, or the tunnel keeps dropping.
Server shows "disconnected" in the dashboard
The server isn't holding an active tunnel to bridge.agentry.run.
First check: is the runtime container running?
ssh your-server
docker ps | grep agentry-runtimeYou should see one container with Up <duration>. If not, the container exited or never started.
Container exited
docker logs agentry-runtimeLook for the last few lines. Most common causes:
- "context deadline exceeded" — couldn't reach
bridge.agentry.run. Network issue. See Outbound network below. - "x509: certificate signed by unknown authority" — the server's CA bundle is out of date. Update with
apt update && apt install ca-certificates && update-ca-certificates. - "unauthorized: device cert revoked" — this server's identity was revoked in the dashboard. Re-enroll: in the dashboard, Add this machine → copy the new command → paste it on the server.
Container never started
The runtime container failed to start. Re-run the Add this machine command from the dashboard — it's idempotent and will recreate the container with a fresh identity.
Tunnel drops every few seconds
The dashboard flickers between connected / disconnected. Logs show repeated reconnects.
Common causes:
- NAT / firewall idle timeout too short. Some corporate firewalls close idle connections after 5-10 minutes. The runtime sends keepalives every 30 seconds — that should be enough for any reasonable timeout, but aggressive policies trip it.
- Server is at network capacity. A noisy neighbor on the same VM host. Check
dmesgfor "connection reset by peer" or "broken pipe". - DNS flapping. The runtime resolves
bridge.agentry.runonce at startup. If your server's DNS is unreliable (e.g. switching between two upstream resolvers), restart the runtime to force a re-resolve.
Fix: restart the runtime:
docker restart agentry-runtimeIf it keeps flapping, escalate — collect logs and send to support@agentry.run.
Outbound network
The runtime needs outbound HTTPS (TCP 443) to:
bridge.agentry.run— the persistent tunnelapp.agentry.run— control-plane RPCghcr.io— pulling the runtime image- Wherever your bound services live (MongoDB Atlas, Anthropic API, etc.)
Test from the server:
curl -sSI https://bridge.agentry.run/healthz | head -1
# Expect: HTTP/2 200If that fails:
- DNS issue:
nslookup bridge.agentry.run. Should resolve. - Firewall blocking 443: check
iptables, your cloud provider's security group, corporate proxy rules. - TLS interception: a corporate MITM proxy is rewriting certs. agentry pins certs at the CA level — if your network rewrites the chain, you'll see "unknown authority" errors. Talk to your network admin.
"cluster <name> is offline"
Your harness or the dashboard says the cluster (server) is offline, but you think it's up.
cluster "homelab" is offlineCause: there's no live tunnel from that server to the bridge. Either:
- The server's runtime container isn't running.
- The runtime is running but disconnected (the bridge dropped it).
- You're addressing the wrong server name (typo).
Fix: SSH in and check docker ps. If the container is up but the dashboard says disconnected, restart it:
docker restart agentry-runtimeWatch the logs as it reconnects:
docker logs -f agentry-runtimeYou should see "tunnel established" within ~5 seconds.
Deploy URL returns 502
*.agentry.live returns a 502 with a message like "cluster X is offline" or "deployment route is inconsistent with cluster ownership".
"cluster offline": the server hosting this deployment isn't connected. Same fix as above.
"route inconsistent": the bridge refused the route because the org IDs don't match (a server cert says one org; the route says another). This shouldn't happen in normal use — it implies either a server cert was issued for the wrong org, or the route table has stale data. Email support@agentry.run with the deploy URL.
Tunnel routes synced to bridge: stale
The deployment detail page shows the tunnel-health indicator in amber: "Tunnel sync stale (N s)".
What it means: the control plane has been unable to push route updates to the bridge for the last N seconds. Existing routes keep serving; new routes don't propagate.
Cause: usually a brief incident on the bridge or the control plane. Should clear within 30-60 seconds.
If it persists, your existing deploys keep serving the last-pushed route table. New deploys may 404 until sync resumes. Check our status page; if nothing is reported, email support.
When in doubt, restart
The single most reliable recovery for any tunnel issue:
docker restart agentry-runtimeagentry's runtime is designed to be safely restartable — sandboxes pause, the deploy containers keep running, and the connection re-establishes within 5 seconds.
Next
- Sandbox issues — when the sandbox is the problem, not the tunnel.
- MCP wiring — when the harness can't see agentry's tools.
- Add a server — initial connect flow, in case you need a clean slate.