Skip to content

Connection + tunnel

When the server itself can't reach agentry, or the tunnel keeps dropping.

Server shows "disconnected" in the dashboard

The server isn't holding an active tunnel to bridge.agentry.run.

First check: is the runtime container running?

bash
ssh your-server
docker ps | grep agentry-runtime

You should see one container with Up <duration>. If not, the container exited or never started.

Container exited

bash
docker logs agentry-runtime

Look for the last few lines. Most common causes:

  • "context deadline exceeded" — couldn't reach bridge.agentry.run. Network issue. See Outbound network below.
  • "x509: certificate signed by unknown authority" — the server's CA bundle is out of date. Update with apt update && apt install ca-certificates && update-ca-certificates.
  • "unauthorized: device cert revoked" — this server's identity was revoked in the dashboard. Re-enroll: in the dashboard, Add this machine → copy the new command → paste it on the server.

Container never started

The runtime container failed to start. Re-run the Add this machine command from the dashboard — it's idempotent and will recreate the container with a fresh identity.

Tunnel drops every few seconds

The dashboard flickers between connected / disconnected. Logs show repeated reconnects.

Common causes:

  • NAT / firewall idle timeout too short. Some corporate firewalls close idle connections after 5-10 minutes. The runtime sends keepalives every 30 seconds — that should be enough for any reasonable timeout, but aggressive policies trip it.
  • Server is at network capacity. A noisy neighbor on the same VM host. Check dmesg for "connection reset by peer" or "broken pipe".
  • DNS flapping. The runtime resolves bridge.agentry.run once at startup. If your server's DNS is unreliable (e.g. switching between two upstream resolvers), restart the runtime to force a re-resolve.

Fix: restart the runtime:

bash
docker restart agentry-runtime

If it keeps flapping, escalate — collect logs and send to support@agentry.run.

Outbound network

The runtime needs outbound HTTPS (TCP 443) to:

  • bridge.agentry.run — the persistent tunnel
  • app.agentry.run — control-plane RPC
  • ghcr.io — pulling the runtime image
  • Wherever your bound services live (MongoDB Atlas, Anthropic API, etc.)

Test from the server:

bash
curl -sSI https://bridge.agentry.run/healthz | head -1
# Expect: HTTP/2 200

If that fails:

  • DNS issue: nslookup bridge.agentry.run. Should resolve.
  • Firewall blocking 443: check iptables, your cloud provider's security group, corporate proxy rules.
  • TLS interception: a corporate MITM proxy is rewriting certs. agentry pins certs at the CA level — if your network rewrites the chain, you'll see "unknown authority" errors. Talk to your network admin.

"cluster <name> is offline"

Your harness or the dashboard says the cluster (server) is offline, but you think it's up.

cluster "homelab" is offline

Cause: there's no live tunnel from that server to the bridge. Either:

  • The server's runtime container isn't running.
  • The runtime is running but disconnected (the bridge dropped it).
  • You're addressing the wrong server name (typo).

Fix: SSH in and check docker ps. If the container is up but the dashboard says disconnected, restart it:

bash
docker restart agentry-runtime

Watch the logs as it reconnects:

bash
docker logs -f agentry-runtime

You should see "tunnel established" within ~5 seconds.

Deploy URL returns 502

*.agentry.live returns a 502 with a message like "cluster X is offline" or "deployment route is inconsistent with cluster ownership".

"cluster offline": the server hosting this deployment isn't connected. Same fix as above.

"route inconsistent": the bridge refused the route because the org IDs don't match (a server cert says one org; the route says another). This shouldn't happen in normal use — it implies either a server cert was issued for the wrong org, or the route table has stale data. Email support@agentry.run with the deploy URL.

Tunnel routes synced to bridge: stale

The deployment detail page shows the tunnel-health indicator in amber: "Tunnel sync stale (N s)".

What it means: the control plane has been unable to push route updates to the bridge for the last N seconds. Existing routes keep serving; new routes don't propagate.

Cause: usually a brief incident on the bridge or the control plane. Should clear within 30-60 seconds.

If it persists, your existing deploys keep serving the last-pushed route table. New deploys may 404 until sync resumes. Check our status page; if nothing is reported, email support.

When in doubt, restart

The single most reliable recovery for any tunnel issue:

bash
docker restart agentry-runtime

agentry's runtime is designed to be safely restartable — sandboxes pause, the deploy containers keep running, and the connection re-establishes within 5 seconds.

Next

  • Sandbox issues — when the sandbox is the problem, not the tunnel.
  • MCP wiring — when the harness can't see agentry's tools.
  • Add a server — initial connect flow, in case you need a clean slate.

agentry — run AI-built apps on your own hardware.