> For the complete documentation index, see [llms.txt](https://docs.console.zenlayer.com/welcome/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.console.zenlayer.com/welcome/elastic-compute/load-balancing/06-best-practices.md).

# Best Practices

Short, opinionated recommendations. They codify choices that work for most services; the rest of the guide explains the knobs in detail.

## Instance Design

**Start with one instance per service, not per port.** A single instance can hold many listeners. Group everything a service exposes — web, API, admin, metrics — on one instance so it shares VIPs, whitelists, and security-group membership. Split only when lifecycles actually differ (different teams, different compliance scopes, different release cadences).

**Keep IPv4 and IPv6 as parallel instances.** Each instance is single-family. If you need dual-stack, create one instance per family with the same listeners and backends, and publish both addresses in DNS.

## Listener Configuration

**Default scheduler: `mh` (consistent hash).** 5-tuple stability is usually what you want. Connections from the same client stick to the same backend without an explicit persistence timeout.

**Switch to `wIc` when backends aren't identical.** Mixed instance sizes, rolling upgrades to a newer VM family, or deliberately unequal pools — `wIc` lets weights do the work.

**Enable session persistence only when necessary.** Persistence adds state you have to reason about. If your service is stateless, skip it. If it holds per-session state that's expensive to move, pair persistence with `mh` or `wIc` — not `wrr`.

**Set idle timeout to match your protocol's quiet periods.** HTTP APIs: tens of seconds. Long-lived websockets or database pools: hundreds of seconds to minutes. If clients complain about unexpected disconnects during quiet periods, the idle timeout is the usual culprit.

## Backend Configuration

**Prefer `backend port = 0` (same as listener port).** It's the simplest mental model and what most services expect. Reach for a distinct backend port only when you have to — e.g., the service internally listens on `8443` while clients hit `443`.

**Drain before removing.** Set weight to 0 first, wait for active connections to drain, then remove. Removing a backend with active connections still in the table causes mid-flow application errors.

**Keep pool size below the cap.** Large pools amplify health-check traffic and increase the reshuffle surface when you change membership.

**One load balancer per region.** Backends must share the load balancer's region. For a multi-region service, create one load balancer per region and steer clients at the DNS layer.

## Health Checks

**Use `HTTP_GET` against a dedicated `/healthz` endpoint for HTTP services.** `TCP_CHECK` only verifies the port accepts connections. A proper `/healthz` verifies the service can actually serve requests — database reachable, caches warm, dependencies up.

**Don't point health checks at user-facing routes.** User routes pull data, query databases, and can have variable latency. A dedicated health endpoint should be cheap, deterministic, and only fail when the service is actually unable to serve.

**Match `ConnectionTimeout` to real response time, with headroom.** A 3-second timeout is fine for fast endpoints but triggers false positives for services that occasionally hit 2-second tails. Measure your p99 health-check response time and add a margin.

**Raise `RetryCount` in flaky environments.** In environments with noisy networking or occasional GC pauses, `RetryCount = 1` produces too many false-flap removals. Two or three retries smooths over transient blips without materially delaying real failure detection.

**Leave `HealthTreatFailure = 0` (hard fail) by default.** Only switch to "treat all as healthy" for services where going dark is worse than serving potentially broken responses — and make sure you have alerting that fires on the condition, because the symptom won't be obvious from the service's perspective.

## Operational Patterns

**Rolling backend replacement.**

1. Add the new backend to the pool. Wait for it to go healthy.
2. Set the old backend's weight to 0 (drain). Wait for active connections to drain.
3. Remove the old backend.

Repeat per backend. No dropped connections, no client retries.

**Maintenance on a single backend.**

1. Drop weight to 0. Confirm active connections reach zero.
2. Take the VM down, apply changes, bring it back.
3. Restore weight. Health checks will readmit it after one success.

**Changing scheduler.** Switching between algorithms reshuffles flow-to-backend mapping (except for `mh` → `mh-with-port`, which is surgical). Schedule scheduler changes during off-peak if your service is sensitive to mid-flow resets.

**Session persistence + scheduler.** If you enable persistence on an `wrr` or `lc` listener, new flows from a client within the persistence window go to the same backend. When the window expires, a fresh scheduling decision may send the next flow to a different backend. Pick a timeout that spans the expected "session" lifetime from your service's perspective, not an arbitrary round number.

## Troubleshooting

Match a symptom to its most common causes. Color shows which stage of the path to check first.

![Symptom-to-cause map across the LB path](/files/njAShcEkh2IySrmbdBI9)