Load Balancer Listeners
A listener defines how traffic on one protocol and port set is received and dispatched. It is the rule that says "connections arriving on tcp/443 are distributed across this pool of backends using this algorithm, with these session-persistence and health-check settings."
A load balancer instance can carry many listeners — one per protocol/port combination you want to expose. Each listener is evaluated independently.
What a Listener Defines
Protocol
TCP or UDP
Port
A single port, a comma-separated list, or a range start-end. For an all-ports listener, see All-Ports Listener.
Scheduling algorithm
mh (consistent hash, default), wrr (Weighted Round Robin), lc (least connections), wIc (Weighted Least Connections)
Session persistence
Off, or a timeout in seconds within the 60–3600 range
Idle timeout
How long a silent connection is kept tracked before being reaped
Health check
See Health Check
Backend pool
One or more backend servers. See Backend Servers
Typical Use Cases
Web or API front-end on TCP/443. The most common pattern. A TCP listener on 443 accepts TLS-encrypted HTTPS traffic and forwards it to backends that terminate TLS themselves. The load balancer doesn't look inside the handshake; it just picks a backend and streams bytes.
DNS, VoIP, or real-time gaming on UDP. A UDP listener works the same way but over UDP — each "connection" is a 5-tuple flow tracked by the load balancer. Common for DNS (53), QUIC, and gaming/streaming traffic.
Multiple ports sharing one backend pool. If the same backends accept traffic on several ports (say an API on 8080 and health/metrics on 8081), a single listener with a port list covers both. You don't need one listener per port unless the scheduling or health-check settings differ.
Port-range passthrough for active-mode protocols. Applications that use a range of ports (FTP data, some media protocols) can be served by a listener with a port range. Every port in the range lands on the same backend pool.
Protocols
TCP and UDP are supported. The load balancer is Layer 4 and passthrough — it does not parse application-layer protocols, and sessions terminate on the backend rather than on the load balancer.
TCP — Full connection tracking with SYN/FIN/RST semantics. Backends see the real client IP as the source address; no configuration on the backend is required.
UDP — Flow tracking keyed on the 5-tuple. Replies are routed back to the same client through the load balancer. Backends see the real client IP as the source.
HTTP/HTTPS, gRPC, WebSocket, and QUIC all work over TCP or UDP listeners — the load balancer passes the bytes through and the backend handles the protocol. There is no HTTP-level routing, SNI inspection, or TLS termination at the load balancer itself.
Port Configuration
Listener ports can be defined three ways:
Single port
443
Only port 443
Comma-separated list
80,443,8080
Each listed port
Range
9000-9009
Every port from 9000 to 9009 inclusive
For workloads that need a single rule covering every port on a protocol, see All-Ports Listener — a separate forwarding mode.
Within one instance, listeners must have distinct (protocol, port) combinations — the same port on the same protocol cannot appear in two listeners.
Scheduling Algorithms
The scheduler decides which healthy backend handles each new flow. Four algorithms are supported.
Consistent Hash (mh) — default
mh) — defaultThe connection's 5-tuple (source IP, source port, destination IP, destination port, protocol) is hashed to a backend. The same 5-tuple always lands on the same backend, as long as the backend pool hasn't changed. When a backend is added or removed, only the minimum necessary subset of flows is reshuffled.
Use when: sessions benefit from sticking to the same backend (cache warmth, in-memory state), or when you want a stable flow-to-backend mapping that survives minor pool changes.
Weighted Round Robin (wrr)
wrr)Cycle through healthy backends in order. Each new connection goes to the next one in the list.
Use when: backends are identical in capacity and you want deterministic, even rotation. wrr ignores weights and current load.
Least Connections (lc)
lc)Pick the backend with the fewest currently active connections.
Use when: requests have variable duration — long-lived queries, streaming, WebSocket — so counting "next in line" would pile work on one backend while others sit idle.
Weighted Least Connections (wIc)
wIc)Least Connections, but adjusted by each backend's weight. A backend with weight 3 effectively has 3× the capacity of a weight-1 backend and accepts proportionally more traffic.
Use when: your pool is heterogeneous — larger VMs, newer generations, mixed instance families. wIc is usually the most forgiving algorithm for real-world pools.
Picking an Algorithm
Default to
mhunless you have a reason not to. It gives stable flow-to-backend mapping and works well when the pool changes over time.Switch to
wIcif your backends differ in capacity and you want load proportional to capacity.Use
wrrfor identical pools where you want predictable rotation.Use
lcwhen request durations vary widely on an otherwise-uniform pool.
Algorithms operate only on healthy backends. Unhealthy backends aren't "counted" as zero connections — they aren't considered at all.
Session Persistence
Session persistence (sticky sessions) makes subsequent connections from the same client land on the same backend, for as long as the persistence timeout holds.
Off
Each new flow is scheduled independently.
Timeout (60–3600 s)
After a client's first connection, new connections from that client within the window go to the same backend.
When the client is idle for longer than the timeout, the sticky association expires and the next connection is scheduled fresh.
When to enable persistence:
The backend holds per-session state (cache, login, in-progress upload) that is expensive to move.
Your application relies on a client hitting the same backend for consecutive requests, and you can't move that state to a shared store.
When not to:
Stateless services (identical behavior regardless of backend) — persistence just reduces the benefit of load balancing.
When you already use
mh— consistent hash already gives 5-tuple stability without adding persistence on top.
Idle Timeout
Each listener has an idle timeout — the amount of time a connection with no packets in either direction is kept tracked. When it expires, the entry is cleared and subsequent packets (if they arrive) no longer match; the flow has to start a new connection.
Pick a value that fits the expected quiet periods of your protocol. Short-lived request/response APIs can use a tight timeout (tens of seconds). Long-lived connections (WebSocket, database pools, SSH) need a longer timeout — otherwise an otherwise-fine connection will be reaped while it's idle between messages.
Request / Reply Path
Clients send packets to the VIP. The load balancer selects a healthy backend using the listener's scheduler and forwards the packet through. The backend's reply goes directly back to the client without passing through the load balancer — only the forward path is in the load balancer's critical path. From the client's perspective, the reply appears to come from the VIP; the backend's address is never visible.
Backend Pool
Each listener has its own backend pool. The same VM can appear in multiple listeners' pools, but each occurrence has its own weight and health state independent of the others — a backend can be healthy in one listener and unhealthy in another if they probe different ports.
See Backend Servers for the full configuration model.
Last updated