# Load Balancer Listeners

A listener defines how traffic on one protocol and port set is received and dispatched. It is the rule that says "connections arriving on `tcp/443` are distributed across *this* pool of backends using *this* algorithm, with *these* session-persistence and health-check settings."

A load balancer instance can carry many listeners — one per protocol/port combination you want to expose. Each listener is evaluated independently.

## What a Listener Defines

| Aspect                   | Values                                                                                                                                                                                |
| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Protocol**             | `TCP` or `UDP`                                                                                                                                                                        |
| **Port**                 | A single port, a comma-separated list, or a range `start-end`. For an all-ports listener, see [All-Ports Listener](/welcome/elastic-compute/load-balancing/07-all-ports-listener.md). |
| **Scheduling algorithm** | `mh` (consistent hash, default), `wrr` (Weighted Round Robin), `lc` (least connections), `wIc` (Weighted Least Connections)                                                           |
| **Session persistence**  | Off, or a timeout in seconds within the 60–3600 range                                                                                                                                 |
| **Idle timeout**         | How long a silent connection is kept tracked before being reaped                                                                                                                      |
| **Health check**         | See [Health Check](/welcome/elastic-compute/load-balancing/05-health-check.md)                                                                                                        |
| **Backend pool**         | One or more backend servers. See [Backend Servers](/welcome/elastic-compute/load-balancing/04-backend.md)                                                                             |

## Typical Use Cases

**Web or API front-end on TCP/443.** The most common pattern. A TCP listener on 443 accepts TLS-encrypted HTTPS traffic and forwards it to backends that terminate TLS themselves. The load balancer doesn't look inside the handshake; it just picks a backend and streams bytes.

**DNS, VoIP, or real-time gaming on UDP.** A UDP listener works the same way but over UDP — each "connection" is a 5-tuple flow tracked by the load balancer. Common for DNS (53), QUIC, and gaming/streaming traffic.

**Multiple ports sharing one backend pool.** If the same backends accept traffic on several ports (say an API on 8080 and health/metrics on 8081), a single listener with a port list covers both. You don't need one listener per port unless the scheduling or health-check settings differ.

**Port-range passthrough for active-mode protocols.** Applications that use a range of ports (FTP data, some media protocols) can be served by a listener with a port range. Every port in the range lands on the same backend pool.

***

## Protocols

TCP and UDP are supported. The load balancer is Layer 4 and passthrough — it does not parse application-layer protocols, and sessions terminate on the backend rather than on the load balancer.

* **TCP** — Full connection tracking with SYN/FIN/RST semantics. Backends see the real client IP as the source address; no configuration on the backend is required.
* **UDP** — Flow tracking keyed on the 5-tuple. Replies are routed back to the same client through the load balancer. Backends see the real client IP as the source.

HTTP/HTTPS, gRPC, WebSocket, and QUIC all work *over* TCP or UDP listeners — the load balancer passes the bytes through and the backend handles the protocol. There is no HTTP-level routing, SNI inspection, or TLS termination at the load balancer itself.

## Port Configuration

Listener ports can be defined three ways:

| Form                 | Example       | Meaning                                |
| -------------------- | ------------- | -------------------------------------- |
| Single port          | `443`         | Only port 443                          |
| Comma-separated list | `80,443,8080` | Each listed port                       |
| Range                | `9000-9009`   | Every port from 9000 to 9009 inclusive |

For workloads that need a single rule covering every port on a protocol, see [All-Ports Listener](/welcome/elastic-compute/load-balancing/07-all-ports-listener.md) — a separate forwarding mode.

Within one instance, listeners must have distinct (protocol, port) combinations — the same port on the same protocol cannot appear in two listeners.

***

## Scheduling Algorithms

The scheduler decides which healthy backend handles each new flow. Four algorithms are supported.

![Scheduling algorithms](/files/mPodxXVf3hspSUg2txNS)

### Consistent Hash (`mh`) — default

The connection's 5-tuple (source IP, source port, destination IP, destination port, protocol) is hashed to a backend. The same 5-tuple always lands on the same backend, as long as the backend pool hasn't changed. When a backend is added or removed, only the minimum necessary subset of flows is reshuffled.

**Use when:** sessions benefit from sticking to the same backend (cache warmth, in-memory state), or when you want a stable flow-to-backend mapping that survives minor pool changes.

### Weighted Round Robin (`wrr`)

Cycle through healthy backends in order. Each new connection goes to the next one in the list.

**Use when:** backends are identical in capacity and you want deterministic, even rotation. `wrr` ignores weights and current load.

### Least Connections (`lc`)

Pick the backend with the fewest currently active connections.

**Use when:** requests have variable duration — long-lived queries, streaming, WebSocket — so counting "next in line" would pile work on one backend while others sit idle.

### Weighted Least Connections (`wIc`)

Least Connections, but adjusted by each backend's weight. A backend with weight 3 effectively has 3× the capacity of a weight-1 backend and accepts proportionally more traffic.

**Use when:** your pool is heterogeneous — larger VMs, newer generations, mixed instance families. `wIc` is usually the most forgiving algorithm for real-world pools.

### Picking an Algorithm

* **Default to `mh`** unless you have a reason not to. It gives stable flow-to-backend mapping and works well when the pool changes over time.
* **Switch to `wIc`** if your backends differ in capacity and you want load proportional to capacity.
* **Use `wrr`** for identical pools where you want predictable rotation.
* **Use `lc`** when request durations vary widely on an otherwise-uniform pool.

Algorithms operate only on healthy backends. Unhealthy backends aren't "counted" as zero connections — they aren't considered at all.

***

## Session Persistence

Session persistence (sticky sessions) makes subsequent connections from the same client land on the same backend, for as long as the persistence timeout holds.

| Setting             | Effect                                                                                                        |
| ------------------- | ------------------------------------------------------------------------------------------------------------- |
| Off                 | Each new flow is scheduled independently.                                                                     |
| Timeout (60–3600 s) | After a client's first connection, new connections from that client within the window go to the same backend. |

When the client is idle for longer than the timeout, the sticky association expires and the next connection is scheduled fresh.

**When to enable persistence:**

* The backend holds per-session state (cache, login, in-progress upload) that is expensive to move.
* Your application relies on a client hitting the same backend for consecutive requests, and you can't move that state to a shared store.

**When not to:**

* Stateless services (identical behavior regardless of backend) — persistence just reduces the benefit of load balancing.
* When you already use `mh` — consistent hash already gives 5-tuple stability without adding persistence on top.

## Idle Timeout

Each listener has an idle timeout — the amount of time a connection with no packets in either direction is kept tracked. When it expires, the entry is cleared and subsequent packets (if they arrive) no longer match; the flow has to start a new connection.

Pick a value that fits the expected quiet periods of your protocol. Short-lived request/response APIs can use a tight timeout (tens of seconds). Long-lived connections (WebSocket, database pools, SSH) need a longer timeout — otherwise an otherwise-fine connection will be reaped while it's idle between messages.

***

## Request / Reply Path

Clients send packets to the VIP. The load balancer selects a healthy backend using the listener's scheduler and forwards the packet through. The backend's reply goes **directly back to the client** without passing through the load balancer — only the forward path is in the load balancer's critical path. From the client's perspective, the reply appears to come from the VIP; the backend's address is never visible.

![Listener request/reply path](/files/oyVyP8FnVhWj4ZcEh7CF)

## Backend Pool

Each listener has its own backend pool. The same VM can appear in multiple listeners' pools, but each occurrence has its own weight and health state independent of the others — a backend can be healthy in one listener and unhealthy in another if they probe different ports.

See [Backend Servers](/welcome/elastic-compute/load-balancing/04-backend.md) for the full configuration model.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.console.zenlayer.com/welcome/elastic-compute/load-balancing/03-listener.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
