# Load Balancing Overview

## What Is Load Balancer?

Load Balancer is the managed traffic entry point for services running inside your VPC. It takes one virtual address — the VIP — and spreads connections arriving on that address across a pool of backend servers, so no single server has to carry the entire load.

The service operates at **Layer 4** (TCP and UDP) in **passthrough** mode: connections terminate on the backend, not on the load balancer. The load balancer's job is to pick a healthy backend for each new flow, forward the packet through, and track the flow so return traffic lands correctly.

A load balancer instance bundles three things:

* **VIPs** — one or more Elastic IPs (IPv4 or IPv6) that clients connect to.
* **Listeners** — protocol + port + scheduler configurations that define how traffic on each port is dispatched.
* **Backend servers** — the VMs behind the VIP, each with a weight and a health status.

![Load Balancer Architecture](/files/gB7vEWYXbmDNmwzfanlH)

***

## When to Use Load Balancer

A quick map of when Load Balancer fits the job and when to reach for something else.

| Need / Scenario                                                              | Load Balancer?        | Notes                                                                                                |
| ---------------------------------------------------------------------------- | --------------------- | ---------------------------------------------------------------------------------------------------- |
| One server can't carry the load — spread it across many backends             | ✅ Yes                 | Core use case. Pick a scheduler — `mh` for sticky 5-tuples, `wIc` for mixed-capacity pools.          |
| Backend crashes, reboots, or fails a deploy — clients shouldn't notice       | ✅ Yes                 | Paired health checks isolate failing backends and re-admit them on recovery.                         |
| Backend pool churns (autoscale, rolling deploys, replacements)               | ✅ Yes                 | The VIP is stable; clients and DNS stay pointed at it while the pool changes underneath.             |
| Hide private backends from the internet behind one public address            | ✅ Yes                 | The VIP (an EIP) is the only public surface; backends stay on private IPs inside the VPC.            |
| Scale a stateless service horizontally — web front-ends, APIs, microservices | ✅ Yes                 | Standard pattern. Adding replicas increases capacity without changing how clients connect.           |
| HTTP path routing, SNI dispatch, TLS termination, or WAF                     | ❌ No                  | Layer 4 only. Put an application-layer proxy *behind* the VIP for these.                             |
| Single VM on a single port, no pool, no health check                         | ❌ Overkill            | Use **NAT Gateway DNAT** for one-to-one public-port → private-endpoint mapping with less setup.      |
| Bidirectional connectivity between networks (subnet ↔ subnet, VPC ↔ on-prem) | ❌ No                  | Use **Border Gateway**. Load Balancer is a service entry point, not a router.                        |
| Internal-only / private VIP                                                  | ❌ Not supported today | All VIPs are public Elastic IPs. For private endpoints, use direct VPC connectivity to the backends. |

***

## How It Works

![Request flow](/files/oyVyP8FnVhWj4ZcEh7CF)

1. The client opens a connection to the VIP.
2. The load balancer matches the packet to a listener (protocol + port) and picks a healthy backend using the listener's scheduling algorithm.
3. The load balancer forwards the packet to the chosen backend. The backend sees the real client IP as the source — no special configuration on the backend side.
4. The backend replies **directly to the client** — the reply does not go back through the load balancer. From the client's perspective, the response appears to come from the VIP.

Only the forward path traverses the load balancer; return traffic bypasses it entirely. This keeps the load balancer off the reply hot path and lets backend bandwidth scale with the pool size.

Unhealthy backends are pulled out of scheduling automatically. When a backend recovers, it comes back into rotation on the next successful probe.

***

## Frequently Asked Questions

**Does Load Balancer handle HTTPS?** Not as a TLS-terminating product. A TCP listener can carry HTTPS connections end-to-end — the load balancer passes the bytes to a backend that terminates TLS itself. The load balancer does not look inside the TLS handshake, so SNI-based routing is not available.

**Can my backends see the real client IP?** Yes. The load balancer is passthrough, so backends see the real client IP as the source address — no configuration needed on the backend side. This works for both TCP and UDP listeners.

**Can one load balancer have multiple VIPs?** Yes. An instance can hold one or more IPv4 and/or IPv6 VIPs, and the same listeners apply to all of them. You can also configure multiple listeners on the same instance for different ports.

**Do backends need to be in the same VPC as the load balancer?** Yes. Backends are referenced by their private IP inside the same global VPC, and must currently be in the same region as the load balancer instance. Cross-region backends are not supported today — create a separate load balancer per region.

**What happens when all backends fail a health check?** By default, the listener stops sending traffic (hard fail). You can change this to *treat all as healthy* so requests still reach the backends while you investigate — useful when a false-positive health check would otherwise black-hole the service. See [Health Check](/welcome/elastic-compute/load-balancing/05-health-check.md).

**Is the load balancer highly available?** Yes. Each instance is internally redundant — the VIP stays reachable without any pairing, active/standby configuration, or keepalive setup on your side. From your side, the load balancer looks like one service endpoint.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.console.zenlayer.com/welcome/elastic-compute/load-balancing/load-balancing-overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
