High Availability K3s Cluster Setup
This guide walks you through setting up a highly available K3s cluster using embedded etcd for true HA configuration. We’ll use keepalived and HAProxy for load balancing and high availability with a floating IP address to eliminate single points of failure.
Architecture Overview
Our HA K3s setup consists of:
- 3 control plane nodes running K3s server with embedded etcd
- 3 worker nodes for running workloads
- 2 dedicated load balancer nodes running Keepalived and HAProxy
- Keepalived for floating IP management and failover
- HAProxy for load balancing K3s API server traffic
- Floating IP (
192.168.1.10
) to eliminate single points of failure
Network Architecture Diagram
┌─────────────────────────────────┐ │ 👤 Clients │ │ kubectl, k9s, apps │ └──────────────┬──────────────────┘ │ ┌──────────────▼──────────────────┐ │ 🌐 Floating IP (VIP) │ │ 192.168.1.10:6443 │ │ Managed by Keepalived │ └──────────────┬──────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 🔧 Load Balancer Layer │ │ │ │ ┌───────────────────────┐ ┌───────────────────────┐ │ │ │ 🖥️ lb-01 │ │ 🖥️ lb-02 │ │ │ │ 192.168.1.8 │ │ 192.168.1.9 │ │ │ │ Keepalived MASTER │ │ Keepalived BACKUP │ │ │ │ HAProxy instance │ │ HAProxy instance │ │ │ └───────────────────────┘ └───────────────────────┘ │ └─────────────────────────────────┬───────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ ⚙️ K3s Control Plane │ │ │ │ ┌───────────────────────┐ ┌───────────────────────┐ │ │ │ 🖥️ k3s-server-01 │ │ 🖥️ k3s-server-02 │ │ │ │ 192.168.1.11 │ │ 192.168.1.12 │ │ │ │ Control Plane+etcd │ │ Control Plane+etcd │ │ │ └───────────────────────┘ └───────────────────────┘ │ │ │ │ ┌───────────────────────┐ │ │ │ 🖥️ k3s-server-03 │ │ │ │ 192.168.1.13 │ │ │ │ Control Plane+etcd │ │ │ └───────────────────────┘ │ └─────────────────────────────────┬───────────────────────────────┘ │ ▼ ┌──────────────────────────────────────┐ │ 🗄️ Embedded etcd Cluster │ │ Distributed across control plane │ │ (servers 01, 02, 03) │ └──────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ 🚀 K3s Worker Nodes │ │ │ │ ┌───────────────────────┐ ┌───────────────────────┐ │ │ │ 🖥️ k3s-worker-01 │ │ 🖥️ k3s-worker-02 │ │ │ │ 192.168.1.21 │ │ 192.168.1.22 │ │ │ │ Agent node │ │ Agent node │ │ │ │ Workload execution │ │ Workload execution │ │ │ └───────────────────────┘ └───────────────────────┘ │ │ │ │ ┌───────────────────────┐ │ │ │ 🖥️ k3s-worker-03 │ │ │ │ 192.168.1.23 │ │ │ │ Agent node │ │ │ │ Workload execution │ │ │ └───────────────────────┘ │ └─────────────────────────────────────────────────────────────────┘
Component Details
Component | Purpose | IP Address | Port | High Availability |
---|---|---|---|---|
Floating IP | Single entry point | 192.168.1.10 | 6443 | Managed by Keepalived |
lb-01 | Load balancer | 192.168.1.8 | 6443 | Keepalived MASTER |
lb-02 | Load balancer | 192.168.1.9 | 6443 | Keepalived BACKUP |
k3s-server-01 | Control plane + etcd | 192.168.1.11 | 6443 | Active member |
k3s-server-02 | Control plane + etcd | 192.168.1.12 | 6443 | Active member |
k3s-server-03 | Control plane + etcd | 192.168.1.13 | 6443 | Active member |
k3s-worker-01 | Agent node | 192.168.1.21 | 10250 | Workload distribution |
k3s-worker-02 | Agent node | 192.168.1.22 | 10250 | Workload distribution |
k3s-worker-03 | Agent node | 192.168.1.23 | 10250 | Workload distribution |
Keepalived | IP failover | lb-01, lb-02 | VRRP | 2-node redundancy |
HAProxy | Load balancing | lb-01, lb-02 | 6443 | 2-instance redundancy |
Traffic Flow
- Client Connection: Clients connect to the floating IP
192.168.1.10:6443
- Keepalived: Routes traffic to the active load balancer (normally lb-01)
- HAProxy: Load balances incoming API requests across all three K3s control plane servers
- K3s API: Control plane processes requests and maintains cluster state via embedded etcd
- Worker Communication: Worker nodes connect to the floating IP for resilient control plane access
- Workload Distribution: Pods and services are scheduled across the worker nodes for actual workloads
- Failover: If active load balancer fails, Keepalived automatically promotes backup load balancer
Prerequisites
- Proxmox VE environment with VM template (see Proxmox Ubuntu VM Template Setup Guide)
- 8 Ubuntu VMs total:
- 2 for load balancers (2 CPU cores, 512MB RAM minimum)
- 3 for K3s servers (2 CPU cores, 4GB RAM minimum)
- 3 for K3s agents (2 CPU cores, 8GB RAM minimum)
- Network access between all nodes
- Basic understanding of Kubernetes concepts
Step 1: Prepare Load Balancer VMs
-
Clone VM template for load balancers
Create 2 VMs from your Ubuntu template for dedicated load balancers:
- VM IDs: 201, 202
- Names:
lb-01
,lb-02
- Clone Mode: Full clone
-
Configure load balancer hardware resources
Navigate to the Hardware tab for each VM and adjust:
- Memory: 512 MB (should be sufficient for HAProxy and Keepalived)
- Processors: 1 core
- Disk: 10GB
-
Configure load balancer network settings
In the Cloud-Init tab, set static IP addresses:
- lb-01:
192.168.1.8/24
- lb-02:
192.168.1.9/24
- Gateway:
192.168.1.1
- lb-01:
-
Start and verify load balancer VMs
Start both VMs and verify connectivity.
Step 2: Prepare K3s Server VMs
-
Clone VM template
Create 3 VMs from your Ubuntu template, ideally distributed across Proxmox nodes for hardware redundancy:
- VM IDs: 301, 302, 303
- Names:
k3s-server-01
,k3s-server-02
,k3s-server-03
- Clone Mode: Full clone
-
Configure hardware resources
Navigate to the Hardware tab for each VM and adjust:
- Memory: 4096 MB (4GB)
- Processors: 2 cores
- Disk: Ensure adequate space (minimum 20GB recommended)
-
Configure network settings
In the Cloud-Init tab, set static IP addresses:
- k3s-server-01:
192.168.1.11/24
- k3s-server-02:
192.168.1.12/24
- k3s-server-03:
192.168.1.13/24
- Gateway:
192.168.1.1
- k3s-server-01:
-
Start and verify VMs
Start all VMs and verify:
- Correct IP addresses are assigned
- SSH access is working
- Internet connectivity is available
Terminal window # Test connectivity to each nodessh root@192.168.1.11ssh root@192.168.1.12ssh root@192.168.1.13
Step 3: Install and Configure Keepalived
Keepalived manages our floating IP address and provides automatic failover between the dedicated load balancer nodes.
-
Install keepalived
Run on both
lb-01
andlb-02
:Terminal window sudo apt updatesudo apt install keepalived -y -
Create keepalived configuration
Create the configuration file on both VMs:
Terminal window sudo nano /etc/keepalived/keepalived.conf -
Configure master node (lb-01)
Terminal window vrrp_instance VI_1 {state MASTERinterface eth0virtual_router_id 55priority 255advert_int 1authentication {auth_type PASSauth_pass SuPeRsEcReT}virtual_ipaddress {192.168.1.10/24}} -
Configure backup node (lb-02)
Terminal window vrrp_instance VI_1 {state BACKUPinterface eth0virtual_router_id 55priority 150advert_int 1authentication {auth_type PASSauth_pass SuPeRsEcReT}virtual_ipaddress {192.168.1.10/24}} -
Start and enable keepalived
On both load balancer nodes:
Terminal window sudo systemctl enable --now keepalived.servicesudo systemctl status keepalived.service -
Test failover functionality
Terminal window # Test basic connectivity to load balancer nodesping -c 3 192.168.1.8ping -c 3 192.168.1.9# Test floating IPping -c 3 192.168.1.10Test failover by stopping keepalived on the master:
Terminal window # On lb-01 (master)sudo systemctl stop keepalived.service# The floating IP should still respondping -c 3 192.168.1.10# Check which node took over on lb-02sudo systemctl status keepalived.service# Restart the service on lb-01sudo systemctl start keepalived.service
Step 4: Install and Configure HAProxy
HAProxy will load balance the K3s API server traffic across all three K3s server nodes.
-
Install HAProxy
Run on both
lb-01
andlb-02
:Terminal window sudo apt updatesudo apt install haproxy -y -
Configure HAProxy
Edit the HAProxy configuration on both load balancer VMs:
Terminal window sudo nano /etc/haproxy/haproxy.cfgAdd the following configuration at the end of the file:
Terminal window # K3s API Server Load Balancerfrontend k3s_frontendbind *:6443mode tcpdefault_backend k3s_backendbackend k3s_backendmode tcpoption tcp-checkbalance roundrobinserver k3s-server-01 192.168.1.11:6443 checkserver k3s-server-02 192.168.1.12:6443 checkserver k3s-server-03 192.168.1.13:6443 check -
Restart and enable HAProxy
Run on both load balancer VMs:
Terminal window sudo systemctl restart haproxysudo systemctl enable haproxysudo systemctl status haproxy -
Verify HAProxy configuration
Check for configuration errors:
Terminal window sudo haproxy -f /etc/haproxy/haproxy.cfg -c
Step 5: Install K3s Cluster
Now we’ll install K3s with embedded etcd to create our highly available cluster.
-
Initialize the first control plane node
On k3s-server-01, run:
Terminal window curl -sfL https://get.k3s.io | K3S_TOKEN=SECRET sh -s - server \--cluster-init \--tls-san=192.168.1.10 -
Wait for first node to be ready
Verify the first node is running:
Terminal window sudo systemctl status k3ssudo kubectl get nodes -
Join additional control plane nodes
On k3s-server-02 and k3s-server-03, run:
Terminal window curl -sfL https://get.k3s.io | K3S_TOKEN=SECRET sh -s - server \--server https://192.168.1.11:6443 \--tls-san=192.168.1.10 -
Verify cluster status
Check that all nodes have joined:
Terminal window sudo kubectl get nodes -o widesudo kubectl get pods -A
Step 6: Configure Agent (Worker) Nodes
Worker nodes are responsible for running your actual workloads (pods, containers). They join the cluster as agent nodes and communicate with the control plane via the floating IP for high availability.
-
Clone VM template for worker nodes
Create 3 worker VMs from the Ubuntu template, distributed across Proxmox nodes:
- VM IDs: 311, 312, 313
- Names:
k3s-worker-01
,k3s-worker-02
,k3s-worker-03
- Clone Mode: Full clone
- Distribution: Place one worker on each Proxmox node alongside the control plane nodes
-
Configure worker hardware resources
Navigate to the Hardware tab for each worker VM and adjust:
- Memory: 24576 MB (24GB - adjust based on your workload requirements)
- Processors: 2 cores (adjust based on your workload requirements)
- Disk: 32GB (consider more storage if using Longhorn or other persistent storage solutions)
-
Configure worker network settings
In the Cloud-Init tab, set static IP addresses:
- k3s-worker-01:
192.168.1.21/24
- k3s-worker-02:
192.168.1.22/24
- k3s-worker-03:
192.168.1.23/24
- Gateway:
192.168.1.1
- k3s-worker-01:
-
Start and verify worker VMs
Start all worker VMs and verify connectivity:
Terminal window # Test connectivity to each worker nodessh root@192.168.1.21ssh root@192.168.1.22ssh root@192.168.1.23 -
Install K3s agent on worker nodes
On each worker node (k3s-worker-01, k3s-worker-02, k3s-worker-03), run:
Terminal window curl -sfL https://get.k3s.io | K3S_TOKEN=SECRET sh -s - agent \--server https://192.168.1.10:6443 -
Verify cluster with worker nodes
From any control plane node, verify all nodes have joined:
Terminal window # Check all nodes are readysudo kubectl get nodes -o wide# Should show something like:# NAME STATUS ROLES AGE VERSION# k3s-server-01 Ready control-plane,master 45m v1.28.x+k3s1# k3s-server-02 Ready control-plane,master 40m v1.28.x+k3s1# k3s-server-03 Ready control-plane,master 35m v1.28.x+k3s1# k3s-worker-01 Ready <none> 5m v1.28.x+k3s1# k3s-worker-02 Ready <none> 4m v1.28.x+k3s1# k3s-worker-03 Ready <none> 3m v1.28.x+k3s1# Check cluster infosudo kubectl cluster-info# Verify system pods are runningsudo kubectl get pods -A -
Test workload scheduling
Deploy a test application to verify workers can run pods:
Terminal window # Create a test deploymentsudo kubectl create deployment nginx-test --image=nginx --replicas=3# Check pod distribution across worker nodessudo kubectl get pods -o wide# Clean up test deploymentsudo kubectl delete deployment nginx-test
Next Steps
- Configure persistent storage with Longhorn for high availability data persistence
- Set up monitoring and logging
- Implement backup strategies for your cluster data