Skip to content

High Availability K3s Cluster Setup

This guide walks you through setting up a highly available K3s cluster using embedded etcd for true HA configuration. We’ll use keepalived and HAProxy for load balancing and high availability with a floating IP address to eliminate single points of failure.

Architecture Overview

Our HA K3s setup consists of:

  • 3 control plane nodes running K3s server with embedded etcd
  • 3 worker nodes for running workloads
  • 2 dedicated load balancer nodes running Keepalived and HAProxy
  • Keepalived for floating IP management and failover
  • HAProxy for load balancing K3s API server traffic
  • Floating IP (192.168.1.10) to eliminate single points of failure

Network Architecture Diagram

┌─────────────────────────────────┐
│ 👤 Clients │
│ kubectl, k9s, apps │
└──────────────┬──────────────────┘
┌──────────────▼──────────────────┐
│ 🌐 Floating IP (VIP) │
│ 192.168.1.10:6443 │
│ Managed by Keepalived │
└──────────────┬──────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ 🔧 Load Balancer Layer │
│ │
│ ┌───────────────────────┐ ┌───────────────────────┐ │
│ │ 🖥️ lb-01 │ │ 🖥️ lb-02 │ │
│ │ 192.168.1.8 │ │ 192.168.1.9 │ │
│ │ Keepalived MASTER │ │ Keepalived BACKUP │ │
│ │ HAProxy instance │ │ HAProxy instance │ │
│ └───────────────────────┘ └───────────────────────┘ │
└─────────────────────────────────┬───────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ ⚙️ K3s Control Plane │
│ │
│ ┌───────────────────────┐ ┌───────────────────────┐ │
│ │ 🖥️ k3s-server-01 │ │ 🖥️ k3s-server-02 │ │
│ │ 192.168.1.11 │ │ 192.168.1.12 │ │
│ │ Control Plane+etcd │ │ Control Plane+etcd │ │
│ └───────────────────────┘ └───────────────────────┘ │
│ │
│ ┌───────────────────────┐ │
│ │ 🖥️ k3s-server-03 │ │
│ │ 192.168.1.13 │ │
│ │ Control Plane+etcd │ │
│ └───────────────────────┘ │
└─────────────────────────────────┬───────────────────────────────┘
┌──────────────────────────────────────┐
│ 🗄️ Embedded etcd Cluster │
│ Distributed across control plane │
│ (servers 01, 02, 03) │
└──────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ 🚀 K3s Worker Nodes │
│ │
│ ┌───────────────────────┐ ┌───────────────────────┐ │
│ │ 🖥️ k3s-worker-01 │ │ 🖥️ k3s-worker-02 │ │
│ │ 192.168.1.21 │ │ 192.168.1.22 │ │
│ │ Agent node │ │ Agent node │ │
│ │ Workload execution │ │ Workload execution │ │
│ └───────────────────────┘ └───────────────────────┘ │
│ │
│ ┌───────────────────────┐ │
│ │ 🖥️ k3s-worker-03 │ │
│ │ 192.168.1.23 │ │
│ │ Agent node │ │
│ │ Workload execution │ │
│ └───────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

Component Details

ComponentPurposeIP AddressPortHigh Availability
Floating IPSingle entry point192.168.1.106443Managed by Keepalived
lb-01Load balancer192.168.1.86443Keepalived MASTER
lb-02Load balancer192.168.1.96443Keepalived BACKUP
k3s-server-01Control plane + etcd192.168.1.116443Active member
k3s-server-02Control plane + etcd192.168.1.126443Active member
k3s-server-03Control plane + etcd192.168.1.136443Active member
k3s-worker-01Agent node192.168.1.2110250Workload distribution
k3s-worker-02Agent node192.168.1.2210250Workload distribution
k3s-worker-03Agent node192.168.1.2310250Workload distribution
KeepalivedIP failoverlb-01, lb-02VRRP2-node redundancy
HAProxyLoad balancinglb-01, lb-0264432-instance redundancy

Traffic Flow

  1. Client Connection: Clients connect to the floating IP 192.168.1.10:6443
  2. Keepalived: Routes traffic to the active load balancer (normally lb-01)
  3. HAProxy: Load balances incoming API requests across all three K3s control plane servers
  4. K3s API: Control plane processes requests and maintains cluster state via embedded etcd
  5. Worker Communication: Worker nodes connect to the floating IP for resilient control plane access
  6. Workload Distribution: Pods and services are scheduled across the worker nodes for actual workloads
  7. Failover: If active load balancer fails, Keepalived automatically promotes backup load balancer

Prerequisites

  • Proxmox VE environment with VM template (see Proxmox Ubuntu VM Template Setup Guide)
  • 8 Ubuntu VMs total:
    • 2 for load balancers (2 CPU cores, 512MB RAM minimum)
    • 3 for K3s servers (2 CPU cores, 4GB RAM minimum)
    • 3 for K3s agents (2 CPU cores, 8GB RAM minimum)
  • Network access between all nodes
  • Basic understanding of Kubernetes concepts

Step 1: Prepare Load Balancer VMs

  1. Clone VM template for load balancers

    Create 2 VMs from your Ubuntu template for dedicated load balancers:

    • VM IDs: 201, 202
    • Names: lb-01, lb-02
    • Clone Mode: Full clone
  2. Configure load balancer hardware resources

    Navigate to the Hardware tab for each VM and adjust:

    • Memory: 512 MB (should be sufficient for HAProxy and Keepalived)
    • Processors: 1 core
    • Disk: 10GB
  3. Configure load balancer network settings

    In the Cloud-Init tab, set static IP addresses:

    • lb-01: 192.168.1.8/24
    • lb-02: 192.168.1.9/24
    • Gateway: 192.168.1.1
  4. Start and verify load balancer VMs

    Start both VMs and verify connectivity.

Step 2: Prepare K3s Server VMs

  1. Clone VM template

    Create 3 VMs from your Ubuntu template, ideally distributed across Proxmox nodes for hardware redundancy:

    • VM IDs: 301, 302, 303
    • Names: k3s-server-01, k3s-server-02, k3s-server-03
    • Clone Mode: Full clone
  2. Configure hardware resources

    Navigate to the Hardware tab for each VM and adjust:

    • Memory: 4096 MB (4GB)
    • Processors: 2 cores
    • Disk: Ensure adequate space (minimum 20GB recommended)
  3. Configure network settings

    In the Cloud-Init tab, set static IP addresses:

    • k3s-server-01: 192.168.1.11/24
    • k3s-server-02: 192.168.1.12/24
    • k3s-server-03: 192.168.1.13/24
    • Gateway: 192.168.1.1
  4. Start and verify VMs

    Start all VMs and verify:

    • Correct IP addresses are assigned
    • SSH access is working
    • Internet connectivity is available
    Terminal window
    # Test connectivity to each node
    ssh root@192.168.1.11
    ssh root@192.168.1.12
    ssh root@192.168.1.13

Step 3: Install and Configure Keepalived

Keepalived manages our floating IP address and provides automatic failover between the dedicated load balancer nodes.

  1. Install keepalived

    Run on both lb-01 and lb-02:

    Terminal window
    sudo apt update
    sudo apt install keepalived -y
  2. Create keepalived configuration

    Create the configuration file on both VMs:

    Terminal window
    sudo nano /etc/keepalived/keepalived.conf
  3. Configure master node (lb-01)

    Terminal window
    vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 55
    priority 255
    advert_int 1
    authentication {
    auth_type PASS
    auth_pass SuPeRsEcReT
    }
    virtual_ipaddress {
    192.168.1.10/24
    }
    }
  4. Configure backup node (lb-02)

    Terminal window
    vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 55
    priority 150
    advert_int 1
    authentication {
    auth_type PASS
    auth_pass SuPeRsEcReT
    }
    virtual_ipaddress {
    192.168.1.10/24
    }
    }
  5. Start and enable keepalived

    On both load balancer nodes:

    Terminal window
    sudo systemctl enable --now keepalived.service
    sudo systemctl status keepalived.service
  6. Test failover functionality

    Terminal window
    # Test basic connectivity to load balancer nodes
    ping -c 3 192.168.1.8
    ping -c 3 192.168.1.9
    # Test floating IP
    ping -c 3 192.168.1.10

    Test failover by stopping keepalived on the master:

    Terminal window
    # On lb-01 (master)
    sudo systemctl stop keepalived.service
    # The floating IP should still respond
    ping -c 3 192.168.1.10
    # Check which node took over on lb-02
    sudo systemctl status keepalived.service
    # Restart the service on lb-01
    sudo systemctl start keepalived.service

Step 4: Install and Configure HAProxy

HAProxy will load balance the K3s API server traffic across all three K3s server nodes.

  1. Install HAProxy

    Run on both lb-01 and lb-02:

    Terminal window
    sudo apt update
    sudo apt install haproxy -y
  2. Configure HAProxy

    Edit the HAProxy configuration on both load balancer VMs:

    Terminal window
    sudo nano /etc/haproxy/haproxy.cfg

    Add the following configuration at the end of the file:

    Terminal window
    # K3s API Server Load Balancer
    frontend k3s_frontend
    bind *:6443
    mode tcp
    default_backend k3s_backend
    backend k3s_backend
    mode tcp
    option tcp-check
    balance roundrobin
    server k3s-server-01 192.168.1.11:6443 check
    server k3s-server-02 192.168.1.12:6443 check
    server k3s-server-03 192.168.1.13:6443 check
  3. Restart and enable HAProxy

    Run on both load balancer VMs:

    Terminal window
    sudo systemctl restart haproxy
    sudo systemctl enable haproxy
    sudo systemctl status haproxy
  4. Verify HAProxy configuration

    Check for configuration errors:

    Terminal window
    sudo haproxy -f /etc/haproxy/haproxy.cfg -c

Step 5: Install K3s Cluster

Now we’ll install K3s with embedded etcd to create our highly available cluster.

  1. Initialize the first control plane node

    On k3s-server-01, run:

    Terminal window
    curl -sfL https://get.k3s.io | K3S_TOKEN=SECRET sh -s - server \
    --cluster-init \
    --tls-san=192.168.1.10
  2. Wait for first node to be ready

    Verify the first node is running:

    Terminal window
    sudo systemctl status k3s
    sudo kubectl get nodes
  3. Join additional control plane nodes

    On k3s-server-02 and k3s-server-03, run:

    Terminal window
    curl -sfL https://get.k3s.io | K3S_TOKEN=SECRET sh -s - server \
    --server https://192.168.1.11:6443 \
    --tls-san=192.168.1.10
  4. Verify cluster status

    Check that all nodes have joined:

    Terminal window
    sudo kubectl get nodes -o wide
    sudo kubectl get pods -A

Step 6: Configure Agent (Worker) Nodes

Worker nodes are responsible for running your actual workloads (pods, containers). They join the cluster as agent nodes and communicate with the control plane via the floating IP for high availability.

  1. Clone VM template for worker nodes

    Create 3 worker VMs from the Ubuntu template, distributed across Proxmox nodes:

    • VM IDs: 311, 312, 313
    • Names: k3s-worker-01, k3s-worker-02, k3s-worker-03
    • Clone Mode: Full clone
    • Distribution: Place one worker on each Proxmox node alongside the control plane nodes
  2. Configure worker hardware resources

    Navigate to the Hardware tab for each worker VM and adjust:

    • Memory: 24576 MB (24GB - adjust based on your workload requirements)
    • Processors: 2 cores (adjust based on your workload requirements)
    • Disk: 32GB (consider more storage if using Longhorn or other persistent storage solutions)
  3. Configure worker network settings

    In the Cloud-Init tab, set static IP addresses:

    • k3s-worker-01: 192.168.1.21/24
    • k3s-worker-02: 192.168.1.22/24
    • k3s-worker-03: 192.168.1.23/24
    • Gateway: 192.168.1.1
  4. Start and verify worker VMs

    Start all worker VMs and verify connectivity:

    Terminal window
    # Test connectivity to each worker node
    ssh root@192.168.1.21
    ssh root@192.168.1.22
    ssh root@192.168.1.23
  5. Install K3s agent on worker nodes

    On each worker node (k3s-worker-01, k3s-worker-02, k3s-worker-03), run:

    Terminal window
    curl -sfL https://get.k3s.io | K3S_TOKEN=SECRET sh -s - agent \
    --server https://192.168.1.10:6443
  6. Verify cluster with worker nodes

    From any control plane node, verify all nodes have joined:

    Terminal window
    # Check all nodes are ready
    sudo kubectl get nodes -o wide
    # Should show something like:
    # NAME STATUS ROLES AGE VERSION
    # k3s-server-01 Ready control-plane,master 45m v1.28.x+k3s1
    # k3s-server-02 Ready control-plane,master 40m v1.28.x+k3s1
    # k3s-server-03 Ready control-plane,master 35m v1.28.x+k3s1
    # k3s-worker-01 Ready <none> 5m v1.28.x+k3s1
    # k3s-worker-02 Ready <none> 4m v1.28.x+k3s1
    # k3s-worker-03 Ready <none> 3m v1.28.x+k3s1
    # Check cluster info
    sudo kubectl cluster-info
    # Verify system pods are running
    sudo kubectl get pods -A
  7. Test workload scheduling

    Deploy a test application to verify workers can run pods:

    Terminal window
    # Create a test deployment
    sudo kubectl create deployment nginx-test --image=nginx --replicas=3
    # Check pod distribution across worker nodes
    sudo kubectl get pods -o wide
    # Clean up test deployment
    sudo kubectl delete deployment nginx-test

Next Steps

  • Configure persistent storage with Longhorn for high availability data persistence
  • Set up monitoring and logging
  • Implement backup strategies for your cluster data