RKE2 Deployment Guide

This guide provides comprehensive approaches to deploying RKE2 clusters for different scenarios: High Availability (HA) production environments and Edge computing deployments.

Deployment Patterns

High Availability Deployments

Use Case: Production datacenters, enterprise environments
Architecture: 3+ server nodes with external load balancers
Automation: Terraform for infrastructure + Argo CD for GitOps
Characteristics: Full redundancy, external dependencies

Edge Computing Deployments

Use Case: Remote locations, resource-constrained environments
Architecture: 1 server node with 2+ agents
Automation: edgectl for automated bootstrap
Characteristics: Minimal footprint, self-contained, intermittent connectivity

Edge Deployment with edgectl

For edge computing scenarios, I use my custom edgectl tool that automates the entire RKE2 lifecycle with secure token management via HashiCorp Vault.

Prerequisites

Before using edgectl, ensure you have:

HashiCorp Vault deployed and accessible
Vault credentials configured
Root access on target nodes
Network connectivity between nodes

Install edgectl

bash

# Install the latest version
go install github.com/michielvha/edgectl@latest

# Verify installation
edgectl version

Edge Cluster Bootstrap

Step 1: Bootstrap First Server Node

bash

# On the first server node, run:
edgectl rke2 server

# This will:
# 1. Generate a unique cluster-id (e.g., rke2-abc12345)
# 2. Install and configure RKE2 server
# 3. Store the join token in Vault at kv/data/rke2/<cluster-id>
# 4. Save cluster-id to /etc/edgectl/cluster-id

Step 2: Retrieve Cluster ID

bash

# Check the generated cluster-id
cat /etc/edgectl/cluster-id

# Example output: rke2-abc12345

Step 3: Join Agent Nodes

bash

# On each agent node, run:
edgectl rke2 agent --cluster-id rke2-abc12345

# This will:
# 1. Retrieve the join token from Vault using the cluster-id
# 2. Install and configure RKE2 agent
# 3. Join the agent to the control plane
# 4. No manual token handling required!

Step 4: Add Additional Server Nodes (Optional HA)

bash

# For high availability with multiple servers:
edgectl rke2 server --token <token-from-first-server>

# Note: Secondary servers still use manual token for now

Complete Edge Deployment Example

bash

# Server Node (192.168.10.10)
ssh [email protected]
edgectl rke2 server
# Note the cluster-id from /etc/edgectl/cluster-id

# Agent Node 1 (192.168.10.11)
ssh [email protected]
edgectl rke2 agent --cluster-id rke2-abc12345

# Agent Node 2 (192.168.10.12)
ssh [email protected]
edgectl rke2 agent --cluster-id rke2-abc12345

# Verify cluster
ssh [email protected]
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml
kubectl get nodes

edgectl Features

✅ Automated RKE2 Installation: Complete server and agent bootstrap
✅ Cluster ID Management: Unique cluster identification for multi-cluster environments
✅ HashiCorp Vault Integration: Secure token storage and retrieval
✅ Token-less Agent Join: Agents join using cluster-id, not raw tokens
✅ Embedded Scripts: Modular bash scripts for system-level operations
✅ Load Balancer Support: HAProxy + Keepalived configuration

How edgectl Works

Server Bootstrap: Generates a unique cluster-id (e.g., rke2-abc12345)
Token Storage: Automatically stores join token in Vault at kv/data/rke2/<cluster-id>
Cluster ID Persistence: Saves cluster-id to /etc/edgectl/cluster-id
Agent Join: Agents retrieve token from Vault using cluster-id
No Manual Token Handling: All token operations are automated and secure

Token Lifecycle with edgectl

Stage	Action	Location
Server Bootstrap	Token generated and stored	Vault: `kv/data/rke2/<cluster-id>`
Cluster ID Created	Unique ID persisted	Node: `/etc/edgectl/cluster-id`
Agent Installation	Token retrieved automatically	Vault: `kv/data/rke2/<cluster-id>`
Additional Masters	Manual token for HA setup	Provided via `--token` flag

File Layout

# Local Node Files
/etc/edgectl/cluster-id              # Stores the generated cluster-id

# Vault Paths
kv/data/rke2/<cluster-id>            # Join token + metadata for the cluster

High Availability Deployment

For production environments requiring maximum uptime and redundancy, use the traditional Terraform + manual approach.

Pre-Deployment Planning

Infrastructure Requirements

Before deployment, ensure your infrastructure meets these requirements:

Component	Requirement	Notes
Load Balancers	2x Layer 4 TCP LB	Active-passive configuration
Server Nodes	3x (odd number)	Embedded etcd requires odd count
Agent Nodes	3+	Scale based on workload requirements
Operating System	Ubuntu 22.04+	SELinux/AppArmor supported
Network	Low latency LAN	<10ms between nodes preferred

Network Preparation

Ensure the following ports are accessible between nodes:

bash

# Server-to-Server Communication
6443/tcp   # Kubernetes API
9345/tcp   # RKE2 Supervisor API  
2379/tcp   # etcd client
2380/tcp   # etcd peer
2381/tcp   # etcd metrics

# All Nodes
10250/tcp  # Kubelet API
30000-32767/tcp # NodePort range

# Cilium CNI
4240/tcp   # Health checks
4244/tcp   # Hubble gRPC
8472/udp   # VXLAN (if enabled)

Automated Deployment Module

I've developed a deployment module that automates the entire RKE2 bootstrap process. The module is available at:

bash

# Source the RKE2 deployment module
source <(curl -fsSL https://raw.githubusercontent.com/michielvha/PDS/main/bash/module/rke2.sh)

Server Node Bootstrap

The automated server bootstrap includes:

✅ Operating system hardening and preparation
✅ Firewall configuration (UFW/iptables)
✅ RKE2 installation with CIS profile
✅ Cilium CNI configuration
✅ Load balancer integration
✅ Certificate SAN configuration

bash

# Bootstrap first server node
configure_rke2_server_primary

# Bootstrap additional server nodes  
configure_rke2_server_additional

Agent Node Bootstrap

Agent node automation includes:

✅ Operating system preparation
✅ Firewall configuration for agent role
✅ RKE2 agent installation and configuration
✅ Automatic cluster joining

bash

# Bootstrap agent nodes
configure_rke2_agent

Manual Deployment Steps

For environments requiring manual deployment, follow these steps:

Step 1: System Preparation

Update the system and install prerequisites:

bash

# Update system packages
sudo apt update && sudo apt upgrade -y

# Install required packages
sudo apt install -y curl wget unzip

# Configure firewall (UFW example)
sudo ufw enable
sudo ufw allow 22/tcp    # SSH
sudo ufw allow 6443/tcp  # Kubernetes API
sudo ufw allow 9345/tcp  # RKE2 Supervisor
sudo ufw allow 10250/tcp # Kubelet

Step 2: RKE2 Installation

Download and install RKE2:

bash

# Download RKE2 installation script
curl -sfL https://get.rke2.io | sudo sh -

# Enable and start RKE2 service
sudo systemctl enable rke2-server.service

Step 3: Server Configuration

Primary Server Node

Create the initial server configuration:

bash

sudo mkdir -p /etc/rancher/rke2

cat <<EOF | sudo tee /etc/rancher/rke2/config.yaml
# Basic Configuration
write-kubeconfig-mode: "0644"
profile: "cis"

# Load Balancer Configuration
tls-san:
  - "lb.edge.example.com"
  - "192.168.10.100"  # LB VIP
  - "192.168.10.10"   # Local node IP

# Cilium CNI Configuration
cni: "cilium"
disable:
  - "rke2-kube-proxy"  # Cilium replaces kube-proxy

# Security Configuration
selinux: true
secrets-encryption: true
EOF

Additional Server Nodes

For the second and third server nodes:

bash

cat <<EOF | sudo tee /etc/rancher/rke2/config.yaml
# Cluster Join Configuration
server: https://lb.edge.example.com:9345
token: ${NODE_TOKEN}  # From first server

# Basic Configuration  
write-kubeconfig-mode: "0644"
profile: "cis"

# Load Balancer Configuration
tls-san:
  - "lb.edge.example.com"
  - "192.168.10.100"

# CNI Configuration
cni: "cilium"
disable:
  - "rke2-kube-proxy"

# Security Configuration
selinux: true
secrets-encryption: true
EOF

Agent Node Configuration

For worker nodes:

bash

cat <<EOF | sudo tee /etc/rancher/rke2/config.yaml
# Cluster Join Configuration
server: https://lb.edge.example.com:9345
token: ${NODE_TOKEN}

# Profile Configuration
profile: "cis"

# Security Configuration
selinux: true
EOF

Step 4: CIS Compliance Preparation

For CIS-compliant deployments, create the required etcd user:

bash

# Create etcd group and user
sudo groupadd --system etcd
sudo useradd --system --no-create-home --shell /sbin/nologin --gid etcd etcd

Step 5: Service Management

Start RKE2 services:

bash

# Start server nodes first
sudo systemctl start rke2-server.service

# Retrieve node token (from first server)
sudo cat /var/lib/rancher/rke2/server/node-token

# Start agent nodes after servers are running
sudo systemctl start rke2-agent.service

Post-Deployment Configuration

Configure kubectl Access

Set up kubectl for cluster administration:

bash

# Export kubeconfig
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml

# Add to shell profile
echo 'export KUBECONFIG=/etc/rancher/rke2/rke2.yaml' >> ~/.bashrc
echo 'export PATH=$PATH:/var/lib/rancher/rke2/bin' >> ~/.bashrc

# Source the profile
source ~/.bashrc

Add Useful Aliases

bash

cat <<EOF >> ~/.bashrc
# Kubernetes aliases
alias k='kubectl'
alias kga='kubectl get all'
alias kgp='kubectl get pods'
alias kgn='kubectl get nodes'
alias kd='kubectl describe'
alias kl='kubectl logs'
alias ke='kubectl edit'

# Enable kubectl completion
source <(kubectl completion bash)
complete -F __start_kubectl k
EOF

Verify Cluster Health

bash

# Check node status
kubectl get nodes -o wide

# Verify all system pods are running
kubectl get pods -A

# Check cluster info
kubectl cluster-info

# Remove completed pods
kubectl delete pod -n kube-system --field-selector=status.phase=Succeeded

Cilium Configuration

Custom Cilium Values

RKE2 allows customization of Cilium via HelmChartConfig. Create the following configuration:

bash

cat <<EOF | sudo tee /var/lib/rancher/rke2/server/manifests/cilium-config.yaml
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-cilium
  namespace: kube-system
spec:
  valuesContent: |-
    # Enable eBPF mode and disable kube-proxy
    kubeProxyReplacement: true
    
    # Enable Hubble observability
    hubble:
      enabled: true
      relay:
        enabled: true
      ui:
        enabled: true
    
    # Performance optimizations
    bpf:
      masquerade: true
      hostServices:
        enabled: true
    
    # Security enhancements
    encryption:
      enabled: true
      type: wireguard
EOF

Troubleshooting Common Issues

Service Won't Start

bash

# Check service status
sudo systemctl status rke2-server.service

# View detailed logs
sudo journalctl -u rke2-server.service -f

# Check configuration
sudo rke2 server --config /etc/rancher/rke2/config.yaml --dry-run

Node Join Failures

bash

# Verify token
sudo cat /var/lib/rancher/rke2/server/node-token

# Check network connectivity
curl -k https://lb.edge.example.com:9345/ping

# Verify DNS resolution
nslookup lb.edge.example.com

Certificate Issues

bash

# Regenerate certificates
sudo rm -rf /var/lib/rancher/rke2/server/tls/
sudo systemctl restart rke2-server.service

# Check certificate SANs
openssl x509 -in /var/lib/rancher/rke2/server/tls/server-ca.crt -text -noout

GitOps Integration

Once the cluster is deployed, integrate with your GitOps workflow:

Cluster Secret Creation: Generate Argo CD cluster secrets
Policy Application: Apply security and network policies
Workload Deployment: Deploy applications via GitOps

This deployment approach ensures consistent, secure, and manageable RKE2 clusters that integrate seamlessly with modern platform engineering practices.

aro

rke2

handbook

Reference Architectures

vault

armbian

Build custom armbian images

single board compute

RKE2 Deployment Guide ​

Deployment Patterns ​

High Availability Deployments ​

Edge Computing Deployments ​

Edge Deployment with edgectl ​

Prerequisites ​

Install edgectl ​

Edge Cluster Bootstrap ​

Step 1: Bootstrap First Server Node ​

Step 2: Retrieve Cluster ID ​

Step 3: Join Agent Nodes ​

Step 4: Add Additional Server Nodes (Optional HA) ​

Complete Edge Deployment Example ​

edgectl Features ​

How edgectl Works ​

Token Lifecycle with edgectl ​

File Layout ​

High Availability Deployment ​

Pre-Deployment Planning ​

Infrastructure Requirements ​

Network Preparation ​

Automated Deployment Module ​

Server Node Bootstrap ​

Agent Node Bootstrap ​

Manual Deployment Steps ​

Step 1: System Preparation ​

Step 2: RKE2 Installation ​

Step 3: Server Configuration ​

Primary Server Node ​

Additional Server Nodes ​

Agent Node Configuration ​

Step 4: CIS Compliance Preparation ​

Step 5: Service Management ​

Post-Deployment Configuration ​

Configure kubectl Access ​

Add Useful Aliases ​

Verify Cluster Health ​

Cilium Configuration ​

Custom Cilium Values ​

Troubleshooting Common Issues ​

Service Won't Start ​

Node Join Failures ​

Certificate Issues ​

GitOps Integration ​

RKE2 Deployment Guide

Deployment Patterns

High Availability Deployments

Edge Computing Deployments

Edge Deployment with edgectl

Prerequisites

Install edgectl

Edge Cluster Bootstrap

Step 1: Bootstrap First Server Node

Step 2: Retrieve Cluster ID

Step 3: Join Agent Nodes

Step 4: Add Additional Server Nodes (Optional HA)

Complete Edge Deployment Example

edgectl Features

How edgectl Works

Token Lifecycle with edgectl

File Layout

High Availability Deployment

Pre-Deployment Planning

Infrastructure Requirements

Network Preparation

Automated Deployment Module

Server Node Bootstrap

Agent Node Bootstrap

Manual Deployment Steps

Step 1: System Preparation

Step 2: RKE2 Installation

Step 3: Server Configuration

Primary Server Node

Additional Server Nodes

Agent Node Configuration

Step 4: CIS Compliance Preparation

Step 5: Service Management

Post-Deployment Configuration

Configure kubectl Access

Add Useful Aliases

Verify Cluster Health

Cilium Configuration

Custom Cilium Values

Troubleshooting Common Issues

Service Won't Start

Node Join Failures

Certificate Issues

GitOps Integration