Ubuntu 22.04高可用Kubernetes 1.34集群部署(3 Master节点)

概述

生产环境的Kubernetes集群需要高可用(HA)架构来确保控制平面的稳定性和可靠性。本文将详细介绍如何在Ubuntu 22.04操作系统上,使用containerd作为容器运行时,部署包含3个Master节点的高可用Kubernetes集群。

高可用架构说明

架构设计

本文采用以下高可用架构:

  • 3个Master节点:实现控制平面的高可用,etcd集群跨3节点部署
  • 负载均衡器:使用HAProxy + Keepalived实现VIP漂移
  • Worker节点:根据业务需求可动态扩展

服务器规划

主机名          IP地址           角色           配置
--------------------------------------------------------------
k8s-master1    192.168.1.10    Master + LB    4核8GB
k8s-master2    192.168.1.11    Master + LB    4核8GB
k8s-master3    192.168.1.12    Master + LB    4核8GB
k8s-node1      192.168.1.13    Worker         2核4GB
k8s-node2      192.168.1.14    Worker         2核4GB

# 虚拟IP(VIP)
192.168.1.100   k8s-vip       负载均衡IP

网络规划

Pod网络 CIDR:     10.244.0.0/16
Service CIDR:     10.96.0.0/12
节点网络:         192.168.1.0/24

准备工作

1. 所有节点配置hosts文件

# 在所有节点执行
sudo tee -a /etc/hosts << EOF
192.168.1.10 k8s-master1
192.168.1.11 k8s-master2
192.168.1.12 k8s-master3
192.168.1.13 k8s-node1
192.168.1.14 k8s-node2
EOF

2. 配置主机名

# Master1节点
sudo hostnamectl set-hostname k8s-master1

# Master2节点
sudo hostnamectl set-hostname k8s-master2

# Master3节点
sudo hostnamectl set-hostname k8s-master3

# Node1节点
sudo hostnamectl set-hostname k8s-node1

# Node2节点
sudo hostnamectl set-hostname k8s-node2

3. 关闭防火墙

# 在所有节点执行
sudo ufw disable

# 或者开放必要端口
sudo ufw allow 6443/tcp      # Kubernetes API Server
sudo ufw allow 2379/tcp      # etcd client
sudo ufw allow 2380/tcp      # etcd peer
sudo ufw allow 10250/tcp     # Kubelet
sudo ufw allow 10255/tcp     # Kubelet read-only
sudo ufw allow 30000-32767/tcp  # NodePort Services
sudo ufw reload

4. 配置Swap

# 在所有节点执行
sudo swapoff -a
sudo sed -i '/ swap / s/^/#/' /etc/fstab

5. 加载内核模块

# 在所有节点执行
sudo tee /etc/modules-load.d/k8s.conf << EOF
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

sudo tee /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF

sudo sysctl --system

安装配置容器运行时(containerd)

1. 安装containerd

# 在所有节点执行
sudo apt-get update
sudo apt-get install -y containerd

# 配置containerd
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml

2. 配置systemd cgroup驱动

# 在所有节点执行
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml

# 配置镜像加速(可选)
sudo tee /etc/containerd/registries.toml << EOF
[plugins."io.containerd.grpc.v1.cri".registry]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors]
    [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
      endpoint = ["https://mirror.ccs.tencentyun.com"]
    [plugins."io.containerd.grpc.v1.cri".registry.mirrors."k8s.gcr.io"]
      endpoint = ["https://gcr.lank8s.cn"]
EOF

# 重启containerd
sudo systemctl restart containerd
sudo systemctl enable containerd

安装Kubernetes组件

1. 添加Kubernetes仓库

# 在所有节点执行
sudo curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.34/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.34/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

sudo apt-get update

2. 安装组件

# 在所有节点执行
sudo apt-get install -y kubeadm=1.34.0-1.1 kubelet=1.34.0-1.1 kubectl=1.34.0-1.1

# 锁定版本
sudo apt-mark hold kubeadm kubelet kubectl

# 验证
kubeadm version
kubelet --version

配置高可用负载均衡器

1. 安装HAProxy和Keepalived

# 在所有Master节点执行
sudo apt-get install -y haproxy keepalived

# 验证安装
haproxy -v
keepalived --version

2. 配置HAProxy

# 在所有Master节点执行
sudo tee /etc/haproxy/haproxy.cfg << EOF
global
    log /dev/log    local0
    log /dev/log    local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000
    timeout client  50000
    timeout server  50000

frontend k8s-api
    bind *:8443
    mode tcp
    option tcplog
    default_backend k8s-api-backend

backend k8s-api-backend
    mode tcp
    option tcplog
    balance roundrobin
    server k8s-master1 192.168.1.10:6443 check
    server k8s-master2 192.168.1.11:6443 check
    server k8s-master3 192.168.1.12:6443 check
EOF

# 重启HAProxy
sudo systemctl restart haproxy
sudo systemctl enable haproxy

3. 配置Keepalived

# Master1节点配置
sudo tee /etc/keepalived/keepalived.conf << EOF
! Master1配置
global_defs {
    router_id LVS_DEVEL
    script_user root
    enable_script_security
}

vrrp_script chk_haproxy {
    script "/bin/bash -c '[[ \$(pgrep haproxy) ]]'"
    interval 2
    weight 2
    fall 3
    rise 2
}

vrrp_instance VI_K8S {
    state MASTER
    interface eth0
    virtual_router_id 100
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 12345678
    }
    virtual_ipaddress {
        192.168.1.100/24 dev eth0 label eth0:1
    }
    track_script {
        chk_haproxy
    }
}
EOF

# Master2节点配置(priority: 90)
sudo tee /etc/keepalived/keepalived.conf << EOF
! Master2配置
global_defs {
    router_id LVS_DEVEL
    script_user root
    enable_script_security
}

vrrp_script chk_haproxy {
    script "/bin/bash -c '[[ \$(pgrep haproxy) ]]'"
    interval 2
    weight 2
    fall 3
    rise 2
}

vrrp_instance VI_K8S {
    state BACKUP
    interface eth0
    virtual_router_id 100
    priority 90
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 12345678
    }
    virtual_ipaddress {
        192.168.1.100/24 dev eth0 label eth0:1
    }
    track_script {
        chk_haproxy
    }
}
EOF

# Master3节点配置(priority: 80)
sudo tee /etc/keepalived/keepalived.conf << EOF
! Master3配置
global_defs {
    router_id LVS_DEVEL
    script_user root
    enable_script_security
}

vrrp_script chk_haproxy {
    script "/bin/bash -c '[[ \$(pgrep haproxy) ]]'"
    interval 2
    weight 2
    fall 3
    rise 2
}

vrrp_instance VI_K8S {
    state BACKUP
    interface eth0
    virtual_router_id 100
    priority 80
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 12345678
    }
    virtual_ipaddress {
        192.168.1.100/24 dev eth0 label eth0:1
    }
    track_script {
        chk_haproxy
    }
}
EOF

# 重启Keepalived
sudo systemctl restart keepalived
sudo systemctl enable keepalived

# 验证VIP
ip addr show eth0 | grep 192.168.1.100

初始化第一个Master节点

1. 预拉取镜像

# 在k8s-master1执行
kubeadm config images pull \
  --kubernetes-version v1.34.0 \
  --image-repository registry.aliyuncs.com/google_containers

2. 创建初始化配置文件

# 在k8s-master1执行
cat > kubeadm-config.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta4
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.1.10
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///run/containerd/containerd.sock
  imagePullPolicy: IfNotPresent
  taints: []
---
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
kubernetesVersion: v1.34.0
controlPlaneEndpoint: "192.168.1.100:8443"
apiServer:
  timeoutForControlPlane: 4m0s
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
  imageRepository: registry.aliyuncs.com/google_containers
  imageTag: v1.11.1
etcd:
  local:
    dataDir: /var/lib/etcd
    imageRepository: registry.aliyuncs.com/google_containers
    imageTag: 3.5.12-0
imageRepository: registry.aliyuncs.com/google_containers
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/12
scheduler: {}
EOF

3. 初始化集群

# 在k8s-master1执行
sudo kubeadm init --config kubeadm-config.yaml --upload-certs

# 初始化成功后请保存好显示的join命令
# 用于添加其他Master节点和Worker节点

4. 配置kubectl并安装网络插件

# 配置kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

# 安装Calico网络插件
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

# 等待就绪
kubectl wait --for=condition=Ready pods -l k8s-app=calico-node -n kube-system --timeout=300s

添加其他Master节点

# 在k8s-master2和k8s-master3执行(使用初始化时生成的命令)
sudo kubeadm join 192.168.1.100:8443 \
  --token abcdef.1234567890abcdef \
  --discovery-token-ca-cert-hash sha256:xxxxxxxxxx \
  --control-plane \
  --certificate-key xxxxxxxxxx \
  --cri-socket=unix:///run/containerd/containerd.sock

# 配置kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

验证集群状态

# 查看节点
kubectl get nodes

# 查看etcd集群
kubectl exec -n kube-system etcd-k8s-master1 -- etcdctl endpoint health --cluster

添加Worker节点

# 在每个Worker节点执行
sudo kubeadm join 192.168.1.100:8443 \
  --token abcdef.1234567890abcdef \
  --discovery-token-ca-cert-hash sha256:xxxxxxxxxx \
  --cri-socket=unix:///run/containerd/containerd.sock

高可用验证

1. 测试etcd集群

# 检查etcd健康状态
kubectl exec -n kube-system etcd-k8s-master1 -- etcdctl endpoint health --cluster
kubectl exec -n kube-system etcd-k8s-master1 -- etcdctl member list --cluster

2. 测试VIP漂移

# 查看当前VIP所在节点
ip addr | grep 192.168.1.100

# 停止Master1的Keepalived
sudo systemctl stop keepalived

# 等待几秒后查看VIP是否漂移
ip addr | grep 192.168.1.100

# 测试API Server是否可用
curl -k https://192.168.1.100:8443/api/v1/nodes

# 恢复Master1的Keepalived
sudo systemctl start keepalived

常用运维命令

集群状态检查

# 查看节点状态
kubectl get nodes -o wide
kubectl top nodes

# 查看所有Pod状态
kubectl get pods -A

# 查看系统事件
kubectl get events --sort-by='.metadata.creationTimestamp'

证书管理

<code# 查看证书到期时间
kubeadm certs check-expiration

# 续期所有证书
kubeadm certs renew all

# 重启kubelet
sudo systemctl restart kubelet

节点维护

<code# 标记节点不可调度
kubectl cordon 

# 驱逐Pod
kubectl drain  --ignore-daemonsets

# 恢复调度
kubectl uncordon 

备份etcd

<code# 备份etcd
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-backup.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

故障排查

Q1: Master节点无法加入

解决方法:

  1. 检查网络连通性
  2. 验证token是否有效
  3. 检查防火墙规则
<code# 重新生成join命令
kubeadm token create --print-join-command --certificate-key

Q2: etcd集群异常

解决方法:

  1. 检查网络连通性
  2. 检查防火墙是否开放2380端口
  3. 查看etcd日志
<code# 查看etcd日志
kubectl logs -n kube-system -l component=etcd

Q3: VIP不漂移

解决方法:

  1. 检查网络接口名称
  2. 检查防火墙是否阻止VRRP
  3. 验证Keepalived配置
<code# 查看Keepalived日志
sudo journalctl -u keepalived -f

总结

通过本文的详细介绍,你应该已经掌握了在Ubuntu 22.04上使用containerd部署3节点高可用Kubernetes集群的完整方法。

关键要点:

  • 使用HAProxy + Keepalived实现API Server负载均衡和VIP漂移
  • 3个Master节点确保控制平面高可用
  • etcd集群跨3节点部署,数据安全可靠
  • 定期备份etcd数据
  • 制定节点维护和故障恢复流程

参考资源

发表回复

后才能评论