Kubernetes 1.34安装配置详解:从入门到生产环境

概述

Kubernetes(简称K8s)是目前最流行的容器编排平台,1.34版本带来了许多新特性和改进。本文将详细介绍Kubernetes 1.34的安装配置、新特性体验以及日常运维操作,帮助你快速上手这个强大的容器编排系统。

前置条件

  • 至少3台Linux服务器(Master 1台 + Worker 2台)
  • 每台服务器至少2GB内存、2核CPU
  • 稳定的网络连接
  • 对Linux命令行有基本了解

服务器规划

主机名          IP地址           角色
-----------------------------------------
k8s-master     192.168.1.10    Master/Worker
k8s-node1      192.168.1.11    Worker
k8s-node2      192.168.1.12    Worker

系统准备

1. 配置主机名

# Master节点
hostnamectl set-hostname k8s-master

# Node1节点
hostnamectl set-hostname k8s-node1

# Node2节点
hostnamectl set-hostname k8s-node2

2. 配置hosts文件

cat >> /etc/hosts << EOF
192.168.1.10 k8s-master
192.168.1.11 k8s-node1
192.168.1.12 k8s-node2
EOF

3. 关闭防火墙

# 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld

# 或者配置防火墙规则(生产环境推荐)
firewall-cmd --permanent --add-port=6443/tcp
firewall-cmd --permanent --add-port=2379-2380/tcp
firewall-cmd --permanent --add-port=10250-10255/tcp
firewall-cmd --reload

4. 关闭SELinux

# 临时关闭
setenforce 0

# 永久关闭
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config

# 重启生效
reboot

5. 配置Swap分区

# 临时关闭swap
swapoff -a

# 永久关闭(注释掉swap行)
sed -i '/ swap / s/^/#/' /etc/fstab

6. 加载内核模块

cat >> /etc/modules-load.d/k8s.conf <> /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF

# 应用配置
sysctl --system

7. 安装容器运行时

# 安装Docker
yum install -y yum-utils
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install docker-ce docker-ce-cli containerd.io -y

# 启动Docker
systemctl start docker
systemctl enable docker

# 配置Docker使用systemd作为cgroup驱动
cat > /etc/docker/daemon.json << EOF
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF

# 重启Docker
systemctl restart docker

安装Kubernetes组件

1. 添加Kubernetes仓库

cat >> /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF

2. 安装Kubernetes组件

# 安装kubeadm、kubelet、kubectl
yum install -y kubeadm-1.34.0 kubelet-1.34.0 kubectl-1.34.0 --disableexcludes=kubernetes

# 启动kubelet
systemctl enable kubelet
systemctl start kubelet

# 验证安装
kubeadm version
kubelet --version

3. 初始化Master节点

# 在Master节点执行
kubeadm init   --apiserver-advertise-address=192.168.1.10   --image-repository registry.aliyuncs.com/google_containers   --kubernetes-version v1.34.0   --service-cidr=10.96.0.0/12   --pod-network-cidr=10.244.0.0/16

# 初始化成功后会显示类似以下信息
# [init] Using Kubernetes version: v1.34.0
# [preflight] Running pre-flight checks
# [preflight] Pulling images required for setting up a Kubernetes cluster
# [preflight] This might take a minute or two, depending on your internet speed
# [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a temporary certificate authority.
# [certificates] Generated front-proxy-ca certificate and key.
# [certificates] Generated front-proxy-client certificate and key.
# [certificates] Generated etcd/ca certificate and key.
# [certificates] Generated etcd/server certificate and key.
# [certificates] Generated etcd/peer certificate and key.
# [certificates] Generated etcd/healthcheck-client certificate and key.
# [certificates] Generated apiserver-etcd-client certificate and key.
# [certificates] Generated apiserver certificate and key.
# [certificates] Generated apiserver-kubelet-client certificate and key.
# [certificates] Generated root CA certificate and key.
# [upload-certs] Uploaded certificates to Secret.
# [mark-control-plane] Marking the node k8s-master as control-plane and adding taints.
# [bootstrap-token] Using token: abcdef.1234567890abcdef
# [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
# [bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve csr from a Node Bootstrapping token
# [bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
# [kubelet-start] Starting the kubelet
# [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

# 记录最后显示的join命令
# 例如:
# kubeadm join 192.168.1.10:6443 --token abcdef.1234567890abcdef #     --discovery-token-ca-cert-hash sha256:xxxxxx

4. 配置kubectl

# 配置kubectl
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

# 验证配置
kubectl get nodes
kubectl get pods -n kube-system

5. 安装网络插件(Calico)

# 安装Calico网络插件
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

# 或者使用Helm安装
helm repo add projectcalico https://projectcalico.org/charts
helm install calico projectcalico/tigera-operator --version 3.28.0 -n calico-system --create-namespace

# 验证安装
kubectl get pods -n kube-system | grep calico

6. 添加Worker节点

# 在每个Worker节点上执行(使用初始化时生成的命令)
kubeadm join 192.168.1.10:6443   --token abcdef.1234567890abcdef   --discovery-token-ca-cert-hash sha256:xxxxxx

# 如果忘记token,可以重新生成
kubeadm token create --print-join-command

# 验证节点状态
kubectl get nodes -o wide

Kubernetes 1.34新特性

1. Server-Side Field Validation(服务端字段验证)

Kubernetes 1.34增强了服务端字段验证功能,提供更详细的错误信息。

# 启用服务端字段验证
kubectl apply -f pod.yaml --validate=true

# 验证Pod定义
kubectl apply -f pod.yaml --validate=strict

2. Kubelet资源管理增强

1.34版本改进了资源管理,支持更精细的CPU亲和性控制。

# Pod资源限制示例
apiVersion: v1
kind: Pod
metadata:
  name: resource-demo
spec:
  containers:
  - name: demo
    image: nginx
    resources:
      requests:
        memory: "128Mi"
        cpu: "500m"
      limits:
        memory: "256Mi"
        cpu: "1000m"
      # CPU亲和性配置
      resources:
        limits:
          kubernetes.io/cpu: "2"
        requests:
          kubernetes.io/cpu: "1"

3. 临时存储管理增强

改进了临时存储(ephemeral-storage)的管理和监控。

apiVersion: v1
kind: Pod
metadata:
  name: storage-demo
spec:
  containers:
  - name: demo
    image: nginx
    resources:
      limits:
        ephemeral-storage: "2Gi"
      requests:
        ephemeral-storage: "1Gi"
    volumeMounts:
    - name: tmp-volume
      mountPath: /tmp
  volumes:
  - name: tmp-volume
    emptyDir:
      sizeLimit: "1Gi"

常用操作命令

1. Pod操作

# 创建Pod
kubectl run nginx --image=nginx

# 创建Pod(指定资源限制)
kubectl run web --image=nginx --requests=cpu=500m,memory=256Mi --limits=cpu=1,memory=512Mi

# 查看Pod列表
kubectl get pods
kubectl get pods -o wide
kubectl get pods -n kube-system

# 查看Pod详情
kubectl describe pod nginx

# 查看Pod日志
kubectl logs nginx
kubectl logs nginx -c container-name
kubectl logs -f nginx

# 删除Pod
kubectl delete pod nginx
kubectl delete pod nginx --force --grace-period=0

# 进入Pod内部
kubectl exec -it nginx -- /bin/bash

# 缩放Pod副本数
kubectl scale deployment nginx --replicas=3

2. Deployment操作

# 创建Deployment
kubectl create deployment nginx --image=nginx

# 暴露服务
kubectl expose deployment nginx --port=80 --type=NodePort

# 查看Deployment
kubectl get deployments
kubectl get deployment nginx -o yaml

# 更新Deployment
kubectl set image deployment/nginx nginx=nginx:1.25

# 查看更新状态
kubectl rollout status deployment/nginx

# 回滚Deployment
kubectl rollout undo deployment/nginx
kubectl rollout undo deployment/nginx --to-revision=2

# 查看历史版本
kubectl rollout history deployment/nginx

# 扩缩容
kubectl scale deployment/nginx --replicas=5

# 暂停/恢复更新
kubectl rollout pause deployment/nginx
kubectl rollout resume deployment/nginx

3. Service操作

# 创建Service
kubectl expose pod my-pod --port=80 --target-port=8080

# 创建Service(ClusterIP)
kubectl create service clusterip my-service --tcp=80:8080

# 创建Service(NodePort)
kubectl create service nodeport my-service --tcp=80:80

# 创建Service(LoadBalancer)
kubectl create service loadbalancer my-service --tcp=80:80

# 查看Service
kubectl get services
kubectl get svc

# 查看Service详情
kubectl describe service nginx

# 删除Service
kubectl delete service nginx

4. Namespace操作

# 查看Namespace
kubectl get namespaces
kubectl get ns

# 创建Namespace
kubectl create namespace dev
kubectl create -f namespace.yaml

# 切换Namespace
kubectl config set-context --current --namespace=dev

# 删除Namespace
kubectl delete namespace dev

5. ConfigMap和Secret

# 创建ConfigMap
kubectl create configmap app-config --from-literal=key1=value1 --from-literal=key2=value2
kubectl create configmap app-config --from-file=config.properties

# 创建Secret
kubectl create secret generic db-secret --from-literal=username=admin --from-literal=password=123456
kubectl create secret tls tls-secret --cert=tls.crt --key=tls.key

# 在Pod中使用ConfigMap
apiVersion: v1
kind: Pod
metadata:
  name: configmap-pod
spec:
  containers:
  - name: demo
    image: nginx
    envFrom:
    - configMapRef:
        name: app-config
    volumeMounts:
    - name: config-volume
      mountPath: /etc/config
  volumes:
  - name: config-volume
    configMap:
      name: app-config

6. Ingress配置

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 80
      - path: /web
        pathType: Prefix
        backend:
          service:
            name: web-service
            port:
              number: 80

7. 资源配额和限制

# 创建ResourceQuota
apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
  namespace: dev
spec:
  hard:
    cpu: "10"
    memory: "20Gi"
    pods: "20"
    services: "10"
    replicationcontrollers: "20"
    resourcequotas: "1"
    persistentvolumeclaims: "5"

# 创建LimitRange
apiVersion: v1
kind: LimitRange
metadata:
  name: limit-range
  namespace: dev
spec:
  limits:
  - default:
      cpu: "500m"
      memory: "256Mi"
    defaultRequest:
      cpu: "200m"
      memory: "128Mi"
    type: Container

日常运维

1. 集群状态检查

# 检查节点状态
kubectl get nodes
kubectl get nodes -o wide
kubectl top nodes

# 检查Pod状态
kubectl get pods -A
kubectl get pods -A -o wide
kubectl top pods -A

# 检查集群组件状态
kubectl get componentstatuses

# 检查Events
kubectl get events --sort-by='.metadata.creationTimestamp'
kubectl get events -n kube-system --sort-by='.lastTimestamp'

# 检查集群健康状况
kubectl cluster-info
kubectl get --raw='/healthz'

2. 故障排查

# 查看Pod详细状态
kubectl describe pod 

# 查看Pod日志
kubectl logs 
kubectl logs  -c 
kubectl logs  --previous

# 查看资源使用情况
kubectl top pods -n 
kubectl top nodes

# 进入容器调试
kubectl exec -it  -- /bin/bash

# 端口转发
kubectl port-forward pod/ 8080:80

# 代理访问Dashboard
kubectl proxy

# 查看Endpoint
kubectl get endpoints 

3. 备份和恢复

# 使用Velero备份
velero backup create backup-2024 --include-namespaces default
velero backup create backup-2024 --exclude-namespaces kube-system

# 查看备份
velero backup get

# 恢复备份
velero restore create --from-backup backup-2024

# 备份etcd(如果使用etcd)
ETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-backup.db   --endpoints=https://127.0.0.1:2379   --cacert=/etc/kubernetes/pki/etcd/ca.crt   --cert=/etc/kubernetes/pki/etcd/server.crt   --key=/etc/kubernetes/pki/etcd/server.key

4. 集群升级

# Master节点升级
apt-get update
apt-get install -y kubeadm=1.35.0 kubectl=1.35.0 kubelet=1.35.0

# 检查升级计划
kubeadm upgrade plan

# 执行升级
kubeadm upgrade apply v1.35.0

# 设置cordon标记节点(维护时)
kubectl cordon 
kubectl drain  --ignore-daemonsets --delete-emptydir-data

# 升级kubelet
apt-get install -y kubelet=1.35.0
systemctl restart kubelet

# 恢复节点
kubectl uncordon 

5. 监控和日志

# 部署Prometheus
helm install prometheus stable/prometheus-operator -n monitoring --create-namespace

# 部署Grafana
helm install grafana stable/grafana -n monitoring

# 部署ELK日志系统
helm install elasticsearch elastic/elasticsearch -n logging --create-namespace
helm install kibana elastic/kibana -n logging
helm install filebeat elastic/filebeat -n logging

# 查看metrics
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes
kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods

常见问题

Q1: Pod一直处于Pending状态

问题描述:Pod创建后一直处于Pending状态

解决方法:

  1. 检查节点资源是否足够
  2. 检查是否有节点被标记为不可调度
  3. 检查PVC是否正确绑定
# 查看详细原因
kubectl describe pod 

# 检查节点资源
kubectl describe nodes | grep -A 10 "Allocated resources"
kubectl top nodes

# 检查节点状态
kubectl get nodes
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"	"}{.status.conditions[?(@.type=="Ready")].status}{"
"}{end}'

Q2: Pod一直处于CrashLoopBackOff状态

问题描述:Pod反复重启

解决方法:

  1. 检查容器启动命令是否正确
  2. 检查应用程序是否有错误
  3. 检查资源限制是否合理
# 查看日志
kubectl logs 
kubectl logs  --previous

# 检查容器配置
kubectl describe pod 

# 检查事件
kubectl get events --field-slider=involvedObject.name=

Q3: Service无法访问

问题描述:Service无法正常访问

解决方法:

  1. 检查Endpoint是否存在
  2. 检查Pod是否正常运行
  3. 检查Service配置是否正确
# 检查Endpoint
kubectl get endpoints 

# 检查Service配置
kubectl describe service 

# 测试Service
kubectl run test --rm -it --image=busybox --restart=Never -- wget -qO- http://:

Q4: 网络插件问题

问题描述:Pod之间无法通信

解决方法:

  1. 检查网络插件是否正常运行
  2. 检查节点网络配置
  3. 检查防火墙规则
# 检查网络插件Pod
kubectl get pods -n kube-system | grep -E "calico|flannel|cilium"

# 查看网络插件日志
kubectl logs -n kube-system 

# 检查节点网络接口
ip addr
ip route

# 测试Pod间通信
kubectl exec  -- ping 

总结

通过本文的介绍,你应该已经掌握了Kubernetes 1.34的安装配置和日常运维操作。Kubernetes是一个功能强大的容器编排平台,建议在实际使用中不断探索和学习更多高级特性。

建议的生产环境最佳实践:

  • 使用高可用架构部署集群
  • 配置资源限制和请求
  • 启用监控和日志系统
  • 定期备份集群数据
  • 制定集群升级计划
  • 配置RBAC权限控制
  • 使用NetworkPolicy限制网络访问

参考资源

发表回复

后才能评论