Kubernetes 1.34安装配置详解:从入门到生产环境
概述
Kubernetes(简称K8s)是目前最流行的容器编排平台,1.34版本带来了许多新特性和改进。本文将详细介绍Kubernetes 1.34的安装配置、新特性体验以及日常运维操作,帮助你快速上手这个强大的容器编排系统。
前置条件
- 至少3台Linux服务器(Master 1台 + Worker 2台)
- 每台服务器至少2GB内存、2核CPU
- 稳定的网络连接
- 对Linux命令行有基本了解
服务器规划
主机名 IP地址 角色
-----------------------------------------
k8s-master 192.168.1.10 Master/Worker
k8s-node1 192.168.1.11 Worker
k8s-node2 192.168.1.12 Worker
系统准备
1. 配置主机名
# Master节点
hostnamectl set-hostname k8s-master
# Node1节点
hostnamectl set-hostname k8s-node1
# Node2节点
hostnamectl set-hostname k8s-node2
2. 配置hosts文件
cat >> /etc/hosts << EOF
192.168.1.10 k8s-master
192.168.1.11 k8s-node1
192.168.1.12 k8s-node2
EOF
3. 关闭防火墙
# 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
# 或者配置防火墙规则(生产环境推荐)
firewall-cmd --permanent --add-port=6443/tcp
firewall-cmd --permanent --add-port=2379-2380/tcp
firewall-cmd --permanent --add-port=10250-10255/tcp
firewall-cmd --reload
4. 关闭SELinux
# 临时关闭
setenforce 0
# 永久关闭
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
# 重启生效
reboot
5. 配置Swap分区
# 临时关闭swap
swapoff -a
# 永久关闭(注释掉swap行)
sed -i '/ swap / s/^/#/' /etc/fstab
6. 加载内核模块
cat >> /etc/modules-load.d/k8s.conf <> /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
# 应用配置
sysctl --system
7. 安装容器运行时
# 安装Docker
yum install -y yum-utils
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install docker-ce docker-ce-cli containerd.io -y
# 启动Docker
systemctl start docker
systemctl enable docker
# 配置Docker使用systemd作为cgroup驱动
cat > /etc/docker/daemon.json << EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF
# 重启Docker
systemctl restart docker
安装Kubernetes组件
1. 添加Kubernetes仓库
cat >> /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
2. 安装Kubernetes组件
# 安装kubeadm、kubelet、kubectl
yum install -y kubeadm-1.34.0 kubelet-1.34.0 kubectl-1.34.0 --disableexcludes=kubernetes
# 启动kubelet
systemctl enable kubelet
systemctl start kubelet
# 验证安装
kubeadm version
kubelet --version
3. 初始化Master节点
# 在Master节点执行
kubeadm init --apiserver-advertise-address=192.168.1.10 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.34.0 --service-cidr=10.96.0.0/12 --pod-network-cidr=10.244.0.0/16
# 初始化成功后会显示类似以下信息
# [init] Using Kubernetes version: v1.34.0
# [preflight] Running pre-flight checks
# [preflight] Pulling images required for setting up a Kubernetes cluster
# [preflight] This might take a minute or two, depending on your internet speed
# [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a temporary certificate authority.
# [certificates] Generated front-proxy-ca certificate and key.
# [certificates] Generated front-proxy-client certificate and key.
# [certificates] Generated etcd/ca certificate and key.
# [certificates] Generated etcd/server certificate and key.
# [certificates] Generated etcd/peer certificate and key.
# [certificates] Generated etcd/healthcheck-client certificate and key.
# [certificates] Generated apiserver-etcd-client certificate and key.
# [certificates] Generated apiserver certificate and key.
# [certificates] Generated apiserver-kubelet-client certificate and key.
# [certificates] Generated root CA certificate and key.
# [upload-certs] Uploaded certificates to Secret.
# [mark-control-plane] Marking the node k8s-master as control-plane and adding taints.
# [bootstrap-token] Using token: abcdef.1234567890abcdef
# [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
# [bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve csr from a Node Bootstrapping token
# [bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
# [kubelet-start] Starting the kubelet
# [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
# 记录最后显示的join命令
# 例如:
# kubeadm join 192.168.1.10:6443 --token abcdef.1234567890abcdef # --discovery-token-ca-cert-hash sha256:xxxxxx
4. 配置kubectl
# 配置kubectl
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
# 验证配置
kubectl get nodes
kubectl get pods -n kube-system
5. 安装网络插件(Calico)
# 安装Calico网络插件
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
# 或者使用Helm安装
helm repo add projectcalico https://projectcalico.org/charts
helm install calico projectcalico/tigera-operator --version 3.28.0 -n calico-system --create-namespace
# 验证安装
kubectl get pods -n kube-system | grep calico
6. 添加Worker节点
# 在每个Worker节点上执行(使用初始化时生成的命令)
kubeadm join 192.168.1.10:6443 --token abcdef.1234567890abcdef --discovery-token-ca-cert-hash sha256:xxxxxx
# 如果忘记token,可以重新生成
kubeadm token create --print-join-command
# 验证节点状态
kubectl get nodes -o wide
Kubernetes 1.34新特性
1. Server-Side Field Validation(服务端字段验证)
Kubernetes 1.34增强了服务端字段验证功能,提供更详细的错误信息。
# 启用服务端字段验证
kubectl apply -f pod.yaml --validate=true
# 验证Pod定义
kubectl apply -f pod.yaml --validate=strict
2. Kubelet资源管理增强
1.34版本改进了资源管理,支持更精细的CPU亲和性控制。
# Pod资源限制示例
apiVersion: v1
kind: Pod
metadata:
name: resource-demo
spec:
containers:
- name: demo
image: nginx
resources:
requests:
memory: "128Mi"
cpu: "500m"
limits:
memory: "256Mi"
cpu: "1000m"
# CPU亲和性配置
resources:
limits:
kubernetes.io/cpu: "2"
requests:
kubernetes.io/cpu: "1"
3. 临时存储管理增强
改进了临时存储(ephemeral-storage)的管理和监控。
apiVersion: v1
kind: Pod
metadata:
name: storage-demo
spec:
containers:
- name: demo
image: nginx
resources:
limits:
ephemeral-storage: "2Gi"
requests:
ephemeral-storage: "1Gi"
volumeMounts:
- name: tmp-volume
mountPath: /tmp
volumes:
- name: tmp-volume
emptyDir:
sizeLimit: "1Gi"
常用操作命令
1. Pod操作
# 创建Pod
kubectl run nginx --image=nginx
# 创建Pod(指定资源限制)
kubectl run web --image=nginx --requests=cpu=500m,memory=256Mi --limits=cpu=1,memory=512Mi
# 查看Pod列表
kubectl get pods
kubectl get pods -o wide
kubectl get pods -n kube-system
# 查看Pod详情
kubectl describe pod nginx
# 查看Pod日志
kubectl logs nginx
kubectl logs nginx -c container-name
kubectl logs -f nginx
# 删除Pod
kubectl delete pod nginx
kubectl delete pod nginx --force --grace-period=0
# 进入Pod内部
kubectl exec -it nginx -- /bin/bash
# 缩放Pod副本数
kubectl scale deployment nginx --replicas=3
2. Deployment操作
# 创建Deployment
kubectl create deployment nginx --image=nginx
# 暴露服务
kubectl expose deployment nginx --port=80 --type=NodePort
# 查看Deployment
kubectl get deployments
kubectl get deployment nginx -o yaml
# 更新Deployment
kubectl set image deployment/nginx nginx=nginx:1.25
# 查看更新状态
kubectl rollout status deployment/nginx
# 回滚Deployment
kubectl rollout undo deployment/nginx
kubectl rollout undo deployment/nginx --to-revision=2
# 查看历史版本
kubectl rollout history deployment/nginx
# 扩缩容
kubectl scale deployment/nginx --replicas=5
# 暂停/恢复更新
kubectl rollout pause deployment/nginx
kubectl rollout resume deployment/nginx
3. Service操作
# 创建Service
kubectl expose pod my-pod --port=80 --target-port=8080
# 创建Service(ClusterIP)
kubectl create service clusterip my-service --tcp=80:8080
# 创建Service(NodePort)
kubectl create service nodeport my-service --tcp=80:80
# 创建Service(LoadBalancer)
kubectl create service loadbalancer my-service --tcp=80:80
# 查看Service
kubectl get services
kubectl get svc
# 查看Service详情
kubectl describe service nginx
# 删除Service
kubectl delete service nginx
4. Namespace操作
# 查看Namespace
kubectl get namespaces
kubectl get ns
# 创建Namespace
kubectl create namespace dev
kubectl create -f namespace.yaml
# 切换Namespace
kubectl config set-context --current --namespace=dev
# 删除Namespace
kubectl delete namespace dev
5. ConfigMap和Secret
# 创建ConfigMap
kubectl create configmap app-config --from-literal=key1=value1 --from-literal=key2=value2
kubectl create configmap app-config --from-file=config.properties
# 创建Secret
kubectl create secret generic db-secret --from-literal=username=admin --from-literal=password=123456
kubectl create secret tls tls-secret --cert=tls.crt --key=tls.key
# 在Pod中使用ConfigMap
apiVersion: v1
kind: Pod
metadata:
name: configmap-pod
spec:
containers:
- name: demo
image: nginx
envFrom:
- configMapRef:
name: app-config
volumeMounts:
- name: config-volume
mountPath: /etc/config
volumes:
- name: config-volume
configMap:
name: app-config
6. Ingress配置
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
ingressClassName: nginx
rules:
- host: app.example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 80
- path: /web
pathType: Prefix
backend:
service:
name: web-service
port:
number: 80
7. 资源配额和限制
# 创建ResourceQuota
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-quota
namespace: dev
spec:
hard:
cpu: "10"
memory: "20Gi"
pods: "20"
services: "10"
replicationcontrollers: "20"
resourcequotas: "1"
persistentvolumeclaims: "5"
# 创建LimitRange
apiVersion: v1
kind: LimitRange
metadata:
name: limit-range
namespace: dev
spec:
limits:
- default:
cpu: "500m"
memory: "256Mi"
defaultRequest:
cpu: "200m"
memory: "128Mi"
type: Container
日常运维
1. 集群状态检查
# 检查节点状态
kubectl get nodes
kubectl get nodes -o wide
kubectl top nodes
# 检查Pod状态
kubectl get pods -A
kubectl get pods -A -o wide
kubectl top pods -A
# 检查集群组件状态
kubectl get componentstatuses
# 检查Events
kubectl get events --sort-by='.metadata.creationTimestamp'
kubectl get events -n kube-system --sort-by='.lastTimestamp'
# 检查集群健康状况
kubectl cluster-info
kubectl get --raw='/healthz'
2. 故障排查
# 查看Pod详细状态
kubectl describe pod
# 查看Pod日志
kubectl logs
kubectl logs -c
kubectl logs --previous
# 查看资源使用情况
kubectl top pods -n
kubectl top nodes
# 进入容器调试
kubectl exec -it -- /bin/bash
# 端口转发
kubectl port-forward pod/ 8080:80
# 代理访问Dashboard
kubectl proxy
# 查看Endpoint
kubectl get endpoints
3. 备份和恢复
# 使用Velero备份
velero backup create backup-2024 --include-namespaces default
velero backup create backup-2024 --exclude-namespaces kube-system
# 查看备份
velero backup get
# 恢复备份
velero restore create --from-backup backup-2024
# 备份etcd(如果使用etcd)
ETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-backup.db --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key
4. 集群升级
# Master节点升级
apt-get update
apt-get install -y kubeadm=1.35.0 kubectl=1.35.0 kubelet=1.35.0
# 检查升级计划
kubeadm upgrade plan
# 执行升级
kubeadm upgrade apply v1.35.0
# 设置cordon标记节点(维护时)
kubectl cordon
kubectl drain --ignore-daemonsets --delete-emptydir-data
# 升级kubelet
apt-get install -y kubelet=1.35.0
systemctl restart kubelet
# 恢复节点
kubectl uncordon
5. 监控和日志
# 部署Prometheus
helm install prometheus stable/prometheus-operator -n monitoring --create-namespace
# 部署Grafana
helm install grafana stable/grafana -n monitoring
# 部署ELK日志系统
helm install elasticsearch elastic/elasticsearch -n logging --create-namespace
helm install kibana elastic/kibana -n logging
helm install filebeat elastic/filebeat -n logging
# 查看metrics
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes
kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods
常见问题
Q1: Pod一直处于Pending状态
问题描述:Pod创建后一直处于Pending状态
解决方法:
- 检查节点资源是否足够
- 检查是否有节点被标记为不可调度
- 检查PVC是否正确绑定
# 查看详细原因
kubectl describe pod
# 检查节点资源
kubectl describe nodes | grep -A 10 "Allocated resources"
kubectl top nodes
# 检查节点状态
kubectl get nodes
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{" "}{.status.conditions[?(@.type=="Ready")].status}{"
"}{end}'
Q2: Pod一直处于CrashLoopBackOff状态
问题描述:Pod反复重启
解决方法:
- 检查容器启动命令是否正确
- 检查应用程序是否有错误
- 检查资源限制是否合理
# 查看日志
kubectl logs
kubectl logs --previous
# 检查容器配置
kubectl describe pod
# 检查事件
kubectl get events --field-slider=involvedObject.name=
Q3: Service无法访问
问题描述:Service无法正常访问
解决方法:
- 检查Endpoint是否存在
- 检查Pod是否正常运行
- 检查Service配置是否正确
# 检查Endpoint
kubectl get endpoints
# 检查Service配置
kubectl describe service
# 测试Service
kubectl run test --rm -it --image=busybox --restart=Never -- wget -qO- http://:
Q4: 网络插件问题
问题描述:Pod之间无法通信
解决方法:
- 检查网络插件是否正常运行
- 检查节点网络配置
- 检查防火墙规则
# 检查网络插件Pod
kubectl get pods -n kube-system | grep -E "calico|flannel|cilium"
# 查看网络插件日志
kubectl logs -n kube-system
# 检查节点网络接口
ip addr
ip route
# 测试Pod间通信
kubectl exec -- ping
总结
通过本文的介绍,你应该已经掌握了Kubernetes 1.34的安装配置和日常运维操作。Kubernetes是一个功能强大的容器编排平台,建议在实际使用中不断探索和学习更多高级特性。
建议的生产环境最佳实践:
- 使用高可用架构部署集群
- 配置资源限制和请求
- 启用监控和日志系统
- 定期备份集群数据
- 制定集群升级计划
- 配置RBAC权限控制
- 使用NetworkPolicy限制网络访问
参考资源
- Kubernetes官方文档:https://kubernetes.io/docs/home/
- Kubernetes GitHub:https://github.com/kubernetes/kubernetes
- Kubeadm文档:https://kubernetes.io/docs/reference/setup-tools/kubeadm/
- kubectl文档:https://kubernetes.io/docs/reference/kubectl/
声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。






