OpenStack生产环境部署完整指南:从规划到上线
categories: - OpenStack运维 tags: - OpenStack - 生产环境 - 部署指南 - 高可用
OpenStack生产环境部署完整指南:从规划到上线
一、项目规划
1.1 需求分析
在开始部署前,需要明确以下需求:
业务需求:
- 用户数量和并发需求
- 虚拟机实例规模
- 存储容量需求
- 网络带宽需求
- SLA要求(可用性、响应时间)
- OpenStack版本选择
- 部署工具选择
- 集成需求(容器、裸金属)
- 多区域需求
技术需求:
1.2 规模评估
| 规模 | Compute节点 | VM数量 | 存储容量 | 网络带宽 |
|---|---|---|---|---|
| 小型 | 3-5 | <100 | 10TB | 1Gbps |
| 中型 | 5-20 | 100-1000 | 50TB | 10Gbps |
| 大型 | 20-100 | 1000-10000 | 500TB | 40Gbps |
| 超大型 | 100+ | 10000+ | 1PB+ | 100Gbps |
1.3 硬件选型
Controller节点规格:
| 组件 | 最低配置 | 推荐配置 |
|---|---|---|
| CPU | 8核 | 16核 |
| 内存 | 32GB | 64GB |
| 系统盘 | 200GB SSD | 500GB NVMe |
| 数据盘 | 无 | 1TB SSD |
| 网络 | 2x 1Gbps | 4x 1Gbps + 2x 10Gbps |
Compute节点规格:
| 组件 | 最低配置 | 推荐配置 |
|---|---|---|
| CPU | 8核 | 32核 |
| 内存 | 32GB | 128GB |
| 本地存储 | 500GB | 4TB NVMe |
| 网络 | 2x 1Gbps | 2x 10Gbps |
Storage节点规格:
| 组件 | 最低配置 | 推荐配置 |
|---|---|---|
| CPU | 8核 | 16核 |
| 内存 | 16GB | 32GB |
| 数据盘 | 10TB HDD | 40TB HDD |
| 网络 | 1Gbps | 10Gbps |
二、环境准备
2.1 系统准备
# 1. 安装Ubuntu Server 22.04
# 下载ISO并制作启动U盘
# 2. 系统配置
# 设置主机名
hostnamectl set-hostname controller01
# 配置hosts文件
cat >> /etc/hosts << 'EOF'
10.0.0.11 controller01
10.0.0.12 controller02
10.0.0.13 controller03
10.0.0.21 compute01
10.0.0.22 compute02
10.0.0.31 storage01
EOF
# 3. 配置NTP时间同步
timedatectl set-timezone Asia/Shanghai
apt install -y chrony
cat > /etc/chrony/chrony.conf << 'EOF'
server 0.pool.ntp.org iburst
server 1.pool.ntp.org iburst
server 2.pool.ntp.org iburst
allow 10.0.0.0/24
EOF
systemctl restart chrony
# 4. 配置网络
cat > /etc/netplan/01-netcfg.yaml << 'EOF'
network:
version: 2
renderer: networkd
ethernets:
ens3:
addresses:
- 10.0.0.11/24
gateway4: 10.0.0.1
nameservers:
addresses:
- 8.8.8.8
- 8.8.4.4
ens4:
addresses:
- 10.0.1.11/24
optional: true
EOF
netplan apply
# 5. 更新系统
apt update && apt upgrade -y
# 6. 安装基础工具
apt install -y curl wget git vim htop iftop iotop sysstat
2.2 配置APT源
# 添加OpenStack Yoga仓库
apt install -y software-properties-common
add-apt-repository -y cloud-archive:yoga
apt update
# 或者使用阿里云镜像
cat > /etc/apt/sources.list.d/openstack-yoga.list << 'EOF'
deb http://mirrors.aliyun.com/openstack/yoga/ubuntu jammy main
deb http://mirrors.aliyun.com/openstack/yoga/ubuntu jammy-updates main
EOF
apt update
2.3 配置数据库
# 安装MariaDB Galera集群
apt install -y mariadb-server mariadb-client galera-3
# 配置Galera集群
cat > /etc/mysql/mariadb.conf.d/99-galera.cnf << 'EOF'
[mysqld]
binlog_format = ROW
default-storage-engine = InnoDB
innodb_autoinc_lock_mode = 2
bind-address = 0.0.0.0
[galera]
wsrep_on = ON
wsrep_provider = /usr/lib/galera/libgalera_sMM.so
wsrep_cluster_name = "openstack-galera"
wsrep_cluster_address = "gcomm://controller01,controller02,controller03"
wsrep_node_name = "controller01"
wsrep_node_address = "10.0.0.11"
wsrep_slave_threads = 4
EOF
# 初始化集群(在第一个节点)
galera_new_cluster
# 在其他节点启动MySQL
systemctl start mysql
2.4 配置消息队列
# 安装RabbitMQ
apt install -y rabbitmq-server
# 配置RabbitMQ
rabbitmqctl add_user openstack RABBITMQ_PASS
rabbitmqctl set_permissions -p / openstack ".*" ".*" ".*"
# 启用管理插件
rabbitmq-plugins enable rabbitmq_management
# 配置集群
# 编辑/etc/rabbitmq/rabbitmq.conf
cluster_formation.peer_discovery_backend = classic_official_peers
cluster_formation.classic_official_peers.1 = rabbit@controller01
cluster_formation.classic_official_peers.2 = rabbit@controller02
cluster_formation.classic_official_peers.3 = rabbit@controller03
systemctl restart rabbitmq-server
2.5 配置Memcached
# 安装Memcached
apt install -y memcached libmemcached-tools
# 配置Memcached
cat > /etc/memcached.conf << 'EOF'
-m 2048
-p 11211
-u root
-l 0.0.0.0
-c 10240
-p /var/run/memcached/memcached.pid
EOF
systemctl restart memcached
# 配置防火墙(所有Controller节点)
ufw allow from 10.0.0.0/24 to any port 11211
三、Kolla-Ansible部署
3.1 安装Kolla-Ansible
# 安装依赖
apt install -y python3-pip python3-dev libffi-dev libssl-dev
# 安装Kolla-Ansible
pip3 install kolla-ansible==15.0.0
# 复制配置文件
sudo mkdir -p /etc/kolla
sudo chown $USER:$USER /etc/kolla
cp -r /usr/local/share/kolla-ansible/etc_examples/kolla/* /etc/kolla/
cp /usr/local/share/kolla-ansible/ansible/inventory/multinode /home/$USER/
# 安装Ansible
pip3 install ansible==6.4.0
3.2 配置Globals
# /etc/kolla/globals.yml
---
kolla_base_distro: "ubuntu"
kolla_install_type: "source"
openstack_release: "yoga"
kolla_internal_vip_address: "10.0.0.100"
kolla_external_vip_address: "10.0.0.100"
network_interface: "ens3"
neutron_external_interface: "ens4"
enable_haproxy: "yes"
# 启用服务
enable_horizon: "yes"
enable_glance: "yes"
enable_nova: "yes"
enable_neutron: "yes"
enable_cinder: "yes"
enable_keystone: "yes"
enable_heat: "yes"
enable_tacker: "no"
enable_magnum: "no"
enable_octavia: "no"
# 存储后端
enable_cinder_backend_lvm: "yes"
enable_cinder_backend_ceph: "no"
multinode: "yes"
# Docker配置
docker_registry: "docker.io"
docker_namespace: "kolla"
3.3 配置Passwords
# 生成密码
kolla-ansible -i multinode passwords
# 或手动配置
cat > /etc/kolla/passwords.yml << 'EOF'
keystone_admin_password: "YOUR_ADMIN_PASSWORD"
database_password: "YOUR_DB_PASSWORD"
rabbitmq_password: "YOUR_RABBITMQ_PASSWORD"
memcache_secret_key: "YOUR_MEMCACHE_KEY"
haproxy_password: "YOUR_HAPROXY_PASSWORD"
docker_registry_password: ""
EOF
3.4 配置Inventory
# /home/$USER/multinode
[control]
controller01 ansible_host=10.0.0.11 ansible_user=deploy
controller02 ansible_host=10.0.0.12 ansible_user=deploy
controller03 ansible_host=10.0.0.13 ansible_user=deploy
[network]
network01 ansible_host=10.0.0.14 ansible_user=deploy
[compute]
compute01 ansible_host=10.0.0.21 ansible_user=deploy
compute02 ansible_host=10.0.0.22 ansible_user=deploy
[storage]
storage01 ansible_host=10.0.0.31 ansible_user=deploy
[monitoring]
monitoring01 ansible_host=10.0.0.41 ansible_user=deploy
[deployment]
localhost ansible_connection=local
3.5 执行部署
# 1. 验证连接
ansible -i multinode all -m ping
# 2. 准备节点
kolla-ansible -i multinode bootstrap-servers
# 3. 预检查
kolla-ansible -i multinode prechecks -e "kolla_action=deploy"
# 4. 执行部署
kolla-ansible -i multinode deploy
# 5. 部署后检查
kolla-ansible -i multinode post-deploy
# 6. 验证部署
source /etc/kolla/admin-openrc.sh
openstack compute service list
openstack network agent list
四、验证部署
4.1 服务验证
# 检查服务状态
openstack compute service list
openstack network agent list
openstack volume service list
# 检查端点
openstack endpoint list
# 检查Nova Hypervisor
openstack hypervisor list
# 检查网络
openstack network list
openstack router list
# 检查存储
openstack volume list
4.2 创建测试资源
# 创建测试网络
openstack network create test-network
openstack subnet create --network test-network \
--subnet-range 192.168.100.0/24 \
test-subnet
# 创建测试路由器
openstack router create test-router
openstack router add subnet test-router test-subnet
openstack router set --external-gateway public test-router
# 创建测试虚拟机
openstack flavor create --public test-flavor --id auto \
--ram 512 --disk 5 --vcpus 1
openstack image list
openstack keypair create test-key > test-key.pem
openstack server create --flavor test-flavor \
--image cirros \
--network test-network \
--key-name test-key \
test-vm
4.3 功能测试
# 测试网络连通性
openstack console log show test-vm --lines 20
# 获取VNC
openstack console url show test-vm
# 测试浮动IP
openstack floating ip create public
openstack server add floating ip test-vm
ping -c 4
# 测试SSH
chmod 600 test-key.pem
ssh -i test-key.pem cirros@
五、配置生产环境
5.1 配置高可用
# 启用HAProxy高可用
# Kolla-Ansible已自动配置
# 验证HA
curl http://10.0.0.100:5000
# 检查VIP
ip addr show | grep 10.0.0.100
5.2 配置监控
# 启用Prometheus和Grafana
# /etc/kolla/globals.yml
enable_prometheus: "yes"
enable_grafana: "yes"
# 访问Grafana
# http://10.0.0.100:3000
# 默认用户名: admin
# 密码: admin
5.3 配置日志
# 配置ELK Stack
# /etc/kolla/globals.yml
enable_elasticsearch: "yes"
enable_kibana: "yes"
enable_fluentd: "yes"
# 访问Kibana
# http://10.0.0.100:5601
5.4 配置告警
# 配置Aodh告警
# 创建CPU使用率告警
aodh alarm create \
--name high-cpu \
--type threshold \
--meter-name cpu_util \
--threshold 80 \
--comparison-operator gt \
--statistic avg \
--period 300
六、上线检查清单
6.1 基础设施检查
# [ ] 所有节点可访问
ping -c 2 controller01
ping -c 2 compute01
# [ ] NTP时间同步
chronyc tracking
# [ ] 存储容量充足
df -h
# [ ] 网络连通性
iperf3 -c 10.0.0.21
6.2 服务检查
# [ ] API服务正常
curl -s http://10.0.0.100:5000 | head -20
# [ ] 数据库集群正常
mysql -e "SHOW STATUS LIKE 'wsrep_cluster_size';"
# [ ] 消息队列正常
rabbitmqctl cluster_status
# [ ] 存储服务正常
openstack volume service list
6.3 安全检查
# [ ] SSL证书配置
curl -I https://10.0.0.100:443
# [ ] 防火墙配置
ufw status numbered
# [ ] 审计日志启用
tail -f /var/log/keystone/audit.log
6.4 性能检查
# [ ] API响应时间
time openstack server list
# [ ] 虚拟机创建时间
time openstack server create --flavor m1.small --image cirros --network test test-perf
# [ ] 存储IOPS
fio --name=test --ioengine=libaio --direct=1 --rw=randread --bs=4k --size=1G --numjobs=4
七、运维准备
7.1 配置管理
# [ ] Ansible配置
# /home/deploy/multinode
# [ ] 备份配置
# 备份/etc/kolla/
# [ ] 文档准备
# 架构文档
# 运维手册
# 应急预案
7.2 监控配置
# [ ] 配置Grafana仪表板
# 导入OpenStack仪表板
# [ ] 配置告警规则
# CPU使用率
# 内存使用率
# 磁盘使用率
# 服务状态
7.3 备份策略
# [ ] 数据库备份
# mysqldump
# [ ] 配置文件备份
# /etc/kolla/
# [ ] 镜像备份
# 重要镜像导出
八、常见问题处理
8.1 部署失败
# 查看错误日志
tail -f /var/log/kolla/ansible.log
# 重新执行部署
kolla-ansible -i multinode deploy -e "reconfigure=true"
# 检查依赖
pip3 list | grep kolla
ansible --version
8.2 服务启动失败
# 检查容器状态
docker ps -a | grep kolla
# 查看容器日志
docker logs kolla_keystone_1
# 重启服务
kolla-ansible -i multinode service restart -e "kolla_action=restart"
8.3 网络问题
# 检查网络命名空间
ip netns list
# 检查OVS
ovs-vsctl show
# 检查neutron agents
openstack network agent list
九、总结
本文详细介绍了OpenStack生产环境的完整部署流程。
核心要点:
恭喜您完成了OpenStack生产环境部署!
建议持续关注官方文档更新,定期进行版本升级和安全加固。
声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。







