GitLab CI/CD完整配置教程

1. 引言

GitLab CI/CD是GitLab提供的内置持续集成、持续交付和持续部署工具,通过在代码仓库中定义.gitlab-ci.yml配置文件,实现自动化的构建、测试和部署流程。本教程将全面介绍GitLab CI/CD的核心概念、配置方法和最佳实践,帮助您建立高效的自动化流水线。

2. 基础概念

2.1. GitLab Runner

GitLab Runner是执行CI/CD作业的轻量级代理,可以安装在本地服务器、云实例或Docker容器中。Runner有两种类型:

共享Runner:由GitLab维护,所有项目可用

专用Runner:为特定项目或组配置,提供更好的资源隔离

2.2. 核心组件

Pipeline(流水线):由多个作业组成的自动化流程,由Git事件触发(如代码提交、合并请求)

Stage(阶段):流水线中的逻辑分组(如build、test、deploy),按顺序执行

Job(作业):执行具体任务的最小单元(如编译代码、运行测试、部署应用)

Artifact(工件):作业生成的可下载文件(如编译后的二进制文件、测试报告)

3. 配置文件详解

3.1. 基本结构

.gitlab-ci.yml使用YAML语法定义流水线,基本结构如下:

# 定义所有作业共享的默认配置
default:
  image: node:20-alpine  # 默认使用的Docker镜像
  before_script:
    - echo "Starting pipeline"

# 定义阶段顺序
stages:
  - build
  - test
  - deploy

# 示例作业:构建阶段
build_job:
  stage: build
  script:
    - echo "Compiling the code..."
    - npm install
    - npm run build
  artifacts:  # 保存构建产物
    paths:
      - dist/
    expire_in: 1 week

# 示例作业:测试阶段
test_job:
  stage: test
  script:
    - echo "Running unit tests..."
    - npm test
  coverage: '/Code coverage: \d+\.\d+/'  # 解析测试覆盖率

# 示例作业:部署阶段
deploy_job:
  stage: deploy
  environment: production  # 关联GitLab环境
  script:
    - echo "Deploying to production..."
    - scp -r dist/ user@server:/var/www/
  only:
    - main  # 仅当main分支有提交时触发
  when: manual  # 设置为手动部署

3.2. 关键指令解析

#### image与services

build:
  image: docker:20.10.16  # 使用Docker镜像
  services:
    - docker:20.10.16-dind  # 使用Docker-in-Docker服务
  script:
    - docker build -t myapp .
    - docker push myapp

#### before_script与after_script

test:
  before_script:
    - echo "Setting up test environment..."
    - pip install -r requirements.txt
  after_script:
    - echo "Cleaning up..."
    - rm -rf __pycache__
  script:
    - pytest

#### 依赖控制

stages:
  - build
  - test
  - deploy

build:
  stage: build
  script:
    - make build
  artifacts:
    paths:
      - binaries/

test:
  stage: test
  needs: ["build"]  # 明确依赖构建作业
  script:
    - ./binaries/test_runner

#### 缓存配置

build:
  script:
    - bundle install --path vendor
  cache:
    paths:
      - vendor/ruby  # 缓存Ruby依赖
    key: ${CI_COMMIT_REF_SLUG}  # 基于分支名的唯一缓存键
    policy: pull-push  # 默认行为(拉取+推送)

test:
  script:
    - bundle exec rspec
  cache:
    paths:
      - vendor/ruby
    key: ${CI_COMMIT_REF_SLUG}
    policy: pull  # 仅拉取缓存,不更新

#### 环境与动态环境

deploy_staging:
  stage: deploy
  environment:
    name: staging
    url: https://staging.example.com
  script:
    - echo "Deploying to staging"

deploy_prod:
  stage: deploy
  environment:
    name: production
    url: https://example.com
    auto_stop_in: 1 day  # 自动停止环境
  script:
    - echo "Deploying to production"
  when: manual  # 需要手动触发

# 动态环境(基于分支名创建)
deploy_review:
  stage: deploy
  environment:
    name: review/$CI_COMMIT_REF_NAME  # 动态环境名
    url: https://$CI_COMMIT_REF_NAME.review.example.com
    on_stop: stop_review  # 定义停止动作
  script:
    - echo "Deploy review for $CI_COMMIT_REF_NAME"
  only:
    - branches
  except:
    - main

stop_review:
  stage: deploy
  variables:
    GIT_STRATEGY: none  # 不需要代码
  environment:
    name: review/$CI_COMMIT_REF_NAME
    action: stop  # 停止环境操作
  script:
    - echo "Stopping review environment"
  when: manual
  only:
    - branches
  except:
    - main

#### 条件执行

job:
  script:
    - echo "Conditional execution"
  only:
    - main  # 仅main分支
    - tags  # 或标签
    - api  # 或API触发
    - schedules  # 或定时任务
  except:
    variables:
      - $CI_COMMIT_MESSAGE =~ /WIP/  # 提交消息含WIP时跳过
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'  # 合并请求触发
    - if: '$CI_COMMIT_BRANCH == "main" && $CI_COMMIT_TAG'  # main分支且有标签
      when: manual
    - when: on_failure  # 前序作业失败时执行

#### 模板与继承

# 定义模板作业
.job_template: &job_configuration  # YAML锚点
  image: ruby:2.7
  before_script:
    - echo "Setting up..."
  after_script:
    - echo "Tearing down..."

rspec:
  <<: *job_configuration  # 继承模板
  script:
    - rspec

pytest:
  <<: *job_configuration
  script:
    - pytest

# 扩展模板
.deploy_template: &deploy_config
  stage: deploy
  environment:
    url: https://example.com
  script:
    - echo "Deploying..."

prod_deploy:
  <<: *deploy_config
  environment:
    name: production
    url: https://prod.example.com

staging_deploy:
  <<: *deploy_config
  environment:
    name: staging
    url: https://staging.example.com

4. 高级配置

4.1. 多流水线配置

# .gitlab-ci.yml
include:
  - local: '/templates/build.yml'  # 包含本地文件
  - remote: 'https://gitlab.com/example/templates/-/raw/main/test.yml'  # 包含远程文件
  - project: 'my-group/my-project'  # 包含其他项目文件
    file: '/templates/deploy.yml'
    ref: main  # 指定分支

# 使用动态配置
generate-config:
  stage: prepare
  script:
    - generate-ci-config > generated-config.yml
  artifacts:
    paths:
      - generated-config.yml

child-pipeline:
  stage: test
  trigger:
    include:
      - artifact: generated-config.yml
        job: generate-config
    strategy: depend  # 等待子流水线完成

4.2. 矩阵构建

build:
  stage: build
  parallel:
    matrix:
      - NODE_VERSION: [14, 16, 18]
        OS: [ubuntu, alpine]
  script:
    - echo "Building Node $NODE_VERSION on $OS"
    - node --version

4.3. 密钥管理

deploy:
  script:
    - echo "$SSH_PRIVATE_KEY" > id_rsa
    - chmod 600 id_rsa
    - ssh -i id_rsa user@server "mkdir -p /app"
    - scp -i id_rsa -r dist/ user@server:/app/
  before_script:
    - 'command -v ssh-agent >/dev/null || ( apt-get update -y && apt-get install openssh-client -y )'
    - eval $(ssh-agent -s)
    - ssh-add <(echo "$SSH_PRIVATE_KEY")
    - mkdir -p ~/.ssh
    - chmod 700 ~/.ssh
    - ssh-keyscan -H server >> ~/.ssh/known_hosts

5. 完整示例:Node.js应用部署

# 全局默认配置
default:
  image: node:20-alpine
  cache:
    key: $CI_COMMIT_REF_SLUG
    paths:
      - node_modules/
      - .npm/
  before_script:
    - npm ci --cache .npm --prefer-offline

# 定义流水线阶段
stages:
  - install
  - build
  - test
  - security
  - deploy

# 安装依赖
install_dependencies:
  stage: install
  script:
    - npm ci
  cache:
    paths:
      - node_modules/
      - .npm/
    policy: push

# 构建应用
build_application:
  stage: build
  script:
    - npm run build
  artifacts:
    paths:
      - dist/
    expire_in: 1 week
  dependencies:
    - install_dependencies

# 运行单元测试
unit_tests:
  stage: test
  script:
    - npm run test:unit
  coverage: '/Lines\s*:\s*(\d+\.\d+)%/'
  artifacts:
    reports:
      junit: test-results/junit.xml
      coverage_report:
        coverage_format: cobertura
        path: coverage/cobertura-coverage.xml

# E2E测试
e2e_tests:
  stage: test
  image: cypress/included:10.9.0
  script:
    - npm run test:e2e
  artifacts:
    when: always
    paths:
      - cypress/screenshots/
      - cypress/videos/
    reports:
      junit: test-results/e2e/junit.xml
  dependencies:
    - build_application

# 安全扫描
security_scan:
  stage: security
  image: node:20-alpine
  script:
    - npm audit --audit-level moderate
    - npm run snyk
  allow_failure: true  # 允许失败但不阻断流水线

# 部署到开发环境
deploy_dev:
  stage: deploy
  environment:
    name: development
    url: https://dev.example.com
    auto_stop_in: 3 days
  script:
    - echo "Deploying to development"
    - ssh -i $SSH_KEY user@dev-server "mkdir -p /opt/app"
    - rsync -avz -e "ssh -i $SSH_KEY" dist/ user@dev-server:/opt/app/
    - ssh -i $SSH_KEY user@dev-server "pm2 reload all"
  only:
    - develop
  tags:
    - ssh-runner  # 指定使用SSH Runner

# 部署到生产环境
deploy_prod:
  stage: deploy
  environment:
    name: production
    url: https://example.com
  script:
    - echo "Deploying to production"
    - scp -i $SSH_KEY dist/* prod-user@prod-server:/var/www/html/
    - ssh -i $SSH_KEY prod-user@prod-server "nginx -s reload"
  when: manual
  only:
    - main
  before_script:
    - 'which ssh-agent || ( apk add --no-cache openssh-client )'
    - eval $(ssh-agent -s)
    - echo "$SSH_KEY" | tr -d '\r' | ssh-add -
    - mkdir -p ~/.ssh
    - chmod 700 ~/.ssh
    - ssh-keyscan prod-server >> ~/.ssh/known_hosts
    - chmod 644 ~/.ssh/known_hosts

6. 最佳实践

6.1. 流水线优化

使用缓存减少依赖安装时间

并行执行独立作业

使用needs替代dependencies减少等待时间

为关键阶段添加超时控制:

build:
  script:
    - make build
  timeout: 10 minutes  # 10分钟超时

6.2. 安全配置

避免在日志中打印敏感信息:

variables:
  SECRET_VAR: $PROTECTED_VAR  # 使用保护变量
  DEBUG: 'false'

在Runner配置中屏蔽敏感变量:

variables:
  CI_DEBUG_TRACE: "false"  # 关闭调试追踪

6.3. 错误处理

test:
  script:
    - npm test || true  # 允许失败
  allow_failure: true

deploy:
  script:
    - ./deploy.sh
  after_script:
    - if [ $? -ne 0 ]; then
        echo "Deploy failed, rolling back...";
        ./rollback.sh;
      fi

6.4. 性能监控

monitor:
  stage: .post  # 后置阶段
  script:
    - curl -X POST -H 'Content-type: application/json' --data '{"text":"Pipeline completed"}' $SLACK_WEBHOOK
  rules:
    - if: $CI_PIPELINE_SOURCE == "push"
      when: on_success  # 仅成功时通知
    - if: $CI_PIPELINE_SOURCE == "push"
      when: on_failure  # 失败时通知

7. 总结

本教程详细介绍了GitLab CI/CD的核心配置方法,从基础概念到高级场景,涵盖了:

核心组件理解(Runner、Pipeline、Stage、Job)

完整的.gitlab-ci.yml语法解析

关键场景实现(构建、测试、部署、环境管理)

高级特性应用(缓存、安全、多流水线、矩阵构建)

生产级最佳实践(性能优化、安全加固、错误处理)

通过合理配置GitLab CI/CD,您可以建立高度自动化、安全可靠的持续交付流水线,显著提升开发效率和产品质量。实际应用中,建议根据项目特点逐步优化配置,结合GitLab的监控和审计功能持续改进流水线性能。

发表回复

后才能评论