GitLab CI/CD配置完整技术教程

1. 引言

GitLab CI/CD是GitLab提供的内置持续集成、持续交付和持续部署工具,它允许开发者自动化构建、测试和部署流程。通过在代码仓库根目录创建.gitlab-ci.yml文件定义自动化流水线,GitLab Runner执行定义的任务,极大提高开发效率和代码质量。本教程将详细讲解从基础配置到高级实践的完整过程,包括安装配置、流水线编写、最佳实践等核心内容。

2. GitLab CI/CD核心概念

2.1. Pipeline(流水线)

Pipeline是CI/CD流程的核心概念,代表一次完整的自动化执行过程。当开发者提交代码或触发事件时,GitLab会根据.gitlab-ci.yml文件创建Pipeline。Pipeline包含多个阶段(Stage),每个阶段包含多个作业(Job)。Pipeline状态分为:pending(等待)、running(运行)、success(成功)、failed(失败)、canceled(取消)、skipped(跳过)。

2.2. Stage(阶段)

Stage是流水线中的逻辑分组,多个作业可以属于同一阶段。默认按顺序执行(例如:build → test → deploy),但可以自定义执行顺序。所有同阶段作业并行执行,成功后才进入下一阶段。

2.3. Job(作业)

Job是流水线中的最小执行单元,由Runner执行。每个Job包含:

执行脚本(script)

执行规则(only/except, rules)

依赖关系(needs, dependencies)

资源配置(tags, image)

工件(artifacts)

环境配置(environment)

2.4. Runner(执行器)

Runner是执行作业的独立进程,分为:

共享Runner(GitLab提供,公共项目可用)

群组Runner(特定群组项目可用)

项目专用Runner(仅限当前项目)

Runner类型:

Shell Runner(直接在主机执行)

Docker Runner(在容器中执行)

Kubernetes Runner(在K8s集群中执行)

3. 安装配置GitLab Runner

3.1. 安装Runner(Linux环境)

# 下载官方安装包
curl -LJO "https://gitlab-runner-downloads.s3.amazonaws.com/latest/rpm/gitlab-runner_amd64.rpm"
sudo rpm -Uvh gitlab-runner_amd64.rpm

# 注册Runner
sudo gitlab-runner register

# 输入GitLab实例URL
https://gitlab.com/

# 输入注册token(在项目Settings → CI/CD → Runners中获取)
glrt-xxxxxxxxxxxxxxxxxxxx

# 输入Runner描述
my-project-runner

# 输入Runner标签(用逗号分隔)
docker,linux

# 选择执行器
docker

# 选择默认镜像
alpine:latest

# 安装并启动服务
sudo gitlab-runner install
sudo gitlab-runner start

3.2. 验证Runner状态

在项目Settings → CI/CD → Runners中,新注册的Runner应显示为绿色在线状态。检查状态命令:

sudo gitlab-runner status

4. 编写.gitlab-ci.yml文件

4.1. 基础结构示例

stages:
  - build
  - test
  - deploy

variables:
  IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

build_job:
  stage: build
  script:
    - echo "Building application..."
    - docker build -t $IMAGE_TAG .
  tags:
    - docker

test_job:
  stage: test
  script:
    - echo "Running tests..."
    - docker run $IMAGE_TAG npm test
  tags:
    - docker
  artifacts:
    reports:
      junit: test-results.xml

deploy_job:
  stage: deploy
  script:
    - echo "Deploying to production..."
    - docker push $IMAGE_TAG
  only:
    - main
  tags:
    - docker
  environment: production

4.2. 配置详解

#### stages(阶段定义)

定义流水线执行顺序,至少包含一个阶段:

stages:
  - validate
  - build
  - test
  - security_scan
  - deploy

#### variables(全局变量)

定义所有作业可用的变量:

variables:
  DOCKER_DRIVER: overlay2
  DOCKER_TLS_CERTDIR: "/certs"
  KUBE_NAMESPACE: production

#### 作业配置

每个作业必须包含:

stage:指定所属阶段

script:要执行的命令序列

tags:指定可执行的Runner标签

示例作业:

lint_code:
  stage: validate
  script:
    - echo "Running linter..."
    - npm run lint
  tags:
    - node
  artifacts:
    paths:
      - lint-report.html
    expire_in: 1 week
  rules:
    - if: $CI_COMMIT_BRANCH == "main"

#### 条件执行(rules)

使用rules定义作业执行条件:

deploy_staging:
  stage: deploy
  script:
    - echo "Deploy to staging"
  rules:
    - if: '$CI_COMMIT_BRANCH == "dev"'
      when: manual
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'

#### 依赖关系(needs)

突破阶段限制,并行执行作业:

test_unit:
  stage: test
  script:
    - echo "Running unit tests"
  needs: ["build_job"]

test_integration:
  stage: test
  script:
    - echo "Running integration tests"
  needs: ["build_job"]

#### 环境配置(environment)

定义部署环境:

deploy_prod:
  stage: deploy
  script:
    - ./deploy.sh production
  environment:
    name: production
    url: https://prod.example.com
    on_stop: stop_production

stop_production:
  stage: deploy
  script:
    - ./stop.sh production
  environment:
    name: production
    action: stop
  when: manual

5. 高级配置技巧

5.1. 缓存(Cache)

缓存依赖项加速构建:

cache:
  paths:
    - node_modules/
    - .npm/

install_deps:
  stage: build
  script:
    - npm ci --cache .npm --prefer-offline
  cache:
    key: $CI_COMMIT_REF_SLUG
    paths:
      - node_modules/
      - .npm/

5.2. 制品(Artifacts)

保存构建产物供后续作业使用:

build:
  stage: build
  script:
    - npm run build
  artifacts:
    name: "$CI_JOB_NAME-$CI_COMMIT_REF_NAME"
    paths:
      - dist/
    expire_in: 1 week
    reports:
      junit:
        - test-results.xml
      cobertura:
        - coverage/cobertura-coverage.xml

5.3. 多项目流水线

触发下游项目流水线:

trigger_downstream:
  stage: deploy
  trigger:
    project: my-group/downstream-project
    branch: main
    strategy: depend

5.4. 动态环境(Dynamic Environments)

动态创建部署环境:

deploy_review:
  stage: deploy
  script:
    - echo "Deploying to $CI_ENVIRONMENT_SLUG"
  environment:
    name: review/$CI_COMMIT_REF_NAME
    url: https://$CI_ENVIRONMENT_SLUG.example.com
    on_stop: stop_review

stop_review:
  stage: deploy
  variables:
    GIT_STRATEGY: none
  script:
    - echo "Removing review environment"
  environment:
    name: review/$CI_COMMIT_REF_NAME
    action: stop
  when: manual

5.5. 安全扫描

集成SAST(静态应用安全测试):

sast:
  stage: security_scan
  script:
    - echo "Running SAST scan..."
  artifacts:
    reports:
      sast: gl-sast-report.json
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
  allow_failure: true

6. 完整示例:Node.js应用CI/CD

stages:
  - install
  - test
  - build
  - security
  - deploy_staging
  - deploy_prod

variables:
  NODE_VERSION: "16.14"
  DOCKER_IMAGE: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA
  KUBE_NAMESPACE_DEV: development
  KUBE_NAMESPACE_PROD: production

cache:
  key: $CI_COMMIT_REF_SLUG
  paths:
    - node_modules/
    - .npm/

install_dependencies:
  stage: install
  image: node:$NODE_VERSION
  script:
    - npm ci --cache .npm --prefer-offline
  tags:
    - docker

unit_tests:
  stage: test
  image: node:$NODE_VERSION
  script:
    - npm run test:unit
    - npm run test:coverage
  coverage: '/Lines\s*:\s*(\d+\.\d+)%/'
  artifacts:
    reports:
      coverage_report:
        coverage_format: cobertura
        path: coverage/cobertura-coverage.xml
      junit:
        - test-results.xml
  tags:
    - docker
  needs: ["install_dependencies"]

build_application:
  stage: build
  image: docker:20.10.16
  services:
    - docker:20.10.16-dind
  script:
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
    - docker build -t $DOCKER_IMAGE .
    - docker push $DOCKER_IMAGE
  tags:
    - docker
  needs: ["unit_tests"]

container_scanning:
  stage: security
  image:
    name: aquasec/trivy:latest
    entrypoint: [""]
  script:
    - trivy image --format template --template "@contrib/sarif.tpl" -o sarif.json $DOCKER_IMAGE
  artifacts:
    reports:
      sast: sarif.json
  tags:
    - docker
  needs: ["build_application"]
  allow_failure: true

deploy_staging:
  stage: deploy_staging
  image: registry.gitlab.com/gitlab-org/cluster-integration/helm-install-image:latest
  script:
    - kubectl config use-context $KUBECONFIG
    - helm upgrade --install ./helm-chart \
        --namespace $KUBE_NAMESPACE_DEV \
        --set image.tag=$CI_COMMIT_SHORT_SHA
  environment:
    name: staging
    url: https://staging.example.com
  tags:
    - k8s-runner
  needs: ["container_scanning"]
  rules:
    - if: '$CI_COMMIT_BRANCH == "dev"'

deploy_production:
  stage: deploy_prod
  image: registry.gitlab.com/gitlab-org/cluster-integration/helm-install-image:latest
  script:
    - kubectl config use-context $KUBECONFIG_PROD
    - helm upgrade --install ./helm-chart \
        --namespace $KUBE_NAMESPACE_PROD \
        --set image.tag=$CI_COMMIT_SHORT_SHA
  environment:
    name: production
    url: https://example.com
  tags:
    - k8s-runner-prod
  needs: ["container_scanning"]
  when: manual
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'

7. 最佳实践

7.1. 流水线优化

使用并行作业:parallel: 5

按需触发:only: [merge_requests, main]

依赖优化:使用needs减少等待时间

条件执行:使用rules代替only/except

7.2. 安全加固

# 禁用Shell执行器,推荐Docker/Kubernetes Runner
# 设置变量保护(在Settings → CI/CD → Variables中)
variables:
  SECURE_VAR: 'protected_value'  # 标记为"Masked"和"Protected"

7.3. 性能优化

# 使用Docker缓存
build:
  services:
    - docker:20.10.16-dind
  variables:
    DOCKER_BUILDKIT: 1
  before_script:
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
    - docker pull $CI_REGISTRY_IMAGE:latest || true
  script:
    - >
      docker build
      --cache-from $CI_REGISTRY_IMAGE:latest
      -t $DOCKER_IMAGE .

7.4. 多环境管理

# 使用文件包含
include:
  - local: '/ci/build.yml'
  - local: '/ci/test.yml'
  - project: 'my-group/common-ci'
    file: '/security-scan.yml'

8. 故障排除

8.1. 常见错误及解决方案

Runner未注册

确保安装了正确版本的Runner

验证注册Token未过期

检查防火墙和网络连接

作业卡在pending状态

# 检查Runner日志
sudo gitlab-runner --debug run
# 验证标签匹配
sudo gitlab-runner list

Docker权限问题

# 添加gitlab-runner到docker组
sudo usermod -aG docker gitlab-runner

缓存未生效

确保缓存路径在项目目录内

检查cache:key是否正确生成

验证 Runner 配置允许缓存

8.2. 调试技巧

使用CI_DEBUG_TRACE: "1"变量输出详细日志

本地测试流水线:gitlab-runner exec docker

作业环境交互:添加before_script: ["sleep 3600"]然后手动调试

9. 总结

GitLab CI/CD提供了强大的持续集成和持续部署能力,通过合理配置.gitlab-ci.yml文件,可以实现从代码提交到生产部署的全自动化流程。本教程详细介绍了从基础概念到高级配置的完整知识体系,包括Runner安装、流水线编写、高级特性应用和故障排除方法。掌握这些技术后,您可以显著提高开发效率,减少人为错误,确保代码质量和部署安全。随着项目复杂度增长,持续优化CI/CD流程将成为软件工程实践中的重要环节,建议结合具体项目需求灵活应用本教程中的最佳实践。

发表回复

后才能评论