DevOps standards for Continuous Integration and Continuous Delivery (CI/CD)¶
Introduction¶
This document provides an overview of the University of Cambridge Information Services’ (UIS’) Development and Operations (DevOps) Division approach to implementing Continuous Integration and Continuous Delivery (CI/CD) as per ISO standard ISO/IEC/IEEE 32675:2022(E) “Information technology — DevOps — Building reliable and secure systems including application build, package and deployment”.
CI/CD is the automation of most or all of the manual processes traditionally required to get new code deployed to production. It encompasses the build, review, test and deployment processes. Since we provision infrastructure as code, this extends to infrastructure provisioning. With a CI/CD pipeline, development teams can make changes to code that are then automatically tested and pushed out for delivery and deployment. Automation makes processes predictable, reviewable, and repeatable so that there is less opportunity for error from human intervention.
In this document we consider CI/CD broadly to encompass processes which a) are automated, b) have outputs which relate to the DevOps development lifecycle and c) are triggered by actions by engineers or on a schedule. Notably absent from this are scheduled processes which run as part of services themselves such as scheduled data imports, syncs or other batch processes.
Where CI/CD processes are programming language specific, the provision of standard CI/CD templates may not be the same between languages. It is desirable to close feature gaps between languages where they exist, for example, by generating templates that are compatible with multiple languages.
This document describes CI/CD standards at a high-level. Specifics and implementation details may be found elsewhere. For example, our guidebook has a page on how to add CI/CD to a Python project and a page on configuring release automation via CI/CD. Our existing web application and terraform deployment boilerplates already include the standards described below.
CI/CD automation is performed via GitLab pipelines. These standards are either implemented in the shared DevOps CI template configurations or use out of the box solutions.
Applicability¶
This standard should be implemented for all services that the university develops, whether internally, or through third parties. It is particularly relevant to services that handle sensitive data or support important processes, where a security breach could have significant negative impacts.
Exceptions can be submitted following the exemption process within the Systems Management Policy informing and seeking approval of the Tech Leads Forum first
Terminology¶
A job is an automated process which usually takes as its input a specific commit of the code and some configuration variables. Jobs may generate artefacts which are either files shared subsequently with other jobs or files which are published to some external registry. Jobs may have intended side-effects such as releasing a new version of some software or provisioning some infrastructure.
Jobs are collected into pipelines. Within a pipeline, jobs may depend on one another for
artefacts or side-effects. The configuration of jobs and pipelines lives within the git repository
of a project, usually in a .gitlab-ci.yml
file.
Pipelines are triggered from multiple sources. Most commonly we use push-triggered pipelines which trigger when new commits are pushed to a repository, tag-triggered pipelines which trigger when commits are tagged and scheduled pipelines which run according to a defined schedule.
Jobs within pipelines may indicate that they deploy to environments. The most recent jobs which have deployed to an environment can be seen in the “Environments” tab in GitLab.
Figure taken from our
Guidebook tutorial on creating a Python package.
Cloud Platform-as-a-Service¶
The following sections detail our standard CI pipeline workflows. While the concepts are generic, the implementation of our custom CI templates is specifically designed for use with our standard Platform-as-a-Service (PaaS) based on Google Cloud and follows our published standards on Development and Deployment. This is so that we can ensure certain tasks which may require additional thought, such as authentication to external systems, are implemented in a consistent, approved and secure way. As such, there are some core cloud platform prerequisites which must be in place before using many of the workflows below, they are:
- A folder/product needs to be created first on Google Cloud using the gcp-product-factory. This will create the space in our PaaS to host your service and will ensure that all the required pre-configuration is securely deployed following our standards.
- The GitLab projects which will store source code and run the CI/CD pipelines must be deployed via the gitlab-project-factory. Existing projects can be imported.
- A GitLab Runner must have been deployed for the GitLab projects in question via the gitlab-runner-infrastructure Terraform configuration. This ensures that the GitLab runner required to execute CI and CD jobs has the correct configuration and is secure.
For a more detailed explanation on our Google Cloud standards see the Cloud Platform section in the Guidebook.
GitLab Auto-DevOps¶
Pipelines should use GitLab’s Auto-DevOps feature. This provides a number of standard jobs for tasks such as container image builds, dependency scanning, secret detection, and code quality.
Enabling Auto-DevOps can be as simple as including the Auto-DevOps template in a project’s .gitlab-ci.yml file. However, in practice we have embedded the Auto-DevOps template into our own CI templates to allow us to extend the Auto-DevOps pipeline, adding additional functionality where required. For example, our standard web application boilerplate uses our common-pipeline.yml, which should be the default template used in any of our Python projects.
Our CI templates can be found in the dedicated ci-templates GitLab project.
Stages¶
CI/CD pipelines have multiple stages corresponding to different classes of work. The following are jobs you should employ in CI/CD pipelines, grouped by stage. As mentioned above, some of the jobs described below ship with GitLab’s Auto DevOps feature and some have been created by UIS DevOps.
Build and package¶
For web application or library code, we build appropriate build artefacts. For web applications, this will be a container image built using the Auto-DevOps Build job. For libraries this is generally some language-specific package intended for upload to package registries such as maven, PyPI or npm. In this case we have to implement our own build job. For PyPI, this is implemented using our standard python-publish template.
Analyse¶
We employ a number of static analysis jobs which generate report artefacts. As a baseline, we must include all the Auto-DevOps Dependency Scanning as well as the other Auto-DevOps analysis jobs such as Container Scanning, Code Quality, Secret Detection, and Static Application Security Testing ( SAST).
The output from these analysis jobs is provided back to GitLab which allows us to rapidly scan for projects using out-of-date dependencies or ones whose container images contain libraries with known vulnerabilities.
GitLab security policies run this analysis step for the main
branch regularly so that we may scan
for new vulnerabilities as they arise.
Test and lint¶
“Linting” is a term of art used in Software Development to refer to automated code formatting checks. These checks do not check functional correctness per se but rather look for common patterns which, although still allowing the code to function correctly, are considered detrimental to readability or maintainability. Recently it has become common to enforce an opinionated code formatting style for reasons of consistency and for that code formatting style to be amenable for automatic application. Below we’ll use “code format checking” to mean the use of both linting and more opinionated code style tooling.
In the test and lint stage there will be some form of automated testing and code style review. It is outside this paper’s scope to cover test standards, which is covered in a separate paper. For our Python projects, use our python-tox template to run tests using tox, and for our Terraform repositories use the jobs in the test stage of the terraform-pipeline template.
For projects with unit-tests it will generate a code-coverage artefact which indicates which code the unit-tests actually tested. This is used later in the review section.
Testing jobs usually comprise the majority of pipelines triggered by commits to the code.
For pipelines which are triggered by tags, we will also perform some basic correctness checks such as ensuring that the version of the software in the packaging metadata matches the name of the tag. This is to enforce that a tag named 1.2.3 does actually correspond with version 1.2.3 of the software.
We also use pre-commit hooks to specify which code formatting tools, and configuration, to use. Use our pre-commit template to enforce pre-commit checks in CI/CD. The use of pre-commit allows developers to configure tools to run locally as well when committing changes. This helps lower the latency of the feedback loop for what may be viewed as “trivial” code formatting changes. This also helps ensure that reviewers stay focused on content rather than style.
Review¶
GitLab Merge Requests (MRs) aggregate a number of reports generated by CI/CD jobs. Exactly which jobs are run depends on the nature of the repository. For web application and library code, the code coverage artefact will be used to highlight the changed portions of code depending on whether the tests exercised them or not. Similarly, a test report will be attached to the MR indicating which tests ran and which passed. MRs cannot ordinarily be merged unless all tests pass. When a container image is built, tests are run within the container to provide some confidence that the packaging is successful.
Code format checking is treated as a test in its own right and so formatting issues will need to be addressed or manually skipped for MRs to be merged.
While we currently perform dependency and security vulnerability analysis for MRs, we do not prevent merging if new vulnerabilities are determined. GitLab provides mechanisms to define security policies surrounding vulnerability triaging and reviewing which we are in the process of adopting and will be covered in another paper.
Bill-of-materials analysis for a project in GitLab. The analysis for this page is performed as a
CI/CD job.
Release and publish¶
Jobs in this stage are performed only on commits which have been merged into the default branch. Even so, we run the test and lint jobs once more and do not proceed until they pass.
Use our standard release automation workflow for creating and tagging new releases. Details of this workflow are available in the guidebook but in brief, we have two forms of release automation:
Variant A involves jobs triggered when new commits land in the default branch. These jobs curate a
CHANGELOG.md
file and increment the version number specified in the packaging configuration.
Version numbers are incremented in line with semantic versioning informed
by conventional commit messages.
In this case, use of
the commitlint job
which will enforce conventional commit naming schemes. Once the CHANGELOG.md
file and versioning
metadata are updated, a new tag and GitLab release is created named after the version.
Variant B similarly involves jobs triggered when new commits land in the default branch. This time
changes to the version number and CHANGELOG.md
file are kept in a separate MR. Merging this MR
effectively causes a new tag and GitLab release to be created. This variant is suitable for products
where new releases should not be created for each change which lands in the default branch.
Example MR managed by variant B of the automated release process.
The GitLab release and automated changelog generated when merged.
No matter how the git tag corresponding to a release is made, once created a pipeline is triggered to publish the build artefact for that release. For libraries this will involve publishing the artefact to a package registry appropriate for the language. This will be a public registry for Open Source libraries or the private GitLab registry for private or internal libraries.
For web applications, publication involves pushing the container image to the Google Cloud Artifact Registry tagged with the version, ready for deployment. Use our artifact-registry template for this task. This template handles authentication to both the GitLab Container Registry and the Google Cloud Artifact Registry in a secure and approved manner.
CI/CD pipeline triggered by a new git tag which builds, tests and publishes a container image to
the Google Artifact Registry.
The container images pushed to the Google Artifact Registrty ready for deployment.
Deploy¶
Deployment of services and provision of infrastructure is done via an infrastructure as code using Terraform. Terraform code is code like any other and so many of the CI/CD jobs described above are equally applicable.
Use our terraform-pipeline CI template for Terraform deployments. This template is designed explicitly for Terraform and runs test and code-format checking jobs as well as “plan” and “apply” jobs for our standard development, staging and production environments.
These jobs are triggered when new commits are pushed to the repository. The “plan” jobs allow manual review of infrastructure changes in advance of “apply”. Any commit may be deployed to the development environment but, ordinarily, only commits which land in the default branch should trigger deployments to staging or production environments.
For commits which are merged into the default branch, the staging environment is, unless configured otherwise, deployed automatically. This lets us treat the staging environment as the most “up-to-date” infrastructure and ensure that the default branch is always deployable. Jobs are created for deployment to the development and production environments but they require manual approval to proceed. Manual approval for deployment to the production and development environments can be performed in the Pipelines view by pressing the “play button” next to the appropriate environment to trigger the deployment.
The container images pushed to the Google Artifact Registrty ready for deployment.
GitLab also has an Environments view, which can be very useful to see at a glance which version is currently deployed to each environment. However, the “play” and “stop” buttons in this view should not be used as our Terraform pipeline jobs have not yet been configured with the required logic/functions and, therefore, jobs triggered from the Environments view can perform unintended or confusing actions. Triggering manual pipeline jobs must be done via the Pipelines view, as described above until support has been added.
The “Environments” page for a product in GitLab showing which version is currently deployed to the
staging environment.