Guidance on testing Terraform modules¶
This page outlines our approach to testing Terraform modules. With the
introduction of the terraform
test
command in
Terraform version 1.6
, the Cloud Team has begun writing tests for our shared,
reusable modules in the
uis/devops/infra/terraform
GitLab group.
To support this, we have discussed and agreed on a set of initial processes for when and how to test these modules. This document serves to record and communicate those decisions.
Which Terraform code should we test¶
With the introduction of the terraform test
command, it may be tempting to
assume that all Terraform code should now include tests. However, we believe
there is a clear distinction between shared, reusable modules and standard
infrastructure repositories, often referred to as root
modules.
Shared modules—such as those in the uis/devops/infra/terraform
group—are
designed to be reused across many infrastructure projects. Changes to these
modules can have unintended consequences for multiple downstream configurations.
They also tend to contain more complex logic to support varied use cases. For
these reasons, we believe shared modules are ideal candidates for Terraform
testing.
Using terraform test
, we can validate the logic of these modules and deploy
test resources to ensure all supported configurations are functional. This helps
catch issues early, before releasing a new version for others to use. In the
past, we often relied on manual testing or only discovered problems when a
downstream project failed after updating the module.
In contrast, infrastructure repositories are generally self-contained. They
deploy infrastructure for a specific product and typically make use of our
upstream shared modules. These repos should already use the standard
terraform-pipeline.yml
template, which automatically deploys changes on the default branch to a staging
environment. This already provides a form of integration testing-if the
deployment to staging fails, the pipeline will not proceed to production.
For these reasons, we are not currently recommending that teams write Terraform tests for standard infrastructure repositories. As our use of this feature matures, we may revisit this position for edge cases involving complex logic etc., but for now, our focus remains on testing reusable modules.
Unit tests vs integration tests¶
By default, a terraform test
run will deploy real infrastructure, evaluate
your assertions, and then tear down the resources. We treat this as a form of
integration testing, since it validates actual deployments in a real
environment.
Alternatively, Terraform tests can be configured to run in plan mode, where only a plan is generated and assertions are made against the planned changes. No resources are deployed in this mode, making it faster and more lightweight. We refer to this as unit testing for Terraform code.
Because deploying infrastructure can be time-consuming and resource-intensive, we aim to limit the number of full integration tests per module. Ideally, each module will include one or two integration tests to confirm that key usage patterns function as expected.
For testing logic—such as conditional behaviour or the number and types of resources to be created—we prefer plan-based unit tests. These are significantly faster and more efficient to run.
From version 1.7
onwards, Terraform also supports provider
mocking,
which allows us to simulate provider responses during testing. This enhances
unit tests further by enabling validation of more scenarios without requiring
real infrastructure.
Reporting and code coverage¶
As of Terraform version 1.7
, the terraform test
command supports outputting
test results in JUnit format. This integrates well
with GitLab's unit tests report
feature and is enabled
by default in the terraform-test
job within our standard
terraform-module.yml
pipeline template.
Currently, code coverage metrics are not available for Terraform tests. This means we cannot measure how much of a module's logic is exercised by the test suite. However, this capability may be introduced in future Terraform releases as the testing functionality continues to evolve.
Where should we deploy resources for testing¶
As mentioned earlier, we are currently only writing tests for shared modules in
the uis/devops/infra/terraform
group. To support this, we have provisioned a
Google Cloud product named Infra Testing using the
gcp-product-factory
.
This product includes the standard meta-project along with a single
integration
environment project, which is shared across all module tests. All
shared modules in the group deploy their test resources to this common
integration project.
A product-specific GKE GitLab runner is also deployed
to allow impersonation of the terraform-deploy
service account, which is
provisioned by the gcp-product-factory
. This service account should be used
when deploying resources as part of these tests.
Label everything!¶
Because all test resources are deployed to a single shared Google project, it is essential that we can identify which resources belong to which module.
To support this, all resources that support labels must include the
ucam-devops-terraform-test
label. For resources that do not support labels,
this value should be added to the resource description instead.
The label value should be unique to each module. We've agreed to use the
module's project name for this purpose. For example, the gcp-cloud-run-app
module should apply the following label to its resources:
labels = {
ucam-devops-terraform-test = "gcp-cloud-run-app"
}
How do we ensure that we don't leave resources behind¶
The terraform test
command attempts to destroy all resources it creates at the
end of each test run. In most cases, this works reliably. However, if a test run
fails unexpectedly, Terraform may not clean up the resources properly.
To handle this, we’ve developed a tool called the
terraform-cleanup-tool
.
This Python utility is designed specifically to clean up resources created
during testing of our reusable modules in the uis/devops/infra/terraform
group.
You can run the tool locally to clean up after a failed manual test. It is also
integrated into our standard terraform-module.yml
pipeline template.
Cleanup tool supported resources
The cleanup tool currently supports a defined set of resource types, as listed in the project's README. If your tests create resources that are not yet supported, you are expected to update the tool to handle them appropriately.
Dealing with deletion protection¶
Many Google Cloud resources include a deletion_protection
attribute to prevent
accidental deletion. By default, both the built-in terraform test
teardown and
our terraform-test-cleanup
tool do not modify this setting.
As a result, if a resource has deletion protection enabled, your Terraform tests must explicitly disable it. This ensures resources can be properly destroyed after the test run.
To support this, shared modules often require a deletion_protection
variable
that can be overridden during testing. This allows tests to disable the setting
while keeping it enabled by default in production.
The standard terraform-module.yml
pipeline¶
To ensure consistent configuration of our reusable modules, we’ve created a
standard CI pipeline template named terraform-module.yml
, located in the
ci-templates
repository. This template provides the following functionality:
- Linting via the
terraform-lint.yml
template, including:terraform fmt
terraform validate
tflint
trivy
- Terraform testing through a generic
terraform-test
job - Packaging and publishing the module to the GitLab internal Terraform module
registry, via the Auto-DevOps
Terraform-Module.gitlab-ci.yml
template
All shared modules should include this pipeline by referencing it in their
.gitlab-ci.yml
files.
The following sections provide more detail about how the terraform-test
job
works and why certain elements—such as resource groups and cleanup steps—are
included.
Providing values for input variables¶
The terraform-test
job will automatically load variables from a
./tests/tests.auto.tfvars
file, if present. For most simple cases, this is the
recommended place to define test variables.
For more control, you can use the TERRAFORM_TEST_ARGS
variable to specify
additional arguments to pass to the terraform test
command. For example, to
specify a different .tfvars
file you could do the following.
terraform-test:
variables:
TERRAFORM_TEST_ARGS: -var-file=$CI_PROJECT_DIR/something-else.tfvars
Alternatively, you can define individual variables using Terraform’s
TF_VAR_
prefix:
terraform-test:
variables:
TF_VAR_project: my-different-project-id
Before and after script cleanup¶
The terraform-test
job is configured with both a before_script
and an
after_script
, each of which runs the terraform-test-cleanup
tool. Running
this tool before and after the test ensures that no leftover resources from a
previous run interfere with the current test.
Without this step, stale resources could lead to failures due to name conflicts or unexpected state. The cleanup scripts run quickly when there's nothing to remove, so we consider this a low-cost but effective safeguard.
As of GitLab 17, the
after_script
now executes even if a job is cancelled. This long-awaited feature ensures
cleanup runs even in failure or cancellation scenarios.
However, users can still force-cancel a pipeline, which skips the after_script
entirely. To protect against this edge case, we also run the cleanup tool in the
before_script
, ensuring that any leftover resources are removed before the
next test run begins. This helps avoid issues like name or subnet clashes.
Overriding terraform-test
before_script
and after_script
¶
If you need to customise the before_script
or after_script
for the
terraform-test
job, you must still include the terraform-test-cleanup
tool to ensure proper cleanup.
You can use GitLab’s !reference
syntax to include the default logic alongside
your additions. For example, to extend the before_script
:
terraform-test:
before_script:
- !reference [.terraform-test-cleanup, before_script]
- echo "Some other command here..."
And for the after_script
you can do the same thing:
terraform-test:
after_script:
- !reference [.terraform-test-cleanup, after_script]
- echo "Some other command here..."
Resource groups avoid duplicate test runs¶
To avoid conflicts such as resource name or IP address clashes, the
terraform-test
job uses GitLab's
resource_group
setting:
terraform-test:
...
resource_group: $CI_PROJECT_NAME
This configuration ensures that only one instance of the terraform-test
job
can run at a time per project. If another pipeline is triggered—either by a new
commit or by another developer working in parallel—any additional
terraform-test
jobs will be queued.
They will display a message indicating they are waiting for the resource group to become available. Once the running job completes, the next one in the queue will start automatically.
Running jobs in parallel using a job matrix¶
As of Terraform version 1.12
, the terraform test
command supports parallel
execution
natively.
However, this feature has some limitations, and care is needed when writing
tests to avoid unintended conflicts.
An alternative—especially if you’re using a version earlier than 1.12
—is to
run tests in parallel using a job
matrix. This involves
extending the terraform-test
job and passing a different test file to each
matrix entry.
For example, if your project includes two test files, tests/one.tftest.hcl
and
tests/two.tftest.hcl
, you could configure the following in your
.gitlab-ci.yml
:
terraform-test:
parallel:
matrix:
- TERRAFORM_TEST_FILTER: tests/one.tftest.hcl
- TERRAFORM_TEST_FILTER: tests/two.tftest.hcl
resource_group: "$CI_PROJECT_NAME/$TERRAFORM_TEST_FILTER"
The terraform-test
job supports the TERRAFORM_TEST_FILTER
variable to
specify a single test file to run. By setting this variable in each matrix
entry, you can run multiple tests in parallel.
resource_group must remain unique
You must remember to also customise the resource_group
to be unique per
test file. Without this, all jobs would share the same group and run
sequentially instead of in parallel.