How to tune GKE-hosted runner memory and CPU settings for GitLab CI jobs¶

This guide explains how to configure resource requests and limits for GitLab CI jobs that run on GKE-hosted GitLab runners. These settings help optimize performance and control resource usage for CI pipelines.

Note

This guide is particularly useful when CI jobs are terminated due to Out Of Memory (OOMKilled) errors (typically seen as exit code 137) in jobs such as pre-commit or python:tox.

Supported variables¶

The following variables are available for fine-tuning resource allocation for jobs:

Variable	Description	Example
`KUBERNETES_MEMORY_REQUEST`	Defines the amount of memory requested for the job’s container.	`512Mi`
`KUBERNETES_MEMORY_LIMIT`	Defines the maximum amount of memory the container can use.	`1Gi`
`KUBERNETES_CPU_REQUEST`	Defines the amount of CPU requested for the job’s container.	`500m`

For more details, see the official GitLab Runner documentation: Overwrite container resources

Note

The maximum values for resource overwrites depend on the size of the GKE node hosting the runner. Currently, the limit is set to half of the resources available on a single node: 2000m CPU and 8 Gi memory.

Warning

Use these variables only when necessary, and increase resource values incrementally and responsibly. Setting excessively high limits (for example, 8Gi memory for every job) can cause inefficient cluster utilization, prevent autoscaling from working correctly, and increase costs. Always profile and measure before raising limits.

When to use¶

Only use these variables if:

A job is repeatedly failing due to OOMKilled (e.g. exit code 137).
A job runs significantly slower than expected due to CPU throttling.
Profiling shows that the job requires more resources than the default configuration provides.

Estimating appropriate values¶

Before changing resource settings, profile the job locally or in a controlled environment to understand its actual requirements. An example profiling workflow is shown below:

Run the same job locally in a Docker container:

Monitor resource usage while the job runs:

docker container top <container_id>
docker stats <container_id>

Note the peak CPU and memory usage.

Steps¶

Define the resource tuning variables in a CI job configuration file. For example:

resource-tuned-job:
  stage: test
  script:
    - echo "Running with tuned Kubernetes resource settings"
  variables:
    KUBERNETES_MEMORY_REQUEST: "512Mi"
    KUBERNETES_MEMORY_LIMIT: "1Gi"
    KUBERNETES_CPU_REQUEST: "500m"
  tags:
    - $GKE_RUNNER_TAG

Commit and push the configuration changes. Subsequent CI jobs will use the specified memory and CPU settings on the Kubernetes runner.

Summary¶

This guide described how to control memory and CPU resources for CI jobs running on a GKE-hosted GitLab runner using the KUBERNETES_MEMORY_REQUEST, KUBERNETES_MEMORY_LIMIT, and KUBERNETES_CPU_REQUEST variables. These settings should be used sparingly, only when default configurations are insufficient, and should always be based on profiling data to maintain efficient use of cluster resources.