Skip to content

BAU Process and Responsibilities

Purpose

This document defines the Business As Usual (BAU) process for the Cloud Team, outlining its purpose, responsibilities, and operational expectations.

The BAU role ensures the continuous, reliable operation of the Cloud Team’s services, Terraform modules, and supporting infrastructure.

This rotating role designates one team member each week as the first line of response for operational events, notifications, and routine maintenance activities.

The goal is to provide a consistent point of contact, maintain situational awareness of current issues, and handle day-to-day operational work without disrupting ongoing project development.


Scope

The BAU Engineer acts as the initial point of contact for the Cloud Team for the duration of the assigned week. Their responsibilities include both proactive monitoring and reactive response across several areas.

Monitoring and Incident Response

  • Actively monitor alerts, alarms, and notifications from GCP, on-Prem, Teams, and Halo.
  • Triage incoming alerts to determine severity and required action.
  • Where possible, investigate and remediate incidents directly.
  • Reassign issues internally within the Cloud Team if deeper investigation is needed.
  • Escalate to the appropriate Development Team if the issue relates to a product under their ownership.
  • Ensure all incidents and actions are properly documented in Gitlab or Halo issues where applicable.

Renovate Dependency Updates

  • Review Dependency Dashboard for Renovate generated merge requests for Cloud Team repositories (modules, boilerplates, or other infrastructure code).
  • Validate that updates are appropriate and do not introduce breaking changes.
  • Run or verify automated tests associated with the repository.
  • Merge safe and verified updates.
  • Flag any complex or high-risk changes for further review by creating a new issue tagged with workflowNeeds Refinement and teamCloud

Team Communications and Queries

  • Monitor the Cloud Team channel for messages, support requests, or operational questions.
  • Provide initial responses to developer or stakeholder queries where possible.
  • Reassign or escalate queries internally if specialised knowledge is required.
  • Maintain professional, helpful, and responsive communication throughout the assigned week.

Responsibilities Summary

Category Responsibilities Typical Actions
Monitoring & Alerts Triage and respond to alerts from GCP and on-prem systems Acknowledge alerts, investigate, fix or escalate
Incident Documentation Ensure all actions are recorded and traceable Comment on issues, update tickets, document in logs
Renovate Updates Process dependency update MRs Review, test, merge, or escalate updates
Team Communication Respond to chat messages and support requests Answer, reassign, or follow up on questions

Duration and Rotation

  • The BAU role is rotated weekly among Cloud Team members.
  • The rota is managed centrally in the DevOps Cloud Shared calendar.
  • Engineers may swap BAU weeks between themselves where necessary, provided the change is communicated with the team during daily Stand-Up.
  • Holiday or planned leave cover will be arranged by the Team Lead, ensuring the rota remains balanced and all weeks are covered.
  • The BAU Engineer should ensure that:
    • They are available and reachable during their assigned week.
    • A short handover is completed before the end of the week, briefing the next BAU Engineer on:
      • Any unresolved issues or incidents.
      • Pending Renovate merge requests (MRs).
      • Outstanding Halo tickets or chat items requiring follow-up.

Expectations and Conduct

  • The BAU Engineer will use the Cloud BAU issue board. This board will automatically be updated with issues tagged with cloud-bau
  • The BAU Engineer is not expected to fix every issue personally, but they do own the triage and ensure issues are escalated and acknowledged.
  • During their assigned week, BAU duties take priority over project work. Work prioritisation and expectations will reflect that.
  • Issues that cannot be resolved immediately should be:
    • Documented clearly and added to the teamCloud board.
    • Escalated appropriately (either internally or to the product team).