How to restore a GCS bucket from an AWS S3 backup¶
This guide describes how to restore a Google Cloud Storage (GCS) bucket from its backup stored in an AWS S3 bucket.
Overview¶
The Data Backup Service provides an automated way to enroll a GCS bucket for regular backups to an AWS S3 bucket. This is useful for disaster recovery purposes, ensuring that critical data stored in GCS is also available in a different cloud provider.
Once a GCS bucket has been enabled in the backup service, you can restore its contents from the AWS S3 backup in the event of data loss or corruption.
Prerequisites¶
- Your GCS bucket has been previously enrolled in the Data Backup Service and has existing backups in the AWS S3 bucket.
- You have the necessary permissions to copy data to the GCS buckets you wish
to restore. The role required for this is:
roles/storage.objectCreator- Use your gcloudadmin account to perform these operations which will require deploy or admin permissions.
Steps to restore a GCS bucket from AWS S3 backup¶
For this procedure, we'll assume the following:
- GCP Project ID:
my-gcp-project. - GCS Bucket to restore:
my-gcs-bucket. - gcloudadmin group with admin permissions for the project is
myteam-admin@gcloudadmin.g.apps.cam.ac.uk.
Perform the following steps:
- Find the GCP project ID.
- Find the GCS bucket name you wish to restore.
- Find the
gcloudadmingroup with admin permissions for the project. - Checkout the latest main branch of data-backup/infrastructure repo.
-
Create a new branch for your changes:
git checkout -b restore-my-gcs-bucket -
Open restore.tf.
-
Add a new object to the local variable
restore_configurationfor production as shown below:locals { restore_configuration = { production = [ { project_id = "my-gcp-project", bucket_name = "my-gcs-bucket", iam_gcloudadmin_group_names = ["myteam-admin"] } ] } } -
Commit and push your changes for your branch and create a merge request to main.
- Check the pipeline for the merge request to ensure it passes all checks.
- Once the merge request has been approved and merged, apply the changes to the production environment using the CI pipeline for the merged main branch.
- This will start the restore process. Depending on the size of the GCS bucket being restored, this may take some time. You can check the progress of the restore in data-backup Storage Transfer Jobs page.
- The restore process will copy the contents of the AWS S3 backup bucket back to a restore bucket
in the data-backup GCP project into a folder with the same project id and bucket name as shown below:
data-backup-prod-s3-restore-65434812/my-gcp-project/my-gcs-bucket
- The required IAM permissions for the specified
gcloudadmingroup are added to the restore bucket to allow copying the restored data to the original GCS bucket, either partially or fully, depending on the nature of the data loss or corruption. - It is envisaged that the copying of data from the restore bucket to the original GCS bucket will be done manually by the service team responsible for the GCS bucket, using tools such as:
- Once the restore is complete, it is recommended to remove the restore configuration added in step
7 by deleting the relevant object from the local variable
restore_configurationand creating a new merge request to main and applying the changes using the CI pipeline.