Service Catalogue Production Database Restore to Test environment
Service catalog schema is stored in Postgres database, everyday at 23:00 Test database gets refreshed from production database export dump.
Service catalog Github Repository has refresh script helm_deploy/hmpps-service-catalogue/db_backup_restore.sh
and Job schedule config is in helm_deploy/hmpps-service-catalogue/templates/db-backup-restore-cronjob.yaml
.
Status of job can be found with kubectl describe jobs -n hmpps-portfolio-management-prod
. Job starting with sc-database-backup-restore
are for service catalog schema refresh. After successful completion pod status shows completed and describe command shows 0 failed for relevant sc-database-backup-restore job.
Pods Statuses: 0 Active (0 Ready) / 1 Succeeded / 0 Failed
In case of job failure , reasons will be test and production schema’s have mismatches. To investigate/debug issues perform following steps.
- Create job based on the failed one.
kubectl -n hmpps-portfolio-management-prod get jobs sc-database-backup-restore-28847460 -o yaml > sc-database-backup-restore-debug.yaml
- Update output file
sc-database-backup-restore-debug.yaml
- Change the
metadata.name
to a new job name. - Remove the
metadata.uid
,metadata.resourceVersion
andmetadata.creationTimestamp
fields. - Remove the status section.
- Remove the
spec.template.metadata.labels
andspec.selector
fields. - Edit the command/args parameters as below
- command: - /usr/bin/sleep args: - "10000"
- Apply the config changes, this step will run the job and will keep the pod running.
kubectl -n hmpps-portfolio-management-prod apply -f sc-database-backup-restore-debug.yaml
- Connect to the pod with an interactive terminal
kubectl -n hmpps-portfolio-management-prod exec -it sc-database-backup-restore-debug-lsz58 -- /bin/bash
-
On connected pod script is /tmp/entrypoint.sh , it has both backup and restore commands specified. Debug part is same as any other bash scripts, not covered in this document.
-
Once the issue is sorted, delete the debug job.
kubectl delete job [job name] -n hmpps-portfolio-management-prod
-
Alerts will not be cleared even after successful job completion as the previous failed job will still exist. Manually delete any failed jobs to clear the alert.
In case you need to refresh the developement database from production.
- Get the cron job details
$ kubectl get cronjob -n hmpps-portfolio-management-prod NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE sc-database-backup-restore 0 23 * 1-12 * False 0 12h 190d update-dependency-info 0 */6 * * * False 0 5h45m 209d
- Create new job restore job
$ kubectl create job --from=cronjob/sc-database-backup-restore manual-sc-database-backup-restore -n hmpps-portfolio-management-prod job.batch/manual-sc-database-backup-restore created $ kubectl get pods -n hmpps-portfolio-management-prod| grep restore manual-sc-database-backup-restore-t7bwd 0/1 ContainerCreating 0 16s sc-database-backup-restore-28856100-42kfz 0/1 Completed 0 36h $ kubectl get events -n hmpps-portfolio-management-prod LAST SEEN TYPE REASON OBJECT MESSAGE 46s Normal Scheduled pod/manual-sc-database-backup-restore-t7bwd Successfully assigned hmpps-portfolio-management-prod/manual-sc-database-backup-restore-t7bwd to ip-172-20-149-165.eu-west-2.compute.internal 43s Normal Pulling pod/manual-sc-database-backup-restore-t7bwd Pulling image "ghcr.io/ministryofjustice/hmpps-devops-tools" 46s Normal SuccessfulCreate job/manual-sc-database-backup-restore Created pod: manual-sc-database-backup-restore-t7bwd 46s Warning UnexpectedJob cronjob/sc-database-backup-restore Saw a job that the controller did not create or forgot: manual-sc-database-backup-restore
Pod status will remain at ContainerCreating for few mins and later job status will be completed.
$ kubectl get jobs -n hmpps-portfolio-management-prod NAME COMPLETIONS DURATION AGE manual-sc-database-backup-restore 1/1 6m1s 7m32s.
APPENDIX
Sample shell script to create a job based on the last failed job
#!/bin/bash db_job=/tmp/dbjob.yaml last_job=$(kubectl -n hmpps-portfolio-management-prod get jobs | grep sc-database-backup-restore | tail -1 | awk '{print $1}' ) kubectl -n hmpps-portfolio-management-prod get jobs $last_job -o yaml > ${db_job} items_to_delete=(".metadata.uid" ".metadata.resourceVersion" ".metadata.creationTimestamp" ".status" ".spec.template.metadata.labels" ".spec.selector") for each_del in "${items_to_delete[@]}"; do yq eval "del(${each_del})" -i ${db_job} done unique_name="sc-test-$(date +%s)" yq eval ".metadata.name = \"${unique_name}\"" -i ${db_job} yq eval '.spec.template.spec.containers[0].command = ["/usr/bin/sleep", "15000"]' -i ${db_job} kubectl -n hmpps-portfolio-management-prod apply -f ${db_job} kubectl get pods | grep sc-test | grep RunningThis page was last reviewed on 11-Nov-2024, next review will be on 11-Feb-2025.
Edit this page here.