Service Catalogue Production Database Restore to Test environment
Service catalog schema is stored in Postgres database, everyday at 23:00 Test database gets refreshed from production database export dump.
Service catalog Github Repository has refresh script helm_deploy/hmpps-service-catalogue/db_backup_restore.sh
and Job schedule config is in helm_deploy/hmpps-service-catalogue/templates/db-backup-restore-cronjob.yaml
.
Status of job can be found with kubectl describe jobs -n hmpps-portfolio-management-prod
. Job starting with sc-database-backup-restore
are for service catalog schema refresh. After successful completion pod status shows completed and describe command shows 0 failed for relevant sc-database-backup-restore job.
Pods Statuses: 0 Active (0 Ready) / 1 Succeeded / 0 Failed
In case of job failure , reasons will be test and production schema’s have mismatches. To investigate/debug issues perform following steps.
- Create job based on the failed one.
kubectl -n hmpps-portfolio-management-prod get jobs sc-database-backup-restore-28847460 -o yaml > sc-database-backup-restore-debug.yaml
- Update output file, remove all status metadata, rename the job and edit the command/args parameters as below
- command: - /usr/bin/sleep args: - "10000"
- Apply the config changes, this step will run the job and will keep the pod running.
kubectl -n hmpps-portfolio-management-prod apply -f sc-database-backup-restore-debug.yaml
- Connect to the pod with an interactive terminal
kubectl -n hmpps-portfolio-management-prod exec -it sc-database-backup-restore-debug-lsz58 -- /bin/bash
-
On connected pod script is /tmp/entrypoint.sh , it has both backup and restore commands specified. Debug part is same as any other bash scripts, not covered in this document.
-
Once the issue is sorted, delete the debug job.
kubectl delete job [job name] -n hmpps-portfolio-management-prod
-
Alerts will not be cleared even after successful job completion as the previous failed job will still exist. Manually delete any failed jobs to clear the alert.
In case you need to refresh the developement database from production.
- Get the cron job details
$ kubectl get cronjob -n hmpps-portfolio-management-prod NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE sc-database-backup-restore 0 23 * 1-12 * False 0 12h 190d update-dependency-info 0 */6 * * * False 0 5h45m 209d
- Create new job restore job
$ kubectl create job --from=cronjob/sc-database-backup-restore manual-sc-database-backup-restore -n hmpps-portfolio-management-prod job.batch/manual-sc-database-backup-restore created $ kubectl get pods -n hmpps-portfolio-management-prod| grep restore manual-sc-database-backup-restore-t7bwd 0/1 ContainerCreating 0 16s sc-database-backup-restore-28856100-42kfz 0/1 Completed 0 36h $ kubectl get events -n hmpps-portfolio-management-prod LAST SEEN TYPE REASON OBJECT MESSAGE 46s Normal Scheduled pod/manual-sc-database-backup-restore-t7bwd Successfully assigned hmpps-portfolio-management-prod/manual-sc-database-backup-restore-t7bwd to ip-172-20-149-165.eu-west-2.compute.internal 43s Normal Pulling pod/manual-sc-database-backup-restore-t7bwd Pulling image "ghcr.io/ministryofjustice/hmpps-devops-tools" 46s Normal SuccessfulCreate job/manual-sc-database-backup-restore Created pod: manual-sc-database-backup-restore-t7bwd 46s Warning UnexpectedJob cronjob/sc-database-backup-restore Saw a job that the controller did not create or forgot: manual-sc-database-backup-restore
Pod status will remain at ContainerCreating for few mins and later job status will be completed.
$ kubectl get jobs -n hmpps-portfolio-management-prod NAME COMPLETIONS DURATION AGE manual-sc-database-backup-restore 1/1 6m1s 7m32s.