Tuesday, 2 January 2024

PostgreSQL Backup Health Check Best Practices

There are several backup strategies available for PostgreSQL, and the right one for your environment will depend on your backup and recovery requirements. Some popular strategies include:
 
1. Physical Backup:
·        A physical backup is a binary copy of the database cluster's files at a specific point in time.
·         It includes all database objects, system catalogs, configuration files, and transaction logs.
·         The backup is essentially a snapshot of the entire database storage.
·         Physical backups are usually faster to create and restore compared to logical backups.
 
2. Logical Backup:
·         A logical backup contains SQL statements that can recreate the database's structure and data.
·         This type of backup is created using tools like pg_dump, which generates a script with SQL                    commands to recreate the database schema and insert the data.
·         Logical backups are more human-readable and can be more flexible in terms of selective data                  restoration.
·         However, they are generally slower to create and restore compared to physical backups.

  


How to verify health of backup image in PostgreSQL
Verifying the health of a backup image in PostgreSQL is crucial to ensure that you can rely on it for recovery in case of data loss or system failure. Here are steps you can take to verify the health of a backup image:
 
1.      Use pg_verifybackup:
Starting from PostgreSQL 13, there is a tool called pg_verifybackup that you can use to verify the integrity of a backup. This tool can check whether the backup is consistent and whether it contains all the required data.
Example:
                pg_verifybackup    /path/to/backup
                The tool will analyze the backup and report any issues it finds.
 
2.      Check Log Files:
Review the PostgreSQL log files generated during the backup process. Look for any error messages or warnings that might indicate a problem with the backup. The log files are usually located in the PostgreSQL data directory, and the main log file is typically named postgresql.log.
 
3.      Restore and Test:
The most reliable way to verify the health of a backup is to restore it to a test environment and perform some basic tests. This will ensure that the backup is not only consistent but also functional for recovery.
·         Restore the backup to a different location or a separate test server.
·         Start the PostgreSQL server using the restored data.
·         Perform some basic queries and checks to ensure the data integrity.
 
4.      Checksums:
Some backup tools and methods support checksums. For example, if you are using pg_basebackup with the --write-recovery-conf option, it generates a backup_label file that contains a checksum. You can compare this checksum with the checksum of the actual backup files to verify integrity.
Example:
                pg_verifybackup --checksums /path/to/backup
 
5.      Use pg_restore:
If you have taken a custom-format dump using pg_dump, you can use pg_restore with the --list option to verify the integrity of the dump file.
Example:
                pg_restore --list /path/to/dumpfile
 
Unlike physical backups (created with tools like pg_basebackup), logical backups created with pg_dump do not have a dedicated health check tool similar to pg_verifybackup. The nature of logical backups makes it more challenging to perform a standalone health check without restoration because the backup is essentially a script containing SQL statements.
  
Creating Physical Backup using pg_basebackup 
&
Verifying Health of the Physical Backup Image using pg_verifybackup
 
Step 1: We can take a physical backup using pg_basebackup command as follows: 
pg_basebackup.exe  -U postgres  -D  D:\verify
 
A backup manifest file with contents will be generated backup_manifest file is nothing but a JSON object, with the following:
      The list of files included in the backup and, for each of them, the relative path from PGDATA and important metadata such as size, time of last modification and checksum        
 
Step 2:  From PostgreSQL 13, there is a tool called pg_verifybackup that can be used to verify the integrity of a backup. This tool can check whether the backup is consistent and whether it contains all the required data.
pg_verifybackup.exe  D:\verify\
 
It will compare the backup_manifest file with the contents generated by pg_basebackup. If all the contents matched properly including checksum then it will generate a message backup successfully verified otherwise invalid checksum for file and its details will be displayed.
 
·         Backup verification proceeds in four stages. First, pg_verifybackup reads the backup_manifest file. If that file does not exist, cannot be read, is malformed, or fails verification against its own internal checksum, pg_verifybackup will terminate with a fatal error.
·         pg_verifybackup ensures the accuracy of data files in a PostgreSQL backup, with exceptions for certain files (postgresql.auto.confstandby.signal, and recovery.signal, because it is expected that these files may have been created or modified as part of the process of taking the backup )and an indirect check for missing directories based on the presence of expected files.
·         Next, pg_verifybackup will checksum all the files, compare the checksums against the values in the manifest, and emit errors for any files for which the computed checksum does not match the checksum stored in the manifest.
·         Finally, pg_verifybackup will use the manifest to verify that the write-ahead log records which will be needed to recover the backup are present and that they can be read and parsed.
 
Creating Logical Backup using pg_dump 
&
Verifying Health of the Logical Backup Image
 
Step 1: We can take a logical backup using pg_dump command as follows: 
pg_dump -U postgres -d employ -F c -f d:\verify\backup_file.dump
 
-U postgres: Specifies the PostgreSQL user
-d employ: Specifies the name of the database
-F c: Specifies custom format is a compressed binary format output file
-f d:\verify\backup_file.dump: Specifies the output file
 
Step 2:  To verify the integrity of a backup file, typically use tools like pg_restore for custom format backups or other methods appropriate for the backup format (-F option with pg_dump). To verify a custom format backup using pg_restore:
 pg_restore --dbname=your_database --list d:\verify\backup_file.dump
 
This command lists the contents of the backup without actually performing the restore. If there are no errors reported, it indicates that the backup file is in good health.
 
Note: pg_restore -l d:\verify\backup_file.dump   (or)
           pg_restore --dbname=employ --list d:\verify\backup_file.dump  “--can be used”.
 
***Very Very Important Note***
For a more thorough health check of a logical backup, you may want to consider additional steps:
 
Step 1: Perform a Dry Run Restore:  This involves actually restoring the backup to a temporary or test database. This process helps ensure that the backup can be successfully restored and that there are no unforeseen issues.
 pg_restore --dbname=your_test_database backup_file.dump

Step 2: Schema and Data Checks: Generate separate SQL scripts for the schema and data portions of the backup using pg_restore. 
pg_restore --dbname=your_database --schema-only backup_file.dump > schema.sql
pg_restore --dbname=your_database --data-only backup_file.dump > data.sql

Review these scripts to ensure that the schema and data are correctly represented.
 
Step 3: Checksum Verification (if applicable): If checksums are enabled in your PostgreSQL cluster, use pg_checksums to verify checksums in the data directory.
 pg_checksums -D /path/to/your/data_directory

Step 4: Review Logs and Output: Check for any warnings or errors in the output of the pg_restore command and review the logs.
 
Remember that these checks are not exhaustive, and the effectiveness of your health check process may depend on the specific requirements and characteristics of your database. Always make sure to test your backup and restore procedures in a controlled environment to ensure that you can reliably recover your data when needed.

No comments:

Post a Comment