Backup and Restoration

Overview

For database security purposes, openGauss provides two backup types, multiple backup and restoration solutions, and data reliability assurance mechanisms.

Backup and restoration can be logically or physically performed.

  • Logical backup and restoration: backs up data by logically exporting data. This method can dump data that is backed up at a certain time point, and restore data only to this backup point. A logical backup does not back up data processed between failure occurrence and the last backup. It applies to scenarios where data rarely changes. Such data damaged due to misoperation can be quickly restored using a logical backup. To restore all the data in a database through logical backup, rebuild a database and import the backup data. Logical backup is not recommended for databases requiring high data availability because it takes a long time for data restoration. Logical backup is a major approach to migrate and transfer data because it can be performed on any platform.

  • Physical backup and restoration: copies physical files in the unit of disk blocks to back up a database. A database can be restored using backup files, such as data files and archive log files. Physical backup is usually used for full backup, quickly backing up and restoring data with low costs if properly planned.

    The two data backup and restoration solutions supported by openGauss are as follows. Methods for restoring data in case of an exception differ for different backup and restoration solutions.

    Table 1 Comparison between logical and physical backup and restoration

    Backup Type

    Application Scenario

    Media

    Advantage and Disadvantage

    Logical backup and restoration

    Small volume of data needs to be processed.

    Currently, it is used for the backup and restoration of one or more tables.

    • Disk
    • SSD

    Backing up a small amount of data (for specified objects) is efficient and flexible.

    For a large volume of data, backup requires a long period of time.

    Physical backup and restoration

    Huge volume of data needs to be processed. It is mainly used for full backup and restoration as well as the backup and restoration of all WAL archive and run logs in the database.

    Backing up a huge amount of data is efficient.

    While backing up and restoring data, take the following aspects into consideration:

    • Whether the impact of data backup on services is acceptable

    • Database restoration efficiency

      To minimize the impact of database faults, try to minimize the restoration duration, achieving the highest restoration efficiency.

    • Data restorability

      Minimize data loss after the database is invalidated.

    • Database restoration cost

    There are many factors that need to be considered while you select a backup policy on the live network, such as backup objects, data volume, and network configuration. Table 2 lists available backup policies and applicable scenarios for each backup policy.

    Table 2 Backup policies and scenarios

    Backup Policy

    Key Performance Factor

    Typical Data Volume

    Performance Specifications

    Cluster backup

    • Data amount
    • Network configuration

    Data volume: PB level

    Object quantity: about 1 million

    Backup:

    • Data transfer rate on each host: 80 Mbit/s (NBU/EISOO+Disk)
    • Disk I/O rate (SSD/HDD): about 90%

    Table backup

    • Schema where the table to be backed up resides
    • Network configuration (NBU)

    Data volume: 10 TB level

    Backup: depends on query performance rate and I/O rate

    NOTE:

    For multi-table backup, the backup time is calculated as follows:

    Total time = Number of tables x Starting time + Total data volume/Data backup speed

    In the preceding information:

    • The starting time of a disk is about 5s. The starting time of an NBU is longer than that of a disk (depending on the NBU deployment).
    • The data backup speed is about 50 MB/s on a single node. (The speed is evaluated based on the backup of a 1 GB table from a physical host to a local disk.)

    The smaller the table is, the lower the backup performance will be.

Physical Backup and Restoration

gs_basebackup

Background

After openGauss is deployed, problems and exceptions may occur during database running. gs_basebackup, provided by openGauss, is used to perform basic physical backup. gs_basebackup copies the binary files of the database on the server using a replication protocol. To remotely execute gs_basebackup, you need to use the system administrator account. gs_basebackup supports hot backup and compressed backup.

NOTE:

  • gs_basebackup supports only full backup.

  • gs_basebackup supports hot backup and compressed backup.

  • gs_basebackup cannot back up tablespaces containing absolute paths on the same server. This is because the absolute path is unique on the same machine, and brings about conflicts. However, it can back up tablespaces containing absolute paths on different machines.

  • If the functions of incremental checkpoint and dual-write are enabled, gs_basebackup also backs up dual-write files.

  • If the pg_xlog directory is a soft link, no soft link is created during backup. Data is directly backed up to the pg_xlog directory in the destination path.

  • If the backup permission is revoked during the backup, the backup may fail or the backup data may be unavailable.

  • openGauss does not support version upgrade.

Prerequisites

  • The openGauss database can be connected.

  • User permissions are not revoked during the backup.

  • In the pg_hba.conf file, the replication connection is allowed and the connection is established by a system administrator.

  • If the Xlog transmission mode is stream, the number of max_wal_senders must be configured to at least one.

  • If the Xlog transmission mode is fetch, the wal_keep_segments parameter must be set to a large value so that logs are not removed before the backup ends.

  • During the restoration, backup files exist in the backup directory on all the nodes. If backup files are lost on any node, copy them to it from another node.

Syntax

  • Display help information.

    gs_basebackup -? | --help
    
  • Display version information.

    gs_basebackup -V | --version
    

Parameter Description

The gs_basebackup tool can use the following types of parameters:

  • -D directory

    Directory for storing backup files. This parameter is mandatory.

  • Common parameters

    • -c, --checkpoint=fast|spread

      Sets the checkpoint mode to fast or spread (default).

    • -l, --label=LABEL

      Adds tags for the backup.

    • -P, --progress

      Enables the progress report.

    • -v, --verbose

      Enables the verbose mode.

    • -V, --version

      Prints the version and exits.

    • -?, --help

      Displays gs_basebackup command parameters.

    • -T, –tablespace-mapping=olddir=newdir

      During the backup, the tablespace in the olddir directory is relocated to the newdir directory. For this to take effect, olddir must exactly match the path where the tablespace is located (but it is not an error if the backup does not contain the tablespaces in olddir). olddir and newdir must be absolute paths. If a path happens to contain an equal sign (=), you can escape it with a backslash (\). This option can be used multiple times for multiple tablespaces.

    • -F, –format=plain|tar

      Sets the output format to plain (default) or tar. If this parameter is not set, the default value –format=plain is used. The plain format writes the output as a flat file, using the same layout as the current data directory and tablespace. When the cluster has no extra tablespace, the entire database is placed in the target directory. If the cluster contains additional tablespaces, the primary data directory will be placed in the target directory, but all other tablespaces will be placed in the same absolute path on the server. The tar mode writes the output as a tar file in the target directory. The primary data directory is written to a file named base.tar, and other tablespaces are named after their OIDs. The generated .tar package must be decompressed using the gs_tar command.

    • -X, –xlog-method=fetch|stream

      Sets the Xlog transmission mode. If this parameter is not set, the default value –xlog-method=stream is used. The required write-ahead log files (WALs) are included in the backup. This includes all WALs generated during the backup. In fetch mode, WAL files are collected at the end of the backup. Therefore, the wal_keep_segments parameter must be set to a large value so that logs are not removed before the backup ends. If it has been rotated when the log is to be transmitted, the backup fails and is unavailable. In stream mode, WALs are streamed when a backup is created. This will open a second connection to the server and start streaming WALs while the backup is running. Therefore, it will use up to two connections configured by the max_wal_senders parameter. As long as the client can receive WALs, no additional WALs need to be stored on the host.

    • -x, –xlog

      Equivalent to using -X with the fetch method.

    • -Z –compress=level

      Enables gzip compression for the output of the tar file and sets the compression level (0 to 9, where 0 indicates no compression and 9 indicates the best compression). The compression is available only when the tar format is used. The suffix .gz is automatically added to the end of all .tar file names.

    • -z

      Enables gzip compression for tar file output and uses the default compression level. The compression is available only when the tar format is used. The suffix .gz is automatically added to the end of all .tar file names.

    • -t, –rw-timeout

      Sets the checkpoint time limit during backup. The default value is 120s. If the full checkpoint of the database is time-consuming, you can increase the value of rw-timeout.

  • Connection parameters

    • -h, --host=HOSTNAME

      Specifies the host name of the machine on which the server is running or the directory for the Unix-domain socket.

    • -p, --port=PORT

      Specifies the port number of the database server.

      You can modify the default port number using this parameter.

    • -U, --username=USERNAME

      Specifies the user that connects to the database.

    • -s, --status-interval=INTERVAL

      Specifies the time for sending status packets to the server, in seconds.

    • -w,--no-password

      Never issues a password prompt.

    • -W, --password

      Issues a password prompt when the -U parameter is used to connect to a local or remote database.

Example

gs_basebackup -D /home/test/trunk/install/data/backup -h 127.0.0.1 -p 21233
INFO:  The starting position of the xlog copy of the full build is: 0/1B800000. The slot minimum LSN is: 0/1B800000.

Restoring Data from Backup Files

If a database is faulty, restore it from backup files. gs_basebackup backs up the database in binary mode. Therefore, you can directly copy and replace the original files or start the database on the backup database.

NOTE:

  • If the current database instance is running, a port conflict may occur when you start the database from the backup file. In this case, you need to modify the port parameter in the configuration file or specify a port when starting the database.
  • If the current backup file is a primary/standby database, you may need to modify the replication connections between the master and slave databases. That is, replconninfo1 and replconninfo2 in the postgresql.conf file.

To restore the original database, perform the following steps:

  1. Stop the database server. For details, see Administrator Guide.
  2. Copy the original database and all tablespaces to another location for future use.
  3. Delete all or part of the files from the original database.
  4. Use the database system user rights to restore the required database files from the backup.
  5. If a link file exists in the database, modify the link file so that it can be linked to the correct file.
  6. Restart the database server and check the database content to ensure that the database is restored to the required status.

NOTE:

  • Incremental restoration from backup files is not supported.
  • After the restoration, check that the link file in the database is linked to the correct file.

gs_probackup

Background

gs_probackup is a tool used to manage openGauss database backup and restoration. It periodically backs up the openGauss instances so that the server can be restored when the database is faulty.

  • It supports the physical backup of a standalone database or a primary node in a database.
  • It supports the backup of contents in external directories, such as script files, configuration files, log files, and dump files.
  • It supports incremental backup, periodic backup, and remote backup.
  • It supports settings on the backup retention policy.

Prerequisites

  • The openGauss database can be connected.
  • To use PTRACK incremental backup, manually add enable_cbm_tracking = on to postgresql.conf.

Important Notes

  • The backup must be performed by the user who runs the database server.
  • The major version number of the database server to be backed up must be the same as that of the database server to be restored.
  • To back up a database in remote mode using SSH, install the database of the same major version on the local and remote hosts, and run the ssh-copy-id remote_user@remote_host command to set an SSH connection without a password between the local host backup user and the remote host database user.
  • In remote mode, only the subcommands add-instance, backup, and restore can be executed.
  • Before running the restore subcommand, stop the gaussdb process.

Command Description

  • Print the gs_probackup version.

    gs_probackup -V|--version
    gs_probackup version
    
  • Display brief information about the gs_probackup command. Alternatively, display details about parameters of a specified subcommand of gs_probackup.

    gs_probackup -?|--help
    gs_probackup help [command]
    
  • Initialize the backup directory in backup-path. The backup directory stores the contents that have been backed up. If the backup-path backup path exists, it must be empty.

    gs_probackup init -B backup-path [--help]
    
  • Initialize a new backup instance in the backup directory of backup-path and generate the pg_probackup.conf configuration file, which saves the **gs_probackup **settings of the specified data directory pgdata-path.

    gs_probackup add-instance -B backup-path -D pgdata-path --instance=instance_name
    [-E external-directories-paths]
    [remote_options]
    [--help]
    
  • Delete the backup content related to the specified instance from the backup-path directory.

    gs_probackup del-instance -B backup-path --instance=instance_name
    [--help]
    
  • Add the specified connection, compression, and log-related settings to the pg_probackup.conf configuration file or modify the existing settings. You are not advised to manually edit the pg_probackup.conf configuration file.

    gs_probackup set-config -B backup-path --instance=instance_name
    [-D pgdata-path] [-E external-directories-paths] [--archive-timeout=timeout]
    [--retention-redundancy=retention-redundancy] [--retention-window=retention-window] [--wal-depth=wal-depth]
    [--compress-algorithm=compress-algorithm] [--compress-level=compress-level]
    [-d dbname] [-h hostname] [-p port] [-U username]
    [logging_options] [remote_options]
    [--help]
    
  • Add the backup-related settings to the backup.control configuration file or modify the settings.

    gs_probackup set-backup -B backup-path --instance=instance_name -i backup-id
    [--note=text] [pinning_options]
    [--help]
    
  • Display the content of the pg_probackup.conf configuration file in the backup directory. You can specify --format=json to display the information in JSON format. By default, the plain text format is used.

    gs_probackup show-config -B backup-path --instance=instance_name
    [--format=plain|json]
    [--help]
    
  • Display the contents of the backup directory. If instance_name and backup_id are specified, detailed information about the backup is displayed. You can specify --format=json to display the information in JSON format. By default, the plain text format is used.

    gs_probackup show -B backup-path
    [--instance=instance_name [-i backup-id]] [--archive] [--format=plain|json]
    [--help]
    
  • Create a backup for a specified database instance.

    gs_probackup backup -B backup-path --instance=instance_name -b backup-mode
    [-D pgdata-path] [-C] [-S slot-name] [--temp-slot] [--backup-pg-log] [-j threads_num] [--progress]
    [--no-validate] [--skip-block-validation] [-E external-directories-paths] [--no-sync] [--note=text]
    [--archive-timeout=timeout]
    [logging_options] [retention_options] [compression_options]
    [connection_options] [remote_options] [pinning_options]
    [--help]
    
  • Restore a specified instance from the backup copy in the backup-path directory. If an instance to be restored is specified, gs_probackup will look for its latest backup and restore it to the specified recovery target. Otherwise, the latest backup of any instance is used.

    gs_probackup restore -B backup-path --instance=instance_name
    [-D pgdata-path] [-i backup_id] [-j threads_num] [--progress] [--force] [--no-sync] [--no-validate] [--skip-block-validation]
    [--external-mapping=OLDDIR=NEWDIR] [-T OLDDIR=NEWDIR] [--skip-external-dirs] [-I incremental_mode]
    [recovery_options] [remote_options] [logging_options]
    [--help]
    
  • Merge all incremental backups between the specified incremental backup and its parent full backup into the parent full backup. The parent full backup will receive all merged data, while the merged incremental backup will be deleted as redundancy.

    gs_probackup merge -B backup-path --instance=instance_name -i backup_id
    [-j threads_num] [--progress] [logging_options]
    [--help]
    
  • Delete a specified backup or delete backups that do not meet the current retention policy.

    gs_probackup delete -B backup-path --instance=instance_name
    [-i backup-id | --delete-expired | --merge-expired | --status=backup_status]
    [--delete-wal] [-j threads_num] [--progress]
    [--retention-redundancy=retention-redundancy] [--retention-window=retention-window]
    [--wal-depth=wal-depth] [--dry-run]
    [logging_options]
    [--help]
    
  • Verify that all files required for restoring the database exist and are not damaged. If **instance_name **is not specified, **gs_probackup **verifies all available backups in the backup directory. If **instance_name **is specified and no additional options are specified, **gs_probackup **verifies all available backups for this backup instance. If both **instance_name **and **backup-id **or recovery objective-related options are specified, **gs_probackup **checks whether these options can be used to restore the database.

    gs_probackup validate -B backup-path
    [--instance=instance_name] [-i backup-id]
    [-j threads_num] [--progress] [--skip-block-validation]
    [--recovery-target-time=time | --recovery-target-xid=xid | --recovery-target-lsn=lsn | --recovery-target-name=target-name]
    [--recovery-target-inclusive=boolean]
    [logging_options]
    [--help] 
    

Parameter Description

Common parameters

  • command

    Specifies subcommands except version and help: init, add-instance, del-instance, set-config, set-backup, show-config, show, backup, restore, merge, delete, and validate.

  • -?, --help

    Displays help information about the command line parameters of gs_probackup and exits.

    Only --help can be used in subcommands; -? is forbidden.

  • -V, --version

    Prints the gs_probackup version and exits.

  • -B backup-path, --backup-path=backup-path

    Backup path.

    System environment variable: $BACKUP_PATH

  • -D pgdata-path, --pgdata=pgdata-path

    Path of the data directory.

    System environment variable: $PGDATA

  • --instance=instance_name

    Instance name.

  • -i backup-id, --backup-id=backup-id

    Unique identifier of a backup.

  • --format=format

    Specifies format of the backup information to be displayed. The plain and JSON formats are supported.

    Default value: plain

  • --status=backup_status

    Deletes all backups in a specified state. The states are as follows:

    • OK: Backup is complete and valid.
    • DONE: Backup has been completed but not verified.
    • RUNNING: Backup is in progress.
    • MERGING: Backups are being merged.
    • DELETING: Backup is being deleted.
    • CORRUPT: Some backup files are damaged.
    • ERROR: Backup fails due to an unexpected error.
    • ORPHAN: Backup is invalid because one of its parent backups is corrupted or lost.
  • -j threads_num, --threads=threads_num

    Sets the number of concurrent threads for the backup, restoration, and combination processes.

  • --archive

    Displays WAL archiving information.

  • --progress

    Displays progress.

  • --note=text

    Adds a note to the backup.

Backup-related parameters

  • -b backup-mode, --backup-mode=backup-mode

    Specifies the backup mode. The value can be FULL or PTRACK.

    FULL: creates a full backup. The full backup contains all data files.

    PTRACK: creates a PTRACK incremental backup.

  • -C, --smooth-checkpoint

    Expands checkpoints within a period of time. By default, gs_probackup attempts to complete checkpoints as soon as possible.

  • -S slot-name, --slot=slot-name

    Specifies the replication slot for WAL stream processing.

  • --temp-slot

    Creates a temporary physical replication slot for WAL stream processing in the backup instance to ensure that all required WAL segments are still available during the backup.

    The default slot name is pg_probackup_slot, which can be changed using the --slot/-S option.

  • --backup-pg-log

    Includes the log directory in the backup. This directory typically contains log messages. By default, the log directory is not included.

  • -E external-directories-paths, --external-dirs=external-directories-paths

    Includes the specified directory in the backup. This option is useful for backing up scripts in external data directories, sql dumps, and configuration files. To back up multiple external directories, use colons (:) to separate their paths in Unix.

    Example: -E /tmp/dir1:/tmp/dir2

  • --skip-block-validation

    Disables block-level verification to speed up backup.

  • --no-validate

    Skips the automatic verification when the backup is complete.

  • --no-sync

    Disables backup file synchronization to the disk.

  • --archive-timeout=timeout

    Specifies timeout interval for streaming processing, in seconds.

    Default value: 300

  • -t rwtimeout

    The timeout period for connecting to the database in seconds.

    Default value: 120

Restoration-related parameters

  • -I, --incremental-mode=none|checksum|lsn

    Reuses the valid pages available in PGDATA if they are not modified.

    Default value: none

  • --external-mapping=OLDDIR=NEWDIR

    During restoration, the external directory contained in the backup is moved from **OLDDIR **to NEWDIR. OLDDIR and NEWDIR must be absolute paths. If the path contains an equal sign (=), use a backslash (\) to escape. This option can be specified for multiple directories.

  • -T OLDDIR=NEWDIR, --tablespace-mapping=OLDDIR=NEWDIR

    Relocates the tablespace from the OLDDIR directory to the NEWDIR directory during the restoration. OLDDIR and NEWDIR must be absolute paths. If the path contains an equal sign (=), use a backslash (\) to escape. This parameter can be specified multiple times for multiple tablespaces. This parameter must be used together with --external-mapping.

  • --skip-external-dirs

    Skips the external directories in the backup that are specified using the --external-dirs option. The contents of these directories will not be restored.

  • --skip-block-validation

    Skips block-level verification to speed up verification. During the automatic verification before the restoration, only file-level verification is performed.

  • --no-validate

    Skips the backup verification.

  • --force

    Specifies the invalid state that allows ignoring backup. This flag can be used if data needs to be restored from a damaged or invalid backup. Exercise caution when using it.

Recovery objective-related parameters (recovery_options)

NOTE: Currently, continuous WAL archiving PITR cannot be configured. Therefore, parameter usage is restricted as follows: To use continuously archived WAL logs for PITR, perform the following steps:

  1. Replace the target database directory with the physical backup files.
  2. Delete all files in the database directory pg_xlog/.
  3. Copy the archived WAL log file to the pg_xlog file. (Or you can configure restore_command in the recovery.conf file to skip this step.)
  4. Create the recovery command file recovery.conf in the database directory and specify the database recovery degree.
  5. Start the database.
  6. Connect to the database and check whether the database is recovered to the expected status. If the expected status is reached, run the pg_xlog_replay_resume() command so that the primary node can provide services externally.
  • --recovery-target-lsn=lsn

    Specifies LSN to be restored. Currently, only the backup stop LSN can be specified.

  • --recovery-target-name=target-name

    Specifies named savepoint to which data is restored. You can obtain the savepoint by viewing the recovery-name column in the backup.

  • --recovery-target-time=time

    Specifies time to which data is restored. Currently, only recovery-time can be specified.

  • --recovery-target-xid=xid

    Specifies transaction ID to which data is restored. Currently, only recovery-xid can be specified.

  • --recovery-target-inclusive=boolean

    When this parameter is set to true, the recovery target will include the specified content.

    When this parameter is set to false, the recovery target will not include the specified content.

    This parameter must be used together with --recovery-target-name, --recovery-target-time, --recovery-target-lsn, or --recovery-target-xid.

Retention-related parameters (retention_options)

NOTE: The following parameters can be used together with the backup and delete commands.

  • --retention-redundancy=retention-redundancy

    Number of full backups retained in the data directory. The value must be a positive integer. The value 0 indicates that the setting is disabled.

    Default value: 0

  • --retention-window=retention-window

    Specifies the retention period. The value must be a positive integer. The value 0 indicates that the setting is disabled.

    Default value: 0

  • --wal-depth=wal-depth

    Latest number of valid backups that must be retained on each timeline to perform the PITR capability The value must be a positive integer. The value 0 indicates that the setting is disabled.

    Default value: 0

  • --delete-wal

    Deletes unnecessary WAL files from any existing backup.

  • --delete-expired

    Deletes the backups that do not comply with the retention policy defined in the pg_probackup.conf configuration file.

  • --merge-expired

    Merges the oldest incremental backup that meets the retention policy requirements with its expired parent backup.

  • --dry-run

    Displays the status of all available backups. Expired backups will not be deleted or merged.

Fixed backup-related parameters (pinning_options)

NOTE: To exclude certain backups from the established retention policy, you can use the following parameters with the backup and set-backup commands.

  • --ttl=interval

    Specifies a fixed amount of time to back up data from the restoration time. The value must be a positive integer. The value 0 indicates that the backup is canceled.

    Supported unit: ms, s, min, h, d (default value: s)

    For example, --ttl=30d.

  • --expire-time=time

    Specifies the timestamp when the backup is invalid. The time stamp must comply with the ISO-8601 standard.

    For example, --expire-time='2020-01-01 00:00:00+03'.

Log-related parameters (logging_options)

Log levels: verbose, log, info, warning, error, and off.

  • --log-level-console=log-level-console

    Sets the level of logs to be sent to the console. Each level contains all the levels following it. A higher level indicates fewer messages sent. If this parameter is set to off, the log recording function of the console is disabled.

    Default value: info

  • --log-level-file=log-level-file

    Sets the level of logs to be sent to the log file. Each level contains all the levels following it. A higher level indicates fewer messages sent. If this parameter is set to off, the log file recording function is disabled.

    Default value: off

  • --log-filename=log-filename

    Specifies the name of the log file to be created. The file name can use the strftime mode. Therefore, %-escapes can be used to specify the file name that changes with time.

    For example, if the pg_probackup-%u.log mode is specified, pg_probackup generates a log file each day of the week, with %u replaced by the corresponding decimal number, that is, pg_probackup-1.log indicates Monday. pg_probackup-2.log indicates Tuesday, and so on.

    This parameter is valid if the --log-level-file parameter is specified to enable log file recording.

    Default value: “pg_probackup.log”

  • --error-log-filename=error-log-filename

    Specifies the name of the log file that is used only for error logs. The specifying method is the same as that of the --log-filename parameter.

    It is used for troubleshooting and monitoring.

  • --log-directory=log-directory

    Specifies the directory where log files are created. The value must be an absolute path. This directory is created when the first log is written.

    Default value: $BACKUP_PATH/log

  • --log-rotation-size=log-rotation-size

    Specifies the maximum size of a log file. If the maximum size is reached, the log file will be circulated after the gs_probackup command is executed. The help and version commands will not lead to a log file circulation. The value 0 indicates that the file size-based loop is disabled.

    The unit can be KB, MB, GB, or TB. The default unit is KB.

    Default value: 0

  • --log-rotation-age=log-rotation-age

    Maximum life cycle of a log file. If the maximum size is reached, the log file will be circulated after the gs_probackup command is executed. The help and version commands will not lead to a log file circulation. The $BACKUP_PATH/log/log_rotation directory saves the time of the last created log file. The value 0 indicates that the time-based loop is disabled.

    Supported unit: ms, s, min, h, d (default value: min)

    Default value: 0

Connection-related parameters (connection_options)

NOTE: The following parameters can be used together with the backup command.

  • -d dbname, --pgdatabase=dbname

    Specifies the name of the database to connect to. This connection is only used to manage the backup process. Therefore, you can connect to any existing database. If this parameter is not specified in the command line, the PGDATABASE environment variable, or the pg_probackup.conf configuration file, gs_probackup attempts to obtain the value from the PGUSER environment variable. If the PGUSER variable is not set, the value is obtained from the current user name.

    System environment variable: $PGDATABASE

  • -h hostname, --pghost=hostname

    Specifies the host name of the system on which the server is running. If the value begins with a slash (/), it is used as the directory for the UNIX domain socket.

    System environment variable: $PGHOST

    Default value: local socket

  • -p port, --pgport=p__ort

    Specifies the TCP port or local Unix domain socket file name extension on which the server is listening for connections.

    System environment variable: $PGPORT

    Default value: 5432

  • -U username, --pguser=username

    Specifies the username of the host to be connected.

    System environment variable: $PGUSER

  • -w, --no-password

    Never issues a password prompt. The connection attempt fails if the host requires password verification and the password is not provided in other ways. This parameter is useful in batch jobs and the scripts that require no user password.

  • -W password, --password=password

    User password for database connection. If the host uses the trust authentication policy, the administrator does not need to enter the -W parameter. If the -W parameter is not provided and you are not a system administrator, the system will ask you to enter a password.

Compression-related parameters (compression_options)

NOTE: The following parameters can be used together with the backup command.

  • --compress-algorithm=compress-algorithm

    Specifies the algorithm used to compress data file.

    The value can be zlib, pglz, or none. If zlib or pglz is set, compression is enabled. By default, the compression function is disabled.

    Default value: none

  • --compress-level=compress-level

    Specifies the compression level. Value range: 0-9

    • 0 indicates no compression.
    • 1 indicates that the compression ratio is the lowest and processing speed the fastest.
    • 9 indicates that the compression ratio is the highest and processing speed the slowest.
    • This parameter can be used together with --compress-algorithm.

    Default value: 1

  • --compress

    Compresses with --compress-algorithm=zlib and --compress-level=1.

Remote mode-related parameters (remote_options)

NOTE: The following are parameters that remotely run gs_probackup through SSH, and can be used together with the add-instance, set-config, backup, and restore commands.

  • --remote-proto=protocol

    Specifies the protocol used for remote operations. Currently, only the SSH protocol is supported. Valid value:

    ssh: enables the remote backup mode through SSH. This is the default.

    none: The remote mode is disabled explicitly.

    If --remote-host is specified, this parameter can be omitted.

  • --remote-host=destination

    Specifies the IP address or host name of the remote host to be connected.

  • --remote-port=port

    Specifies the port number of the remote host to be connected.

    Default value: 22

  • --remote-user=username

    Specifies the remote host user for SSH connection. If this parameter is not specified, the user who initiates the SSH connection is used.

    Default value: the current user.

  • --remote-path=path

    Specifies the installation directory of gs_probackup in the remote system.

    Default value: current path

  • --ssh-options=ssh_options

    Specifies the character string of the SSH command line parameter.

    Example: --ssh-options='-c cipher_spec -F configfile'

    NOTE:

    • If the server does not respond due to a temporary network fault, **gs_probackup **will exit after waiting for archive-timeout seconds (300 seconds is set by default).
    • If the LSN of the standby server is different from that of the primary server, the database continuously updates the following log information. In this case, you need to rebuild the standby server.
    LOG: walsender thread shut down
    LOG: walsender thread started
    LOG: received wal replication command: IDENTIFY_VERSION
    LOG: received wal replication command: IDENTIFY_MODE
    LOG: received wal replication command: IDENTIFY_SYSTEM
    LOG: received wal replication command: IDENTIFY_CONSISTENCE 0/D0002D8
    LOG: remote request lsn/crc: [xxxxx] local max lsn/crc: [xxxxx]
    

Backup Process

  1. Initialize the backup directory. Create the backups/ and wal/ subdirectories in the specified directory to store backup files and WAL files respectively.

    gs_probackup init -B backup_dir
    
  2. Add a new backup instance. gs_probackup can store backups of multiple database instances in the same backup directory.

    gs_probackup add-instance -B backup_dir -D data_dir --instance instance_name
    
  3. Create a backup for a specified database instance. Before performing an incremental backup, you must create at least one full backup.

    gs_probackup backup -B backup_dir --instance instance_name -b backup_mode
    
  4. Restore data from the backup of a specified DB instance.

    gs_probackup restore -B backup_dir --instance instance_name -D pgdata-path -i backup_id
    

Troubleshooting

Problem Description

Cause and Solution

ERROR: query failed: ERROR: canceling statement due to conflict with recovery

Cause: The operation performed on the standby node is accessing the storage row. The corresponding row is modified or deleted on the primary node, and the Xlog is replayed on the standby node. As a result, the operation is canceled on the standby node.

Solution:

1. Increase the values of the following parameters:

max_standby_archive_delay

max_standby_streaming_delay

2. Add the following configuration item:

hot_standby_feedback = on

PITR Recovery

Background

When a database breaks down or needs to be rolled back to a previous state, the point-in-time recovery (PITR) function of openGauss can be used to restore the database to any point in time after the backup and archive data is generated.

NOTE:

  • PITR can only be restored to a point in time after the physical backup data is generated.
  • Only the primary node can be restored using PITR. The standby node needs to be fully built to synchronize data with the primary node.

Prerequisites

  • Full data files have been physically backed up.
  • WAL log files have been archived.

PITR Recovery Process

  1. Replace the target database directory with the physical backup files.
  2. Delete all files in the database directory pg_xlog/.
  3. Copy the archived WAL log file to the pg_xlog file. (Or you can configure restore_command in the recovery.conf file to skip this step.)
  4. Create the recovery command file recovery.conf in the database directory and specify the database recovery degree.
  5. Start the database.
  6. Connect to the database and check whether the database is recovered to the expected status.
  7. If the expected status is reached, run the pg_xlog_replay_resume() command so that the primary node can provide services externally.

Configuring the recovery.conf File

Archive Recovery Configuration

  • restore_command = string

The shell command is used to obtain the archived WAL files among the WAL file series. Any %f in the string is replaced by the name of the file to retrieve from the archive, and any %p is replaced by the path name to copy it to on the server. Any %r is replaced by the name of the file containing the last valid restart point.

For example:

restore_command = 'cp /mnt/server/archivedir/%f %p'
  • archive_cleanup_command = string

This option parameter declares a shell command that is executed each time the system is restarted. archive_cleanup_command provides a mechanism for deleting unnecessary archived WAL files from the standby database. Any %r is replaced by the name of the file containing the last valid restart point. That is the earliest file that must be kept to allow recovery to be restartable, so all files older than %r can be safely removed.

For example:

archive_cleanup_command = 'pg_archivecleanup /mnt/server/archivedir %r'

If multiple standby servers need to be recovered from the same archive path, ensure that WAL files are not deleted from any standby server before the recovery.

  • recovery_end_command = string

This parameter is optional and is used to declare a shell command that is executed only when the recovery is complete. recovery_end_command provides a cleanup mechanism for future replication and recovery.

Recovery Object Configuration

  • recovery_target_name = string

This parameter declares that the name is recovered to a recovery point created using pg_create_restore_point().

For example:

recovery_target_name = 'restore_point_1'
  • recovery_target_time = timestamp

This parameter declares that the name is recovered to a specified timestamp.

For example:

recovery_target_time = '2020-01-01 12:00:00'
  • recovery_target_xid = string

This parameter declares that the name is recovered to a transaction ID.

For example:

recovery_target_xid = '3000'
  • recovery_target_lsn = string

This parameter declares that the name is recovered to the LSN specified by log.

For example:

recovery_target_lsn = '0/0FFFFFF'
  • recovery_target_inclusive = boolean

This parameter declares whether to stop the recovery after the recovery target is specified (true) or before the recovery target is specified (false). This declaration supports only the recovery targets recovery_target_time, recovery_target_xid, and recovery_target_lsn.

For example:

recovery_target_inclusive = true

NOTE:

  • Only one of the four configuration items recovery_target_name, recovery_target_time, recovery_target_xid, and recovery_target_lsn can be used at a time.
  • If no recovery targets are configured or the configured target does not exist, data is recovered to the latest WAL log point by default.

Logical Backup and Restoration

gs_dump

Background

gs_dump, provided by openGauss, is used to export database information. You can export a database or its objects (such as schemas, tables, and views). The database can be the default postgres database or a user-specified database.

gs_dump is executed by OS user omm.

When gs_dump is used to export data, other users can still access (read and write) openGauss databases.

gs_dump can export complete, consistent data. For example, if gs_dump is started to export database A at T1, data of the database at that time point will be exported, and modifications on the database after that time point will not be exported.

gs_dump can export database information to a plain-text SQL script file or archive file.

  • Plain-text SQL script: It contains the SQL statements required to restore the database. You can use gsql to execute the SQL script. With only a little modification, the SQL script can rebuild a database on other hosts or database products.
  • Archive file: It contains data required to restore the database. It can be a tar-, directory-, or custom-format archive. For details, see Table 1. The export result must be used with gs_restoreto restore the database. The system allows users to select or even to sort the content to be imported.

Functions

gs_dump can create export files in four formats, which are specified by -F or --format=, as listed in Table 1.

Table 1 Formats of exported files

Format

Value of -F

Description

Suggestion

Corresponding Import Tool

Plain-text

p

A plain-text script file containing SQL statements and commands. The commands can be executed on gsql, a command line terminal, to recreate database objects and load table data.

You are advised to use plain-text exported files for small databases.

Before using gsql to restore database objects, you can use a text editor to edit the plain-text export file as required.

Custom

c

A binary file that allows the restoration of all or selected database objects from an exported file.

You are advised to use custom-format archive files for medium or large database.

You can use gs_restore to import database objects from a custom-format archive.

Directory

d

A directory containing directory files and the data files of tables and BLOB objects.

-

.tar

t

A tar-format archive that allows the restoration of all or selected database objects from an exported file. It cannot be further compressed and has an 8-GB limitation on the size of a single table.

-

NOTE:
To reduce the size of an exported file, you can use gs_dump to compress it to a plain-text file or custom-format file. By default, a plain-text file is not compressed when generated. When a custom-format archive is generated, a medium level of compression is applied by default. Archived exported files cannot be compressed using gs_dump. When a plain-text file is exported in compressed mode, gsql fails to import data objects.

Precautions

Do not modify an exported file or its content. Otherwise, restoration may fail.

To ensure the data consistency and integrity, gs_dump acquires a share lock on a table to be dumped. If another transaction has acquired a share lock on the table, gs_dump waits until this lock is released and then locks the table for dumping. If the table cannot be locked within the specified time, the dump fails. You can customize the timeout duration to wait for lock release by specifying the --lock-wait-timeout parameter.

Syntax

gs_dump [OPTION]... [DBNAME]

NOTE:
DBNAME does not follow a short or long option. It specifies the database to be connected.
For example:
Specify DBNAME without a -d option preceding it.

gs_dump -p port_number  postgres -f dump1.sql

or

export PGDATABASE=postgres 
 gs_dump -p port_number -f dump1.sql

Environment variable: PGDATABASE

Parameter Description

Common parameters

  • -f, --file=FILENAME

    Sends the output to the specified file or directory. If this parameter is omitted, the standard output is generated. If the output format is (-F c/-F d/-F t), the -f parameter must be specified. If the value of the -f parameter contains a directory, the current user must have the read and write permissions on the directory, and the directory cannot be an existing one.

  • -F, --format=c|d|t|p

    Selects the exported file format. The format can be:

    • p|plain: Generates a text SQL script file. This is the default value.

    • c|custom: Outputs a custom-format archive as a directory to be used as the input of gs_restore. This is the most flexible output format in which users can manually select it and reorder the archived items during restoration. An archive in this format is compressed by default.

    • d|directory: Creates a directory containing directory files and the data files of tables and BLOBs.

    • t|tar: Outputs a .tar archive as the input of gs_restore. The .tar format is compatible with the directory format. Extracting a .tar archive generates a valid directory-format archive. However, the .tar archive cannot be further compressed and has an 8-GB limitation on the size of a single table. The order of table data items cannot be changed during restoration.

      A .tar archive can be used as input of gsql.

  • -v, --verbose

    Specifies the verbose mode. If it is specified, gs_dump writes detailed object comments and the number of startups/stops to the dump file, and progress messages to standard error.

  • -V, --version

    Prints the gs_dump version and exits.

  • -Z, --compress=0-9

    Specifies the used compression level.

    Value range: 0-9

    • 0 indicates no compression.
    • 1 indicates that the compression ratio is the lowest and processing speed the fastest.
    • 9 indicates that the compression ratio is the highest and processing speed the slowest.

    For the custom-format archive, this option specifies the compression level of a single table data segment. By default, data is compressed at a medium level. The plain-text and .tar archive formats do not support compression currently.

  • --lock-wait-timeout=TIMEOUT

    Do not keep waiting to obtain shared table locks since the beginning of the dump. Consider it as failed if you are unable to lock a table within the specified time. The timeout period can be specified in any of the formats accepted by SET statement_timeout.

  • -?, --help

    Displays help about gs_dump parameters and exits.

Dump parameters:

  • -a, --data-only

    Generates only the data, not the schema (data definition). Dump the table data, big objects, and sequence values.

  • -b, --blobs

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • -c, --clean

    Before writing the command of creating database objects into the backup file, writes the command of clearing (deleting) database objects to the backup files. (If no objects exist in the target database, gs_restore probably displays some error information.)

    This parameter is used only for the plain-text format. For the archive format, you can specify the option when using gs_restore.

  • -C, --create

    The backup file content starts with the commands of creating the database and connecting to the created database. (If the command script is executed in this mode, you can specify any database to run the command for creating a database. The data is restored to the created database instead of the specified database.)

    This parameter is used only for the plain-text format. For the archive format, you can specify the option when using gs_restore.

  • -E, --encoding=ENCODING

    Creates a dump file in the specified character set encoding. By default, the dump file is created in the database encoding. (Alternatively, you can set the environment variable PGCLIENTENCODING to the required dump encoding.)

  • -n, --schema=SCHEMA

    Dumps only schemas matching the schema names. This option contains the schema and all its contained objects. If this option is not specified, all non-system schemas in the target database will be dumped. Multiple schemas can be selected by specifying multiple -n options. The schema parameter is interpreted as a pattern according to the same rules used by the \d command of gsql. Therefore, multiple schemas can also be selected by writing wildcard characters in the pattern. When you use wildcard characters, quote the pattern to prevent the shell from expanding the wildcard characters.

    NOTE:

    • If -n is specified, gs_dump does not dump any other database objects which the selected schemas might depend upon. Therefore, there is no guarantee that the results of a specific-schema dump can be automatically restored to an empty database.
    • If -n is specified, the non-schema objects are not dumped.

    Multiple schemas can be dumped. Entering **-n **schemaname multiple times dumps multiple schemas.

    For example:

    gs_dump -h host_name -p port_number postgres -f backup/bkp_shl2.sql -n sch1 -n sch2
    

    In the preceding example, sch1 and sch2 are dumped.

  • -N, --exclude-schema=SCHEMA

    Does not dump any schemas matching the schemas pattern. The pattern is interpreted according to the same rules as for -n. -N can be specified multiple times to exclude schemas matching any of the specified patterns.

    When both -n and -N are specified, the schemas that match at least one -n option but no -N is dumped. If -N is specified and -n is not, the schemas matching -N are excluded from what is normally dumped.

    Dump allows you to exclude multiple schemas during dumping.

    Specify -N exclude schema name to exclude multiple schemas during dumping.

    For example:

    gs_dump -h host_name -p port_number postgres -f backup/bkp_shl2.sql -N sch1 -N sch2
    

    In the preceding example, sch1 and sch2 will be excluded during the dumping.

  • -o, --oids

    Dumps object identifiers (OIDs) as parts of the data in each table. Use this option if your application references the OID columns in some way. If the preceding situation does not occur, do not use this parameter.

  • -O, --no-owner

    Do not output commands to set ownership of objects to match the original database. By default, gs_dump issues the ALTER OWNER or SET SESSION AUTHORIZATION statement to set ownership of created database objects. These statements will fail when the script is running unless it is started by a system administrator (or the same user that owns all of the objects in the script). To make a script that can be stored by any user and give the user ownership of all objects, specify -O.

    This parameter is used only for the plain-text format. For the archive format, you can specify the option when using gs_restore.

  • -s, --schema-only

    Dumps only the object definition (schema) but not data.

  • -S, --sysadmin=NAME

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • -t, --table=TABLE

    Specifies a list of tables, views, sequences, or foreign tables to be dumped. You can use multiple -t parameters or wildcard characters to specify tables.

    When you use wildcard characters, quote patterns to prevent the shell from expanding the wildcard characters.

    The -n and -N options have no effect when -t is used, because tables selected by using -t will be dumped regardless of those options.

    NOTE:

    • The number of -t parameters must be less than or equal to 100.
    • If the number of -t parameters is greater than 100, you are advised to use the --include-table-file parameter to replace some -t parameters.
    • If -t is specified, gs_dump does not dump any other database objects which the selected tables might depend upon. Therefore, there is no guarantee that the results of a specific-table dump can be automatically restored to an empty database.
    • -t tablename only dumps visible tables in the default search path. -t '*.tablename' dumps tablename tables in all the schemas of the dumped database. -t schema.table dumps tables in a specific schema.
    • -t tablename does not export trigger information from a table.

    For example:

    gs_dump -h host_name -p port_number postgres -f backup/bkp_shl2.sql -t schema1.table1 -t schema2.table2
    

    In the preceding example, schema1.table1 and schema2.table2 are dumped.

  • --include-table-file=FILENAME

    Specifies the table file to be dumped.

  • -T, --exclude-table=TABLE

    Specifies a list of tables, views, sequences, or foreign tables not to be dumped. You can use multiple -T parameters or wildcard characters to specify tables.

    When -t and -T are input, the object will be stored in -t list not -T table object.

    For example:

    gs_dump -h host_name -p port_number postgres -f backup/bkp_shl2.sql -T table1 -T table2
    

    In the preceding example, table1 and table2 are excluded from the dumping.

  • --exclude-table-file=FILENAME

    Specifies the table files that do not need to be dumped.

    NOTE:
    Same as --include-table-file, the content format of this parameter is as follows:
    schema1.table1 schema2.table2 ……

  • -x, --no-privileges|--no-acl

    Prevents the dumping of access permissions (grant/revoke commands).

  • --binary-upgrade

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --binary-upgrade-usermap=“USER1=USER2”

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --column-inserts|--attribute-inserts

    Exports data by running the INSERT command with explicit column names {INSERT INTO table (column, …) VALUES …}. This will cause a slow restoration. However, since this option generates an independent command for each row, an error in reloading a row causes only the loss of the row rather than the entire table content.

  • --disable-dollar-quoting

    Disables the use of dollar sign ($) for function bodies, and forces them to be quoted using the SQL standard string syntax.

  • --disable-triggers

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --exclude-table-data=TABLE

    Does not dump data that matches any of table patterns. The pattern is interpreted according to the same rules as for -t.

    --exclude-table-data can be entered more than once to exclude tables matching any of several patterns. When you need the specified table definition rather than data in the table, this option is helpful.

    To exclude data of all tables in the database, see --schema-only.

  • --inserts

    Dumps data by the INSERT statement (rather than COPY). This will cause a slow restoration.

    However, since this option generates an independent command for each row, an error in reloading a row causes only the loss of the row rather than the entire table content. The restoration may fail if you rearrange the column order. The --column-inserts option is unaffected against column order changes, though even slower.

  • --no-security-labels

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --no-tablespaces

    Does not issue commands to select tablespaces. All the objects will be created during restoration, no matter which tablespace is selected when using this option.

    This parameter is used only for the plain-text format. For the archive format, you can specify the option when using gs_restore.

  • --no-unlogged-table-data

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --non-lock-table

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --include-alter-table

    Dumps deleted columns of tables. This option records deleted columns.

  • --quote-all-identifiers

    Forcibly quotes all identifiers. This parameter is useful when you dump a database for migration to a later version, in which additional keywords may be introduced.

  • --section=SECTION

    Specifies dumped name sections (pre-data, data, or post-data).

  • --serializable-deferrable

    Uses a serializable transaction for the dump to ensure that the used snapshot is consistent with later database status. Perform this operation at a time point in the transaction flow, at which everything is normal. This ensures successful transaction and avoids serialization failures of other transactions, which requires serialization again.

    This option has no benefits for disaster recovery. During the upgrade of the original database, loading a database copy as a report or loading other shared read-only dump is helpful. If the option does not exist, dump reveals a status which is different from the submitted sequence status of any transaction.

    This option will make no difference if there are no active read-write transactions when gs_dump is started. If the read-write transactions are in active status, the dump start time will be delayed for an uncertain period.

  • --use-set-session-authorization

    Specifies that the standard SQL SET SESSION AUTHORIZATION command rather than ALTER OWNER is returned to ensure the object ownership. This makes dumping more standard. However, if a dump file contains objects that have historical problems, restoration may fail. A dump using SET SESSION AUTHORIZATION requires the system administrator permissions, whereas ALTER OWNER requires lower permissions.

  • --with-encryption=AES128

    Specifies that dumping data needs to be encrypted using AES128.

  • --with-key=KEY

    Specifies that the key length of AES128 must be 16 bytes.

    NOTE:

    When using the gs_dump tool for encrypted export, only plain format export is supported. The data exported through -F plain needs to be imported through the gsql tool, and if it is imported through encryption, the --with-key parameter must be specified when importing through gsql.

  • --include-depend-objs

    Includes information about the objects that depend on the specified object in the backup result. This parameter takes effect only if the -t or --include-table-file parameter is specified.

  • --exclude-self

    Excludes information about the specified object from the backup result. This parameter takes effect only if the -t or --include-table-file parameter is specified.

  • --dont-overwrite-file

    The existing files in plain-text, .tar, and custom formats will be overwritten. This option is not used for the directory format.

    For example:

    Assume that the backup.sql file exists in the current directory. If you specify -f backup.sql in the input command, and the backup.sql file is generated in the current directory, the original file will be overwritten.

    If the backup file already exists and --dont-overwrite-file is specified, an error will be reported with the message that the dump file exists.

    gs_dump -p port_number postgres -f backup.sql -F plain --dont-overwrite-file
    

NOTE:

  • The -s/--schema-only and -a/--data-only parameters do not coexist.
  • The -c/--clean and -a/--data-only parameters do not coexist.
  • --inserts/--column-inserts and -o/--oids do not coexist, because OIDS cannot be set using the INSERT statement.
  • --role must be used in conjunction with --rolepassword.
  • --binary-upgrade-usermap must be used in conjunction with --binary-upgrade.
  • --include-depend-objs or --exclude-self takes effect only when -t or --include-table-file is specified.
  • --exclude-self must be used in conjunction with --include-depend-objs.

Connection parameters:

  • -h, --host=HOSTNAME

    Specifies the host name. If the value begins with a slash (/), it is used as the directory for the UNIX domain socket. The default value is taken from the PGHOST environment variable (if available). Otherwise, a Unix domain socket connection is attempted.

    This parameter is used only for defining names of the hosts outside openGauss. The names of the hosts inside openGauss must be 127.0.0.1.

    Example: host name

    Environment variable: PGHOST

  • -p, --port=PORT

    Specifies the host port number. If the thread pool function is enabled, you are advised to use pooler port, that is, the host port number plus 1.

    Environment variable: PGPORT

  • -U, --username=NAME

    Specifies the username of the host to be connected.

    If the username of the host to be connected is not specified, the system administrator is used by default.

    Environment variable: PGUSER

  • -w, --no-password

    Never issues a password prompt. The connection attempt fails if the host requires password verification and the password is not provided in other ways. This parameter is useful in batch jobs and scripts in which no user password is required.

  • -W, --password=PASSWORD

    Specifies the user password for connection. If the host uses the trust authentication policy, the administrator does not need to enter the -W option. If the -W option is not provided and you are not a system administrator, the Dump Restore tool will ask you to enter a password.

  • --role=ROLENAME

    Specifies a role name to be used for creating the dump. If this option is selected, the SET ROLE command will be issued after the database is connected to gs_dump. It is useful when the authenticated user (specified by -U) lacks the permissions required by gs_dump. It allows the user to switch to a role with the required permissions. Some installations have a policy against logging in directly as a super administrator. This option allows dumping data without violating the policy.

  • --rolepassword=ROLEPASSWORD

    Specifies the password for a role.

Notice

If any local additions need to be added to the template1 database in openGauss, restore the output of gs_dump into an empty database with caution. Otherwise, you are likely to obtain errors due to duplicate definitions of the added objects. To create an empty database without any local additions, copy data from template0 rather than template1. Example:

CREATE DATABASE foo WITH TEMPLATE template0;

The .tar file size must be smaller than 8 GB. (This is the .tar file format limitations.) The total size of a .tar archive and any of the other output formats are not limited, except possibly by the OS.

The dump file generated by gs_dump does not contain the statistics used by the optimizer to make execution plans. Therefore, you are advised to run ANALYZE after restoring from a dump file to ensure optimal performance. The dump file does not contain any ALTER DATABASE … SET commands. These settings are dumped by gs_dumpall, along with database users and other installation settings.

Examples

Use gs_dump to dump a database as a SQL text file or a file in other formats.

In the following examples, Bigdata@123 indicates the password for the database user. backup/MPPDB_backup.sql indicates an exported file where backup indicates the relative path of the current directory. 37300 indicates the port number of the database server. postgres indicates the name of the database to be accessed.

NOTE: Before exporting files, ensure that the directory exists and you have the read and write permissions on the directory.

Example 1: Use gs_dump to export the full information of the postgres database. The exported MPPDB_backup.sql file is in plain-text format.

gs_dump -U omm -W Bigdata@123 -f backup/MPPDB_backup.sql -p 37300 postgres -F p
gs_dump[port='37300'][postgres][2018-06-27 09:49:17]: The total objects number is 356.
gs_dump[port='37300'][postgres][2018-06-27 09:49:17]: [100.00%] 356 objects have been dumped.
gs_dump[port='37300'][postgres][2018-06-27 09:49:17]: dump database postgres successfully
gs_dump[port='37300'][postgres][2018-06-27 09:49:17]: total time: 1274  ms

Use gsql to import data from the exported plain-text file.

Example 2: Use gs_dump to export the full information of the postgres database. The exported MPPDB_backup.tar file is in .tar format.

gs_dump -U omm -W Bigdata@123 -f backup/MPPDB_backup.tar -p 37300 postgres -F t
gs_dump[port='37300'][postgres][2018-06-27 10:02:24]: The total objects number is 1369.
gs_dump[port='37300'][postgres][2018-06-27 10:02:53]: [100.00%] 1369 objects have been dumped.
gs_dump[port='37300'][postgres][2018-06-27 10:02:53]: dump database postgres successfully
gs_dump[port='37300'][postgres][2018-06-27 10:02:53]: total time: 50086  ms

Example 3: Use gs_dump to export the full information of the postgres database. The exported MPPDB_backup.dmp file is in custom format.

gs_dump -U omm -W Bigdata@123 -f backup/MPPDB_backup.dmp -p 37300 postgres -F c
gs_dump[port='37300'][postgres][2018-06-27 10:05:40]: The total objects number is 1369.
gs_dump[port='37300'][postgres][2018-06-27 10:06:03]: [100.00%] 1369 objects have been dumped.
gs_dump[port='37300'][postgres][2018-06-27 10:06:03]: dump database postgres successfully
gs_dump[port='37300'][postgres][2018-06-27 10:06:03]: total time: 36620  ms

Example 4: Use gs_dump to export the full information of the postgres database. The exported MPPDB_backup file is in directory format.

gs_dump -U omm -W Bigdata@123 -f backup/MPPDB_backup -p 37300  postgres -F d
gs_dump[port='37300'][postgres][2018-06-27 10:16:04]: The total objects number is 1369.
gs_dump[port='37300'][postgres][2018-06-27 10:16:23]: [100.00%] 1369 objects have been dumped.
gs_dump[port='37300'][postgres][2018-06-27 10:16:23]: dump database postgres successfully
gs_dump[port='37300'][postgres][2018-06-27 10:16:23]: total time: 33977  ms 

Example 5: Use gs_dump to export the information of the postgres database, excluding the information of the table specified in the /home/MPPDB_temp.sql file. The exported MPPDB_backup.sql file is in plain-text format.

gs_dump -U omm -W Bigdata@123 -p 37300 postgres --exclude-table-file=/home/MPPDB_temp.sql -f backup/MPPDB_backup.sql
gs_dump[port='37300'][postgres][2018-06-27 10:37:01]: The total objects number is 1367.
gs_dump[port='37300'][postgres][2018-06-27 10:37:22]: [100.00%] 1367 objects have been dumped.
gs_dump[port='37300'][postgres][2018-06-27 10:37:22]: dump database postgres successfully
gs_dump[port='37300'][postgres][2018-06-27 10:37:22]: total time: 37017  ms

Example 6: Use gs_dump to export only the information about the views that depend on the testtable table. Create another testtable table, and then restore the views that depend on it.

  • Back up only the views that depend on the testtable table.

    gs_dump -s -p 37300 postgres -t PUBLIC.testtable --include-depend-objs --exclude-self -f backup/MPPDB_backup.sql -F p
    gs_dump[port='37300'][postgres][2018-06-15 14:12:54]: The total objects number is 331.
    gs_dump[port='37300'][postgres][2018-06-15 14:12:54]: [100.00%] 331 objects have been dumped.
    gs_dump[port='37300'][postgres][2018-06-15 14:12:54]: dump database postgres successfully
    gs_dump[port='37300'][postgres][2018-06-15 14:12:54]: total time: 327  ms
    
  • Change the name of the testtable table.

    gsql -p 37300 postgres -r -c "ALTER TABLE PUBLIC.testtable RENAME TO testtable_bak;"
    
  • Create another testtable table.

    CREATE TABLE PUBLIC.testtable(a int, b int, c int);
    
  • Restore the views for the new testtable table.

    gsql -p 37300 postgres -r -f backup/MPPDB_backup.sql
    

gs_dumpall

Background

gs_dumpall, provided by openGauss, is used to export all openGauss database information, including data of the default database postgres, user-defined databases, and common global objects of all openGauss databases.

gs_dumpall is executed by OS user omm.

When gs_dumpall is used to export data, other users can still access (read and write) openGauss databases.

gs_dumpall can export complete, consistent data. For example, if gs_dumpall is started to export openGauss database at T1, data of the database at that time point will be exported, and modifications on the database after that time point will not be exported.

gs_dumpall exports all openGauss databases in two parts:

  • gs_dumpall exports all global objects, including information about database users and groups, tablespaces, and attributes (for example, global access permissions).
  • gs_dumpall invokes gs_dump to export SQL scripts from each openGauss database, which contain all the SQL statements required to restore databases.

The exported files are both plain-text SQL scripts. Use gsql to execute them to restore openGauss databases.

Precautions

  • Do not modify an exported file or its content. Otherwise, restoration may fail.
  • To ensure the data consistency and integrity, gs_dumpall acquires a share lock on a table to be dumped. If another transaction has acquired a share lock on the table, gs_dumpall waits until this lock is released and then locks the table for dumping. If the table cannot be locked within the specified time, the dump fails. You can customize the timeout duration to wait for lock release by specifying the --lock-wait-timeout parameter.
  • During an export, gs_dumpall reads all tables in a database. Therefore, you need to connect to the database as an openGauss administrator to export a complete file. When you use gsql to execute SQL scripts, cluster administrator permissions are also required to add users and user groups, and create databases.

Syntax

gs_dumpall [OPTION]...

Parameter Description

Common parameters:

  • -f, --filename=FILENAME

    Sends the output to the specified file. If this parameter is omitted, the standard output is generated.

  • -v, --verbose

    Specifies the verbose mode. If it is specified, gs_dumpall writes detailed object comments and number of startups/stops to the dump file, and progress messages to standard error.

  • -V, --version

    Prints the gs_dumpall version and exits.

  • --lock-wait-timeout=TIMEOUT

    Do not keep waiting to obtain shared table locks at the beginning of the dump. Consider it as failed if you are unable to lock a table within the specified time. The timeout period can be specified in any of the formats accepted by SET statement_timeout.

  • -?, --help

    Displays help about gs_dumpall parameters and exits.

Dump parameters:

  • -a, --data-only

    Dumps only the data, not the schema (data definition).

  • -c, --clean

    Runs SQL statements to delete databases before rebuilding them. Statements for dumping roles and tablespaces are added.

  • -g, --globals-only

    Dumps only global objects (roles and tablespaces) but no databases.

  • -o, --oids

    Dumps object identifiers (OIDs) as parts of the data in each table. Use this parameter if your application references the OID columns in some way. If the preceding situation does not occur, do not use this parameter.

  • -O, --no-owner

    Do not output commands to set ownership of objects to match the original database. By default, gs_dumpall issues the ALTER OWNER or SET SESSION AUTHORIZATION command to set ownership of created schema objects. These statements will fail when the script is running unless it is started by a system administrator (or the same user that owns all of the objects in the script). To make a script that can be stored by any user and give the user ownership of all objects, specify -O.

  • -r, --roles-only

    Dumps only roles but not databases or tablespaces.

  • -s, --schema-only

    Dumps only the object definition (schema) but not data.

  • -S, --sysadmin=NAME

    Name of the system administrator during the dump.

  • -t, --tablespaces-only

    Dumps only tablespaces but not databases or roles.

  • -x, --no-privileges

    Prevents the dumping of access permissions (grant/revoke commands).

  • --column-inserts|--attribute-inserts

    Exports data by running the INSERT command with explicit column names {INSERT INTO table (column, …) VALUES …}. This will cause a slow restoration. However, since this option generates an independent command for each row, an error in reloading a row causes only the loss of the row rather than the entire table content.

  • --disable-dollar-quoting

    Disables the use of dollar sign ($) for function bodies, and forces them to be quoted using the SQL standard string syntax.

  • --disable-triggers

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --inserts

    Dumps data by the INSERT statement (rather than COPY). This will cause a slow restoration. The restoration may fail if you rearrange the column order. The --column-inserts option is unaffected against column order changes, though even slower.

  • --no-security-labels

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --no-tablespaces

    Does not generate output statements to create tablespaces or select tablespaces for objects. All the objects will be created during the restoration process, no matter which tablespace is selected when using this option.

  • --no-unlogged-table-data

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --quote-all-identifiers

    Forcibly quotes all identifiers. This parameter is useful when you dump a database for migration to a later version, in which additional keywords may be introduced.

  • --dont-overwrite-file

    Does not overwrite the current file.

  • --use-set-session-authorization

    Specifies that the standard SQL SET SESSION AUTHORIZATION command rather than ALTER OWNER is returned to ensure the object ownership. This makes dumping more standard. However, if a dump file contains objects that have historical problems, restoration may fail. A dump using SET SESSION AUTHORIZATION requires the system administrator rights, whereas ALTER OWNER requires lower permissions.

  • --with-encryption=AES128

    Specifies that dumping data needs to be encrypted using AES128.

  • --with-key=KEY

    Specifies that the key length of AES128 must be 16 bytes.

  • --include-templatedb

    Includes template databases during the dump.

  • --binary-upgrade

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --binary-upgrade-usermap=“USER1=USER2”

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --tablespaces-postfix

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --parallel-jobs

    Specifies the number of concurrent backup processes. The value range is 1-1000.

NOTE:

  • The -g/--globals-only and -r/--roles-only parameters do not coexist.
  • The -g/--globals-only and -t/--tablespaces-only parameters do not coexist.
  • The -r/--roles-only and -t/--tablespaces-only parameters do not coexist.
  • The -s/--schema-only and -a/--data-only parameters do not coexist.
  • The -r/--roles-only and -a/--data-only parameters do not coexist.
  • The -t/--tablespaces-only and -a/--data-only parameters do not coexist.
  • The -g/--globals-only and -a/--data-only parameters do not coexist.
  • --tablespaces-postfix must be used in conjunction with --binary-upgrade.
  • --binary-upgrade-usermap must be used in conjunction with --binary-upgrade.
  • --parallel-jobs must be used in conjunction with -f/--file.

Connection parameters:

  • -h, --host

    Specifies the host name. If the value begins with a slash (/), it is used as the directory for the UNIX domain socket. The default value is taken from the PGHOST environment (if variable). Otherwise, a Unix domain socket connection is attempted.

    This parameter is used only for defining names of the hosts outside openGauss. The names of the hosts inside openGauss must be 127.0.0.1.

    Environment Variable: PGHOST

  • -l, --database

    Specifies the name of the database connected to dump all objects and discover other databases to be dumped. If this parameter is not specified, the postgres database will be used. If the postgres database does not exist, template1 will be used.

  • -p, --port

    TCP port or the local Unix-domain socket file extension on which the server is listening for connections. The default value is the PGPORT environment variable.

    If the thread pool function is enabled, you are advised to use pooler port, that is, the listening port number plus 1.

    Environment variable: PGPORT

  • -U, --username

    Specifies the user name to connect to.

    Environment variable: PGUSER

  • -w, --no-password

    Never issues a password prompt. The connection attempt fails if the host requires password verification and the password is not provided in other ways. This parameter is useful in batch jobs and scripts in which no user password is required.

  • -W, --password

    Specifies the user password for connection. If the host uses the trust authentication policy, the administrator does not need to enter the -W option. If the -W option is not provided and you are not a system administrator, the Dump Restore tool will ask you to enter a password.

  • --role

    Specifies a role name to be used for creating the dump. This option causes gs_dumpall to issue the SET ROLE statement after connecting to the database. It is useful when the authenticated user (specified by -U) lacks the permissions required by the gs_dumpall. It allows the user to switch to a role with the required permissions. Some installations have a policy against logging in directly as a system administrator. This option allows dumping data without violating the policy.

  • --rolepassword

    Specifies the password of the specific role.

Notice

gs_dumpall internally invokes gs_dump. For details about the diagnosis information, see gs_dump.

Once gs_dumpall is restored, run ANALYZE on each database so that the optimizer can provide useful statistics.

gs_dumpall requires all needed tablespace directories to exit before the restoration. Otherwise, database creation will fail if the databases are in non-default locations.

Examples

Use gs_dumpall to export all openGauss databases at a time.

NOTE: gs_dumpall supports only plain-text format export. Therefore, only gsql can be used to restore a file exported using gs_dumpall.

gs_dumpall -f backup/bkp2.sql -p 37300
gs_dump[port='37300'][dbname='postgres'][2018-06-27 09:55:09]: The total objects number is 2371.
gs_dump[port='37300'][dbname='postgres'][2018-06-27 09:55:35]: [100.00%] 2371 objects have been dumped.
gs_dump[port='37300'][dbname='postgres'][2018-06-27 09:55:46]: dump database dbname='postgres' successfully
gs_dump[port='37300'][dbname='postgres'][2018-06-27 09:55:46]: total time: 55567  ms
gs_dumpall[port='37300'][2018-06-27 09:55:46]: dumpall operation successful
gs_dumpall[port='37300'][2018-06-27 09:55:46]: total time: 56088  ms

gs_restore

Background

gs_restore, provided by openGauss, is used to import data that was exported using gs_dump. It can also be used to import files exported by gs_dump.

gs_restore is executed by OS user omm.

It has the following functions:

  • Importing data to the database

    If a database is specified, data is imported to the database. For parallel import, the password for connecting to the database is required.

  • Importing data to the script file

    If the database storing imported data is not specified, a script containing the SQL statement to recreate the database is created and written to a file or standard output. This script output is equivalent to the plain text output format of gs_dump.

Command Format

gs_restore [OPTION]... FILE

NOTE:

  • FILE does not have a short or long parameter. It is used to specify the location for the archive files.
  • The dbname or -l parameter is required as prerequisites. Users cannot enter dbname and -l parameters at the same time.
  • gs_restore incrementally imports data by default. To prevent data exceptions caused by multiple import operations, you are advised to use the -c parameter during the import. Before recreating database objects, delete the database objects that already exist in the database to be restored.
  • There is no option to control log printing. To hide logs, redirect the logs to the log file. If a large amount of table data needs to be restored, the table data will be restored in batches. Therefore, the log indicating that the table data has been imported is generated for multiple times.

Parameter Description

Common parameters

  • -d, --dbname=NAME

    Connects to the dbname database and imports data to the database.

  • -f, --file=FILENAME

    Specifies the output file for the generated script, or uses the output file in the list specified using -l.

    The default is the standard output.

    NOTE:
    -f cannot be used in conjunction with -d.

  • -F, --format=c|d|t

    Specifies the format of the archive. The format does not need to be specified because the gs_restore determines the format automatically.

    Value range:

    • c/custom: The archive form is the customized format in gs_dump.
    • d/directory: The archive form is a directory archive format.
    • t/tar: The archive form is a .tar archive format.
  • -l, --list

    Lists the forms of the archive. The operation output can be used for the input of the -L parameter. If filtering parameters, such as -n or -t, are used together with -l, they will restrict the listed items.

  • -v, --verbose

    Specifies the verbose mode.

  • -V, --version

    Prints the gs_restore version and exits.

  • -?, --help

    Displays help information about the parameters of gs_restore and exits.

Parameters for importing data

  • -a, -data-only

    Imports only the data, not the schema (data definition). gs_restore incrementally imports data.

  • -c, --clean

    Cleans (deletes) existing database objects in the database to be restored before recreating them.

  • -C, --create

    Creates the database before importing data to it. (When this parameter is used, the database specified by -d is used to issue the initial CREATE DATABASE command. All data is imported to the created database.)

  • -e, --exit-on-error

    Exits if an error occurs when you send the SQL statement to the database. If you do not exit, the commands will still be sent and error information will be displayed when the import ends.

  • -I, --index=NAME

    Imports only the definition of the specified index. Multiple indexes can be imported. Enter -I_ index_ multiple times to import multiple indexes.

    For example:

    gs_restore -h host_name -p port_number -d postgres -I Index1 -I Index2 backup/MPPDB_backup.tar
    

    In this example, Index1 and Index2 will be imported.

  • -j, --jobs=NUM

    Specifies the number of concurrent, the most time-consuming jobs of gs_restore (such as loading data, creating indexes, or creating constraints). This parameter can greatly reduce the time to import a large database to a server running on a multiprocessor machine.

    Each job is one process or one thread, depending on the OS; and uses a separate connection to the server.

    The optimal value for this option depends on the server hardware setting, the client, the network, the number of CPU cores, and disk settings. It is recommended that the parameter be set to the number of CPU cores on the server. In addition, a larger value can also lead to faster import in many cases. However, an overly large value will lead to decreased performance because of thrashing.

    This parameter supports custom-format archives only. The input file must be a regular file (not the pipe file). This parameter can be ignored when you select the script method rather than connect to a database server. In addition, multiple jobs cannot be used in conjunction with the --single-transaction parameter.

  • -L, --use-list=FILENAME

    Imports only archive elements that are listed in list-file and imports them in the order that they appear in the file. If filtering parameters, such as -n or -t, are used in conjunction with -L, they will further limit the items to be imported.

    list-file is normally created by editing the output of a previous -l parameter. File lines can be moved or removed, and can also be commented out by placing a semicolon (;) at the beginning of the row.

  • -n, --schema=NAME

    Restores only objects that are listed in schemas.

    This parameter can be used in conjunction with the -t parameter to import a specific table.

    Entering **-n **schemaname multiple times can import multiple schemas.

    For example:

    gs_restore -h host_name -p port_number -d postgres -n sch1 -n sch2 backup/MPPDB_backup.tar
    

    In this example, sch1 and sch2 will be imported.

  • -O, --no-owner

    Do not output commands to set ownership of objects to match the original database. By default, gs_restore issues the ALTER OWNER or SET SESSION AUTHORIZATION statement to set ownership of created schema elements. Unless the system administrator or the user who has all the objects in the script initially accesses the database. Otherwise, the statement will fail. Any user name can be used for the initial connection using -O, and this user will own all the created objects.

  • -P, --function=NAME(args)

    Imports only listed functions. You need to correctly spell the function name and the parameter based on the contents of the dump file in which the function exists.

    Entering -P alone means importing all function-name(args) functions in a file. Entering -P with -n means importing the function-name(args) functions in a specified schema. Entering -P multiple times and using -n once means that all imported functions are in the -n schema by default.

    You can enter -n schema-name -P 'function-name(args)' multiple times to import functions in specified schemas.

    For example:

    gs_restore -h host_name -p port_number -d postgres -n test1 -P 'Func1(integer)' -n test2 -P 'Func2(integer)' backup/MPPDB_backup.tar
    

    In this example, both Func1 (i integer) in the test1 schema and Func2 (j integer) in the test2 schema will be imported.

  • -s, --schema-only

    Imports only schemas (data definitions), instead of data (table content). The current sequence value will not be imported.

  • -S, --sysadmin=NAME

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • -t, --table=NAME

    Imports only listed table definitions or data, or both. This parameter can be used in conjunction with the -n parameter to specify a table object in a schema. When -n is not entered, the default schema is PUBLIC. Entering **-n schemaname -t **tablename multiple times can import multiple tables in a specified schema.

    For example:

    Import table1 in the PUBLIC schema.

    gs_restore -h host_name -p port_number -d postgres -t table1 backup/MPPDB_backup.tar
    

    Import test1 in the test1 schema and test2 in the test2 schema.

    gs_restore -h host_name -p port_number -d postgres -n test1 -t test1 -n test2 -t test2 backup/MPPDB_backup.tar
    

    Import table1 in the PUBLIC schema and test1 in the test1 schema.

    gs_restore -h host_name -p port_number -d postgres -n PUBLIC -t table1 -n test1 -t table1 backup/MPPDB_backup.tar
    
  • -T, --trigger=NAME

    This parameter is reserved for extension.

  • -x, --no-privileges/--no-acl

    Prevents the import of access permissions (GRANT/REVOKE commands).

  • -1, --single-transaction

    Executes import as a single transaction (that is, commands are wrapped in BEGIN/COMMIT).

    This parameter ensures that either all the commands are completed successfully or no application is changed. This parameter means --exit-on-error.

  • --disable-triggers

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --no-data-for-failed-tables

    By default, table data will be imported even if the statement to create a table fails (for example, the table already exists). Data in such table is skipped using this parameter. This operation is useful if the target database already contains the desired table contents.

    This parameter takes effect only when you import data directly into a database, not when you output SQL scripts.

  • --no-security-labels

    Specifies a reserved port for function expansion. This parameter is not recommended.

  • --no-tablespaces

    Tablespaces excluding specified ones All objects will be created during the import process no matter which tablespace is selected when using this option.

  • --section=SECTION

    Imports the listed sections (such as pre-data, data, or post-data).

  • --use-set-session-authorization

    Is used for plain-text backup.

    Outputs the SET SESSION AUTHORIZATION statement instead of the ALTER OWNER statement to determine object ownership. This parameter makes dump more standards-compatible. If the records of objects in exported files are referenced, import may fail. Only administrators can use the SET SESSION AUTHORIZATION statement to dump data, and the administrators must manually change and verify the passwords of exported files by referencing the SET SESSION AUTHORIZATION statement before import. The ALTER OWNER statement requires lower permissions.

NOTICE:

  • If any local additions need to be added to the template1 database during the installation, restore the output of gs_restore into an empty database with caution. Otherwise, you are likely to obtain errors due to duplicate definitions of the added objects. To create an empty database without any local additions, copy data from template0 rather than template1. Example:
CREATE DATABASE foo WITH TEMPLATE template0;
  • gs_restore cannot import large objects selectively. For example, it can only import the objects of a specified table. If an archive contains large objects, all large objects will be imported, or none of them will be restored if they are excluded by using -L, -t, or other parameters.

NOTE:

  1. The -d/--dbname and -f/--file parameters do not coexist.
  2. The -s/--schema-only and -a/--data-only parameters do not coexist.
  3. The -c/--clean and -a/--data-only parameters do not coexist.
  4. When --single-transaction is used, -j/--jobs must be a single job.
  5. --role must be used in conjunction with --rolepassword.

Connection parameters:

  • -h, --host=HOSTNAME

    Specifies the host name. If the value begins with a slash (/), it is used as the directory for the UNIX domain socket. The default value is taken from the PGHOST environment variable. If it is not set, a UNIX domain socket connection is attempted.

    This parameter is used only for defining names of the hosts outside openGauss. The names of the hosts inside openGauss must be 127.0.0.1.

  • -p, --port=PORT

    TCP port or the local Unix-domain socket file extension on which the server is listening for connections. The default value is the PGPORT environment variable.

    If the thread pool function is enabled, you are advised to use pooler port, that is, the listening port number plus 1.

  • -U, --username=NAME

    Specifies the user name to connect to.

  • -w, --no-password

    Never issues a password prompt. The connection attempt fails if the host requires password verification and the password is not provided in other ways. This parameter is useful in batch jobs and scripts in which no user password is required.

  • -W, --password=PASSWORD

    User password for database connection. If the host uses the trust authentication policy, the administrator does not need to enter the -W parameter. If the -W parameter is not provided and you are not a system administrator, gs_restore will ask you to enter a password.

  • --role=ROLENAME

    Specifies a role name for the import operation. If this parameter is selected, the SET ROLE statement will be issued after gs_restore connects to the database. It is useful when the authenticated user (specified by -U) lacks the permissions required by gs_restore. This parameter allows the user to switch to a role with the required permissions. Some installations have a policy against logging in directly as the initial user. This parameter allows data to be imported without violating the policy.

  • --rolepassword=ROLEPASSWORD

    Role password.

Example

Special case: Execute the gsql tool. Run the following commands to import the MPPDB_backup.sql file in the export folder (in plain-text format) generated by gs_dump/gs_dumpall to the postgres database:

gsql -d postgres -p 5432 -W Bigdata@123 -f /home/omm/test/MPPDB_backup.sql
SET
SET
SET
SET
SET
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
CREATE INDEX
CREATE INDEX
CREATE INDEX
SET
CREATE INDEX
REVOKE
REVOKE
GRANT
GRANT
total time: 30476  ms

gs_restore is used to import the files exported by gs_dump.

Example 1: Execute the gs_restore tool to import the exported MPPDB_backup.dmp file (custom format) to the postgres database.

gs_restore -W Bigdata@123 backup/MPPDB_backup.dmp -p 5432 -d postgres
gs_restore: restore operation successful
gs_restore: total time: 13053  ms

Example 2: Execute the gs_restore tool to import the exported MPPDB_backup.tar file (.tar format) to the postgres database.

gs_restore backup/MPPDB_backup.tar -p 5432 -d postgres 
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 21203  ms

Example 3: Execute the gs_restore tool to import the exported MPPDB_backup file (directory format) to the postgres database.

gs_restore backup/MPPDB_backup -p 5432 -d postgres
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 21003  ms

Example 4: Execute the gs_restore tool and run the following commands to import the MPPDB_backup.dmp file (in custom format). Specifically, import all the object definitions and data in the PUBLIC schema. Existing objects are deleted from the target database before the import. If an existing object references to an object in another schema, you need to manually delete the referenced object first.

gs_restore backup/MPPDB_backup.dmp -p 5432 -d postgres -e -c -n PUBLIC
gs_restore: [archiver (db)] Error while PROCESSING TOC:
gs_restore: [archiver (db)] Error from TOC entry 313; 1259 337399 TABLE table1 gaussdba
gs_restore: [archiver (db)] could not execute query: ERROR:  cannot drop table table1 because other objects depend on it
DETAIL:  view t1.v1 depends on table table1
HINT:  Use DROP ... CASCADE to drop the dependent objects too.
    Command was: DROP TABLE public.table1;

Manually delete the referenced object and create it again after the import is complete.

gs_restore backup/MPPDB_backup.dmp -p 5432 -d postgres -e -c -n PUBLIC
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 2203  ms

Example 5: Execute the gs_restore tool and run the following commands to import the MPPDB_backup.dmp file (in custom format). Specifically, import only the definition of table1 in the PUBLIC schema.

gs_restore backup/MPPDB_backup.dmp -p 5432 -d postgres -e -c -s -n PUBLIC -t table1
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 21000  ms

Example 6: Execute the gs_restore tool and run the following commands to import the MPPDB_backup.dmp file (in custom format). Specifically, import only the data of table1 in the PUBLIC schema.

gs_restore backup/MPPDB_backup.dmp -p 5432 -d postgres -e -a -n PUBLIC -t table1
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 20203  ms
Feedback
编组 3备份
    openGauss 2024-05-06 00:44:54
    cancel