Backup and Restoration

Overview
Physical Backup and Restoration
Logical Backup and Restoration

Overview

For database security purposes, openGauss provides two backup types, multiple backup and restoration solutions, and data reliability assurance mechanisms.

Backup and restoration can be logically or physically performed.

Logical backup and restoration: backs up data by logically exporting data. This method can dump data that is backed up at a certain time point, and restore data only to this backup point. A logical backup does not back up data processed between failure occurrence and the last backup. It applies to scenarios where data rarely changes. Such data damaged due to misoperation can be quickly restored using a logical backup. To restore all the data in a database through logical backup, rebuild a database and import the backup data. Logical backup is not recommended for databases requiring high data availability because it takes a long time for data restoration. Logical backup is a major approach to migrate and transfer data because it can be performed on any platform.

Physical backup and restoration: copies physical files in the unit of disk blocks to back up a database. A database can be restored using backup files, such as data files and archive log files. Physical backup is usually used for full backup, quickly backing up and restoring data with low costs if properly planned.

The two data backup and restoration solutions supported by openGauss are as follows. Methods for restoring data in case of an exception differ for different backup and restoration solutions.

Table 1 Comparison between logical and physical backup and restoration

Backup Type	Application Scenario	Media	Advantage and Disadvantage
Logical backup and restoration	Small volume of data needs to be processed. Currently, it is used for the backup and restoration of one or more tables.	Disk SSD	Backing up a small amount of data (for specified objects) is efficient and flexible. For a large volume of data, backup requires a long period of time.
Physical backup and restoration	Huge volume of data needs to be processed. It is mainly used for full backup and restoration as well as the backup and restoration of all WAL archive and run logs in the database.	Backing up a huge amount of data is efficient.

Backup Type

Application Scenario

Media

Advantage and Disadvantage

Logical backup and restoration

Small volume of data needs to be processed.

Currently, it is used for the backup and restoration of one or more tables.

Disk
SSD

Backing up a small amount of data (for specified objects) is efficient and flexible.

For a large volume of data, backup requires a long period of time.

Physical backup and restoration

Huge volume of data needs to be processed. It is mainly used for full backup and restoration as well as the backup and restoration of all WAL archive and run logs in the database.

Backing up a huge amount of data is efficient.

While backing up and restoring data, take the following aspects into consideration:

Whether the impact of data backup on services is acceptable
Database restoration efficiency
To minimize the impact of database faults, try to minimize the restoration duration, achieving the highest restoration efficiency.
Data restorability
Minimize data loss after the database is invalidated.
Database restoration cost

There are many factors that need to be considered while you select a backup policy on the live network, such as backup objects, data volume, and network configuration. Table 2 lists available backup policies and applicable scenarios for each backup policy.

Table 2 Backup policies and scenarios

Backup Policy	Key Performance Factor	Typical Data Volume	Performance Specifications
Cluster backup	Data amount Network configuration	Data volume: PB level Object quantity: about 1 million	Backup: Data transfer rate on each host: 80 Mbit/s (NBU/EISOO+Disk) Disk I/O rate (SSD/HDD): about 90%
Table backup	Schema where the table to be backed up resides Network configuration (NBU)	Data volume: 10 TB level	Backup: depends on query performance rate and I/O rate NOTE: For multi-table backup, the backup time is calculated as follows: Total time = Number of tables x Starting time + Total data volume/Data backup speed In the preceding information: The starting time of a disk is about 5s. The starting time of an NBU is longer than that of a disk (depending on the NBU deployment). The data backup speed is about 50 MB/s on a single node. (The speed is evaluated based on the backup of a 1 GB table from a physical host to a local disk.) The smaller the table is, the lower the backup performance will be.

Backup Policy

Key Performance Factor

Typical Data Volume

Performance Specifications

Cluster backup

Data amount
Network configuration

Data volume: PB level

Object quantity: about 1 million

Backup:

Data transfer rate on each host: 80 Mbit/s (NBU/EISOO+Disk)
Disk I/O rate (SSD/HDD): about 90%

Table backup

Schema where the table to be backed up resides
Network configuration (NBU)

Data volume: 10 TB level

Backup: depends on query performance rate and I/O rate

NOTE:

For multi-table backup, the backup time is calculated as follows:

Total time = Number of tables x Starting time + Total data volume/Data backup speed

In the preceding information:

The starting time of a disk is about 5s. The starting time of an NBU is longer than that of a disk (depending on the NBU deployment).
The data backup speed is about 50 MB/s on a single node. (The speed is evaluated based on the backup of a 1 GB table from a physical host to a local disk.)

The smaller the table is, the lower the backup performance will be.

Physical Backup and Restoration

gs_basebackup

Background

After openGauss is deployed, problems and exceptions may occur during database running. gs_basebackup, provided by openGauss, is used to perform basic physical backup. gs_basebackup copies the binary files of the database on the server using a replication protocol. To remotely execute gs_basebackup, you need to use the system administrator account. gs_basebackup supports hot backup and compressed backup.

NOTE:

gs_basebackup supports only full backup.
gs_basebackup supports hot backup and compressed backup.
gs_basebackup cannot back up tablespaces containing absolute paths on the same server. This is because the absolute path is unique on the same machine, and brings about conflicts. However, it can back up tablespaces containing absolute paths on different machines.
If the functions of incremental checkpoint and dual-write are enabled, gs_basebackup also backs up dual-write files.
If the pg_xlog directory is a soft link, no soft link is created during backup. Data is directly backed up to the pg_xlog directory in the destination path.
If the backup permission is revoked during the backup, the backup may fail or the backup data may be unavailable.
openGauss does not support version upgrade.

Prerequisites

The openGauss database can be connected.
User permissions are not revoked during the backup.
In the pg_hba.conf file, the replication connection is allowed and the connection is established by a system administrator.
If the Xlog transmission mode is stream, the number of max_wal_senders must be configured to at least one.
If the Xlog transmission mode is fetch, the wal_keep_segments parameter must be set to a large value so that logs are not removed before the backup ends.
During the restoration, backup files exist in the backup directory on all the nodes. If backup files are lost on any node, copy them to it from another node.

Syntax

Display help information.
```
gs_basebackup -? | --help
```
Display version information.
```
gs_basebackup -V | --version
```

Parameter Description

The gs_basebackup tool can use the following types of parameters:

-D directory
Directory for storing backup files. This parameter is mandatory.
Common parameters
- -c, --checkpoint=fast|spread
  Sets the checkpoint mode to fast or spread (default).
- -l, --label=LABEL
  Adds tags for the backup.
- -P, --progress
  Enables the progress report.
- -v, --verbose
  Enables the verbose mode.
- -V, --version
  Prints the version and exits.
- -?, --help
  Displays gs_basebackup command parameters.
- -T, –tablespace-mapping=olddir=newdir
  During the backup, the tablespace in the olddir directory is relocated to the newdir directory. For this to take effect, olddir must exactly match the path where the tablespace is located (but it is not an error if the backup does not contain the tablespaces in olddir). olddir and newdir must be absolute paths. If a path happens to contain an equal sign (=), you can escape it with a backslash (\). This option can be used multiple times for multiple tablespaces.
- -F, –format=plain|tar
  Sets the output format to plain (default) or tar. If this parameter is not set, the default value –format=plain is used. The plain format writes the output as a flat file, using the same layout as the current data directory and tablespace. When the cluster has no extra tablespace, the entire database is placed in the target directory. If the cluster contains additional tablespaces, the primary data directory will be placed in the target directory, but all other tablespaces will be placed in the same absolute path on the server. The tar mode writes the output as a tar file in the target directory. The primary data directory is written to a file named base.tar, and other tablespaces are named after their OIDs. The generated .tar package must be decompressed using the gs_tar command.
- -X, –xlog-method=fetch|stream
  Sets the Xlog transmission mode. If this parameter is not set, the default value –xlog-method=stream is used. The required write-ahead log files (WALs) are included in the backup. This includes all WALs generated during the backup. In fetch mode, WAL files are collected at the end of the backup. Therefore, the wal_keep_segments parameter must be set to a large value so that logs are not removed before the backup ends. If it has been rotated when the log is to be transmitted, the backup fails and is unavailable. In stream mode, WALs are streamed when a backup is created. This will open a second connection to the server and start streaming WALs while the backup is running. Therefore, it will use up to two connections configured by the max_wal_senders parameter. As long as the client can receive WALs, no additional WALs need to be stored on the host.
- -x, –xlog
  Equivalent to using -X with the fetch method.
- -Z –compress=level
  Enables gzip compression for the output of the tar file and sets the compression level (0 to 9, where 0 indicates no compression and 9 indicates the best compression). The compression is available only when the tar format is used. The suffix .gz is automatically added to the end of all .tar file names.
- -z
  Enables gzip compression for tar file output and uses the default compression level. The compression is available only when the tar format is used. The suffix .gz is automatically added to the end of all .tar file names.
- -t, –rw-timeout
  Sets the checkpoint time limit during backup. The default value is 120s. If the full checkpoint of the database is time-consuming, you can increase the value of rw-timeout.
Connection parameters
- -h, --host=HOSTNAME
  Specifies the host name of the machine on which the server is running or the directory for the Unix-domain socket.
- -p, --port=PORT
  Specifies the port number of the database server.
  You can modify the default port number using this parameter.
- -U, --username=USERNAME
  Specifies the user that connects to the database.
- -s, --status-interval=INTERVAL
  Specifies the time for sending status packets to the server, in seconds.
- -w,--no-password
  Never issues a password prompt.
- -W, --password
  Issues a password prompt when the -U parameter is used to connect to a local or remote database.

Example

gs_basebackup -D /home/test/trunk/install/data/backup -h 127.0.0.1 -p 21233
INFO:  The starting position of the xlog copy of the full build is: 0/1B800000. The slot minimum LSN is: 0/1B800000.

Restoring Data from Backup Files

If a database is faulty, restore it from backup files. gs_basebackup backs up the database in binary mode. Therefore, you can directly copy and replace the original files or start the database on the backup database.

NOTE:

If the current database instance is running, a port conflict may occur when you start the database from the backup file. In this case, you need to modify the port parameter in the configuration file or specify a port when starting the database.
If the current backup file is a primary/standby database, you may need to modify the replication connections between the master and slave databases. That is, replconninfo1 and replconninfo2 in the postgresql.conf file.

To restore the original database, perform the following steps:

Stop the database server. For details, see Administrator Guide.
Copy the original database and all tablespaces to another location for future use.
Delete all or part of the files from the original database.
Use the database system user rights to restore the required database files from the backup.
If a link file exists in the database, modify the link file so that it can be linked to the correct file.
Restart the database server and check the database content to ensure that the database is restored to the required status.

NOTE:

Incremental restoration from backup files is not supported.
After the restoration, check that the link file in the database is linked to the correct file.

gs_probackup

Background

gs_probackup is a tool used to manage openGauss database backup and restoration. It periodically backs up the openGauss instances so that the server can be restored when the database is faulty.

It supports the physical backup of a standalone database or a primary node in a database.
It supports the backup of contents in external directories, such as script files, configuration files, log files, and dump files.
It supports incremental backup, periodic backup, and remote backup.
It supports settings on the backup retention policy.

Prerequisites

The openGauss database can be connected.
To use PTRACK incremental backup, manually add enable_cbm_tracking = on to postgresql.conf.

Important Notes

The backup must be performed by the user who runs the database server.
The major version number of the database server to be backed up must be the same as that of the database server to be restored.
To back up a database in remote mode using SSH, install the database of the same major version on the local and remote hosts, and run the ssh-copy-id remote_user@remote_host command to set an SSH connection without a password between the local host backup user and the remote host database user.
In remote mode, only the subcommands add-instance, backup, and restore can be executed.
Before running the restore subcommand, stop the gaussdb process.

Command Description

Print the gs_probackup version.

gs_probackup -V|--version
gs_probackup version

Display brief information about the gs_probackup command. Alternatively, display details about parameters of a specified subcommand of gs_probackup.
```
gs_probackup -?|--help
gs_probackup help [command]
```
Initialize the backup directory in backup-path. The backup directory stores the contents that have been backed up. If the backup-path backup path exists, it must be empty.
```
gs_probackup init -B backup-path [--help]
```
Initialize a new backup instance in the backup directory of backup-path and generate the pg_probackup.conf configuration file, which saves the **gs_probackup **settings of the specified data directory pgdata-path.
```
gs_probackup add-instance -B backup-path -D pgdata-path --instance=instance_name
[-E external-directories-paths]
[remote_options]
[--help]
```
Delete the backup content related to the specified instance from the backup-path directory.
```
gs_probackup del-instance -B backup-path --instance=instance_name
[--help]
```

Add the specified connection, compression, and log-related settings to the pg_probackup.conf configuration file or modify the existing settings. You are not advised to manually edit the pg_probackup.conf configuration file.

gs_probackup set-config -B backup-path --instance=instance_name
[-D pgdata-path] [-E external-directories-paths] [--archive-timeout=timeout]
[--retention-redundancy=retention-redundancy] [--retention-window=retention-window] [--wal-depth=wal-depth]
[--compress-algorithm=compress-algorithm] [--compress-level=compress-level]
[-d dbname] [-h hostname] [-p port] [-U username]
[logging_options] [remote_options]
[--help]

Add the backup-related settings to the backup.control configuration file or modify the settings.

gs_probackup set-backup -B backup-path --instance=instance_name -i backup-id
[--note=text] [pinning_options]
[--help]

Display the content of the pg_probackup.conf configuration file in the backup directory. You can specify --format=json to display the information in JSON format. By default, the plain text format is used.
```
gs_probackup show-config -B backup-path --instance=instance_name
[--format=plain|json]
[--help]
```
Display the contents of the backup directory. If instance_name and backup_id are specified, detailed information about the backup is displayed. You can specify --format=json to display the information in JSON format. By default, the plain text format is used.
```
gs_probackup show -B backup-path
[--instance=instance_name [-i backup-id]] [--archive] [--format=plain|json]
[--help]
```

Create a backup for a specified database instance.

gs_probackup backup -B backup-path --instance=instance_name -b backup-mode
[-D pgdata-path] [-C] [-S slot-name] [--temp-slot] [--backup-pg-log] [-j threads_num] [--progress]
[--no-validate] [--skip-block-validation] [-E external-directories-paths] [--no-sync] [--note=text]
[--archive-timeout=timeout]
[logging_options] [retention_options] [compression_options]
[connection_options] [remote_options] [pinning_options]
[--help]

Restore a specified instance from the backup copy in the backup-path directory. If an instance to be restored is specified, gs_probackup will look for its latest backup and restore it to the specified recovery target. Otherwise, the latest backup of any instance is used.

gs_probackup restore -B backup-path --instance=instance_name
[-D pgdata-path] [-i backup_id] [-j threads_num] [--progress] [--force] [--no-sync] [--no-validate] [--skip-block-validation]
[--external-mapping=OLDDIR=NEWDIR] [-T OLDDIR=NEWDIR] [--skip-external-dirs] [-I incremental_mode]
[recovery_options] [remote_options] [logging_options]
[--help]

Merge all incremental backups between the specified incremental backup and its parent full backup into the parent full backup. The parent full backup will receive all merged data, while the merged incremental backup will be deleted as redundancy.
```
gs_probackup merge -B backup-path --instance=instance_name -i backup_id
[-j threads_num] [--progress] [logging_options]
[--help]
```

Delete a specified backup or delete backups that do not meet the current retention policy.

gs_probackup delete -B backup-path --instance=instance_name
[-i backup-id | --delete-expired | --merge-expired | --status=backup_status]
[--delete-wal] [-j threads_num] [--progress]
[--retention-redundancy=retention-redundancy] [--retention-window=retention-window]
[--wal-depth=wal-depth] [--dry-run]
[logging_options]
[--help]

Verify that all files required for restoring the database exist and are not damaged. If **instance_name **is not specified, **gs_probackup **verifies all available backups in the backup directory. If **instance_name **is specified and no additional options are specified, **gs_probackup **verifies all available backups for this backup instance. If both **instance_name **and **backup-id **or recovery objective-related options are specified, **gs_probackup **checks whether these options can be used to restore the database.
```
gs_probackup validate -B backup-path
[--instance=instance_name] [-i backup-id]
[-j threads_num] [--progress] [--skip-block-validation]
[--recovery-target-time=time | --recovery-target-xid=xid | --recovery-target-lsn=lsn | --recovery-target-name=target-name]
[--recovery-target-inclusive=boolean]
[logging_options]
[--help] 
```

Parameter Description

Common parameters

command
Specifies subcommands except version and help: init, add-instance, del-instance, set-config, set-backup, show-config, show, backup, restore, merge, delete, and validate.
-?, --help
Displays help information about the command line parameters of gs_probackup and exits.
Only --help can be used in subcommands; -? is forbidden.
-V, --version
Prints the gs_probackup version and exits.
-B backup-path, --backup-path=backup-path
Backup path.
System environment variable: $BACKUP_PATH
-D pgdata-path, --pgdata=pgdata-path
Path of the data directory.
System environment variable: $PGDATA
--instance=instance_name
Instance name.
-i backup-id, --backup-id=backup-id
Unique identifier of a backup.
--format=format
Specifies format of the backup information to be displayed. The plain and JSON formats are supported.
Default value: plain
--status=backup_status
Deletes all backups in a specified state. The states are as follows:
- OK: Backup is complete and valid.
- DONE: Backup has been completed but not verified.
- RUNNING: Backup is in progress.
- MERGING: Backups are being merged.
- DELETING: Backup is being deleted.
- CORRUPT: Some backup files are damaged.
- ERROR: Backup fails due to an unexpected error.
- ORPHAN: Backup is invalid because one of its parent backups is corrupted or lost.
-j threads_num, --threads=threads_num
Sets the number of concurrent threads for the backup, restoration, and combination processes.
--archive
Displays WAL archiving information.
--progress
Displays progress.
--note=text
Adds a note to the backup.

Backup-related parameters

-b backup-mode, --backup-mode=backup-mode
Specifies the backup mode. The value can be FULL or PTRACK.
FULL: creates a full backup. The full backup contains all data files.
PTRACK: creates a PTRACK incremental backup.
-C, --smooth-checkpoint
Expands checkpoints within a period of time. By default, gs_probackup attempts to complete checkpoints as soon as possible.
-S slot-name, --slot=slot-name
Specifies the replication slot for WAL stream processing.
--temp-slot
Creates a temporary physical replication slot for WAL stream processing in the backup instance to ensure that all required WAL segments are still available during the backup.
The default slot name is pg_probackup_slot, which can be changed using the --slot/-S option.
--backup-pg-log
Includes the log directory in the backup. This directory typically contains log messages. By default, the log directory is not included.
-E external-directories-paths, --external-dirs=external-directories-paths
Includes the specified directory in the backup. This option is useful for backing up scripts in external data directories, sql dumps, and configuration files. To back up multiple external directories, use colons (:) to separate their paths in Unix.
Example: -E /tmp/dir1:/tmp/dir2
--skip-block-validation
Disables block-level verification to speed up backup.
--no-validate
Skips the automatic verification when the backup is complete.
--no-sync
Disables backup file synchronization to the disk.
--archive-timeout=timeout
Specifies timeout interval for streaming processing, in seconds.
Default value: 300
-t rwtimeout
The timeout period for connecting to the database in seconds.
Default value: 120

Restoration-related parameters

-I, --incremental-mode=none|checksum|lsn
Reuses the valid pages available in PGDATA if they are not modified.
Default value: none
--external-mapping=OLDDIR=NEWDIR
During restoration, the external directory contained in the backup is moved from **OLDDIR **to NEWDIR. OLDDIR and NEWDIR must be absolute paths. If the path contains an equal sign (=), use a backslash (\) to escape. This option can be specified for multiple directories.
-T OLDDIR=NEWDIR, --tablespace-mapping=OLDDIR=NEWDIR
Relocates the tablespace from the OLDDIR directory to the NEWDIR directory during the restoration. OLDDIR and NEWDIR must be absolute paths. If the path contains an equal sign (=), use a backslash (\) to escape. This parameter can be specified multiple times for multiple tablespaces. This parameter must be used together with --external-mapping.
--skip-external-dirs
Skips the external directories in the backup that are specified using the --external-dirs option. The contents of these directories will not be restored.
--skip-block-validation
Skips block-level verification to speed up verification. During the automatic verification before the restoration, only file-level verification is performed.
--no-validate
Skips the backup verification.
--force
Specifies the invalid state that allows ignoring backup. This flag can be used if data needs to be restored from a damaged or invalid backup. Exercise caution when using it.

Recovery objective-related parameters (recovery_options)

NOTE: Currently, continuous WAL archiving PITR cannot be configured. Therefore, parameter usage is restricted as follows: To use continuously archived WAL logs for PITR, perform the following steps:
Replace the target database directory with the physical backup files.
Delete all files in the database directory pg_xlog/.
Copy the archived WAL log file to the pg_xlog file. (Or you can configure restore_command in the recovery.conf file to skip this step.)
Create the recovery command file recovery.conf in the database directory and specify the database recovery degree.
Start the database.
Connect to the database and check whether the database is recovered to the expected status. If the expected status is reached, run the pg_xlog_replay_resume() command so that the primary node can provide services externally.

--recovery-target-lsn=lsn
Specifies LSN to be restored. Currently, only the backup stop LSN can be specified.
--recovery-target-name=target-name
Specifies named savepoint to which data is restored. You can obtain the savepoint by viewing the recovery-name column in the backup.
--recovery-target-time=time
Specifies time to which data is restored. Currently, only recovery-time can be specified.
--recovery-target-xid=xid
Specifies transaction ID to which data is restored. Currently, only recovery-xid can be specified.
--recovery-target-inclusive=boolean
When this parameter is set to true, the recovery target will include the specified content.
When this parameter is set to false, the recovery target will not include the specified content.
This parameter must be used together with --recovery-target-name, --recovery-target-time, --recovery-target-lsn, or --recovery-target-xid.

Retention-related parameters (retention_options)

NOTE: The following parameters can be used together with the backup and delete commands.

--retention-redundancy=retention-redundancy
Number of full backups retained in the data directory. The value must be a positive integer. The value 0 indicates that the setting is disabled.
Default value: 0
--retention-window=retention-window
Specifies the retention period. The value must be a positive integer. The value 0 indicates that the setting is disabled.
Default value: 0
--wal-depth=wal-depth
Latest number of valid backups that must be retained on each timeline to perform the PITR capability The value must be a positive integer. The value 0 indicates that the setting is disabled.
Default value: 0
--delete-wal
Deletes unnecessary WAL files from any existing backup.
--delete-expired
Deletes the backups that do not comply with the retention policy defined in the pg_probackup.conf configuration file.
--merge-expired
Merges the oldest incremental backup that meets the retention policy requirements with its expired parent backup.
--dry-run
Displays the status of all available backups. Expired backups will not be deleted or merged.

Fixed backup-related parameters (pinning_options)

NOTE: To exclude certain backups from the established retention policy, you can use the following parameters with the backup and set-backup commands.

--ttl=interval
Specifies a fixed amount of time to back up data from the restoration time. The value must be a positive integer. The value 0 indicates that the backup is canceled.
Supported unit: ms, s, min, h, d (default value: s)
For example, --ttl=30d.
--expire-time=time
Specifies the timestamp when the backup is invalid. The time stamp must comply with the ISO-8601 standard.
For example, --expire-time='2020-01-01 00:00:00+03'.

Log-related parameters (logging_options)

Log levels: verbose, log, info, warning, error, and off.

--log-level-console=log-level-console
Sets the level of logs to be sent to the console. Each level contains all the levels following it. A higher level indicates fewer messages sent. If this parameter is set to off, the log recording function of the console is disabled.
Default value: info
--log-level-file=log-level-file
Sets the level of logs to be sent to the log file. Each level contains all the levels following it. A higher level indicates fewer messages sent. If this parameter is set to off, the log file recording function is disabled.
Default value: off
--log-filename=log-filename
Specifies the name of the log file to be created. The file name can use the strftime mode. Therefore, %-escapes can be used to specify the file name that changes with time.
For example, if the pg_probackup-%u.log mode is specified, pg_probackup generates a log file each day of the week, with %u replaced by the corresponding decimal number, that is, pg_probackup-1.log indicates Monday. pg_probackup-2.log indicates Tuesday, and so on.
This parameter is valid if the --log-level-file parameter is specified to enable log file recording.
Default value: “pg_probackup.log”
--error-log-filename=error-log-filename
Specifies the name of the log file that is used only for error logs. The specifying method is the same as that of the --log-filename parameter.
It is used for troubleshooting and monitoring.
--log-directory=log-directory
Specifies the directory where log files are created. The value must be an absolute path. This directory is created when the first log is written.
Default value: $BACKUP_PATH/log
--log-rotation-size=log-rotation-size
Specifies the maximum size of a log file. If the maximum size is reached, the log file will be circulated after the gs_probackup command is executed. The help and version commands will not lead to a log file circulation. The value 0 indicates that the file size-based loop is disabled.
The unit can be KB, MB, GB, or TB. The default unit is KB.
Default value: 0
--log-rotation-age=log-rotation-age
Maximum life cycle of a log file. If the maximum size is reached, the log file will be circulated after the gs_probackup command is executed. The help and version commands will not lead to a log file circulation. The $BACKUP_PATH/log/log_rotation directory saves the time of the last created log file. The value 0 indicates that the time-based loop is disabled.
Supported unit: ms, s, min, h, d (default value: min)
Default value: 0

Connection-related parameters (connection_options)

NOTE: The following parameters can be used together with the backup command.

-d dbname, --pgdatabase=dbname
Specifies the name of the database to connect to. This connection is only used to manage the backup process. Therefore, you can connect to any existing database. If this parameter is not specified in the command line, the PGDATABASE environment variable, or the pg_probackup.conf configuration file, gs_probackup attempts to obtain the value from the PGUSER environment variable. If the PGUSER variable is not set, the value is obtained from the current user name.
System environment variable: $PGDATABASE
-h hostname, --pghost=hostname
Specifies the host name of the system on which the server is running. If the value begins with a slash (/), it is used as the directory for the UNIX domain socket.
System environment variable: $PGHOST
Default value: local socket
-p port, --pgport=p__ort
Specifies the TCP port or local Unix domain socket file name extension on which the server is listening for connections.
System environment variable: $PGPORT
Default value: 5432
-U username, --pguser=username
Specifies the username of the host to be connected.
System environment variable: $PGUSER
-w, --no-password
Never issues a password prompt. The connection attempt fails if the host requires password verification and the password is not provided in other ways. This parameter is useful in batch jobs and the scripts that require no user password.
-W password, --password=password
User password for database connection. If the host uses the trust authentication policy, the administrator does not need to enter the -W parameter. If the -W parameter is not provided and you are not a system administrator, the system will ask you to enter a password.

Compression-related parameters (compression_options)

NOTE: The following parameters can be used together with the backup command.

--compress-algorithm=compress-algorithm
Specifies the algorithm used to compress data file.
The value can be zlib, pglz, or none. If zlib or pglz is set, compression is enabled. By default, the compression function is disabled.
Default value: none
--compress-level=compress-level
Specifies the compression level. Value range: 0-9
- 0 indicates no compression.
- 1 indicates that the compression ratio is the lowest and processing speed the fastest.
- 9 indicates that the compression ratio is the highest and processing speed the slowest.
- This parameter can be used together with --compress-algorithm.
Default value: 1
--compress
Compresses with --compress-algorithm=zlib and --compress-level=1.

Remote mode-related parameters (remote_options)

NOTE: The following are parameters that remotely run gs_probackup through SSH, and can be used together with the add-instance, set-config, backup, and restore commands.

--remote-proto=protocol
Specifies the protocol used for remote operations. Currently, only the SSH protocol is supported. Valid value:
ssh: enables the remote backup mode through SSH. This is the default.
none: The remote mode is disabled explicitly.
If --remote-host is specified, this parameter can be omitted.
--remote-host=destination
Specifies the IP address or host name of the remote host to be connected.
--remote-port=port
Specifies the port number of the remote host to be connected.
Default value: 22
--remote-user=username
Specifies the remote host user for SSH connection. If this parameter is not specified, the user who initiates the SSH connection is used.
Default value: the current user.
--remote-path=path
Specifies the installation directory of gs_probackup in the remote system.
Default value: current path
--ssh-options=ssh_options
Specifies the character string of the SSH command line parameter.
Example: --ssh-options='-c cipher_spec -F configfile'
NOTE:
- If the server does not respond due to a temporary network fault, **gs_probackup **will exit after waiting for archive-timeout seconds (300 seconds is set by default).
- If the LSN of the standby server is different from that of the primary server, the database continuously updates the following log information. In this case, you need to rebuild the standby server.
```
LOG: walsender thread shut down
LOG: walsender thread started
LOG: received wal replication command: IDENTIFY_VERSION
LOG: received wal replication command: IDENTIFY_MODE
LOG: received wal replication command: IDENTIFY_SYSTEM
LOG: received wal replication command: IDENTIFY_CONSISTENCE 0/D0002D8
LOG: remote request lsn/crc: [xxxxx] local max lsn/crc: [xxxxx]
```

Backup Process

Initialize the backup directory. Create the backups/ and wal/ subdirectories in the specified directory to store backup files and WAL files respectively.
```
gs_probackup init -B backup_dir
```
Add a new backup instance. gs_probackup can store backups of multiple database instances in the same backup directory.
```
gs_probackup add-instance -B backup_dir -D data_dir --instance instance_name
```
Create a backup for a specified database instance. Before performing an incremental backup, you must create at least one full backup.
```
gs_probackup backup -B backup_dir --instance instance_name -b backup_mode
```

Restore data from the backup of a specified DB instance.

gs_probackup restore -B backup_dir --instance instance_name -D pgdata-path -i backup_id

Troubleshooting

Problem Description	Cause and Solution
ERROR: query failed: ERROR: canceling statement due to conflict with recovery	Cause: The operation performed on the standby node is accessing the storage row. The corresponding row is modified or deleted on the primary node, and the Xlog is replayed on the standby node. As a result, the operation is canceled on the standby node. Solution: 1. Increase the values of the following parameters: max_standby_archive_delay max_standby_streaming_delay 2. Add the following configuration item: hot_standby_feedback = on

Problem Description

Cause and Solution

ERROR: query failed: ERROR: canceling statement due to conflict with recovery

Cause: The operation performed on the standby node is accessing the storage row. The corresponding row is modified or deleted on the primary node, and the Xlog is replayed on the standby node. As a result, the operation is canceled on the standby node.

Solution:

1. Increase the values of the following parameters:

max_standby_archive_delay

max_standby_streaming_delay

2. Add the following configuration item:

hot_standby_feedback = on

PITR Recovery

Background

When a database breaks down or needs to be rolled back to a previous state, the point-in-time recovery (PITR) function of openGauss can be used to restore the database to any point in time after the backup and archive data is generated.

NOTE:

PITR can only be restored to a point in time after the physical backup data is generated.
Only the primary node can be restored using PITR. The standby node needs to be fully built to synchronize data with the primary node.

Prerequisites

Full data files have been physically backed up.
WAL log files have been archived.

PITR Recovery Process

Replace the target database directory with the physical backup files.
Delete all files in the database directory pg_xlog/.
Copy the archived WAL log file to the pg_xlog file. (Or you can configure restore_command in the recovery.conf file to skip this step.)
Create the recovery command file recovery.conf in the database directory and specify the database recovery degree.
Start the database.
Connect to the database and check whether the database is recovered to the expected status.
If the expected status is reached, run the pg_xlog_replay_resume() command so that the primary node can provide services externally.

Configuring the recovery.conf File

Archive Recovery Configuration

restore_command = string

The shell command is used to obtain the archived WAL files among the WAL file series. Any %f in the string is replaced by the name of the file to retrieve from the archive, and any %p is replaced by the path name to copy it to on the server. Any %r is replaced by the name of the file containing the last valid restart point.

For example:

restore_command = 'cp /mnt/server/archivedir/%f %p'

archive_cleanup_command = string

This option parameter declares a shell command that is executed each time the system is restarted. archive_cleanup_command provides a mechanism for deleting unnecessary archived WAL files from the standby database. Any %r is replaced by the name of the file containing the last valid restart point. That is the earliest file that must be kept to allow recovery to be restartable, so all files older than %r can be safely removed.

For example:

archive_cleanup_command = 'pg_archivecleanup /mnt/server/archivedir %r'

If multiple standby servers need to be recovered from the same archive path, ensure that WAL files are not deleted from any standby server before the recovery.

recovery_end_command = string

This parameter is optional and is used to declare a shell command that is executed only when the recovery is complete. recovery_end_command provides a cleanup mechanism for future replication and recovery.

Recovery Object Configuration

recovery_target_name = string

This parameter declares that the name is recovered to a recovery point created using pg_create_restore_point().

For example:

recovery_target_name = 'restore_point_1'

recovery_target_time = timestamp

This parameter declares that the name is recovered to a specified timestamp.

For example:

recovery_target_time = '2020-01-01 12:00:00'

recovery_target_xid = string

This parameter declares that the name is recovered to a transaction ID.

For example:

recovery_target_xid = '3000'

recovery_target_lsn = string

This parameter declares that the name is recovered to the LSN specified by log.

For example:

recovery_target_lsn = '0/0FFFFFF'

recovery_target_inclusive = boolean

This parameter declares whether to stop the recovery after the recovery target is specified (true) or before the recovery target is specified (false). This declaration supports only the recovery targets recovery_target_time, recovery_target_xid, and recovery_target_lsn.

For example:

recovery_target_inclusive = true

NOTE:

Only one of the four configuration items recovery_target_name, recovery_target_time, recovery_target_xid, and recovery_target_lsn can be used at a time.
If no recovery targets are configured or the configured target does not exist, data is recovered to the latest WAL log point by default.

Logical Backup and Restoration

gs_dump

Background

gs_dump, provided by openGauss, is used to export database information. You can export a database or its objects (such as schemas, tables, and views). The database can be the default postgres database or a user-specified database.

gs_dump is executed by OS user omm.

When gs_dump is used to export data, other users can still access (read and write) openGauss databases.

gs_dump can export complete, consistent data. For example, if gs_dump is started to export database A at T1, data of the database at that time point will be exported, and modifications on the database after that time point will not be exported.

gs_dump can export database information to a plain-text SQL script file or archive file.

Plain-text SQL script: It contains the SQL statements required to restore the database. You can use gsql to execute the SQL script. With only a little modification, the SQL script can rebuild a database on other hosts or database products.
Archive file: It contains data required to restore the database. It can be a tar-, directory-, or custom-format archive. For details, see Table 1. The export result must be used with gs_restoreto restore the database. The system allows users to select or even to sort the content to be imported.

Functions

gs_dump can create export files in four formats, which are specified by -F or --format=, as listed in Table 1.

Table 1 Formats of exported files

Format	Value of -F	Description	Suggestion	Corresponding Import Tool
Plain-text	p	A plain-text script file containing SQL statements and commands. The commands can be executed on gsql, a command line terminal, to recreate database objects and load table data.	You are advised to use plain-text exported files for small databases.	Before using gsql to restore database objects, you can use a text editor to edit the plain-text export file as required.
Custom	c	A binary file that allows the restoration of all or selected database objects from an exported file.	You are advised to use custom-format archive files for medium or large database.	You can use gs_restore to import database objects from a custom-format archive.
Directory	d	A directory containing directory files and the data files of tables and BLOB objects.	-
.tar	t	A tar-format archive that allows the restoration of all or selected database objects from an exported file. It cannot be further compressed and has an 8-GB limitation on the size of a single table.	-

NOTE:
To reduce the size of an exported file, you can use gs_dump to compress it to a plain-text file or custom-format file. By default, a plain-text file is not compressed when generated. When a custom-format archive is generated, a medium level of compression is applied by default. Archived exported files cannot be compressed using gs_dump. When a plain-text file is exported in compressed mode, gsql fails to import data objects.

Precautions

Do not modify an exported file or its content. Otherwise, restoration may fail.

To ensure the data consistency and integrity, gs_dump acquires a share lock on a table to be dumped. If another transaction has acquired a share lock on the table, gs_dump waits until this lock is released and then locks the table for dumping. If the table cannot be locked within the specified time, the dump fails. You can customize the timeout duration to wait for lock release by specifying the --lock-wait-timeout parameter.

Syntax

gs_dump [OPTION]... [DBNAME]

NOTE:
DBNAME does not follow a short or long option. It specifies the database to be connected.
For example:
Specify DBNAME without a -d option preceding it.

gs_dump -p port_number  postgres -f dump1.sql

export PGDATABASE=postgres

 gs_dump -p port_number -f dump1.sql

Environment variable: PGDATABASE

Parameter Description

Common parameters

-f, --file=FILENAME
Sends the output to the specified file or directory. If this parameter is omitted, the standard output is generated. If the output format is (-F c/-F d/-F t), the -f parameter must be specified. If the value of the -f parameter contains a directory, the current user must have the read and write permissions on the directory, and the directory cannot be an existing one.
-F, --format=c|d|t|p
Selects the exported file format. The format can be:
- p|plain: Generates a text SQL script file. This is the default value.
- c|custom: Outputs a custom-format archive as a directory to be used as the input of gs_restore. This is the most flexible output format in which users can manually select it and reorder the archived items during restoration. An archive in this format is compressed by default.
- d|directory: Creates a directory containing directory files and the data files of tables and BLOBs.
- t|tar: Outputs a .tar archive as the input of gs_restore. The .tar format is compatible with the directory format. Extracting a .tar archive generates a valid directory-format archive. However, the .tar archive cannot be further compressed and has an 8-GB limitation on the size of a single table. The order of table data items cannot be changed during restoration.
  A .tar archive can be used as input of gsql.
-v, --verbose
Specifies the verbose mode. If it is specified, gs_dump writes detailed object comments and the number of startups/stops to the dump file, and progress messages to standard error.
-V, --version
Prints the gs_dump version and exits.
-Z, --compress=0-9
Specifies the used compression level.
Value range: 0-9
- 0 indicates no compression.
- 1 indicates that the compression ratio is the lowest and processing speed the fastest.
- 9 indicates that the compression ratio is the highest and processing speed the slowest.
For the custom-format archive, this option specifies the compression level of a single table data segment. By default, data is compressed at a medium level. The plain-text and .tar archive formats do not support compression currently.
--lock-wait-timeout=TIMEOUT
Do not keep waiting to obtain shared table locks since the beginning of the dump. Consider it as failed if you are unable to lock a table within the specified time. The timeout period can be specified in any of the formats accepted by SET statement_timeout.
-?, --help
Displays help about gs_dump parameters and exits.

Dump parameters:

-a, --data-only
Generates only the data, not the schema (data definition). Dump the table data, big objects, and sequence values.
-b, --blobs
Specifies a reserved port for function expansion. This parameter is not recommended.
-c, --clean
Before writing the command of creating database objects into the backup file, writes the command of clearing (deleting) database objects to the backup files. (If no objects exist in the target database, gs_restore probably displays some error information.)
This parameter is used only for the plain-text format. For the archive format, you can specify the option when using gs_restore.
-C, --create
The backup file content starts with the commands of creating the database and connecting to the created database. (If the command script is executed in this mode, you can specify any database to run the command for creating a database. The data is restored to the created database instead of the specified database.)
This parameter is used only for the plain-text format. For the archive format, you can specify the option when using gs_restore.
-E, --encoding=ENCODING
Creates a dump file in the specified character set encoding. By default, the dump file is created in the database encoding. (Alternatively, you can set the environment variable PGCLIENTENCODING to the required dump encoding.)
-n, --schema=SCHEMA
Dumps only schemas matching the schema names. This option contains the schema and all its contained objects. If this option is not specified, all non-system schemas in the target database will be dumped. Multiple schemas can be selected by specifying multiple -n options. The schema parameter is interpreted as a pattern according to the same rules used by the \d command of gsql. Therefore, multiple schemas can also be selected by writing wildcard characters in the pattern. When you use wildcard characters, quote the pattern to prevent the shell from expanding the wildcard characters.
NOTE:
- If -n is specified, gs_dump does not dump any other database objects which the selected schemas might depend upon. Therefore, there is no guarantee that the results of a specific-schema dump can be automatically restored to an empty database.
- If -n is specified, the non-schema objects are not dumped.
Multiple schemas can be dumped. Entering **-n **schemaname multiple times dumps multiple schemas.
For example:
```
gs_dump -h host_name -p port_number postgres -f backup/bkp_shl2.sql -n sch1 -n sch2
```
In the preceding example, sch1 and sch2 are dumped.
-N, --exclude-schema=SCHEMA
Does not dump any schemas matching the schemas pattern. The pattern is interpreted according to the same rules as for -n. -N can be specified multiple times to exclude schemas matching any of the specified patterns.
When both -n and -N are specified, the schemas that match at least one -n option but no -N is dumped. If -N is specified and -n is not, the schemas matching -N are excluded from what is normally dumped.
Dump allows you to exclude multiple schemas during dumping.
Specify -N exclude schema name to exclude multiple schemas during dumping.
For example:
```
gs_dump -h host_name -p port_number postgres -f backup/bkp_shl2.sql -N sch1 -N sch2
```
In the preceding example, sch1 and sch2 will be excluded during the dumping.
-o, --oids
Dumps object identifiers (OIDs) as parts of the data in each table. Use this option if your application references the OID columns in some way. If the preceding situation does not occur, do not use this parameter.
-O, --no-owner
Do not output commands to set ownership of objects to match the original database. By default, gs_dump issues the ALTER OWNER or SET SESSION AUTHORIZATION statement to set ownership of created database objects. These statements will fail when the script is running unless it is started by a system administrator (or the same user that owns all of the objects in the script). To make a script that can be stored by any user and give the user ownership of all objects, specify -O.
This parameter is used only for the plain-text format. For the archive format, you can specify the option when using gs_restore.
-s, --schema-only
Dumps only the object definition (schema) but not data.
-S, --sysadmin=NAME
Specifies a reserved port for function expansion. This parameter is not recommended.
-t, --table=TABLE
Specifies a list of tables, views, sequences, or foreign tables to be dumped. You can use multiple -t parameters or wildcard characters to specify tables.
When you use wildcard characters, quote patterns to prevent the shell from expanding the wildcard characters.
The -n and -N options have no effect when -t is used, because tables selected by using -t will be dumped regardless of those options.
NOTE:
- The number of -t parameters must be less than or equal to 100.
- If the number of -t parameters is greater than 100, you are advised to use the --include-table-file parameter to replace some -t parameters.
- If -t is specified, gs_dump does not dump any other database objects which the selected tables might depend upon. Therefore, there is no guarantee that the results of a specific-table dump can be automatically restored to an empty database.
- -t tablename only dumps visible tables in the default search path. -t '*.tablename' dumps tablename tables in all the schemas of the dumped database. -t schema.table dumps tables in a specific schema.
- -t tablename does not export trigger information from a table.
For example:
```
gs_dump -h host_name -p port_number postgres -f backup/bkp_shl2.sql -t schema1.table1 -t schema2.table2
```
In the preceding example, schema1.table1 and schema2.table2 are dumped.
--include-table-file=FILENAME
Specifies the table file to be dumped.
-T, --exclude-table=TABLE
Specifies a list of tables, views, sequences, or foreign tables not to be dumped. You can use multiple -T parameters or wildcard characters to specify tables.
When -t and -T are input, the object will be stored in -t list not -T table object.
For example:
```
gs_dump -h host_name -p port_number postgres -f backup/bkp_shl2.sql -T table1 -T table2
```
In the preceding example, table1 and table2 are excluded from the dumping.
--exclude-table-file=FILENAME
Specifies the table files that do not need to be dumped.
NOTE:
Same as --include-table-file, the content format of this parameter is as follows:
schema1.table1 schema2.table2 ……
-x, --no-privileges|--no-acl
Prevents the dumping of access permissions (grant/revoke commands).
--binary-upgrade
Specifies a reserved port for function expansion. This parameter is not recommended.
--binary-upgrade-usermap=“USER1=USER2”
Specifies a reserved port for function expansion. This parameter is not recommended.
--column-inserts|--attribute-inserts
Exports data by running the INSERT command with explicit column names {INSERT INTO table (column, …) VALUES …}. This will cause a slow restoration. However, since this option generates an independent command for each row, an error in reloading a row causes only the loss of the row rather than the entire table content.
--disable-dollar-quoting
Disables the use of dollar sign ($) for function bodies, and forces them to be quoted using the SQL standard string syntax.
--disable-triggers
Specifies a reserved port for function expansion. This parameter is not recommended.
--exclude-table-data=TABLE
Does not dump data that matches any of table patterns. The pattern is interpreted according to the same rules as for -t.
--exclude-table-data can be entered more than once to exclude tables matching any of several patterns. When you need the specified table definition rather than data in the table, this option is helpful.
To exclude data of all tables in the database, see --schema-only.
--inserts
Dumps data by the INSERT statement (rather than COPY). This will cause a slow restoration.
However, since this option generates an independent command for each row, an error in reloading a row causes only the loss of the row rather than the entire table content. The restoration may fail if you rearrange the column order. The --column-inserts option is unaffected against column order changes, though even slower.
--no-security-labels
Specifies a reserved port for function expansion. This parameter is not recommended.
--no-tablespaces
Does not issue commands to select tablespaces. All the objects will be created during restoration, no matter which tablespace is selected when using this option.
This parameter is used only for the plain-text format. For the archive format, you can specify the option when using gs_restore.
--no-unlogged-table-data
Specifies a reserved port for function expansion. This parameter is not recommended.
--non-lock-table
Specifies a reserved port for function expansion. This parameter is not recommended.
--include-alter-table
Dumps deleted columns of tables. This option records deleted columns.
--quote-all-identifiers
Forcibly quotes all identifiers. This parameter is useful when you dump a database for migration to a later version, in which additional keywords may be introduced.
--section=SECTION
Specifies dumped name sections (pre-data, data, or post-data).
--serializable-deferrable
Uses a serializable transaction for the dump to ensure that the used snapshot is consistent with later database status. Perform this operation at a time point in the transaction flow, at which everything is normal. This ensures successful transaction and avoids serialization failures of other transactions, which requires serialization again.
This option has no benefits for disaster recovery. During the upgrade of the original database, loading a database copy as a report or loading other shared read-only dump is helpful. If the option does not exist, dump reveals a status which is different from the submitted sequence status of any transaction.
This option will make no difference if there are no active read-write transactions when gs_dump is started. If the read-write transactions are in active status, the dump start time will be delayed for an uncertain period.
--use-set-session-authorization
Specifies that the standard SQL SET SESSION AUTHORIZATION command rather than ALTER OWNER is returned to ensure the object ownership. This makes dumping more standard. However, if a dump file contains objects that have historical problems, restoration may fail. A dump using SET SESSION AUTHORIZATION requires the system administrator permissions, whereas ALTER OWNER requires lower permissions.
--with-encryption=AES128
Specifies that dumping data needs to be encrypted using AES128.
--with-key=KEY
Specifies that the key length of AES128 must be 16 bytes.
NOTE:
When using the gs_dump tool for encrypted export, only plain format export is supported. The data exported through -F plain needs to be imported through the gsql tool, and if it is imported through encryption, the --with-key parameter must be specified when importing through gsql.
--include-depend-objs
Includes information about the objects that depend on the specified object in the backup result. This parameter takes effect only if the -t or --include-table-file parameter is specified.
--exclude-self
Excludes information about the specified object from the backup result. This parameter takes effect only if the -t or --include-table-file parameter is specified.
--dont-overwrite-file
The existing files in plain-text, .tar, and custom formats will be overwritten. This option is not used for the directory format.
For example:
Assume that the backup.sql file exists in the current directory. If you specify -f backup.sql in the input command, and the backup.sql file is generated in the current directory, the original file will be overwritten.
If the backup file already exists and --dont-overwrite-file is specified, an error will be reported with the message that the dump file exists.
```
gs_dump -p port_number postgres -f backup.sql -F plain --dont-overwrite-file
```

NOTE:

The -s/--schema-only and -a/--data-only parameters do not coexist.
The -c/--clean and -a/--data-only parameters do not coexist.
--inserts/--column-inserts and -o/--oids do not coexist, because OIDS cannot be set using the INSERT statement.
--role must be used in conjunction with --rolepassword.
--binary-upgrade-usermap must be used in conjunction with --binary-upgrade.
--include-depend-objs or --exclude-self takes effect only when -t or --include-table-file is specified.
--exclude-self must be used in conjunction with --include-depend-objs.

Connection parameters:

-h, --host=HOSTNAME
Specifies the host name. If the value begins with a slash (/), it is used as the directory for the UNIX domain socket. The default value is taken from the PGHOST environment variable (if available). Otherwise, a Unix domain socket connection is attempted.
This parameter is used only for defining names of the hosts outside openGauss. The names of the hosts inside openGauss must be 127.0.0.1.
Example: host name
Environment variable: PGHOST
-p, --port=PORT
Specifies the host port number. If the thread pool function is enabled, you are advised to use pooler port, that is, the host port number plus 1.
Environment variable: PGPORT
-U, --username=NAME
Specifies the username of the host to be connected.
If the username of the host to be connected is not specified, the system administrator is used by default.
Environment variable: PGUSER
-w, --no-password
Never issues a password prompt. The connection attempt fails if the host requires password verification and the password is not provided in other ways. This parameter is useful in batch jobs and scripts in which no user password is required.
-W, --password=PASSWORD
Specifies the user password for connection. If the host uses the trust authentication policy, the administrator does not need to enter the -W option. If the -W option is not provided and you are not a system administrator, the Dump Restore tool will ask you to enter a password.
--role=ROLENAME
Specifies a role name to be used for creating the dump. If this option is selected, the SET ROLE command will be issued after the database is connected to gs_dump. It is useful when the authenticated user (specified by -U) lacks the permissions required by gs_dump. It allows the user to switch to a role with the required permissions. Some installations have a policy against logging in directly as a super administrator. This option allows dumping data without violating the policy.
--rolepassword=ROLEPASSWORD
Specifies the password for a role.

Notice

If any local additions need to be added to the template1 database in openGauss, restore the output of gs_dump into an empty database with caution. Otherwise, you are likely to obtain errors due to duplicate definitions of the added objects. To create an empty database without any local additions, copy data from template0 rather than template1. Example:

CREATE DATABASE foo WITH TEMPLATE template0;

The .tar file size must be smaller than 8 GB. (This is the .tar file format limitations.) The total size of a .tar archive and any of the other output formats are not limited, except possibly by the OS.

The dump file generated by gs_dump does not contain the statistics used by the optimizer to make execution plans. Therefore, you are advised to run ANALYZE after restoring from a dump file to ensure optimal performance. The dump file does not contain any ALTER DATABASE … SET commands. These settings are dumped by gs_dumpall, along with database users and other installation settings.

Examples

Use gs_dump to dump a database as a SQL text file or a file in other formats.

In the following examples, Bigdata@123 indicates the password for the database user. backup/MPPDB_backup.sql indicates an exported file where backup indicates the relative path of the current directory. 37300 indicates the port number of the database server. postgres indicates the name of the database to be accessed.

NOTE: Before exporting files, ensure that the directory exists and you have the read and write permissions on the directory.

Example 1: Use gs_dump to export the full information of the postgres database. The exported MPPDB_backup.sql file is in plain-text format.

gs_dump -U omm -W Bigdata@123 -f backup/MPPDB_backup.sql -p 37300 postgres -F p
gs_dump[port='37300'][postgres][2018-06-27 09:49:17]: The total objects number is 356.
gs_dump[port='37300'][postgres][2018-06-27 09:49:17]: [100.00%] 356 objects have been dumped.
gs_dump[port='37300'][postgres][2018-06-27 09:49:17]: dump database postgres successfully
gs_dump[port='37300'][postgres][2018-06-27 09:49:17]: total time: 1274  ms

Use gsql to import data from the exported plain-text file.

Example 2: Use gs_dump to export the full information of the postgres database. The exported MPPDB_backup.tar file is in .tar format.

gs_dump -U omm -W Bigdata@123 -f backup/MPPDB_backup.tar -p 37300 postgres -F t
gs_dump[port='37300'][postgres][2018-06-27 10:02:24]: The total objects number is 1369.
gs_dump[port='37300'][postgres][2018-06-27 10:02:53]: [100.00%] 1369 objects have been dumped.
gs_dump[port='37300'][postgres][2018-06-27 10:02:53]: dump database postgres successfully
gs_dump[port='37300'][postgres][2018-06-27 10:02:53]: total time: 50086  ms

Example 3: Use gs_dump to export the full information of the postgres database. The exported MPPDB_backup.dmp file is in custom format.

gs_dump -U omm -W Bigdata@123 -f backup/MPPDB_backup.dmp -p 37300 postgres -F c
gs_dump[port='37300'][postgres][2018-06-27 10:05:40]: The total objects number is 1369.
gs_dump[port='37300'][postgres][2018-06-27 10:06:03]: [100.00%] 1369 objects have been dumped.
gs_dump[port='37300'][postgres][2018-06-27 10:06:03]: dump database postgres successfully
gs_dump[port='37300'][postgres][2018-06-27 10:06:03]: total time: 36620  ms

Example 4: Use gs_dump to export the full information of the postgres database. The exported MPPDB_backup file is in directory format.

gs_dump -U omm -W Bigdata@123 -f backup/MPPDB_backup -p 37300  postgres -F d
gs_dump[port='37300'][postgres][2018-06-27 10:16:04]: The total objects number is 1369.
gs_dump[port='37300'][postgres][2018-06-27 10:16:23]: [100.00%] 1369 objects have been dumped.
gs_dump[port='37300'][postgres][2018-06-27 10:16:23]: dump database postgres successfully
gs_dump[port='37300'][postgres][2018-06-27 10:16:23]: total time: 33977  ms

Example 5: Use gs_dump to export the information of the postgres database, excluding the information of the table specified in the /home/MPPDB_temp.sql file. The exported MPPDB_backup.sql file is in plain-text format.

gs_dump -U omm -W Bigdata@123 -p 37300 postgres --exclude-table-file=/home/MPPDB_temp.sql -f backup/MPPDB_backup.sql
gs_dump[port='37300'][postgres][2018-06-27 10:37:01]: The total objects number is 1367.
gs_dump[port='37300'][postgres][2018-06-27 10:37:22]: [100.00%] 1367 objects have been dumped.
gs_dump[port='37300'][postgres][2018-06-27 10:37:22]: dump database postgres successfully
gs_dump[port='37300'][postgres][2018-06-27 10:37:22]: total time: 37017  ms

Example 6: Use gs_dump to export only the information about the views that depend on the testtable table. Create another testtable table, and then restore the views that depend on it.

Back up only the views that depend on the testtable table.

gs_dump -s -p 37300 postgres -t PUBLIC.testtable --include-depend-objs --exclude-self -f backup/MPPDB_backup.sql -F p
gs_dump[port='37300'][postgres][2018-06-15 14:12:54]: The total objects number is 331.
gs_dump[port='37300'][postgres][2018-06-15 14:12:54]: [100.00%] 331 objects have been dumped.
gs_dump[port='37300'][postgres][2018-06-15 14:12:54]: dump database postgres successfully
gs_dump[port='37300'][postgres][2018-06-15 14:12:54]: total time: 327  ms

Change the name of the testtable table.

gsql -p 37300 postgres -r -c "ALTER TABLE PUBLIC.testtable RENAME TO testtable_bak;"

Create another testtable table.

CREATE TABLE PUBLIC.testtable(a int, b int, c int);

Restore the views for the new testtable table.

gsql -p 37300 postgres -r -f backup/MPPDB_backup.sql

gs_dumpall

Background

gs_dumpall, provided by openGauss, is used to export all openGauss database information, including data of the default database postgres, user-defined databases, and common global objects of all openGauss databases.

gs_dumpall is executed by OS user omm.

When gs_dumpall is used to export data, other users can still access (read and write) openGauss databases.

gs_dumpall can export complete, consistent data. For example, if gs_dumpall is started to export openGauss database at T1, data of the database at that time point will be exported, and modifications on the database after that time point will not be exported.

gs_dumpall exports all openGauss databases in two parts:

gs_dumpall exports all global objects, including information about database users and groups, tablespaces, and attributes (for example, global access permissions).
gs_dumpall invokes gs_dump to export SQL scripts from each openGauss database, which contain all the SQL statements required to restore databases.

The exported files are both plain-text SQL scripts. Use gsql to execute them to restore openGauss databases.

Precautions

Do not modify an exported file or its content. Otherwise, restoration may fail.
To ensure the data consistency and integrity, gs_dumpall acquires a share lock on a table to be dumped. If another transaction has acquired a share lock on the table, gs_dumpall waits until this lock is released and then locks the table for dumping. If the table cannot be locked within the specified time, the dump fails. You can customize the timeout duration to wait for lock release by specifying the --lock-wait-timeout parameter.
During an export, gs_dumpall reads all tables in a database. Therefore, you need to connect to the database as an openGauss administrator to export a complete file. When you use gsql to execute SQL scripts, cluster administrator permissions are also required to add users and user groups, and create databases.

Syntax

gs_dumpall [OPTION]...

Parameter Description

Common parameters:

-f, --filename=FILENAME
Sends the output to the specified file. If this parameter is omitted, the standard output is generated.
-v, --verbose
Specifies the verbose mode. If it is specified, gs_dumpall writes detailed object comments and number of startups/stops to the dump file, and progress messages to standard error.
-V, --version
Prints the gs_dumpall version and exits.
--lock-wait-timeout=TIMEOUT
Do not keep waiting to obtain shared table locks at the beginning of the dump. Consider it as failed if you are unable to lock a table within the specified time. The timeout period can be specified in any of the formats accepted by SET statement_timeout.
-?, --help
Displays help about gs_dumpall parameters and exits.

Dump parameters:

-a, --data-only
Dumps only the data, not the schema (data definition).
-c, --clean
Runs SQL statements to delete databases before rebuilding them. Statements for dumping roles and tablespaces are added.
-g, --globals-only
Dumps only global objects (roles and tablespaces) but no databases.
-o, --oids
Dumps object identifiers (OIDs) as parts of the data in each table. Use this parameter if your application references the OID columns in some way. If the preceding situation does not occur, do not use this parameter.
-O, --no-owner
Do not output commands to set ownership of objects to match the original database. By default, gs_dumpall issues the ALTER OWNER or SET SESSION AUTHORIZATION command to set ownership of created schema objects. These statements will fail when the script is running unless it is started by a system administrator (or the same user that owns all of the objects in the script). To make a script that can be stored by any user and give the user ownership of all objects, specify -O.
-r, --roles-only
Dumps only roles but not databases or tablespaces.
-s, --schema-only
Dumps only the object definition (schema) but not data.
-S, --sysadmin=NAME
Name of the system administrator during the dump.
-t, --tablespaces-only
Dumps only tablespaces but not databases or roles.
-x, --no-privileges
Prevents the dumping of access permissions (grant/revoke commands).
--column-inserts|--attribute-inserts
Exports data by running the INSERT command with explicit column names {INSERT INTO table (column, …) VALUES …}. This will cause a slow restoration. However, since this option generates an independent command for each row, an error in reloading a row causes only the loss of the row rather than the entire table content.
--disable-dollar-quoting
Disables the use of dollar sign ($) for function bodies, and forces them to be quoted using the SQL standard string syntax.
--disable-triggers
Specifies a reserved port for function expansion. This parameter is not recommended.
--inserts
Dumps data by the INSERT statement (rather than COPY). This will cause a slow restoration. The restoration may fail if you rearrange the column order. The --column-inserts option is unaffected against column order changes, though even slower.
--no-security-labels
Specifies a reserved port for function expansion. This parameter is not recommended.
--no-tablespaces
Does not generate output statements to create tablespaces or select tablespaces for objects. All the objects will be created during the restoration process, no matter which tablespace is selected when using this option.
--no-unlogged-table-data
Specifies a reserved port for function expansion. This parameter is not recommended.
--quote-all-identifiers
Forcibly quotes all identifiers. This parameter is useful when you dump a database for migration to a later version, in which additional keywords may be introduced.
--dont-overwrite-file
Does not overwrite the current file.
--use-set-session-authorization
Specifies that the standard SQL SET SESSION AUTHORIZATION command rather than ALTER OWNER is returned to ensure the object ownership. This makes dumping more standard. However, if a dump file contains objects that have historical problems, restoration may fail. A dump using SET SESSION AUTHORIZATION requires the system administrator rights, whereas ALTER OWNER requires lower permissions.
--with-encryption=AES128
Specifies that dumping data needs to be encrypted using AES128.
--with-key=KEY
Specifies that the key length of AES128 must be 16 bytes.
--include-templatedb
Includes template databases during the dump.
--binary-upgrade
Specifies a reserved port for function expansion. This parameter is not recommended.
--binary-upgrade-usermap=“USER1=USER2”
Specifies a reserved port for function expansion. This parameter is not recommended.
--tablespaces-postfix
Specifies a reserved port for function expansion. This parameter is not recommended.
--parallel-jobs
Specifies the number of concurrent backup processes. The value range is 1-1000.

NOTE:

The -g/--globals-only and -r/--roles-only parameters do not coexist.
The -g/--globals-only and -t/--tablespaces-only parameters do not coexist.
The -r/--roles-only and -t/--tablespaces-only parameters do not coexist.
The -s/--schema-only and -a/--data-only parameters do not coexist.
The -r/--roles-only and -a/--data-only parameters do not coexist.
The -t/--tablespaces-only and -a/--data-only parameters do not coexist.
The -g/--globals-only and -a/--data-only parameters do not coexist.
--tablespaces-postfix must be used in conjunction with --binary-upgrade.
--binary-upgrade-usermap must be used in conjunction with --binary-upgrade.
--parallel-jobs must be used in conjunction with -f/--file.

Connection parameters:

-h, --host
Specifies the host name. If the value begins with a slash (/), it is used as the directory for the UNIX domain socket. The default value is taken from the PGHOST environment (if variable). Otherwise, a Unix domain socket connection is attempted.
This parameter is used only for defining names of the hosts outside openGauss. The names of the hosts inside openGauss must be 127.0.0.1.
Environment Variable: PGHOST
-l, --database
Specifies the name of the database connected to dump all objects and discover other databases to be dumped. If this parameter is not specified, the postgres database will be used. If the postgres database does not exist, template1 will be used.
-p, --port
TCP port or the local Unix-domain socket file extension on which the server is listening for connections. The default value is the PGPORT environment variable.
If the thread pool function is enabled, you are advised to use pooler port, that is, the listening port number plus 1.
Environment variable: PGPORT
-U, --username
Specifies the user name to connect to.
Environment variable: PGUSER
-w, --no-password
Never issues a password prompt. The connection attempt fails if the host requires password verification and the password is not provided in other ways. This parameter is useful in batch jobs and scripts in which no user password is required.
-W, --password
Specifies the user password for connection. If the host uses the trust authentication policy, the administrator does not need to enter the -W option. If the -W option is not provided and you are not a system administrator, the Dump Restore tool will ask you to enter a password.
--role
Specifies a role name to be used for creating the dump. This option causes gs_dumpall to issue the SET ROLE statement after connecting to the database. It is useful when the authenticated user (specified by -U) lacks the permissions required by the gs_dumpall. It allows the user to switch to a role with the required permissions. Some installations have a policy against logging in directly as a system administrator. This option allows dumping data without violating the policy.
--rolepassword
Specifies the password of the specific role.

Notice

gs_dumpall internally invokes gs_dump. For details about the diagnosis information, see gs_dump.

Once gs_dumpall is restored, run ANALYZE on each database so that the optimizer can provide useful statistics.

gs_dumpall requires all needed tablespace directories to exit before the restoration. Otherwise, database creation will fail if the databases are in non-default locations.

Examples

Use gs_dumpall to export all openGauss databases at a time.

NOTE: gs_dumpall supports only plain-text format export. Therefore, only gsql can be used to restore a file exported using gs_dumpall.

gs_dumpall -f backup/bkp2.sql -p 37300
gs_dump[port='37300'][dbname='postgres'][2018-06-27 09:55:09]: The total objects number is 2371.
gs_dump[port='37300'][dbname='postgres'][2018-06-27 09:55:35]: [100.00%] 2371 objects have been dumped.
gs_dump[port='37300'][dbname='postgres'][2018-06-27 09:55:46]: dump database dbname='postgres' successfully
gs_dump[port='37300'][dbname='postgres'][2018-06-27 09:55:46]: total time: 55567  ms
gs_dumpall[port='37300'][2018-06-27 09:55:46]: dumpall operation successful
gs_dumpall[port='37300'][2018-06-27 09:55:46]: total time: 56088  ms

gs_restore

Background

gs_restore, provided by openGauss, is used to import data that was exported using gs_dump. It can also be used to import files exported by gs_dump.

gs_restore is executed by OS user omm.

It has the following functions:

Importing data to the database
If a database is specified, data is imported to the database. For parallel import, the password for connecting to the database is required.
Importing data to the script file
If the database storing imported data is not specified, a script containing the SQL statement to recreate the database is created and written to a file or standard output. This script output is equivalent to the plain text output format of gs_dump.

Command Format

gs_restore [OPTION]... FILE

NOTE:
FILE does not have a short or long parameter. It is used to specify the location for the archive files.
The dbname or -l parameter is required as prerequisites. Users cannot enter dbname and -l parameters at the same time.
gs_restore incrementally imports data by default. To prevent data exceptions caused by multiple import operations, you are advised to use the -c parameter during the import. Before recreating database objects, delete the database objects that already exist in the database to be restored.
There is no option to control log printing. To hide logs, redirect the logs to the log file. If a large amount of table data needs to be restored, the table data will be restored in batches. Therefore, the log indicating that the table data has been imported is generated for multiple times.

Parameter Description

Common parameters

-d, --dbname=NAME
Connects to the dbname database and imports data to the database.
-f, --file=FILENAME
Specifies the output file for the generated script, or uses the output file in the list specified using -l.
The default is the standard output.
NOTE:
-f cannot be used in conjunction with -d.
-F, --format=c|d|t
Specifies the format of the archive. The format does not need to be specified because the gs_restore determines the format automatically.
Value range:
- c/custom: The archive form is the customized format in gs_dump.
- d/directory: The archive form is a directory archive format.
- t/tar: The archive form is a .tar archive format.
-l, --list
Lists the forms of the archive. The operation output can be used for the input of the -L parameter. If filtering parameters, such as -n or -t, are used together with -l, they will restrict the listed items.
-v, --verbose
Specifies the verbose mode.
-V, --version
Prints the gs_restore version and exits.
-?, --help
Displays help information about the parameters of gs_restore and exits.

Parameters for importing data

-a, -data-only
Imports only the data, not the schema (data definition). gs_restore incrementally imports data.
-c, --clean
Cleans (deletes) existing database objects in the database to be restored before recreating them.
-C, --create
Creates the database before importing data to it. (When this parameter is used, the database specified by -d is used to issue the initial CREATE DATABASE command. All data is imported to the created database.)
-e, --exit-on-error
Exits if an error occurs when you send the SQL statement to the database. If you do not exit, the commands will still be sent and error information will be displayed when the import ends.
-I, --index=NAME
Imports only the definition of the specified index. Multiple indexes can be imported. Enter -I_ index_ multiple times to import multiple indexes.
For example:
```
gs_restore -h host_name -p port_number -d postgres -I Index1 -I Index2 backup/MPPDB_backup.tar
```
In this example, Index1 and Index2 will be imported.
-j, --jobs=NUM
Specifies the number of concurrent, the most time-consuming jobs of gs_restore (such as loading data, creating indexes, or creating constraints). This parameter can greatly reduce the time to import a large database to a server running on a multiprocessor machine.
Each job is one process or one thread, depending on the OS; and uses a separate connection to the server.
The optimal value for this option depends on the server hardware setting, the client, the network, the number of CPU cores, and disk settings. It is recommended that the parameter be set to the number of CPU cores on the server. In addition, a larger value can also lead to faster import in many cases. However, an overly large value will lead to decreased performance because of thrashing.
This parameter supports custom-format archives only. The input file must be a regular file (not the pipe file). This parameter can be ignored when you select the script method rather than connect to a database server. In addition, multiple jobs cannot be used in conjunction with the --single-transaction parameter.
-L, --use-list=FILENAME
Imports only archive elements that are listed in list-file and imports them in the order that they appear in the file. If filtering parameters, such as -n or -t, are used in conjunction with -L, they will further limit the items to be imported.
list-file is normally created by editing the output of a previous -l parameter. File lines can be moved or removed, and can also be commented out by placing a semicolon (;) at the beginning of the row.
-n, --schema=NAME
Restores only objects that are listed in schemas.
This parameter can be used in conjunction with the -t parameter to import a specific table.
Entering **-n **schemaname multiple times can import multiple schemas.
For example:
```
gs_restore -h host_name -p port_number -d postgres -n sch1 -n sch2 backup/MPPDB_backup.tar
```
In this example, sch1 and sch2 will be imported.
-O, --no-owner
Do not output commands to set ownership of objects to match the original database. By default, gs_restore issues the ALTER OWNER or SET SESSION AUTHORIZATION statement to set ownership of created schema elements. Unless the system administrator or the user who has all the objects in the script initially accesses the database. Otherwise, the statement will fail. Any user name can be used for the initial connection using -O, and this user will own all the created objects.
-P, --function=NAME(args)
Imports only listed functions. You need to correctly spell the function name and the parameter based on the contents of the dump file in which the function exists.
Entering -P alone means importing all function-name(args) functions in a file. Entering -P with -n means importing the function-name(args) functions in a specified schema. Entering -P multiple times and using -n once means that all imported functions are in the -n schema by default.
You can enter -n schema-name -P 'function-name(args)' multiple times to import functions in specified schemas.
For example:
```
gs_restore -h host_name -p port_number -d postgres -n test1 -P 'Func1(integer)' -n test2 -P 'Func2(integer)' backup/MPPDB_backup.tar
```
In this example, both Func1 (i integer) in the test1 schema and Func2 (j integer) in the test2 schema will be imported.
-s, --schema-only
Imports only schemas (data definitions), instead of data (table content). The current sequence value will not be imported.
-S, --sysadmin=NAME
Specifies a reserved port for function expansion. This parameter is not recommended.
-t, --table=NAME
Imports only listed table definitions or data, or both. This parameter can be used in conjunction with the -n parameter to specify a table object in a schema. When -n is not entered, the default schema is PUBLIC. Entering **-n schemaname -t **tablename multiple times can import multiple tables in a specified schema.
For example:
Import table1 in the PUBLIC schema.
```
gs_restore -h host_name -p port_number -d postgres -t table1 backup/MPPDB_backup.tar
```
Import test1 in the test1 schema and test2 in the test2 schema.
```
gs_restore -h host_name -p port_number -d postgres -n test1 -t test1 -n test2 -t test2 backup/MPPDB_backup.tar
```
Import table1 in the PUBLIC schema and test1 in the test1 schema.
```
gs_restore -h host_name -p port_number -d postgres -n PUBLIC -t table1 -n test1 -t table1 backup/MPPDB_backup.tar
```
-T, --trigger=NAME
This parameter is reserved for extension.
-x, --no-privileges/--no-acl
Prevents the import of access permissions (GRANT/REVOKE commands).
-1, --single-transaction
Executes import as a single transaction (that is, commands are wrapped in BEGIN/COMMIT).
This parameter ensures that either all the commands are completed successfully or no application is changed. This parameter means --exit-on-error.
--disable-triggers
Specifies a reserved port for function expansion. This parameter is not recommended.
--no-data-for-failed-tables
By default, table data will be imported even if the statement to create a table fails (for example, the table already exists). Data in such table is skipped using this parameter. This operation is useful if the target database already contains the desired table contents.
This parameter takes effect only when you import data directly into a database, not when you output SQL scripts.
--no-security-labels
Specifies a reserved port for function expansion. This parameter is not recommended.
--no-tablespaces
Tablespaces excluding specified ones All objects will be created during the import process no matter which tablespace is selected when using this option.
--section=SECTION
Imports the listed sections (such as pre-data, data, or post-data).
--use-set-session-authorization
Is used for plain-text backup.
Outputs the SET SESSION AUTHORIZATION statement instead of the ALTER OWNER statement to determine object ownership. This parameter makes dump more standards-compatible. If the records of objects in exported files are referenced, import may fail. Only administrators can use the SET SESSION AUTHORIZATION statement to dump data, and the administrators must manually change and verify the passwords of exported files by referencing the SET SESSION AUTHORIZATION statement before import. The ALTER OWNER statement requires lower permissions.

NOTICE:

If any local additions need to be added to the template1 database during the installation, restore the output of gs_restore into an empty database with caution. Otherwise, you are likely to obtain errors due to duplicate definitions of the added objects. To create an empty database without any local additions, copy data from template0 rather than template1. Example:

CREATE DATABASE foo WITH TEMPLATE template0;

gs_restore cannot import large objects selectively. For example, it can only import the objects of a specified table. If an archive contains large objects, all large objects will be imported, or none of them will be restored if they are excluded by using -L, -t, or other parameters.

NOTE:

The -d/--dbname and -f/--file parameters do not coexist.
The -s/--schema-only and -a/--data-only parameters do not coexist.
The -c/--clean and -a/--data-only parameters do not coexist.
When --single-transaction is used, -j/--jobs must be a single job.
--role must be used in conjunction with --rolepassword.

Connection parameters:

-h, --host=HOSTNAME
Specifies the host name. If the value begins with a slash (/), it is used as the directory for the UNIX domain socket. The default value is taken from the PGHOST environment variable. If it is not set, a UNIX domain socket connection is attempted.
This parameter is used only for defining names of the hosts outside openGauss. The names of the hosts inside openGauss must be 127.0.0.1.
-p, --port=PORT
TCP port or the local Unix-domain socket file extension on which the server is listening for connections. The default value is the PGPORT environment variable.
If the thread pool function is enabled, you are advised to use pooler port, that is, the listening port number plus 1.
-U, --username=NAME
Specifies the user name to connect to.
-w, --no-password
Never issues a password prompt. The connection attempt fails if the host requires password verification and the password is not provided in other ways. This parameter is useful in batch jobs and scripts in which no user password is required.
-W, --password=PASSWORD
User password for database connection. If the host uses the trust authentication policy, the administrator does not need to enter the -W parameter. If the -W parameter is not provided and you are not a system administrator, gs_restore will ask you to enter a password.
--role=ROLENAME
Specifies a role name for the import operation. If this parameter is selected, the SET ROLE statement will be issued after gs_restore connects to the database. It is useful when the authenticated user (specified by -U) lacks the permissions required by gs_restore. This parameter allows the user to switch to a role with the required permissions. Some installations have a policy against logging in directly as the initial user. This parameter allows data to be imported without violating the policy.
--rolepassword=ROLEPASSWORD
Role password.

Example

Special case: Execute the gsql tool. Run the following commands to import the MPPDB_backup.sql file in the export folder (in plain-text format) generated by gs_dump/gs_dumpall to the postgres database:

gsql -d postgres -p 5432 -W Bigdata@123 -f /home/omm/test/MPPDB_backup.sql
SET
SET
SET
SET
SET
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
CREATE INDEX
CREATE INDEX
CREATE INDEX
SET
CREATE INDEX
REVOKE
REVOKE
GRANT
GRANT
total time: 30476  ms

gs_restore is used to import the files exported by gs_dump.

Example 1: Execute the gs_restore tool to import the exported MPPDB_backup.dmp file (custom format) to the postgres database.

gs_restore -W Bigdata@123 backup/MPPDB_backup.dmp -p 5432 -d postgres
gs_restore: restore operation successful
gs_restore: total time: 13053  ms

Example 2: Execute the gs_restore tool to import the exported MPPDB_backup.tar file (.tar format) to the postgres database.

gs_restore backup/MPPDB_backup.tar -p 5432 -d postgres 
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 21203  ms

Example 3: Execute the gs_restore tool to import the exported MPPDB_backup file (directory format) to the postgres database.

gs_restore backup/MPPDB_backup -p 5432 -d postgres
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 21003  ms

Example 4: Execute the gs_restore tool and run the following commands to import the MPPDB_backup.dmp file (in custom format). Specifically, import all the object definitions and data in the PUBLIC schema. Existing objects are deleted from the target database before the import. If an existing object references to an object in another schema, you need to manually delete the referenced object first.

gs_restore backup/MPPDB_backup.dmp -p 5432 -d postgres -e -c -n PUBLIC
gs_restore: [archiver (db)] Error while PROCESSING TOC:
gs_restore: [archiver (db)] Error from TOC entry 313; 1259 337399 TABLE table1 gaussdba
gs_restore: [archiver (db)] could not execute query: ERROR:  cannot drop table table1 because other objects depend on it
DETAIL:  view t1.v1 depends on table table1
HINT:  Use DROP ... CASCADE to drop the dependent objects too.
    Command was: DROP TABLE public.table1;

Manually delete the referenced object and create it again after the import is complete.

gs_restore backup/MPPDB_backup.dmp -p 5432 -d postgres -e -c -n PUBLIC
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 2203  ms

Example 5: Execute the gs_restore tool and run the following commands to import the MPPDB_backup.dmp file (in custom format). Specifically, import only the definition of table1 in the PUBLIC schema.

gs_restore backup/MPPDB_backup.dmp -p 5432 -d postgres -e -c -s -n PUBLIC -t table1
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 21000  ms

Example 6: Execute the gs_restore tool and run the following commands to import the MPPDB_backup.dmp file (in custom format). Specifically, import only the data of table1 in the PUBLIC schema.

gs_restore backup/MPPDB_backup.dmp -p 5432 -d postgres -e -a -n PUBLIC -t table1
gs_restore[2017-07-21 19:16:26]: restore operation successful
gs_restore[2017-07-21 19:16:26]: total time: 20203  ms