High-Risk Operations

Strictly follow the operation guide and avoid the following risky operations:

Table 1 describes forbidden operations during routine O&M.

Table 1 Forbidden operations

Forbidden Operation

Risk

Modify the file name, permission, or content, or delete any content in the data directory.

Serious errors occur on database nodes and cannot be fixed.

Delete database system catalogs or their data.

Service operations cannot be properly performed.

Table 2 describes risky operations during routine O&M.

Table 2 Risky operations

Category

Risky Operation

Risk

Risk Level

Preventive Measure

Check Item

Operations and maintenance

Upgrade a database kernel.

Services are intermittently interrupted during the upgrade.

High

Perform the upgrade during off-peak hours. Before the upgrade, perform a comprehensive inspection on the database, eliminate key metric risks in advance, communicate with users about the impact and upgrade time window, and then perform the upgrade.

Observe key metrics such as the SQL response delay, number of active sessions, number of threads, and dynamic memory usage.

Kill processes.

If a DN process is killed, services may be intermittently interrupted and even a primary/standby switchover may be triggered. If the difference between the primary and standby nodes is large, the RTO risk is higher.

Medium

Exercise caution when determining the necessity of killing processes during O&M. If processes need to be killed, notify users in advance.

Observe key metrics, such as the SQL response delay, number of active sessions, number of threads, dynamic memory usage, and primary/standby log difference.

Kill sessions.

If a session is killed, client may be disconnected and service may be interrupted.

Medium

Exercise caution when determining the necessity of killing sessions during O&M. If sessions need to be killed, notify users in advance.

Observe key metrics such as the SQL response delay, number of active sessions, number of threads, and dynamic memory usage.

Configuration modification

Modify the postgres.conf file.

If important configurations such as the port number in the file are modified, the database may fail to be started or connected.

Medium

Do not manually modify the configurations. If you need to modify them, run the corresponding database operation commands after you are aware of the risks.

None

Modify the pg_hba.conf file.

If the mutual trust rules in the file are modified, the database may be attacked or the client may fail to connect to the database.

High

Do not manually modify the configurations. If you need to modify them, run the gs_guc commands after you are aware of the risks.

None

Modify some database configuration parameters.

If some parameters are modified improperly, unexpected database behaviors may occur, including but not limited to statement delay increase, memory usage increase, and service connection errors.

Medium

Before modifying parameters, carefully read the product documentation and accurately evaluate the impact. If the impact cannot be evaluated, contact Huawei technical support.

Observe key metrics such as the SQL response delay, CPU usage, and memory usage.

DDL change

Users perform DDL operations.

Most DDL statements use high-level locks to block queries and DML statements. As a result, services are blocked for a long time.

High

Exercise caution when performing DDL operations. Perform DDL operations offline if possible. If DDL operations cannot be performed offline, set parameters such as the lock waiting duration to reduce the waiting time and prevent DDL operations from blocking services.

Observe key metrics such as the SQL response delay, number of active sessions, number of threads, and dynamic memory usage.

Feedback
编组 3备份
    openGauss 2024-05-07 00:46:52
    cancel