gs_cgroup
Background
When jobs are batch processed in a cluster, loads on servers significantly vary due to the complexity of batch processing. To fully use cluster resources, you need to manage loads. gs_cgroup is a load management tool provided by openGauss. It can create default Cgroups and user-defined Cgroups, delete default and user-defined Cgroups, update resource quotas and allocations, display the configuration files of Cgroups and the Cgroup tree, and delete all Cgroups.
gs_cgroup creates Cgroups configuration files for the OS user of a database and generates Cgroups that the OS user sets in the OS. gs_cgroup also allows users to add or delete Cgroups, update Cgroup resource quotas, allocate CPU cores or I/O resources, set exception thresholds, and handle the exceptions. gs_cgroup is responsible only for Cgroups operations performed on the node where the current OS resides, and needs to be centrally configured across nodes by invoking the same statement.
For details, see “Resource Load Management” in Developer Guide.
Examples
- Commands executed by a common user or the database administrator:
Prerequisites: The GAUSSHOME environment variable is used as the database installation directory and user root has created default Cgroups for common users.
Create Cgroups and set corresponding resource quota so that jobs of the database can be specified to a Cgroup and use its resources. The database administrator creates Class Cgroups for each database user.
Create class and workload Cgroups.
gs_cgroup -c -S class1 -s 40
Create the class1 Cgroup and allocate 40% of Class resources to it.
gs_cgroup -c -S class1 -G grp1 -g 20
Create the grp1 Workload Cgroup under the class1 Cgroup and allocate 20% of class1 Cgroup resources to the Workload Cgroup.
Delete the created grp1 Cgroup and class1 Cgroup.
gs_cgroup -d -S class1 -G grp1
Delete the created grp1 Cgroup.
gs_cgroup -d -S class1
Delete the created class1 Cgroup.
NOTICE: If a Class Cgroup is deleted, its Workload Cgroups will be deleted as well.
Update the resource quota for created Cgroups.
Update dynamic resource quota.
gs_cgroup -u -S class1 -G grp1 -g 30
Update the resources allocated to the grp1 Workload Cgroup under the class1 Cgroup for the current user to 30% of class1 resources.
Update the resource limitation quota.
gs_cgroup --fixed -u -S class1 -G grp1 -g 30
Set the number of CPU cores allocated to the grp1 Cgroup to 30% of cores allocated to its parent Cgroup class1.
Update the range of the CPU cores in the Gaussdb Cgroup.
gs_cgroup -u -T Gaussdb -f 0-20
Update the number of CPU cores used by the GaussDB process to 0–20.
NOTE: The -f parameter can only be used to set the range of the CPU cores in the Gaussdb Cgroup. For other Cgroups, if you need to set the number of cores, use the --fixed parameter.
Set exception handling information. (class:wg group must exist.)
Terminate a job under the class:wg Cgroup when job congestion lasts for 1200s or job execution lasts for 2400s.
gs_cgroup -S class -G wg -E "blocktime=1200,elapsedtime=2400" -a
Specify the termination action performed when the size of spilled job data in the class:wg group reaches 256 MB or the size of broadcast job data in the group reaches 100 MB.
gs_cgroup -S class -G wg -E "spillsize=256,broadcastsize=100" -a
Demote a job under the Class Cgroup when the total CPU time taken to execute the job on all nodes reaches 100s.
gs_cgroup -S class -E "allcputime=100" --penalty
Demote a job under the Class Cgroup when the total time taken to execute the job on all nodes reaches 2400s and the skew of the CPU time reaches 90 percent.
gs_cgroup -S class -E "qualificationtime=2400,cpuskewpercent=90"
NOTICE:
To set exception handling information for a Cgroup, ensure that the Cgroup has been created. Multiple specified thresholds are separated by commas (,). If no operation is specified, --penalty is used by default.
Set the number of cores per CPU have for a Cgroup.
Set the range of cores for the class:wg Cgroup to 20% of Class cores.
gs_cgroup -S class -G wg -g 20 --fixed -u
NOTICE: The range of cores for the Class or Workload Cgroup must be specified by the --fixed parameter.
Roll back the previous step.
gs_cgroup --recover
NOTE: The --recover parameter can only roll back the latest addition, deletion, or modification made to the Class and Workload Cgroups.
View information about Cgroups that have been created.
View Cgroup information in configuration files.
gs_cgroup -p
Cgroup configuration
gs_cgroup -p Top Group information is listed: GID: 0 Type: Top Percent(%): 1000( 50) Name: Root Cores: 0-47 GID: 1 Type: Top Percent(%): 833( 83) Name: Gaussdb:omm Cores: 0-20 GID: 2 Type: Top Percent(%): 333( 40) Name: Backend Cores: 0-20 GID: 3 Type: Top Percent(%): 499( 60) Name: Class Cores: 0-20 Backend Group information is listed: GID: 4 Type: BAKWD Name: DefaultBackend TopGID: 2 Percent(%): 266(80) Cores: 0-20 GID: 5 Type: BAKWD Name: Vacuum TopGID: 2 Percent(%): 66(20) Cores: 0-20 Class Group information is listed: GID: 20 Type: CLASS Name: DefaultClass TopGID: 3 Percent(%): 166(20) MaxLevel: 1 RemPCT: 100 Cores: 0-20 GID: 21 Type: CLASS Name: class1 TopGID: 3 Percent(%): 332(40) MaxLevel: 2 RemPCT: 70 Cores: 0-20 Workload Group information is listed: GID: 86 Type: DEFWD Name: grp1:2 ClsGID: 21 Percent(%): 99(30) WDLevel: 2 Quota(%): 30 Cores: 0-5 Timeshare Group information is listed: GID: 724 Type: TSWD Name: Low Rate: 1 GID: 725 Type: TSWD Name: Medium Rate: 2 GID: 726 Type: TSWD Name: High Rate: 4 GID: 727 Type: TSWD Name: Rush Rate: 8 Group Exception information is listed: GID: 20 Type: EXCEPTION Class: DefaultClass PENALTY: QualificationTime=1800 CPUSkewPercent=30 GID: 21 Type: EXCEPTION Class: class1 PENALTY: AllCpuTime=100 QualificationTime=2400 CPUSkewPercent=90 GID: 86 Type: EXCEPTION Group: class1:grp1:2 ABORT: BlockTime=1200 ElapsedTime=2400
Table 1 lists the Cgroup configuration shown in the above example.
Table 1 Cgroup configuration
View the Cgroup tree in the OS.
gs_cgroup -P displays a Cgroup tree. In the tree, shares indicates the value of cpu.shares, which specifies the dynamic quota of CPU resources in the OS, and cpus indicates the value of cpuset.cpus, which specifies the dynamic quota of CPUSET resources in the OS (number of cores that a Cgroup can use).
gs_cgroup -P Mount Information: cpu:/dev/cgroup/cpu blkio:/dev/cgroup/blkio cpuset:/dev/cgroup/cpuset cpuacct:/dev/cgroup/cpuacct Group Tree Information: - Gaussdb:wangrui (shares: 5120, cpus: 0-20, weight: 1000) - Backend (shares: 4096, cpus: 0-20, weight: 400) - Vacuum (shares: 2048, cpus: 0-20, weight: 200) - DefaultBackend (shares: 8192, cpus: 0-20, weight: 800) - Class (shares: 6144, cpus: 0-20, weight: 600) - class1 (shares: 4096, cpus: 0-20, weight: 400) - RemainWD:1 (shares: 1000, cpus: 0-20, weight: 100) - RemainWD:2 (shares: 7000, cpus: 0-20, weight: 700) - Timeshare (shares: 1024, cpus: 0-20, weight: 500) - Rush (shares: 8192, cpus: 0-20, weight: 800) - High (shares: 4096, cpus: 0-20, weight: 400) - Medium (shares: 2048, cpus: 0-20, weight: 200) - Low (shares: 1024, cpus: 0-20, weight: 100) - grp1:2 (shares: 3000, cpus: 0-5, weight: 300) - TopWD:1 (shares: 9000, cpus: 0-20, weight: 900) - DefaultClass (shares: 2048, cpus: 0-20, weight: 200) - RemainWD:1 (shares: 1000, cpus: 0-20, weight: 100) - Timeshare (shares: 1024, cpus: 0-20, weight: 500) - Rush (shares: 8192, cpus: 0-20, weight: 800) - High (shares: 4096, cpus: 0-20, weight: 400) - Medium (shares: 2048, cpus: 0-20, weight: 200) - Low (shares: 1024, cpus: 0-20, weight: 100) - TopWD:1 (shares: 9000, cpus: 0-20, weight: 900)
Parameter Description
-a [--abort]
Terminates a job when it exceeds an exception threshold.
-b pct
Specifies the percentage of resources of the Top Backend Cgroup taken by a Backend Cgroup. The **-B **backendname parameter must be specified as well.
Value Range
- The value ranges from 1 to 99. If this parameter is not set, the default CPU quota accounts for 20% of the Vacuum Cgroup and 80% of the DefaultBackend Cgroup, respectively. The quota sum for the Vacuum and DefaultBackend Cgroups must be less than 100%.
-B name
Specifies the name of a Backend Cgroup. Only the -u parameter can be used to change the resource quota for this Cgroup.
The -b percent and **-B **backendname parameters need to be specified to set the resource proportion of database backend threads.
Value range: a string with a maximum of 64 bytes.
-c
Creates a Cgroup and specifies its name.
A common user can specify -c and -S classname to create a Class Cgroup. If -G groupname is specified as well, a Workload Cgroup will be created under the Class Cgroup. The Workload Cgroup is at the bottom layer in the Class Cgroup (Layer-4 is the bottom layer.)
-d
Deletes Cgroups.
A common user can specify -d and -S classname parameters to delete the created Class Cgroups. If the -G groupname parameter is specified as well, a Workload Cgroup under the Class Cgroup is deleted, and related threads are put into the DefaultClass:DefaultWD:1 Cgroup. If the Workload Cgroups to be deleted locate at a high level (Level 1 is the top level), adjust hierarchy of lower-level Cgroups, create the new Cgroups-related threads, and load them to the new Cgroups.
-E data
Specifies the exception thresholds, including blocktime, elapsedtime, allcputime, spillsize, broadcastsize, qualificationtime, and cpuskewpercent. The thresholds are separated by commas (,). 0 indicates that the setting is canceled. If the parameter is set to an invalid value, an error will be prompted.
Table 2 Exception threshold types
-h [--help]
Displays the command help information.
-H
Collects $GAUSSHOME information among the current users.
Value range: a string with a maximum of 1023 characters.
-f
Specifies the range of core quantity used by the Gaussdb Cgroup. The range format can be a-b or a. For other Cgroups, use the --fixed parameter to set the range of core quantity.
--fixed
Specifies the percentage of cores allocated for a Cgroup's parent group that the Cgroup can use, or specifies the I/O resources.
--fixed is set together with -s, -g, -t, and -b when the kernel range ratio is set.
The ratio is between 0 and 100. The sum of kernels of the same level is less than or equal to 100. The value 0 indicates that the kernel number of a level is same as that of the upper level. The CPU quota for all the Cgroups is set to 0 by default. -f and --fixed cannot be configured at the same time. After --fixed is set, the -f range will be automatically invalid. The ratio will be displayed in -p as the quota value.
When the I/O resource quota is set, -R, -r, -W, and -w are used together.
-g pct
Specifies the percentage of resources in a Class Cgroup taken by a Workload Cgroup. The -G groupname parameter needs to be specified as well. The -g pct parameter can be used with the -c parameter to create a Cgroup or with the -u parameter to update a Workload Cgroup.
Value range: 1 to 99. By default, the CPU quota of a Workload Cgroup is 20%. The sum of CPU quotas for all Workload Cgroups must be less than 99%.
-G name
Specifies the name of a Workload Cgroup. The **-S **classname parameter needs to be set to specify the Class Cgroup to which the Workload Cgroup belongs. The **-G **name parameter can be used with -c to create a Cgroup, with -d to delete a Cgroup, and with -u to update the resource quota for a Cgroup. Note that name in the **-G **name parameter cannot be a default Timeshare Cgroup name, including Low, Medium, High, and Rush.
If a user creates a Workload Cgroup, the name must contain any colons (:). Names of Cgroups must be different.
Value range: a string with a maximum of 28 bytes
-N [--group] name
Shows the Cgroup name, class:wg for short.
-p
Shows information about Cgroup configuration files.
-P
Shows the structure of the Cgroup tree.
--penalty
Demotes a job when the job exceeds an exception threshold. If no operation is specified, --penalty is used by default.
-r data
Only updates the upper limit of data reading for I/O resources, that is, sets the value of blkio.throttle.read_bps_device. This parameter is a string consisting of major:minor value, in which major indicates the major device number of the disk to be accessed, minor indicates the minor device number, and value indicates the upper limit of the number of read operations. The upper limit ranges from 0 to ULONG_MAX, and 0 indicates that the number of read operations is not restricted. This parameter needs to be used with the -u parameter and Cgroup names. If both the Class Cgroup name and Workload Cgroup name are specified, this parameter is used for the Workload Cgroup.
Value range: a string with a maximum of 32 characters.
-R data
Only updates the upper limit of I/O resources used to read data per second, that is, sets the value of blkio.throttle.read_iops_device. The value of this parameter is the same as that of the -r parameter. This parameter needs to be used with the -u parameter and Cgroup names. If both the Class Cgroup name and Workload Cgroup name are specified, this parameter is used for the Workload Cgroup.
Value range: a string with a maximum of 32 characters.
--recover
Rolls back only the latest addition, deletion, or modification made to the Class and Workload Cgroups.
--revert
Restores to the default status of the Cgroup.
-D mpoint
Specifies a mount point. The default mount point is /dev/cgroup/subsystem.
-m
Mounts the Cgroup.
-M
Unmounts the Cgroup.
-U
Specifies the database username.
--refresh
Updates the status of the Cgroup.
-s pct
Specifies the percentage of resources in the top Class Cgroup taken by a Class Cgroup. The **-S **classname parameter needs to be specified as well. The -s pct parameter can be used with the -c parameter to create a Cgroup or with the -u parameter to update a Class Cgroup.
Value range: 1 to 99. By default, the CPU quota of the Class Cgroup is set to 20%. In R6C10, the CPU quota of the Class Cgroup is set to 40%. During the upgrade, the quota is not updated. The sum of the CPU quota of the newly created Class Cgroup and the default DefaultClass quota must be less than 100%.
-S name
Specifies the name of a Class Cgroup. This parameter can be used with -c to create a Cgroup, with -d to delete a Cgroup, or with -u to update resource quota for a Cgroup. The name of a sub-Class Cgroup cannot contain the colon (:).
Value range: a string with a maximum of 31 bytes.
-t percent
Specifies the percentage of resources for top Cgroups (Root, Gaussdb: omm, Backend, and Class Cgroups). The -T name parameter needs to be specified as well. If this parameter is used to specify resource percentage for the -T Root Cgroup, the name shown in the Cgroup configuration file is Root. percent indicates the percentage of the value of blkio.weight, and its minimum value is 10%. The CPU resource quota, such as the value of cpu.shares cannot be changed. If this parameter is used to specify resource percentage for the Gaussdb:omm Cgroup, the parameter value indicates the percentage of CPU resources taken by the Gaussdb:omm Cgroup. (The cpu.shares value for the Gaussdb:omm Cgroup can be obtained based on the quota 1024 for the Root Cgroup and the condition that only one database is available for the current system.) The I/O resource quota is 1000 and will not change. If this parameter is used to specify resource percentage for the Class or Backend Cgroup, the parameter value indicates the percentage of resources in the Gaussdb Cgroup taken by the Class or Backend Cgroup.
Value range: 1 to 99. By default, the quota of the Class Cgroup is 60%, and the quota of the Backend Cgroup is 40%. Modify the quota of the Class Cgroup and automatically update the quota of the Backend Cgroup so that the sum quota of the Backend and Class Cgroups is 100%.
-T name
Specifies the names of top Cgroups.
Value range: a string with a maximum of 64 bytes.
-u
Updates Cgroups.
-V [--version]
Displays version information about the gs_cgroup tool.
-w data
Only updates the upper limit of I/O resources used to write data per second, that is, sets the value of blkio.throttle.write_bps_device. The value of this parameter is the same as that of the -r parameter. The -u parameter and the Cgroup name need to be specified as well. If both the Class Cgroup name and Workload Cgroup name are specified, this parameter is used for the Workload Cgroup.
Value range: a string with a maximum of 32 characters.
-W data
Only updates the upper limit of I/O resources used to write data per second, that is, sets the value of blkio.throttle.write_iops_device. The value of this parameter is the same as that of the -r parameter. The -u parameter and the Cgroup name need to be specified as well. If both the Class Cgroup name and Workload Cgroup name are specified, this parameter is used for the Workload Cgroup.
Value range: a string with a maximum of 32 characters.
NOTE:
Use the following method to obtain the major:minor value for the disk. For example, obtain the number of the disk corresponding to the /mpp directory.
> df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 524173248 41012784 456534008 9% / devtmpfs 66059264 236 66059028 1% /dev tmpfs 66059264 88 66059176 1% /dev/shm /dev/sdb1 2920486864 135987592 2784499272 5% /data /dev/sdc1 2920486864 24747868 2895738996 1% /data1 /dev/sdd1 2920486864 24736704 2895750160 1% /mpp /dev/sde1 2920486864 24750068 2895736796 1% /mpp1 > ls -l /dev/sdd brw-rw---- 1 root disk 8, 48 Feb 26 11:20 /dev/sdd
NOTICE:
Check the disk number of sdd rather than sdd1. Otherwise, an error will be reported. If the length of I/O quota limitation after the upgrade exceeds the allowed maximum length of the string, the update will not be saved in the configuration file. If the maximum length of the string is set to 96 and I/O resources of more than eight disks are updated, the string limitation may be exceeded. The update will not be saved in the configuration file though the update succeeds.