使用指导

通过巡检相关API实现功能调用

  • 智能巡检接口

    示例:

    curl -X 'POST' "http://127.0.0.1:8080/v1/api/app/real-time-inspection?inspection_type=real_time_check&start_time=1689210000000&end_time=1689296400000&instance=127.0.0.1:5432" -H 'accept: application/json' -H 'Content-Type: application/json' -d '{"system_resource": ["os_mem_usage"], "instance_status": [], "database_resource": [], "database_performance": [], "diagnosis_optimization": []}' -H "Authorization: bearer xxx"
    

    如果使用HTTPS协议,则查询示例:

    curl -X 'POST' "https://127.0.0.1:8080/v1/api/app/real-time-inspection?inspection_type=real_time_check&start_time=1689210000000&end_time=1689296400000&instance=127.0.0.1:5432" -H 'accept: application/json' -H 'Content-Type: application/json' -d '{"system_resource": ["os_mem_usage"], "instance_status": [], "database_resource": [], "database_performance": [], "diagnosis_optimization": []}' -H "Authorization: bearer xxx" --cacert xx.crt --key xx.key --cert xx.crt
    

    如果使用自定义阈值,查询示例:

    curl -X 'POST' "https://127.0.0.1:8080/v1/api/app/real-time-inspection?inspection_type=real_time_check&start_time=1689210000000&end_time=1689296400000&instance=127.0.0.1:5432" -H 'accept: application/json' -H 'Content-Type: application/json' -d '{"system_resource": [{"os_mem_usage": {"increase": false, "threshold": [], "forecast": [1440, 0.0, 0.8]}}], "instance_status": [], "database_resource": [], "database_performance": [], "diagnosis_optimization": []}' -H "Authorization: bearer xxx" --cacert xx.crt --key xx.key --cert xx.crt
    

    返回结构示例如下:

    {
     "data": {
      "conclusion": {
       "full_score": 0.06,
       "health_score": 0.06,
       "health_status": "bad",
       "top3": []
      },
      "database_performance": {},
      "database_resource": {},
      "diagnosis_optimization": {},
    		"instance_status": {},
    		"system_resource": {
    			"os_mem_usage": {
    				"127.0.0.1": {
    					"data": [0.31643905373281667],
         "statistic": {
          "avg": 0.311,
          "max": 0.3166,
          "min": 0.3057,
          "the_95th": 0.3153
         },
         "timestamps": [1694674713000],
         "warnings": {
          "increase_warning": true
         }
        }
       }
      }
     },
     "success": true
    }
    
  • 展示巡检任务接口示例:

    curl -X 'GET' "http://127.0.0.1:8080/v1/api/app/real-time-inspection/list?instance=127.0.0.1:5432" -H 'accept: application/json' -H "Authorization: bearer xxx"
    

    如果使用HTTPS协议,则查询示例:

    curl -X 'GET' "https://127.0.0.1:8080/v1/api/app/real-time-inspection/list?instance=127.0.0.1:5432" -H 'accept: application/json' -H "Authorization: bearer xxx" --cacert xx.crt --key xx.key --cert xx.crt
    

    返回结构如下:

    {"data":{"header":["instance","start","end","id","state","cost_time","inspection_type"],"rows":[["127.0.0.1:5432",1689210000000,1689296400000,5,"success",0.033701,"real_time_check"]]},"success":true}
    
  • 获取指定巡检任务的巡检结果接口示例:

    curl -X 'GET' "http://127.0.0.1:8080/v1/api/summary/real-time-inspection?spec_id=5&instance=127.0.0.1:5432" -H 'accept: application/json' -H "Authorization: bearer xxx"
    

    如果使用HTTPS协议,则查询示例:

    curl -X 'GET' "https://127.0.0.1:8080/v1/api/summary/real-time-inspection?spec_id=5&instance=127.0.0.1:5432" -H 'accept: application/json' -H "Authorization: bearer xxx" --cacert xx.crt --key xx.key --cert xx.crt
    

    返回结构与智能巡检接口返回结构一致。

  • 删除指定的巡检任务接口示例:

    curl -X 'DELETE' "http://127.0.0.1:8080/v1/api/app/real-time-inspection? spec_id=5&instance=127.0.0.1:5432" -H 'accept: application/json' -H "Authorization: bearer xxx"
    

    如果使用HTTPS协议,则删除示例:

    curl -X 'DELETE' "https://127.0.0.1:8080/v1/api/app/real-time-inspection? spec_id=5&instance=127.0.0.1:5432" -H 'accept: application/json' -H "Authorization: bearer xxx" --cacert xx.crt --key xx.key --cert xx.crt
    

    返回结构如下:

    {"data":{"success":true},"success":true}
    

自定义阈值参数

  • 自定义阈值传参方式,各告警类型对应的key如表1所示:

    表 1 告警类型以及对应的key

    告警类型

    持续上升

    increase

    超过阈值

    threshold

    预测超过阈值

    forecast

    文件类型有误

    ftype

    • 启用默认告警配置(505.1.0之前的版本)

      {
      "system_resource": [
      "os_mem_usage"
      ],
      "instance_status": [],
      "database_resource": [],
      "database_performance": [],
      "diagnosis_optimization": []
      }
      
    • 启用默认告警配置(505.1.0及之后版本)

      {
      "system_resource": [
      {
      "os_mem_usage": true
      }
      ],
      "instance_status": [],
      "database_resource": [],
      "database_performance": [],
      "diagnosis_optimization": []
      }
      
    • 不启用告警

      {
      "system_resource": [
      {
      "os_mem_usage": false
      }
      ],
      "instance_status": [],
      "database_resource": [],
      "database_performance": [],
      "diagnosis_optimization": []
      }
      
    • 启用自定义告警。如下所示,os_mem_usage启用持续上升告警、自定义设置阈值告警和预测阈值告警;os_disk_usage不启用持续上升告警、阈值告警、启用预测阈值告警。

      {
      "system_resource": [
      {
      "os_mem_usage": {
      "increase": true,
      "threshold": [0.0, 0.8],
      "forecast": [1440, 0.0, 0.8]
      },
      "os_disk_usage": {
      "forecast": [1440, 0.0, 0.8]
      }
      }
      ],
      "instance_status": [],
      "database_resource": [],
      "database_performance": [],
      "diagnosis_optimization": []
      }
      
  • 部分巡检项不支持自定义配置阈值:组件异常、日志异常检查、数据库Top Query、长事务、oldestXmin长时间未推进、Core dump、GUC参数。

  • 各个巡检项支持的阈值类型见概述中自定义阈值表格,前端传入不支持的告警类型会报错。

  • 对于部分巡检项,存在子巡检项,如表2所示:

    表 2 巡检项及子巡检项

    巡检项

    子巡检项

    os_cpu_usage

    cpu_user

    cpu_iowait

    user_login_out

    login

    logout

    db_latency

    p95

    p80

    db_transaction

    commit

    rollback

    db_exec_statement

    select

    update

    insert

    delete

    db_tps

    tps

    qps

    dynamic_memory

    dynamic_used_memory

    dynamic_used_shrctx

    当子巡检项的告警配置一致时,可以省略重复值,具体方式如下:

    • 重复设置os_cpu_usage阈值自定义

      {
      "system_resource": [
      {
      "os_cpu_usage": {
      "cpu_user": {
      "increase": false,
      "threshold": [],
      "forecast": [1440, 0.0, 0.8]
      },
      "cpu_iowait": {
      "increase": false,
      "threshold": [],
      "forecast": [1440, 0.0, 0.8]
      }
      }
      }
      ],
      "instance_status": [],
      "database_resource": [],
      "database_performance": [],
      "diagnosis_optimization": []
      }
      
    • 简化设置os_cpu_usage阈值自定义

      {
      "system_resource": [
      {
      "os_cpu_usage": {
      "increase": false,
      "threshold": [],
      "forecast": [1440, 0.0, 0.8]
      }
      }
      ],
      "instance_status": [],
      "database_resource": [],
      "database_performance": [],
      "diagnosis_optimization": []
      }
      

巡检项结果返回值示例

  • 数据库实例

    以1主DN、2备DN为例,共3个节点:

    主DN:127.0.0.1:19996

    备DN:127.0.0.2:19996, 127.0.0.3:19996

  • 返回结构类型

    • ①:以节点为key:{“127.0.0.1”: xxx, “127.0.0.2”: xxx, “127.0.0.3”: xxx}
    • ②:以DN为key:{“127.0.0.1:19996”: xxx, “127.0.0.2:19996”: xxx, “127.0.0.3:19996”: xxx}
    • ③:以DB为key:{“db1”: xxx, “db2”: xxx, “db3”: xxx}
    • ④:返回list: [{xxx}]

巡检项对应的返回结构如表3所示。

表 3 巡检项返回结构

巡检项

返回结构

备注

os_cpu_usage

两层结构,子巡检项:cpu_user、cpu_iowait。

os_disk_ioutils

-

os_disk_usage

-

os_mem_usage

-

network_packet_loss

两层结构,展示节点到节点的网络状况。

component_error

-

data_directory

-

log_directory

-

db_size

-

buffer_hit_rate

-

user_login_out

两层结构,子巡检项:login、logout。

active_session_rate

-

log_error_check

-

thread_pool

-

db_latency

两层结构,子巡检项:p80、p95。

db_transaction

两层结构,子巡检项:commit、rollback。

db_tmp_file

-

db_exec_statement

两层结构,子巡检项:select、update、insert、delete。

db_deadlock

-

db_tps

两层结构,子巡检项:qps、tps。

db_top_query

-

long_transaction

-

xmin_stuck

-

xlog_accumulate

-

core_dump

-

dynamic_memory

两层结构,子巡检项:dynamic_used_memory、dynamic_used_shrctx。

process_memory

-

other_memory

-

guc_params

-

说明: 巡检结果会存储到DBMind元数据库中,DBMind会定期清除老数据以避免磁盘膨胀。

意见反馈
编组 3备份
    openGauss 2025-06-07 22:42:34
    取消