Uploaded image for project: 'Percona Monitoring and Management'
  1. Percona Monitoring and Management
  2. PMM-5739

New mongodb exporter metrics renamer

Details

    • New Feature
    • Status: Done
    • Medium
    • Resolution: Done
    • None
    • 2.10.0
    • MongoDB_Exporter
    • None
    • 2
    • Platform Sprint 17, Platform Sprint 18, Platform Sprint 19
    • Yes
    • No
    • No
    • No

    Description

      Metrics renamer implementation according to PMM-5556

      In details, it should follow the reaming rules (regex) in de code example: 

      { exp: /^serverStatus/, repl: "ss" },    
      { exp: /^ss\.wiredTiger/, repl: "ss_wt" },    
      { exp: /^ss_wt\.transaction/, repl: "ss_wt_txn" },    
      { exp: /^replSetGetStatus/, repl: "rs" },    
      { exp: /^systemMetrics/, repl: "sys" },    
      { exp: /^local\.oplog\.rs\.stats/, repl: "oplog_stats" },    
      { exp: /^oplog_stats\.wiredTiger/, repl: "oplog_stats_wt" },    
      { exp: /^collStats\.latencyStats/, repl: "collstats_latency" },    
      { exp: /^collStats\.storageStats/, repl: "collstats_storage" },    
      { exp: /^collstats_storage\.wiredTiger/, repl: "collstats_storage_wt" },    
      { exp: /^collstats_storage\.indexDetails/, repl: "collstats_storage_idx" },    
      { exp: /[^a-zA-Z0-9_]+/g, repl: "_" },    
      { exp: /\_$/g, repl: "" }

      For example, all metrics starting with {serverStatus}.<something> should be renamed to {ss}.<something>
      The way to test it is by manually running the metrics gathering command (like collstats or the command we are testing), get the list of metrics and search for the equivalent in the metrics returned by the agent but, there are unit tests, so, we could catch some metrics examples that escaped the rules. 

       Since there is no way to know in advance if a metric is a gauge or a counter, we can only group metrics converting their names to labels if they are in this list:

      var jsNodeToPDMetrics = {
        "replSetGetStatus.members":  {"promDimension": "member_idx"},
        "systemMetrics.disks":       {"promDimension": "device_name"},
        "collStats.storageStats.indexDetails": {"promDimension": "index_name"},
       
        "serverStatus.asserts": {"promDimension": "assert_type"},
        "serverStatus.connections": {"promDimension": "conn_type"},
        "serverStatus.connections": {"promDimension": "conn_type"},
        "serverStatus.globalLock.currentQueue": {"promDimension": "count_type"},
        "globalLock.activeQueue":  {"promDimension": "count_type"},
        "globalLock.locks": {"promDimension": "lock_type"},
        /*"globalLock.locks.<LOCK_TYPE>.acquireCount":     {"promDimension": "lock_mode"},*/
        /*"globalLock.locks.<LOCK_TYPE>.acquireWaitCount": {"promDimension": "lock_mode"},*/
        /*"globalLock.locks.<LOCK_TYPE>.deadlockCount":    {"promDimension": "lock_mode"},*/
        /*"globalLock.locks.<LOCK_TYPE>.timeAcquiringMicros": {"promDimension": "lock_mode"},*/
        "serverStatus.opLatencies":          {"promDimension": "op_type"},
        "serverStatus.opReadConcernCounters":  {"promDimension": "concern_type"},
        /*Following needs to be tested once reportOpWriteConcernCountersInServerStatus*/
        /*  parameter is set*/
        /*"serverStatus.opWriteConcernCounters": {"promDimension": "cmd_type"},*/
        "serverStatus.opcounters":          {"promDimension": "legacy_op_type"},
        "serverStatus.opcountersRepl":          {"promDimension": "legacy_op_type"},
        "serverStatus.transactions.commitTypes": {"promDimension": "commit_type"},
        "serverStatus.wiredTiger.perf": {"promDimension": "perf_bucket"},
        "serverStatus.wiredTiger.concurrentTransactions": {"promDimension": "txn_rw_type"},
        "serverStatus.metrics.cursor.open": {"promDimension": "csr_type"},
        "serverStatus.metrics.document": {"promDimension": "doc_op_type"},
        "serverStatus.metrics.commands":  {"promDimension": "cmd_name"}
      } 

      For example, take these metrics:

              "serverStatus" : {
                  "start" : ISODate("2020-04-21T00:29:18Z"),
                  "host" : "karl-OMEN:17001",
                  "version" : "4.0.17-10",
                  "process" : "mongod",
                  "pid" : NumberLong(10),
                  "uptime" : 34,
                  "uptimeMillis" : NumberLong(34036),
                  "uptimeEstimate" : NumberLong(34),
                  "localTime" : ISODate("2020-04-21T00:29:18Z"),
                  "asserts" : {
                      "regular" : 0,
                      "warning" : 0,
                      "msg" : 0,
                      "user" : 40,
                      "rollovers" : 0
                  }, 

      Following the previous set of rules, metrics under assert will become:
      ss_asserts value: 0 <tag> assert_type=regular
      ss_asserts value 40 <tag> assert_type=user
      All other metrics shouldn't be grouped by using labels.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              carlos.salguero Carlos Salguero
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - Not Specified
                  Not Specified
                  Logged:
                  Time Spent - 1 day, 5 hours
                  1d 5h

                  Smart Checklist