Uploaded image for project: 'Percona Monitoring and Management'
  1. Percona Monitoring and Management
  2. PMM-2623

High memory usage by mysqld_exporter

    XMLWordPrintable

    Details

      Description

      When investigating the cause for missing events in a test environment, the mysqld_exporter that runs via supervisor inside PMM server was found to be using approximately half of the available RAM and was appearing to increase its usage further.

      When first checking, the RSS showed as 1.410g and shortly afterwards 1.516g:

      12087 root      20   0 2694388 1.410g   2248 D   0.0 38.1 482:29.80 mysqld_exporter    
      12087 root      20   0 2694388 1.516g   2160 D  17.5 41.0 482:58.22 mysqld_exporter
      

      Consequently, the server had started to swap heavily:

      # vmstat 2 10
      procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
       r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
       2 10 2000032 153336    280 628896    8    5    50   156    5    6 30  8 61  0  0
      10  9 2001948 153552    280 632032  846 1392 10660  1540 3301 4234  5  7  0 87  0
       6  5 2003164 151420    280 634412 2732 1288 18520  1897 5776 8595 18 16  0 65  0
       7  5 2009972 150428    280 641348 6294 4774 22694  5053 8326 10030 32 23  0 45  0
       0  6 2009644 151908    280 639464 11210 2922 21734  3360 6171 7696 35 14  0 50  0
       0 47 2015832 133240    280 638376 4872 3766 11942  4444 4899 5823 46 20  0 34  0
      48  5 2028900 178924    280 642844 5956 8530 11668  8637 6038 5085 51 36  0 12  0
       6  5 2029032 217096    280 637196 18846 4498 27198  5646 9311 9408 60 22  0 18  0
       9  4 2030948 212364    280 637888 6110 2510  7766  2922 5445 7483 39 12  0 49  0
       0 11 2027592 193980    280 629208 25810 3496 29546  3865 8181 8808 37 12  0 51  0

      The only notable event relating to the period of time matching the gap in events was that the RDS node had become inaccessible:

      First notification: Thu Jun 14 05:21:34 UTC 2018
      Back online:        Thu Jun 14 08:51:00 UTC 2018
      

      During the time that the RDS node was down, the log for the mysqld_exporter had started filling with i/o timeout entries, i.e.

      time="2018-06-14T08:50:49Z" level=error msg="Error pinging mysqld: dial tcp: lookup xxx.rds.amazonaws.com on x.x.x.x:53: read udp y.y.y.y:43630->x.x.x.x:53: i/o timeout" source="exporter.go:120"
      

      The exporter service was stopped manually via supervisorctl stop pmm-mysqld_exporter-10000 to clear the issue.

      Info ahead of restart:

      CMD
      /usr/local/percona/pmm-client/mysqld_exporter -collect.auto_increment.columns -collect.binlog_size -collect.global_status -collect.global_variables -collect.info_schema.innodb_metrics -collect.info_schema.processlist -collect.info_schema.query_response_time -collect.info_schema.tables -collect.info_schema.tablestats -collect.info_schema.userstats -collect.perf_schema.eventswaits -collect.perf_schema.file_events -collect.perf_schema.indexiowaits -collect.perf_schema.tableiowaits -collect.perf_schema.tablelocks -collect.slave_status -web.listen-address=127.0.0.1:10000
      
      # supervisorctl status pmm-mysqld_exporter-10000 
      pmm-mysqld_exporter-10000        RUNNING   pid 209, uptime 9 days, 17:12:50
      

      Consuming 1.5G of memory in 9 days is excessive.

        Smart Checklist

          Attachments

            Activity

              People

              Assignee:
              lalit.choudhary Lalit Choudhary
              Reporter:
              ceri.williams Ceri Williams
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: