Uploaded image for project: 'Percona Monitoring and Management'
  1. Percona Monitoring and Management
  2. PMM-10968

High CPU usage for MongoDB nodes with PMM-Server 2.31.0

Details

    Description

      During the release testing for 2.32.0, while upgrading our performance testing instance from 2.30.0 to 2.31.0 we observed many client nodes which were being used for MONGODB monitoring started showing High CPU usage and mongoDB exporter metrics got stopped from being collected. 

      STR:

      1) Setup pmm2-client 2.30.0 with mongoDB instance and connect to PMM-Server 2.30.0

      2) We used mongoDB docker container from percona, with username/password based authentiation enabled. 
      3) PMM-Server was upgraded to 2.31.0 first and we waited to observe any increase in CPU/Memory consumption on any of the client nodes

      4) most of the MONGODB instances started reporting high CPU usage. 

      Remember client was still on 2.30.0 so pmm-agent and pmm-admin was 2.30.0 while pmm-server was 2.31.0

      See error attachements: 

       

      Error message on pmm-agent log: 

      WARN[2022-10-27T10:39:24.620+00:00] time="2022-10-27T10:39:24Z" level=warning msg="cannot load topology labels: cannot check if the instance is an arbiter: timed out while checking out a connection from connection pool: context canceled; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 0"  agentID=/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e component=agent-process type=mongodb_exporter
      ERRO[2022-10-27T10:39:24.620+00:00] time="2022-10-27T10:39:24Z" level=error msg="Registry - Cannot get node type to check if this is a mongos : timed out while checking out a connection from connection pool: context canceled; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 0"  agentID=/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e component=agent-process type=mongodb_exporter
      INFO[2022-10-27T10:39:29.621+00:00] 2022-10-27T10:39:29.620Z    error    VictoriaMetrics/lib/promscrape/scrapework.go:355    error when scraping "http://127.0.0.1:42002/metrics" from job "mongodb_exporter_agent_id_95a66dd5-5dce-426e-a51e-ad332cca538e_hr" with labels {agent_id="/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e",agent_type="mongodb_exporter",environment="load_testing_10",instance="/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e",job="mongodb_exporter_agent_id_95a66dd5-5dce-426e-a51e-ad332cca538e_hr",machine_id="/machine_id/4c43fc40ab0043c4a55a9ba1e8882580",node_id="/node_id/422fc999-4340-4010-94e7-e025cf868a31",node_name="client_container_6802",node_type="container",service_id="/service_id/b76995fd-2ea5-4433-a9e4-d8b2757a010d",service_name="mongodb_39823809__client_container_50.116.57.109_10",service_type="mongodb"}: error when scraping "http://127.0.0.1:42002/metrics" with timeout 4.5s: timeout  agentID=/agent_id/2f2706f5-bed0-4bc9-9333-6589e8b45ad9 component=agent-process type=vm_agent
      WARN[2022-10-27T10:39:29.621+00:00] time="2022-10-27T10:39:29Z" level=warning msg="cannot load topology labels: cannot check if the instance is an arbiter: timed out while checking out a connection from connection pool: context canceled; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 0"  agentID=/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e component=agent-process type=mongodb_exporter
      ERRO[2022-10-27T10:39:29.621+00:00] time="2022-10-27T10:39:29Z" level=error msg="Registry - Cannot get node type to check if this is a mongos : timed out while checking out a connection from connection pool: context canceled; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 0"  agentID=/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e component=agent-process type=mongodb_exporter
      INFO[2022-10-27T10:39:34.619+00:00] 2022-10-27T10:39:34.618Z    error    VictoriaMetrics/lib/promscrape/scrapework.go:355    error when scraping "http://127.0.0.1:42002/metrics" from job "mongodb_exporter_agent_id_95a66dd5-5dce-426e-a51e-ad332cca538e_hr" with labels {agent_id="/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e",agent_type="mongodb_exporter",environment="load_testing_10",instance="/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e",job="mongodb_exporter_agent_id_95a66dd5-5dce-426e-a51e-ad332cca538e_hr",machine_id="/machine_id/4c43fc40ab0043c4a55a9ba1e8882580",node_id="/node_id/422fc999-4340-4010-94e7-e025cf868a31",node_name="client_container_6802",node_type="container",service_id="/service_id/b76995fd-2ea5-4433-a9e4-d8b2757a010d",service_name="mongodb_39823809__client_container_50.116.57.109_10",service_type="mongodb"}: error when scraping "http://127.0.0.1:42002/metrics" with timeout 4.5s: timeout  agentID=/agent_id/2f2706f5-bed0-4bc9-9333-6589e8b45ad9 component=agent-process type=vm_agent
      WARN[2022-10-27T10:39:34.619+00:00] time="2022-10-27T10:39:34Z" level=warning msg="cannot load topology labels: cannot check if the instance is an arbiter: timed out while checking out a connection from connection pool: context canceled; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 0"  agentID=/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e component=agent-process type=mongodb_exporter
      ERRO[2022-10-27T10:39:34.619+00:00] time="2022-10-27T10:39:34Z" level=error msg="Registry - Cannot get node type to check if this is a mongos : timed out while checking out a connection from connection pool: context canceled; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 0"  agentID=/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e component=agent-process type=mongodb_exporter
      INFO[2022-10-27T10:39:39.620+00:00] 2022-10-27T10:39:39.620Z    error    VictoriaMetrics/lib/promscrape/scrapework.go:355    error when scraping "http://127.0.0.1:42002/metrics" from job "mongodb_exporter_agent_id_95a66dd5-5dce-426e-a51e-ad332cca538e_hr" with labels {agent_id="/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e",agent_type="mongodb_exporter",environment="load_testing_10",instance="/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e",job="mongodb_exporter_agent_id_95a66dd5-5dce-426e-a51e-ad332cca538e_hr",machine_id="/machine_id/4c43fc40ab0043c4a55a9ba1e8882580",node_id="/node_id/422fc999-4340-4010-94e7-e025cf868a31",node_name="client_container_6802",node_type="container",service_id="/service_id/b76995fd-2ea5-4433-a9e4-d8b2757a010d",service_name="mongodb_39823809__client_container_50.116.57.109_10",service_type="mongodb"}: error when scraping "http://127.0.0.1:42002/metrics" with timeout 4.5s: timeout  agentID=/agent_id/2f2706f5-bed0-4bc9-9333-6589e8b45ad9 component=agent-process type=vm_agent
      WARN[2022-10-27T10:39:39.621+00:00] time="2022-10-27T10:39:39Z" level=warning msg="cannot load topology labels: cannot check if the instance is an arbiter: timed out while checking out a connection from connection pool: context canceled; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 0"  agentID=/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e component=agent-process type=mongodb_exporter
      ERRO[2022-10-27T10:39:39.622+00:00] time="2022-10-27T10:39:39Z" level=error msg="Registry - Cannot get node type to check if this is a mongos : timed out while checking out a connection from connection pool: context canceled; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 0"  agentID=/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e component=agent-process type=mongodb_exporter
      INFO[2022-10-27T10:39:44.620+00:00] 2022-10-27T10:39:44.620Z    error    VictoriaMetrics/lib/promscrape/scrapework.go:355    error when scraping "http://127.0.0.1:42002/metrics" from job "mongodb_exporter_agent_id_95a66dd5-5dce-426e-a51e-ad332cca538e_hr" with labels {agent_id="/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e",agent_type="mongodb_exporter",environment="load_testing_10",instance="/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e",job="mongodb_exporter_agent_id_95a66dd5-5dce-426e-a51e-ad332cca538e_hr",machine_id="/machine_id/4c43fc40ab0043c4a55a9ba1e8882580",node_id="/node_id/422fc999-4340-4010-94e7-e025cf868a31",node_name="client_container_6802",node_type="container",service_id="/service_id/b76995fd-2ea5-4433-a9e4-d8b2757a010d",service_name="mongodb_39823809__client_container_50.116.57.109_10",service_type="mongodb"}: error when scraping "http://127.0.0.1:42002/metrics" with timeout 4.5s: timeout  agentID=/agent_id/2f2706f5-bed0-4bc9-9333-6589e8b45ad9 component=agent-process type=vm_agent
      WARN[2022-10-27T10:39:44.621+00:00] time="2022-10-27T10:39:44Z" level=warning msg="cannot load topology labels: cannot check if the instance is an arbiter: timed out while checking out a connection from connection pool: context canceled; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 0"  agentID=/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e component=agent-process type=mongodb_exporter
      ERRO[2022-10-27T10:39:44.621+00:00] time="2022-10-27T10:39:44Z" level=error msg="Registry - Cannot get node type to check if this is a mongos : timed out while checking out a connection from connection pool: context canceled; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 0"  agentID=/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e component=agent-process type=mongodb_exporter
      INFO[2022-10-27T10:39:49.619+00:00] 2022-10-27T10:39:49.619Z    error    VictoriaMetrics/lib/promscrape/scrapework.go:355    error when scraping "http://127.0.0.1:42002/metrics" from job "mongodb_exporter_agent_id_95a66dd5-5dce-426e-a51e-ad332cca538e_hr" with labels {agent_id="/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e",agent_type="mongodb_exporter",environment="load_testing_10",instance="/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e",job="mongodb_exporter_agent_id_95a66dd5-5dce-426e-a51e-ad332cca538e_hr",machine_id="/machine_id/4c43fc40ab0043c4a55a9ba1e8882580",node_id="/node_id/422fc999-4340-4010-94e7-e025cf868a31",node_name="client_container_6802",node_type="container",service_id="/service_id/b76995fd-2ea5-4433-a9e4-d8b2757a010d",service_name="mongodb_39823809__client_container_50.116.57.109_10",service_type="mongodb"}: error when scraping "http://127.0.0.1:42002/metrics" with timeout 4.5s: timeout  agentID=/agent_id/2f2706f5-bed0-4bc9-9333-6589e8b45ad9 component=agent-process type=vm_agent
      WARN[2022-10-27T10:39:49.622+00:00] time="2022-10-27T10:39:49Z" level=warning msg="cannot load topology labels: cannot check if the instance is an arbiter: timed out while checking out a connection from connection pool: context canceled; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 0"  agentID=/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e component=agent-process type=mongodb_exporter
      ERRO[2022-10-27T10:39:49.622+00:00] time="2022-10-27T10:39:49Z" level=error msg="Registry - Cannot get node type to check if this is a mongos : timed out while checking out a connection from connection pool: context canceled; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 0"  agentID=/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e component=agent-process type=mongodb_exporter
      INFO[2022-10-27T10:39:54.620+00:00] 2022-10-27T10:39:54.620Z    error    VictoriaMetrics/lib/promscrape/scrapework.go:355    error when scraping "http://127.0.0.1:42002/metrics" from job "mongodb_exporter_agent_id_95a66dd5-5dce-426e-a51e-ad332cca538e_hr" with labels {agent_id="/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e",agent_type="mongodb_exporter",environment="load_testing_10",instance="/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e",job="mongodb_exporter_agent_id_95a66dd5-5dce-426e-a51e-ad332cca538e_hr",machine_id="/machine_id/4c43fc40ab0043c4a55a9ba1e8882580",node_id="/node_id/422fc999-4340-4010-94e7-e025cf868a31",node_name="client_container_6802",node_type="container",service_id="/service_id/b76995fd-2ea5-4433-a9e4-d8b2757a010d",service_name="mongodb_39823809__client_container_50.116.57.109_10",service_type="mongodb"}: error when scraping "http://127.0.0.1:42002/metrics": the server closes all the connection attempts: the server closed connection before returning the first response byte. Make sure the server returns 'Connection: close' response header before closing the connection  agentID=/agent_id/2f2706f5-bed0-4bc9-9333-6589e8b45ad9 component=agent-process type=vm_agent
      WARN[2022-10-27T10:39:54.621+00:00] time="2022-10-27T10:39:54Z" level=warning msg="cannot load topology labels: cannot check if the instance is an arbiter: timed out while checking out a connection from connection pool: context canceled; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 0"  agentID=/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e component=agent-process type=mongodb_exporter
      ERRO[2022-10-27T10:39:54.622+00:00] time="2022-10-27T10:39:54Z" level=error msg="Registry - Cannot get node type to check if this is a mongos : timed out while checking out a connection from connection pool: context canceled; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 0"  agentID=/agent_id/95a66dd5-5dce-426e-a51e-ad332cca538e component=agent-process type=mongodb_exporter
      
       

      Attachments

        1. image (31) (1).png
          image (31) (1).png
          251 kB
        2. image (31).png
          image (31).png
          251 kB
        3. image (32).png
          image (32).png
          174 kB
        4. image (33).png
          image (33).png
          158 kB
        5. image (34).png
          image (34).png
          175 kB
        6. image (35).png
          image (35).png
          401 kB

        Issue Links

          Activity

            People

              Unassigned Unassigned
              puneet.kala Puneet Kala
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Smart Checklist