Uploaded image for project: 'Percona Monitoring and Management'
  1. Percona Monitoring and Management
  2. PMM-7503

[BE] See all and available resources - CPU & Memory

Details

    • 5
    • 04 - Server Integrations
    • Yes
    • Yes
    • Server Integrations

    Description

      User story:

      • As a DBaaS user, I need to see available resources and quotas on my Kubernetes cluster so that I can understand it's size

      Acceptance criteria

      • user see the full capacity of the Cluster (cpu, memory) and it's every separate component
      • User able to see free resources in the cluster
      • Data for resources updated every 5s

      Out of scope:

      • This data will be based on requests not limits because they are always present in our clusters deployments, limits are not. Ability to choose between sourcing the data from limits and requests is out of the scope.
      • Disk size is out of scope, I will include it in API so we can avoid additional API PRs in theĀ  future. But fields will return 0 values for disk sizes for now.

      Suggested implementation:

      • To get all resources, go through every node that we can schedule on and sum their status->allocatable.
      • To get available resoures, go through all pods in all namespaces, sum their requests and substract it from all resources.

      Details:

      • Add api to pmm /v1/management/DBaaS/Kubernetes/Resources/Get that returns
        • {
          "resources": {"cpu_m":12000, "memory_bytes":4154899878, "disk_size":46498798789984}, 
          "available_resources": {"cpu_m":11000, "memory_bytes":41548998, "disk_size":46498798789}
          }
          
      • Add Resources API to dbaas-api
        • message GetResourcesRequest {
            KubeAuth kube_auth = 1;
          }
          
          message Resources {
             int64 memory_bytes = 1;
             int64 cpu_m = 2;
             int64 disk_size = 2;
          }
          
          message GetResourcesResponse {
             Resources all_resources = 1;  
             Resources available_resources = 2;
          }
          
          // KubernetesClusterAPI provides APIs for managing Kubernetes clusters.
          service KubernetesClusterAPI {
            // CheckKubernetesClusterConnection checks connection to kubernetes clusters.
            rpc CheckKubernetesClusterConnection(CheckKubernetesClusterConnectionRequest) returns (CheckKubernetesClusterConnectionResponse);
            rpc GetResources(GetResourcesRequest) returns (GetResourcesResponse);
          }
          
          
      • Connect these APIs in pmm-managed

      How to test

      • Latest FB: https://github.com/Percona-Lab/pmm-submodules/pull/1564#issuecomment-789941887
      • UI tests failing seems to be false positive as it is not related to my changes.
      • If not able to test with UI counter part PMM-7486, please run this command:
         curl 'http://localhost/v1/management/DBaaS/Kubernetes/Resources/Get' -H 'User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:86.0) Gecko/20100101 Firefox/86.0' -H 'Accept: application/json, text/plain, */*' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Referer: https://localhost/graph/dbaas' -H 'content-type: application/json' -H 'Origin: https://localhost' -H 'Connection: keep-alive' -H 'Cookie: grafana_session=e0d28b194e55612cb138c3d2b78dd8f1' -H 'TE: Trailers' --data-raw '{"kubernetes_cluster_name":"minikube"}'
        
      • it should return response similar to this one:
        {
          "all": {
            "memory_bytes": "16811933696",
            "cpu_m": "12000"
          },
          "available": {
            "memory_bytes": "14949604761",
            "cpu_m": "9430"
          }
        }
        
      • We don't include disk sizes because it is out of the scope of this ticket.
      • all field is a sum of alocatable resources of Kubernetes cluster nodes that are scheduleable. For example we omit master node if it is tainted to forbid scheduling => taint node-role.kubernetes.io/master:NoSchedule
      • available field is a sum of requests of every pod in the cluster. This does not take into account terminated init containers and pending pods. Also completed pods - failed or succeeded - are excluded.

      Attachments

        Issue Links

          Activity

            People

              jan.prukner Jan Prukner (Inactive)
              jan.prukner Jan Prukner (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Smart Checklist