Uploaded image for project: 'Percona Monitoring and Management'
  1. Percona Monitoring and Management
  2. PMM-6189

Disk Details Dashboard: Disk IO Size chart larger by factor of 512

Details

    • Bug
    • Status: Done
    • Medium
    • Resolution: Fixed
    • 2.8.0
    • 2.9.1
    • Grafana Dashboards
    • None
    • 0
    • Yes
    • Yes

    Description

      Panel "Disk IO Size" on "OS/Disk Details" dashboard shows wrong data, multiplying average IO size by 512.

      Disk IO Size panel source
      sum(
      (rate(node_disk_read_bytes_total{node_name="$node_name", device=~"$device"}[$__interval]) * 512 / 
      rate(node_disk_reads_completed_total{node_name="$node_name", device=~"$device"}[$__interval])) > 0 or 
      (irate(node_disk_read_bytes_total{node_name="$node_name", device=~"$device"}[5m]) * 512 / 
      irate(node_disk_reads_completed_total{node_name="$node_name", device=~"$device"}[5m])) > 0
      )
      sum(
      (rate(node_disk_written_bytes_total{node_name="$node_name", device=~"$device"}[$__interval]) * 512 / 
      rate(node_disk_writes_completed_total{node_name="$node_name", device=~"$device"}[$__interval])) > 0 or 
      (irate(node_disk_written_bytes_total{node_name="$node_name", device=~"$device"}[5m]) * 512 / 
      irate(node_disk_writes_completed_total{node_name="$node_name", device=~"$device"}[5m])) > 0
      )
      

      Since early 2018 Prometheus switched these metrics to actual bytes instead of sectors. See https://github.com/prometheus/node_exporter/blob/master/collector/diskstats_linux.go#L105 and https://github.com/prometheus/node_exporter/pull/787

      Can be easily confirmed by running a FIO test on an otherwise-idle server.

      fio command
      # fio --name=randrw --rw=randrw -direct=1 --ioengine=libaio --bs=16k --numjobs=4 --rwmixread=30 --size=1G --runtime=1200 --group_reporting --time_based
      

      This should (and does) generate a steady stream of 16kb-sized IO requests, but Disk IO Size will show a steady stream of 8mb-sized IO requests, or exactly 512 times larger.

      Attachments

        Activity

          People

            Unassigned Unassigned
            sergey.kuzmichev Sergey Kuzmichev (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - Not Specified
                Not Specified
                Logged:
                Time Spent - 1 hour, 45 minutes
                1h 45m

                Smart Checklist