• [6.9.2] node-exporter can't parse mdstat to get disk information for prometheus


    Galileo
    • Solved Minor

    Linux EVERSTORE 5.10.28-Unraid #1 SMP Wed Apr 7 08:23:18 PDT 2021 x86_64 Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz GenuineIntel GNU/Linux

     

    Just upgraded to 6.9.2 from 6.8.3 and it seems that the new kernel has a different way of organizing the /proc/mdstat for disk information. I use node-exporter and prometheus+grafana for monitoring and the node-exporter is complaining that it can't parse the /proc/mdstat anymore, so I get no disk information in my prometheus.

     

    Error from node-exporter: level=error ts=2021-11-10T14:48:47.062Z caller=collector.go:169 msg="collector failed" name=mdadm duration_seconds=0.001066931 err="error parsing mdstatus: error parsing mdstat \"/proc/mdstat\": not enough fields in mdline (expected at least 3): sbName=/boot/config/super.dat

     

    Contents of /proc/mdstat:

     

    root@EVERSTORE:~# cat /proc/mdstat
    sbName=/boot/config/super.dat
    sbVersion=2.9.4
    sbCreated=1544236123
    sbUpdated=1636292320
    sbEvents=89
    sbState=1
    sbNumDisks=7
    sbLabel=0951-1666-756D-B2B0D7B406C6
    sbSynced=1630814401
    sbSynced2=1630873023
    sbSyncErrs=0
    sbSyncExit=0
    mdVersion=2.9.17
    mdState=STARTED
    mdNumDisks=6
    mdNumDisabled=1
    mdNumReplaced=0
    mdNumInvalid=1
    mdNumMissing=0
    mdNumWrong=0
    mdNumNew=0
    mdSwapP=0
    mdSwapQ=0
    mdResyncAction=check P
    mdResyncSize=7814026532
    mdResyncCorr=0
    mdResync=0
    mdResyncPos=0
    mdResyncDt=0
    mdResyncDb=0
    diskNumber.0=0
    diskName.0=
    diskSize.0=7814026532
    diskState.0=7
    diskId.0=ST8000DM004-2CX188_ZCT0F32K
    rdevNumber.0=0
    rdevStatus.0=DISK_OK
    rdevName.0=sdf
    rdevOffset.0=64
    rdevSize.0=7814026532
    rdevId.0=ST8000DM004-2CX188_ZCT0F32K
    rdevNumErrors.0=0
    diskNumber.1=1
    diskName.1=md1
    diskSize.1=7814026532
    diskState.1=7
    diskId.1=ST8000DM004-2CX188_ZCT0EZTS
    rdevNumber.1=1
    rdevStatus.1=DISK_OK
    rdevName.1=sde
    rdevOffset.1=64
    rdevSize.1=7814026532
    rdevId.1=ST8000DM004-2CX188_ZCT0EZTS
    rdevNumErrors.1=0
    diskNumber.2=2
    diskName.2=md2
    diskSize.2=7814026532
    diskState.2=7
    diskId.2=ST8000DM004-2CX188_ZCT0F3CX
    rdevNumber.2=2
    rdevStatus.2=DISK_OK
    rdevName.2=sdj
    rdevOffset.2=64
    rdevSize.2=7814026532
    rdevId.2=ST8000DM004-2CX188_ZCT0F3CX
    rdevNumErrors.2=0
    diskNumber.3=3
    diskName.3=md3
    diskSize.3=5860522532
    diskState.3=7
    diskId.3=ST6000DM004-2EH11C_ZA18BST1
    rdevNumber.3=3
    rdevStatus.3=DISK_OK
    rdevName.3=sdi
    rdevOffset.3=64
    rdevSize.3=5860522532
    rdevId.3=ST6000DM004-2EH11C_ZA18BST1
    rdevNumErrors.3=0
    diskNumber.4=4
    diskName.4=md4
    diskSize.4=5860522532
    diskState.4=7
    diskId.4=HGST_HDN726060ALE614_K1G7ZEVB
    rdevNumber.4=4
    rdevStatus.4=DISK_OK
    rdevName.4=sdg
    rdevOffset.4=64
    rdevSize.4=5860522532
    rdevId.4=HGST_HDN726060ALE614_K1G7ZEVB
    rdevNumErrors.4=0
    diskNumber.5=5
    diskName.5=md5
    diskSize.5=5860522532
    diskState.5=7
    diskId.5=HGST_HDN726060ALE614_K1G75U3B
    rdevNumber.5=5
    rdevStatus.5=DISK_OK
    rdevName.5=sdh
    rdevOffset.5=64
    rdevSize.5=5860522532
    rdevId.5=HGST_HDN726060ALE614_K1G75U3B
    rdevNumErrors.5=0
    diskNumber.6=6
    diskName.6=
    diskSize.6=0
    diskState.6=0
    diskId.6=
    rdevNumber.6=6
    rdevStatus.6=DISK_NP
    rdevName.6=
    rdevOffset.6=0
    rdevSize.6=0
    rdevId.6=
    rdevNumErrors.6=0
    diskNumber.7=7
    diskName.7=
    diskSize.7=0
    diskState.7=0
    diskId.7=
    rdevNumber.7=7
    rdevStatus.7=DISK_NP
    rdevName.7=
    rdevOffset.7=0
    rdevSize.7=0
    rdevId.7=
    rdevNumErrors.7=0
    diskNumber.8=8
    diskName.8=
    diskSize.8=0
    diskState.8=0
    diskId.8=
    rdevNumber.8=8
    rdevStatus.8=DISK_NP
    rdevName.8=
    rdevOffset.8=0
    rdevSize.8=0
    rdevId.8=
    rdevNumErrors.8=0
    diskNumber.9=9
    diskName.9=
    diskSize.9=0
    diskState.9=0
    diskId.9=
    rdevNumber.9=9
    rdevStatus.9=DISK_NP
    rdevName.9=
    rdevOffset.9=0
    rdevSize.9=0
    rdevId.9=
    rdevNumErrors.9=0
    diskNumber.10=10
    diskName.10=
    diskSize.10=0
    diskState.10=0
    diskId.10=
    rdevNumber.10=10
    rdevStatus.10=DISK_NP
    rdevName.10=
    rdevOffset.10=0
    rdevSize.10=0
    rdevId.10=
    rdevNumErrors.10=0
    diskNumber.11=11
    diskName.11=
    diskSize.11=0
    diskState.11=0
    diskId.11=
    rdevNumber.11=11
    rdevStatus.11=DISK_NP
    rdevName.11=
    rdevOffset.11=0
    rdevSize.11=0
    rdevId.11=
    rdevNumErrors.11=0
    diskNumber.12=12
    diskName.12=
    diskSize.12=0
    diskState.12=0
    diskId.12=
    rdevNumber.12=12
    rdevStatus.12=DISK_NP
    rdevName.12=
    rdevOffset.12=0
    rdevSize.12=0
    rdevId.12=
    rdevNumErrors.12=0
    diskNumber.13=13
    diskName.13=
    diskSize.13=0
    diskState.13=0
    diskId.13=
    rdevNumber.13=13
    rdevStatus.13=DISK_NP
    rdevName.13=
    rdevOffset.13=0
    rdevSize.13=0
    rdevId.13=
    rdevNumErrors.13=0
    diskNumber.14=14
    diskName.14=
    diskSize.14=0
    diskState.14=0
    diskId.14=
    rdevNumber.14=14
    rdevStatus.14=DISK_NP
    rdevName.14=
    rdevOffset.14=0
    rdevSize.14=0
    rdevId.14=
    rdevNumErrors.14=0
    diskNumber.15=15
    diskName.15=
    diskSize.15=0
    diskState.15=0
    diskId.15=
    rdevNumber.15=15
    rdevStatus.15=DISK_NP
    rdevName.15=
    rdevOffset.15=0
    rdevSize.15=0
    rdevId.15=
    rdevNumErrors.15=0
    diskNumber.16=16
    diskName.16=
    diskSize.16=0
    diskState.16=0
    diskId.16=
    rdevNumber.16=16
    rdevStatus.16=DISK_NP
    rdevName.16=
    rdevOffset.16=0
    rdevSize.16=0
    rdevId.16=
    rdevNumErrors.16=0
    diskNumber.17=17
    diskName.17=
    diskSize.17=0
    diskState.17=0
    diskId.17=
    rdevNumber.17=17
    rdevStatus.17=DISK_NP
    rdevName.17=
    rdevOffset.17=0
    rdevSize.17=0
    rdevId.17=
    rdevNumErrors.17=0
    diskNumber.18=18
    diskName.18=
    diskSize.18=0
    diskState.18=0
    diskId.18=
    rdevNumber.18=18
    rdevStatus.18=DISK_NP
    rdevName.18=
    rdevOffset.18=0
    rdevSize.18=0
    rdevId.18=
    rdevNumErrors.18=0
    diskNumber.19=19
    diskName.19=
    diskSize.19=0
    diskState.19=0
    diskId.19=
    rdevNumber.19=19
    rdevStatus.19=DISK_NP
    rdevName.19=
    rdevOffset.19=0
    rdevSize.19=0
    rdevId.19=
    rdevNumErrors.19=0
    diskNumber.20=20
    diskName.20=
    diskSize.20=0
    diskState.20=0
    diskId.20=
    rdevNumber.20=20
    rdevStatus.20=DISK_NP
    rdevName.20=
    rdevOffset.20=0
    rdevSize.20=0
    rdevId.20=
    rdevNumErrors.20=0
    diskNumber.21=21
    diskName.21=
    diskSize.21=0
    diskState.21=0
    diskId.21=
    rdevNumber.21=21
    rdevStatus.21=DISK_NP
    rdevName.21=
    rdevOffset.21=0
    rdevSize.21=0
    rdevId.21=
    rdevNumErrors.21=0
    diskNumber.22=22
    diskName.22=
    diskSize.22=0
    diskState.22=0
    diskId.22=
    rdevNumber.22=22
    rdevStatus.22=DISK_NP
    rdevName.22=
    rdevOffset.22=0
    rdevSize.22=0
    rdevId.22=
    rdevNumErrors.22=0
    diskNumber.23=23
    diskName.23=
    diskSize.23=0
    diskState.23=0
    diskId.23=
    rdevNumber.23=23
    rdevStatus.23=DISK_NP
    rdevName.23=
    rdevOffset.23=0
    rdevSize.23=0
    rdevId.23=
    rdevNumErrors.23=0
    diskNumber.24=24
    diskName.24=
    diskSize.24=0
    diskState.24=0
    diskId.24=
    rdevNumber.24=24
    rdevStatus.24=DISK_NP
    rdevName.24=
    rdevOffset.24=0
    rdevSize.24=0
    rdevId.24=
    rdevNumErrors.24=0
    diskNumber.25=25
    diskName.25=
    diskSize.25=0
    diskState.25=0
    diskId.25=
    rdevNumber.25=25
    rdevStatus.25=DISK_NP
    rdevName.25=
    rdevOffset.25=0
    rdevSize.25=0
    rdevId.25=
    rdevNumErrors.25=0
    diskNumber.26=26
    diskName.26=
    diskSize.26=0
    diskState.26=0
    diskId.26=
    rdevNumber.26=26
    rdevStatus.26=DISK_NP
    rdevName.26=
    rdevOffset.26=0
    rdevSize.26=0
    rdevId.26=
    rdevNumErrors.26=0
    diskNumber.27=27
    diskName.27=
    diskSize.27=0
    diskState.27=0
    diskId.27=
    rdevNumber.27=27
    rdevStatus.27=DISK_NP
    rdevName.27=
    rdevOffset.27=0
    rdevSize.27=0
    rdevId.27=
    rdevNumErrors.27=0
    diskNumber.28=28
    diskName.28=
    diskSize.28=0
    diskState.28=0
    diskId.28=
    rdevNumber.28=28
    rdevStatus.28=DISK_NP
    rdevName.28=
    rdevOffset.28=0
    rdevSize.28=0
    rdevId.28=
    rdevNumErrors.28=0
    diskNumber.29=29
    diskName.29=
    diskSize.29=0
    diskState.29=4
    diskId.29=
    rdevNumber.29=29
    rdevStatus.29=DISK_NP_DSBL
    rdevName.29=
    rdevOffset.29=0
    rdevSize.29=0
    rdevId.29=
    rdevNumErrors.29=0

     

    Format that node-exporter expects the /proc/mdstat to be in:

     

    https://github.com/prometheus/node_exporter/blob/master/collector/fixtures/proc/mdstat

     

    Was this changed in this new version of unraid? How can I make this work? Searching google found nobody complaining, so maybe it's just a ME issue.

     

    Thanks.

     

     




    User Feedback

    Recommended Comments

    I fixed this issue. The node-exporter docker required access to the / filesystem and a flag added. Add /host mapping read-only to / and then add the docker post argument of --path.rootfs=/host.

     

    That will allow it to see all the filesystems correctly. 

    Link to comment

    I already have the mapping to host and post argument in my container configuration, but getting the same error. Is there any other things to try / ways to troubleshoot?

    Link to comment

    Okay, so I've taught myself go and looked at the source code and like Galileo mentioned, node exported is expecting the format of /proc/mdstat to be like in the link he posted, i.e. with spaces.

    My /proc/mdstat is similar to Galileo's and doesn't have spaces, so node exporter isn't able to parse my /proc/mdstat
    I don't understand how what he did could have solved the problem...

    Link to comment
    15 minutes ago, gamerkonks said:

    Okay, so I've taught myself go and looked at the source code and like Galileo mentioned, node exported is expecting the format of /proc/mdstat to be like in the link he posted, i.e. with spaces.

    My /proc/mdstat is similar to Galileo's and doesn't have spaces, so node exporter isn't able to parse my /proc/mdstat
    I don't understand how what he did could have solved the problem...

    As far as I know the format has always been like that in Unraid.    
     

    Do not forget that Unraid does not use the standard Linux md driver as it instead uses an Unraid specific one so it is possible that more traditional Linux systems do use a slightly different format.

    Link to comment

    Yeah, I figured as much.

    I don't think I even need the data from that collector, so now I'm just trying to pass a flag to disable it (--no-collector.mdadm)

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.