• [6.8.3] HDDs are spinning, but WebGUI status shows spindown


    mgutt
    • Minor

    This is the WebGUI of my backup NAS:

    121329979_2020-10-2715_06_28.png.aa051a61c7a53834e56599e939504612.png

     

    As you can see it claims all disks are sleeping. But that's not true:

    smartctl -n standby -i /dev/sdb
    smartctl -n standby -i /dev/sdc
    smartctl -n standby -i /dev/sdd
    smartctl -n standby -i /dev/sdf
    smartctl -n standby -i /dev/sdg
    smartctl -n standby -i /dev/sdh
    

    They return all:

    Power mode is:    ACTIVE or IDLE

    "/dev/sde" (the SSD) is the only sleeping disk:

    smartctl -n standby -i /dev/sde
    smartctl 7.1 2019-12-30 r5022 [x86_64-linux-4.19.107-Unraid] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    Device is in STANDBY mode, exit(2)

    Only for testing I executed this command:

    mdcmd spindown 1

    Now disk 1 sleeps:

    smartctl -n standby -i /dev/sdh
    smartctl 7.1 2019-12-30 r5022 [x86_64-linux-4.19.107-Unraid] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    Device is in STANDBY mode, exit(2)

    Spin down delay is set to 30 minutes:

    2116594560_2020-10-2715_12_19.png.4df8a5a68fbbd52cb71985a6a58c7b61.png

     

    Fun fact:

    It worked for 2 weeks without problems until I:

    - opened this page: http://tower/Tools/SysDevs

    - then this page: http://tower/Tools/HardwareProfile

    - clicked on "Show Details", which spun up the disks

    - this took a little bit long, so I opened http://tower/Main

    - again http://tower/Tools/HardwareProfile

    - again clicked on "Show Details" and now I was able to see hardware informations

    - as it did not contain the brand of my RAM I executed "dmidecode -t memory", which did not display the brand name, but I found now the RAM part number

    - I closed the WebGUI and that's it

     

    It seems that clicking on "Show Details" killed the Cronjob to check the status of the disks?!

     

    Update, 2020-01-25

    Now I'm having the same problem on a complete different Unraid server, too.

     

    A manual execution of the smartctl command (that is used by Unraid to check the standby status), returns the correct value:

     

    "sdg" is in STANDBY mode

    root@Thoth:~# smartctl --nocheck standby -A /dev/sdg
    smartctl 7.1 2019-12-30 r5022 [x86_64-linux-4.19.107-Unraid] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    Device is in STANDBY mode, exit(2)

     

    "sdl" is spinning:

    root@Thoth:~# smartctl --nocheck standby -A /dev/sdl
    smartctl 7.1 2019-12-30 r5022 [x86_64-linux-4.19.107-Unraid] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF READ SMART DATA SECTION ===
    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -       0
      2 Throughput_Performance  0x0005   132   132   054    Pre-fail  Offline      -       96
      3 Spin_Up_Time            0x0007   159   159   024    Pre-fail  Always       -       412 (Average 414)
      4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       3195
      5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
      7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
      8 Seek_Time_Performance   0x0005   128   128   020    Pre-fail  Offline      -       18
      9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       6711
     10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
     12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       86
     22 Helium_Level            0x0023   100   100   025    Pre-fail  Always       -       100
    192 Power-Off_Retract_Count 0x0032   095   095   000    Old_age   Always       -       6414
    193 Load_Cycle_Count        0x0012   095   095   000    Old_age   Always       -       6414
    194 Temperature_Celsius     0x0002   157   157   000    Old_age   Always       -       38 (Min/Max 24/55)
    196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
    197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0

     

    But the dashboard doesn't show it AND sdl does not spindown anymore altough inactive for more than 30 minutes (my spindown setting):

    904049059_2021-01-2513_04_50.png.ef6d1931fe1c0a847794d881db414261.png

     

    So  I started monitoring smartctl executions by this:

    while true; do pid=$(pgrep 'smartctl' | head -1); if [[ -n "$pid" ]];  then ps -p "$pid" -o args && strace -v -t -p "$pid"; fi; done

     

    I tested it by executing smartctl with a different terminal and it catches it directly.

     

    Then I waited a long time and several errors were returned:

     

    Broken Pipe errors (happen often)

    strace: Process 8668 attached
    13:53:01 ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[16]=[85, 06, 2c, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, e5, 00], mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=60000, flags=0, status=02, masked_status=01, sb[22]=[72, 01, 00, 1d, 00, 00, 00, 0e, 09, 0c, 00, 00, 00, ff, 00, 00, 00, 00, 00, 00, 00, 50], host_status=0, driver_status=0x8, resid=0, duration=0, info=0x1}) = 0
    13:53:01 ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[16]=[85, 08, 0e, 00, 00, 00, 01, 00, 00, 00, 00, 00, 00, 00, ec, 00], mx_sb_len=32, iovec_count=0, dxfer_len=512, timeout=60000, flags=0, data[512]=[40, 00, ff, 3f, 37, c8, 10, 00, 00, 00, 00, 00, 3f, 00, 00, 00, 00, 00, 00, 00, 39, 31, 35, 30, 42, 41, 30, 38, 31, 32, 39, 31, ...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0
    13:53:01 brk(0x584000)                  = 0x584000
    13:53:01 brk(0x5a7000)                  = 0x5a7000
    13:53:01 brk(0x5c8000)                  = 0x5c8000
    13:53:01 ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[16]=[85, 08, 0e, 00, d0, 00, 01, 00, 00, 00, 4f, 00, c2, 00, b0, 00], mx_sb_len=32, iovec_count=0, dxfer_len=512, timeout=60000, flags=0, data[512]=[04, 00, 05, 32, 00, 64, 64, 00, 00, 00, 00, 00, 00, 00, 09, 32, 00, 64, 64, 1a, 22, 00, 00, 00, 00, 00, 0c, 32, 00, 64, 64, 7f, ...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=8, info=0}) = 0
    13:53:01 ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[16]=[85, 08, 0e, 00, d1, 00, 01, 00, 01, 00, 4f, 00, c2, 00, b0, 00], mx_sb_len=32, iovec_count=0, dxfer_len=512, timeout=60000, flags=0, data[512]=[04, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, ...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=0, info=0}) = 0
    13:53:01 write(1, "=== START OF READ SMART DATA SEC"..., 41) = 41
    13:53:01 ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[16]=[85, 06, 2c, 00, da, 00, 00, 00, 00, 00, 4f, 00, c2, 00, b0, 00], mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=60000, flags=0, status=02, masked_status=01, sb[22]=[72, 01, 00, 1d, 00, 00, 00, 0e, 09, 0c, 00, 00, 00, 00, 00, 00, 00, 4f, 00, c2, 00, 50], host_status=0, driver_status=0x8, resid=0, duration=8, info=0x1}) = 0
    13:53:01 write(1, "SMART overall-health self-assess"..., 57) = 57
    13:53:01 write(1, "\n", 1)              = -1 EPIPE (Broken pipe)
    13:53:01 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=8668, si_uid=0} ---
    13:53:01 close(3)                       = 0
    13:53:01 exit_group(0)                  = ?
    13:53:01 +++ exited with 0 +++
    
    
    strace: Process 20585 attached
    14:09:01 brk(0x5c7000)                  = 0x5c7000
    14:09:01 ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[16]=[85, 08, 0e, 00, d0, 00, 01, 00, 00, 00, 4f, 00, c2, 00, b0, 00], mx_sb_len=32, iovec_count=0, dxfer_len=512, timeout=60000, flags=0, data[512]=[10, 00, 01, 0b, 00, 64, 64, 00, 00, 00, 00, 00, 00, 00, 02, 05, 00, 84, 84, 60, 00, 00, 00, 00, 00, 00, 03, 07, 00, a0, a0, 9b, ...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=5, info=0}) = 0
    14:09:01 ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[16]=[85, 08, 0e, 00, d1, 00, 01, 00, 01, 00, 4f, 00, c2, 00, b0, 00], mx_sb_len=32, iovec_count=0, dxfer_len=512, timeout=60000, flags=0, data[512]=[10, 00, 01, 10, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 02, 36, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 03, 18, 00, 00, 00, 00, ...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=3, info=0}) = 0
    14:09:01 write(1, "=== START OF READ SMART DATA SEC"..., 41) = 41
    14:09:01 ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[16]=[85, 06, 2c, 00, da, 00, 00, 00, 00, 00, 4f, 00, c2, 00, b0, 00], mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=60000, flags=0, status=02, masked_status=01, sb[22]=[72, 01, 00, 1d, 00, 00, 00, 0e, 09, 0c, 00, 00, 00, 00, 00, 00, 00, 4f, 00, c2, 00, 50], host_status=0, driver_status=0x8, resid=0, duration=1, info=0x1}) = 0
    14:09:01 write(1, "SMART overall-health self-assess"..., 57) = 57
    14:09:01 write(1, "\n", 1)              = -1 EPIPE (Broken pipe)
    14:09:01 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=20585, si_uid=0} ---
    14:09:01 close(3)                       = 0
    14:09:01 exit_group(0)                  = ?
    14:09:01 +++ exited with 0 +++

     

    Unable to detect device t (happens every now and then)

    strace: Process 3649 attached
    strace: [ Process PID=3649 runs in x32 mode. ]
    strace: [ Process PID=3649 runs in 64 bit mode. ]
    13:57:01 read(3, "8:0x61b6)\n    \"ST(250|320|500|64"..., 4096) = 4096
    13:57:01 read(3, "   \"-v 12,raw48,Device_Power_Cyc"..., 4096) = 4096
    13:57:01 read(3, " 9,minutes\"\n  },\n  { \"Maxtor Dia"..., 4096) = 4096
    13:57:01 read(3, "                // it might need"..., 4096) = 4096
    13:57:01 read(3, "3[015]\",\n    \"\", \"\", \"\"\n  },\n  {"..., 4096) = 4096
    13:57:01 read(3, "   \"(Hitachi )?HDT7210((16|25)SL"..., 4096) = 4096
    13:57:01 read(3, " },\n  { \"Toshiba 2.5\\\" HDD MK..4"..., 4096) = 4096
    13:57:01 read(3, "04ACA500/FP1A\n    \"TOSHIBA MD04A"..., 4096) = 4096
    13:57:01 read(3, "    \"\", \"\", \"\"\n  },\n  { \"Seagate"..., 4096) = 4096
    13:57:01 read(3, "\"\", \"\"\n  },\n  { \"Seagate Barracu"..., 4096) = 4096
    13:57:01 read(3, "T3000DM001\",\n    \"\", \"\",\n    \"-v"..., 4096) = 4096
    13:57:01 read(3, "Constellation ES (SATA 6Gb/s)\", "..., 4096) = 4096
    13:57:01 read(3, "  \"\", \"\", \"\"\n  },\n  { \"Seagate M"..., 4096) = 4096
    13:57:01 read(3, "\"-v 187,raw48,Uncorrectable_ECC_"..., 4096) = 4096
    13:57:01 read(3, "bly explained by the WD firmware"..., 4096) = 4096
    13:57:01 read(3, "viar Green\", // tested with WDC "..., 4096) = 4096
    13:57:01 read(3, "/82.00A82,\n      // WDC WD80EFAX"..., 4096) = 4096
    13:57:01 read(3, "SB ID entries\n  ////////////////"..., 4096) = 4096
    13:57:01 read(3, " \"USB: Samsung; \",\n    \"0x04e8:0"..., 4096) = 4096
    13:57:01 read(3, "at\"\n  },\n  // Micron\n  { \"USB: M"..., 4096) = 4096
    13:57:01 read(3, "  { \"USB: Maxtor OneTouch 4; \",\n"..., 4096) = 4096
    13:57:01 read(3, "    \"\", // 0x0114\n    \"\", // 0x0"..., 4096) = 4096
    13:57:01 read(3, " usbjmicron\"\n  },\n  { \"USB: Verb"..., 4096) = 3817
    13:57:01 read(3, "", 4096)              = 0
    13:57:01 close(3)                       = 0
    13:57:01 lstat("/dev/", {st_mode=S_IFDIR|0755, st_size=3920, ...}) = 0
    13:57:01 write(1, "/dev/: Unable to detect device t"..., 36) = 36
    13:57:01 write(1, "Please specify device type with "..., 47) = 47
    13:57:01 write(1, "\nUse smartctl -h to get a usage "..., 41) = 41
    13:57:01 exit_group(1)                  = ?
    13:57:01 +++ exited with 1 +++

     

    And those spindown / standby checks appeared (shortened)

    14:40:01 openat(AT_FDCWD, "/dev/sdb", O_RDONLY|O_NONBLOCK) = 3
    14:40:01 write(1, "Device is in STANDBY mode, exit("..., 35) = 35
    14:53:44 openat(AT_FDCWD, "/dev/sdg", O_RDONLY|O_NONBLOCK) = 4
    14:53:44 write(1, "Device is in STANDBY mode, exit("..., 35) = 35
    14:55:01 openat(AT_FDCWD, "/dev/sdj", O_RDONLY|O_NONBLOCK) = 3
    14:55:01 write(1, "Device is in STANDBY mode, exit("..., 35) = 35
    14:59:01 openat(AT_FDCWD, "/dev/sdd", O_RDONLY|O_NONBLOCK) = 3
    14:59:01 write(1, "Device is in STANDBY mode, exit("..., 35) = 35
    15:02:01 openat(AT_FDCWD, "/dev/sdk", O_RDONLY|O_NONBLOCK) = 3
    15:02:01 write(1, "Device is in STANDBY mode, exit("..., 35) = 35
    15:02:01 openat(AT_FDCWD, "/dev/sdh", O_RDONLY|O_NONBLOCK) = 3
    15:02:01 write(1, "Device is in STANDBY mode, exit("..., 35) = 35
    15:11:01 openat(AT_FDCWD, "/dev/sdg", O_RDONLY|O_NONBLOCK) = 3
    15:11:01 write(1, "Device is in STANDBY mode, exit("..., 35) = 35
    15:11:01 openat(AT_FDCWD, "/dev/sdi", O_RDONLY|O_NONBLOCK) = 3
    15:11:01 write(1, "Device is in STANDBY mode, exit("..., 35) = 35
    15:11:01 openat(AT_FDCWD, "/dev/sdb", O_RDONLY|O_NONBLOCK) = 3
    15:11:01 write(1, "Device is in STANDBY mode, exit("..., 35) = 35
    15:11:01 openat(AT_FDCWD, "/dev/sdf", O_RDONLY|O_NONBLOCK) = 3
    15:11:01 write(1, "Device is in STANDBY mode, exit("..., 35) = 35
    15:30:01 openat(AT_FDCWD, "/dev/sdd", O_RDONLY|O_NONBLOCK) = 3
    15:30:01 write(1, "Device is in STANDBY mode, exit("..., 35) = 35
    15:33:01 openat(AT_FDCWD, "/dev/sdk", O_RDONLY|O_NONBLOCK) = 3
    15:33:01 write(1, "Device is in STANDBY mode, exit("..., 35) = 35
    

     

    In the last hour it did not check sdc, sdl and sde. Why?

     

    And why are these checks done in a random order and with a different amount of disks? 




    User Feedback

    Recommended Comments

    Ok, monitored even longer and now a check for sde appeared:

    16:34:02 openat(AT_FDCWD, "/dev/sde", O_RDONLY|O_NONBLOCK) = 3
    16:34:02 write(1, "Device is in STANDBY mode, exit("..., 35) = 35

     

    sdc is still missing, but sdl appeared multiple times with a pipe error:

    strace: Process 27020 attached
    strace: [ Process PID=27020 runs in x32 mode. ]
    strace: [ Process PID=27020 runs in 64 bit mode. ]
    17:15:01 read(3, "\"\", \"\"\n  },\n  { \"Seagate Barracu"..., 4096) = 4096
    17:15:01 read(3, "T3000DM001\",\n    \"\", \"\",\n    \"-v"..., 4096) = 4096
    17:15:01 read(3, "Constellation ES (SATA 6Gb/s)\", "..., 4096) = 4096
    17:15:01 read(3, "  \"\", \"\", \"\"\n  },\n  { \"Seagate M"..., 4096) = 4096
    17:15:01 read(3, "\"-v 187,raw48,Uncorrectable_ECC_"..., 4096) = 4096
    17:15:01 read(3, "bly explained by the WD firmware"..., 4096) = 4096
    17:15:01 read(3, "viar Green\", // tested with WDC "..., 4096) = 4096
    17:15:01 read(3, "/82.00A82,\n      // WDC WD80EFAX"..., 4096) = 4096
    17:15:01 read(3, "SB ID entries\n  ////////////////"..., 4096) = 4096
    17:15:01 read(3, " \"USB: Samsung; \",\n    \"0x04e8:0"..., 4096) = 4096
    17:15:01 read(3, "at\"\n  },\n  // Micron\n  { \"USB: M"..., 4096) = 4096
    17:15:01 read(3, "  { \"USB: Maxtor OneTouch 4; \",\n"..., 4096) = 4096
    17:15:01 read(3, "    \"\", // 0x0114\n    \"\", // 0x0"..., 4096) = 4096
    17:15:01 read(3, " usbjmicron\"\n  },\n  { \"USB: Verb"..., 4096) = 3817
    17:15:01 read(3, "", 4096)              = 0
    17:15:01 close(3)                       = 0
    17:15:01 lstat("/dev/sdl", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 176), ...}) = 0
    17:15:01 stat("/sys/devices", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 stat("/sys/block/sdl/device/..", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 access("/sys/block/sdl/device/../idVendor", F_OK) = -1 ENOENT (No such file or directory)
    17:15:01 stat("/sys/block/sdl/device/../..", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 access("/sys/block/sdl/device/../../idVendor", F_OK) = -1 ENOENT (No such file or directory)
    17:15:01 stat("/sys/block/sdl/device/../../..", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 access("/sys/block/sdl/device/../../../idVendor", F_OK) = -1 ENOENT (No such file or directory)
    17:15:01 stat("/sys/block/sdl/device/../../../..", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 access("/sys/block/sdl/device/../../../../idVendor", F_OK) = -1 ENOENT (No such file or directory)
    17:15:01 stat("/sys/block/sdl/device/../../../../..", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 access("/sys/block/sdl/device/../../../../../idVendor", F_OK) = -1 ENOENT (No such file or directory)
    17:15:01 stat("/sys/block/sdl/device/../../../../../..", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 access("/sys/block/sdl/device/../../../../../../idVendor", F_OK) = -1 ENOENT (No such file or directory)
    17:15:01 stat("/sys/block/sdl/device/../../../../../../..", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 lstat("/sys", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
    17:15:01 lstat("/sys/block", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 lstat("/sys/block/sdl", {st_mode=S_IFLNK|0777, st_size=0, ...}) = 0
    17:15:01 readlink("/sys/block/sdl", "../devices/pci0000:00/0000:00:1d"..., 4095) = 92
    17:15:01 lstat("/sys/devices", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 lstat("/sys/devices/pci0000:00", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 lstat("/sys/devices/pci0000:00/0000:00:1d.0", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 lstat("/sys/devices/pci0000:00/0000:00:1d.0/0000:05:00.0", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 lstat("/sys/devices/pci0000:00/0000:00:1d.0/0000:05:00.0/ata13", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 lstat("/sys/devices/pci0000:00/0000:00:1d.0/0000:05:00.0/ata13/host13", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 lstat("/sys/devices/pci0000:00/0000:00:1d.0/0000:05:00.0/ata13/host13/target13:0:0", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 lstat("/sys/devices/pci0000:00/0000:00:1d.0/0000:05:00.0/ata13/host13/target13:0:0/13:0:0:0", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 lstat("/sys/devices/pci0000:00/0000:00:1d.0/0000:05:00.0/ata13/host13/target13:0:0/13:0:0:0/block", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 lstat("/sys/devices/pci0000:00/0000:00:1d.0/0000:05:00.0/ata13/host13/target13:0:0/13:0:0:0/block/sdl", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 lstat("/sys/devices/pci0000:00/0000:00:1d.0/0000:05:00.0/ata13/host13/target13:0:0/13:0:0:0/block/sdl/device", {st_mode=S_IFLNK|0777, st_size=0, ...}) = 0
    17:15:01 readlink("/sys/devices/pci0000:00/0000:00:1d.0/0000:05:00.0/ata13/host13/target13:0:0/13:0:0:0/block/sdl/device", "../../../13:0:0:0", 4095) = 17
    17:15:01 lstat("/sys/devices/pci0000:00/0000:00:1d.0/0000:05:00.0/ata13/host13/target13:0:0/13:0:0:0", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
    17:15:01 openat(AT_FDCWD, "/sys/class/scsi_host/host13/proc_name", O_RDONLY) = 3
    17:15:01 read(3, "ahci\n", 31)          = 5
    17:15:01 close(3)                       = 0
    17:15:01 openat(AT_FDCWD, "/dev/sdl", O_RDONLY|O_NONBLOCK) = 3
    17:15:01 fcntl(3, F_SETFD, FD_CLOEXEC)  = 0
    17:15:01 ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[6]=[12, 00, 00, 00, 24, 00], mx_sb_len=32, iovec_count=0, dxfer_len=36, timeout=60000, flags=0, data[36]=[00, 00, 05, 02, 5b, 00, 00, 02, 41, 54, 41, 20, 20, 20, 20, 20, 48, 47, 53, 54, 20, 48, 55, 48, 37, 32, 31, 32, 31, 32, 41, 4c, ...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=0, info=0}) = 0
    17:15:01 ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[16]=[85, 08, 0e, 00, 00, 00, 01, 00, 00, 00, 00, 00, 00, 00, ec, 00], mx_sb_len=32, iovec_count=0, dxfer_len=512, timeout=60000, flags=0, data[512]=[5a, 04, ff, 3f, 37, c8, 10, 00, 00, 00, 00, 00, 3f, 00, 00, 00, 00, 00, 00, 00, 43, 38, 55, 48, 48, 44, 45, 45, 20, 20, 20, 20, ...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=0, info=0}) = 0
    17:15:01 ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[16]=[85, 06, 2c, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, e5, 00], mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=60000, flags=0, status=02, masked_status=01, sb[22]=[72, 01, 00, 1d, 00, 00, 00, 0e, 09, 0c, 00, 00, 00, 81, 00, 00, 00, 00, 00, 00, 00, 50], host_status=0, driver_status=0x8, resid=0, duration=0, info=0x1}) = 0
    17:15:01 ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[16]=[85, 08, 0e, 00, 00, 00, 01, 00, 00, 00, 00, 00, 00, 00, ec, 00], mx_sb_len=32, iovec_count=0, dxfer_len=512, timeout=60000, flags=0, data[512]=[5a, 04, ff, 3f, 37, c8, 10, 00, 00, 00, 00, 00, 3f, 00, 00, 00, 00, 00, 00, 00, 43, 38, 55, 48, 48, 44, 45, 45, 20, 20, 20, 20, ...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=0, info=0}) = 0
    17:15:01 brk(0x584000)                  = 0x584000
    17:15:01 brk(0x5a6000)                  = 0x5a6000
    17:15:01 brk(0x5c7000)                  = 0x5c7000
    17:15:01 ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[16]=[85, 08, 0e, 00, d0, 00, 01, 00, 00, 00, 4f, 00, c2, 00, b0, 00], mx_sb_len=32, iovec_count=0, dxfer_len=512, timeout=60000, flags=0, data[512]=[10, 00, 01, 0b, 00, 64, 64, 00, 00, 00, 00, 00, 00, 00, 02, 05, 00, 84, 84, 60, 00, 00, 00, 00, 00, 00, 03, 07, 00, 9f, 9f, 9c, ...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=5, info=0}) = 0
    17:15:01 ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[16]=[85, 08, 0e, 00, d1, 00, 01, 00, 01, 00, 4f, 00, c2, 00, b0, 00], mx_sb_len=32, iovec_count=0, dxfer_len=512, timeout=60000, flags=0, data[512]=[10, 00, 01, 10, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 02, 36, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 03, 18, 00, 00, 00, 00, ...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=3, info=0}) = 0
    17:15:01 write(1, "=== START OF READ SMART DATA SEC"..., 41) = 41
    17:15:01 ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[16]=[85, 06, 2c, 00, da, 00, 00, 00, 00, 00, 4f, 00, c2, 00, b0, 00], mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=60000, flags=0, status=02, masked_status=01, sb[22]=[72, 01, 00, 1d, 00, 00, 00, 0e, 09, 0c, 00, 00, 00, 00, 00, 00, 00, 4f, 00, c2, 00, 50], host_status=0, driver_status=0x8, resid=0, duration=1, info=0x1}) = 0
    17:15:01 write(1, "SMART overall-health self-assess"..., 57) = 57
    17:15:01 write(1, "\n", 1)              = -1 EPIPE (Broken pipe)
    17:15:01 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=27020, si_uid=0} ---
    17:15:01 close(3)                       = 0
    17:15:01 exit_group(0)                  = ?
    17:15:01 +++ exited with 0 +++

     

    Now I started a movie which is on sdl and the icon became directly active in the WebGUI:

     

    1114859516_2021-01-2517_51_23.png.a4b7181e99b927181c28e0347f5202c2.png

     

    So its only something related the standby status which fails.

     

    Link to comment

    One bug location found:

    13:57:01 lstat("/dev/", {st_mode=S_IFDIR|0755, st_size=3920, ...}) = 0
    13:57:01 write(1, "/dev/: Unable to detect device t"..., 36) = 36
    13:57:01 write(1, "Please specify device type with "..., 47) = 47
    13:57:01 write(1, "\nUse smartctl -h to get a usage "..., 41) = 41

     

    This is caused in /usr/local/emhttp/plugins/dynamix/scripts/monitor in this line:

    if (!file_exists($file) || (time()-filemtime($file)>$var['poll_attributes'])) exec("smartctl -n standby -H $type ".escapeshellarg("/dev/$port")."|grep -Pom1 '^SMART.*: \K[A-Z]+'|tr -d '\n' >".escapeshellarg($file));

     

    It happens because $port is sometimes empty:

    /usr/local/emhttp/plugins/dynamix/scripts/monitor170smartctl -n standby -H  '/dev/sdh'|grep -Pom1 '^SMART.*: \K[A-Z]+'|tr -d '
    ' >'/var/local/emhttp/smart/disk8.ssa'
    /usr/local/emhttp/plugins/dynamix/scripts/monitor170smartctl -n standby -H  '/dev/sdd'|grep -Pom1 '^SMART.*: \K[A-Z]+'|tr -d '
    ' >'/var/local/emhttp/smart/disk9.ssa'
    /usr/local/emhttp/plugins/dynamix/scripts/monitor170smartctl -n standby -H  '/dev/'|grep -Pom1 '^SMART.*: \K[A-Z]+'|tr -d '
    ' >'/var/local/emhttp/smart/parity2.ssa'
    /usr/local/emhttp/plugins/dynamix/scripts/monitor170smartctl -n standby -H  '/dev/sdj'|grep -Pom1 '^SMART.*: \K[A-Z]+'|tr -d '
    ' >'/var/local/emhttp/smart/cache.ssa'
    /usr/local/emhttp/plugins/dynamix/scripts/monitor170smartctl -n standby -H  '/dev/sdg'|grep -Pom1 '^SMART.*: \K[A-Z]+'|tr -d '
    ' >'/var/local/emhttp/smart/parity.ssa'

     

    I think its because of "parity2". I don't have a second parity. So this check should not happen.

    Link to comment
    On 1/26/2021 at 11:45 AM, mgutt said:

    It happens because $port is sometimes empty:

    
    
     

     

    Now found the reason why $port is sometimes empty:

    $disks = parse_ini_file("/var/local/emhttp/disks.ini",true);
    ...
    foreach ($disks as $disk) {
    ...
      check_smart($name,port_name($disk['smDevice'] ?? $disk['device']),$text,$info);

     

    Its because "/var/local/emhttp/disks.ini" contains data for parity2 which is not used:

    ["parity2"]
    idx="29"
    name="parity2"
    device=""
    id=""
    size="0"
    status="DISK_NP_DSBL"
    rotational=""
    format="-"
    temp="*"
    numReads="0"
    numWrites="0"
    numErrors="0"
    type="Parity"
    color="grey-off"
    fsStatus="-"
    spindownDelay="-1"
    spinupGroup=""
    deviceSb=""
    idSb=""
    sizeSb="0"


    As we can see there is no "smDevice" key and "device" is empty, which causes this error.

     

    Bug fix

     

    search in "/usr/local/emhttp/plugins/dynamix/scripts/monitor" for:

    if ($name=='flash' || substr($disk['status'],-3)=='_NP') continue;

     

    replace against:

    if ($name=='flash' || strpos($disk['status'],'_NP')) continue;

     

    This is because parity2 has the status "DISK_NP_DSBL" and all other empty disk slots have "DISK_NP".

     

    Now I try to find the other bugs ;)

    Link to comment
    Quote

    Broken Pipe errors (happen often)

     

    Ok, I found the reason in "/usr/local/emhttp/plugins/dynamix/scripts/monitor" for this, too. It's not really a bug. It happens because the output of smartctl is piped to grep and grep stops the pipe because "SMART" was found:

    if (!file_exists($file) || (time()-filemtime($file)>$var['poll_attributes'])) exec("smartctl -n standby -H $type ".escapeshellarg("/dev/$port")."|grep -Pom1 '^SMART.*: \K[A-Z]+'|tr -d '\n' >".escapeshellarg($file));

     

    Two examples, the second causes the broken pipe error:

    root@Thoth:~# smartctl -n standby -H  '/dev/sdc'|grep -Pom1 '^SMART.*: \K[A-Z]+'
    
    root@Thoth:~# smartctl -n standby -H  '/dev/sdc'
    smartctl 7.1 2019-12-30 r5022 [x86_64-linux-4.19.107-Unraid] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    Device is in STANDBY mode, exit(2)

     

    root@Thoth:~# smartctl -n standby -H  '/dev/sdb'|grep -Pom1 '^SMART.*: \K[A-Z]+'
    PASSED
    root@Thoth:~# smartctl -n standby -H  '/dev/sdb'
    smartctl 7.1 2019-12-30 r5022 [x86_64-linux-4.19.107-Unraid] (local build)
    Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED

     

    It could be fixed by passing the result to PHP first and let PHP check for "SMART", but it's ok to leave it as it is.

     

    Now I need to find the major bug: Why are some disks not monitored at all.

    Link to comment

    Interesting, I have a similar problem, automatic spindown worked for about two weeks then just stopped. New Install, new Hardware, all sorts of trying things out in the beginning (first build). I was running 6.9.rc2 at the time and thought it might get fixed with the stable release, which it didn't. Also tried all the file activity plugins, running without dockers and vms, nothing. Pretty sure I also tried the 'HardwareProfile' page in Tools at some point.

     

    Unneccesarry spin ups were sort of fixed with 'smartcl -n standby /dev/sdX' but still auto spindown doesn't work at all.

    I also run only one parity drive and got the irelevant second entry in '/var/local/emhttp/disks.ini'. I also get the pipe errors etc. same as you.

     

    I added the fix in '/usr/local/emhttp/plugins/dynamix/scripts/monitor' but I suppose that is primarily to fix your original issue, which I didn't notice.

     

    I'm glad someone else with more knowledge got the problem as well, I got the feeling there is more people (some bug reports indicting similiar problems) and believe spin down not working should be treated as a little more than a minor bug. Looking forward to your diagnosis!

     

     

    Link to comment

    Nevermind, I just uninstalled 'Dynamix Auto Fan Control' and my problems went away, everything is spinning down as set up, now all that remains is to test wether the plugin works with spinning drives excluded.

     

    Link to comment

    I have a similar problem too, in version 6.9.0 RC2 and above.

    I worte things about that in another threat (I think it was in wrong section).

     

    Quote

    I got the same error. I noticed this error with 6.9.0 RC2. Before that Version everything was working like expected.

    In the Array Monitor the disks were shown as spin down, but I can hear the disks running.

     

    Edit:

    My problem is mayby a bit different as the described one above.

    I'm using a small server with an external 4-bay USB 3.0 storage device from Fantec. It is connected by USB 3.0 and I have 4 WD Red in that case.

    Additionally 2 WD Red + 1 SSD in the server case connected via S-ATA.

     

    Since UNRaid 6.9.0 RC2 the disks in the external case were not detected corretly. If the drives spin-up because of an access, UNRaid doesn't notice that. I got no SMART values. And because of that they won't spin-down correctly, because UNRaid doesn't know that they're up. But if I spin them up manually everything is working fine. They spin down after 30 min and I get the values.

     

    Every drive on the screenshot should be green, because they're running.

     

    I hope, that someony has an idea what I can do.

     

    Quote

    I'm back on 6.9.0 RC1, the last working build for me.

     

    I found a small difference in the log between 6.9.0 RC1 and 6.9.0. If I access one of these external drives than appears the following message in the log file:

    "Mar 3 08:15:29 Media-Server emhttpd: read SMART /dev/sdc"

    That doesn't happen in the release version and that's the reason why UNRaid isn't able to spin-down the drives, because UNRaid doesn't notice that the access happens and the drive spun-up.

     

    I attached a new diagnostic file from RC1.

     

    I've attached the diagnostics from 6.9.0 and 6.9.0 RC1.

    UNRaidArray.PNG

    6.9.0RC1_media-server-diagnostics-20210303-0816.zip 6.9.0_media-server-diagnostics-20210302-1301.zip

    Link to comment
    2 minutes ago, mgutt said:

    Your problem is different and can't be solved as its a hardware limitation:

    https://forums.unraid.net/bug-reports/stable-releases/683-usb-hdds-randomly-spin-up-but-status-stays-unchanged-r1091/

     

     

    Hmm okay, but I think its not the same as in your threat, because I had no problems with the beta releases of 6.9.0 and RC1, the problem just appears at RC2. And the disks don't spin up randomly, they spin up, if an access happens, but UNRaid doesn't notice that.

     

    Should I open a new threat?

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.