Cache disk disappeared overnight and returned as a unassigned device

ati · August 18, 2020

I recently setup a new Cache RAID-1 for running a few dockers. Nothing fanny so I used some old mechanical drives. I basically slapped them both into unRAID and assigned them to be cache drives and it did the rest of the work making the RAID-1 array. I then used the unBalance plugin to move the default 4 folders to those drives.

Overnight one of the cache drives went missing (screenshot 1).

What is strange is the drive in unRAID on the main page doesn't show the drive as missing (screenshot 2), but it does show up under the Unassigned Devices section.

What is even more strange to me is that when I go to the main unRAID dashboard it doesn't even show the same Unassigned Devices as the main page does (screenshot 3).

I am super lost and a little confused. I haven't stopped the array or restarted, but I am sure that'd fix the issue this time around. I am more interested in why it happened and how I can prevent it in the future. What most worries me is the drive doesn't show as missing in the webUI and the main and dashboard pages don't agree on the Unassigned Devices.

Edited August 18, 2020 by ati
Move pics

JorgeB · August 19, 2020

Please post the diagnostics: Tools -> Diagnostics

ati · August 19, 2020

I went to bed last night and realized I never posted the diagnostics. 🤦‍♂️

Since my first post I have removed 1 unassigned device (not the one in question) and added 3 more that are currently in a pre-clear process, otherwise nothing has changed.

unraid-diagnostics-20200819-0621.zip

JorgeB · August 19, 2020

Disk dropped offline and then reconnected with a different ID:

Aug 17 03:41:35 NAS kernel: sd 5:0:4:0: attempting task abort! scmd(00000000389ba74e)
Aug 17 03:41:35 NAS kernel: sd 5:0:4:0: [sdf] tag#118 CDB: opcode=0x35 35 00 00 00 00 00 00 00 00 00
Aug 17 03:41:35 NAS kernel: scsi target5:0:4: handle(0x000e), sas_address(0x50030480013831f5), phy(21)
Aug 17 03:41:35 NAS kernel: scsi target5:0:4: enclosure logical id(0x50030480013831ff), slot(9)
Aug 17 03:41:35 NAS kernel: sd 5:0:4:0: device_block, handle(0x000e)
Aug 17 03:41:37 NAS kernel: sd 5:0:4:0: device_unblock and setting to running, handle(0x000e)
Aug 17 03:41:37 NAS kernel: sd 5:0:4:0: [sdf] Synchronizing SCSI cache
Aug 17 03:41:37 NAS kernel: sd 5:0:4:0: [sdf] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00
Aug 17 03:41:37 NAS rc.diskinfo[8980]: SIGHUP received, forcing refresh of disks info.
Aug 17 03:41:38 NAS kernel: scsi 5:0:4:0: task abort: SUCCESS scmd(00000000389ba74e)
Aug 17 03:41:38 NAS kernel: scsi 5:0:4:0: rejecting I/O to dead device

Could be a connection issue, but SMART is also showing some issues, you should run an extended SMART test, also see here for better pool monitoring.

ati · August 19, 2020

Yeah, the disk has some pending sectors. I can understand it dropped because it's failing - that's fair.

That doesn't explain why the unRAID webUI still wasn't reading correctly on the dashboard page. Plus, if the drive reconnected, why didn't unRAID recognize that's back and add it back into the cache array? I'm a little lost why there is no notification of a missing disk whatsoever on the main page like there would be for a data drive.

Just seems like I'm missing something more...

Edited August 19, 2020 by ati

Cache disk disappeared overnight and returned as a unassigned device

Recommended Posts

ati

Link to comment

JorgeB

Link to comment

ati

Link to comment

JorgeB

Link to comment

ati

Link to comment

Join the conversation