I was performing a routine upgrade from an old 1.5 TB drive to my previous 8TB parity drive (upgraded parity to 14 TB about a week back). Upon boot, I saw that I needed to assign disk 3 as expected. However, once I started the array and commenced data rebuild, disk 1 immediately entered an error state. Unraid alerts post to a slack channel I set up:
Files Davis 4:46 PM
Warning [TOWER] - Disk 3, drive not ready, content being reconstructed
WDC_WD80EMAZ-00WJTA0_7JJYY38C (sdd)
4:47
Alert [TOWER] - Disk 1 in error state (disk dsbl)
WDC_WD30EFRX-68AX9N0_WD-WMC1T3212961 (sdb)
I then lost remote access to system completely (web/ssh/ping), although server was still powered on. I run it headless and didn't have a monitor handy so I power-cycled first to see if anything would change. Server booted but I lost ping again shortly after and never got to GUI on this boot. Pulled server and swapped SATA cable for Disk 1 with a spare, while migrating Disk 3 back to original drive, and booted again. Still seeing errors on Disk 1 and Disk 3 now shows not installed, guessing that was because I did commit the prior change before the Disk 1 problem. Started array in maint mode to run file system check as recommended in wiki. Results of reiserfsck on Disk 1:
reiserfsck 3.6.27
Will read-only check consistency of the filesystem on /dev/md1
Will put log info to 'stdout'
The problem has occurred looks like a hardware problem. If you have
bad blocks, we advise you to get a new hard drive, because once you
get one bad block that the disk drive internals cannot hide from
your sight,the chances of getting more are generally said to become
much higher (precise statistics are unknown to us), and this disk
drive is probably not expensive enough for you to you to risk your
time and data on it. If you don't want to follow that follow that
advice then if you have just a few bad blocks, try writing to the
bad blocks and see if the drive remaps the bad blocks (that means
it takes a block it has in reserve and allocates it for use for
of that block number). If it cannot remap the block, use badblock
option (-B) with reiserfs utils to handle this block correctly.
bread: Cannot read the block (2): (Input/output error).
I'm at a point where I'm not sure what my next step should be to reduce potential for data loss. Since I've only been running single parity, I have an unrecoverable array currently, but I do still have the 1.5 TB drive with whatever data it contained and believe it to be in a working state. Diagnostics attached, albeit from my most recent boot only. I have not shut down or made any further array changes.
tower-diagnostics-20220327-1856.zip