January 1, 20215 yr I recently moved to a new house, the server has not been on for a couple months. Today it booted and I noticed 1 drive disabled. Figured it was a loose sata cable from transit. I installed 6.9.0-rc2, updated all my plugins and restarted. I swapped some sata cables around, this time 3 drives were detected with read errors and unable to write to the drives. I ended up using new config and trusting all drive assignments to recover. Currently the parity rebuild is half way done and will take another day. My 15GB log is 100% full due to drive errors from 2 drives, which are currently in standby. What is the path to resolution for those drives? I read somewhere that standby mode means that UNRAID was able to rewrite the offending sectors and thus did not disable the drives, so will it reenable them after the parity rebuild is complete so they can be overwritten? medialan-diagnostics-20210101-0918.zip
January 1, 20215 yr On mobile now so can't look at Diagnostics. No point in continuing with parity rebuild, how could it be good when those drives aren't even being read.
January 1, 20215 yr Author I see your point, its only got a couple hours left, im going to let it finish because without those 2 drives parity will still be valid hopefully preventing further data loss. I was sort of expecting the parity rebuild to fail by now, giving me more info on what might have caused this...
January 2, 20215 yr 5 hours ago, GreenPenguin said: im going to let it finish because without those 2 drives parity will still be valid No, parity won't be valid and I don't know why you would think it would. Multiple disk errors may be a power or controller issue. If you were mucking about in the case it might just be a case of bad connections on multiple disks.
January 2, 20215 yr Author im going to try a direct sata connection bypassing the hba. I strongly suspect a bad connection on multiple drives due to the nature of my ....installation. I do not have a clear understanding of the criteria when UNRAID decides to disable a disk or force standby (disks will not spin up) is there documentation for this feature somewhere?
January 2, 20215 yr Author disk 10 connected with a warning showing unmountable. i started array in maintenance mode to run xfs filesystem repair, basically this thread, waiting for a secondary superblock now.
January 2, 20215 yr Unraid disables a disk whenever a write to it fails, because it is out-of-sync with the array at that point and needs to be rebuilt. You can't have more disabled disks than parity disks. Your screenshot is showing both parity invalid because they are being rebuilt, and it is showing disks 5 and 10 not spunup because it can't communicate with them.
January 4, 20215 yr Author I ended up having to format the drives to be recognized. I did purchase a replacement sff-8644 breakout cable but is working again now without replacement. Parity was recompleted with all drives, thanks for the quick replies. It is frustrating that this happens due to a loose connection and cannot be resolved with a simple reconnection and system reboot.
January 4, 20215 yr 44 minutes ago, GreenPenguin said: It is frustrating that this happens due to a loose connection and cannot be resolved with a simple reconnection and system reboot. On 1/1/2021 at 9:31 PM, trurl said: Unraid disables a disk whenever a write to it fails, because it is out-of-sync with the array at that point and needs to be rebuilt. In order to get the array back in sync the data disk has to be rebuilt, or parity has to be rebuilt. Since it is the data disk that is out-of-sync it usually makes more sense to rebuild the data disk. The whole point of parity is to allow the system to continue to function when a disk isn't working. If the disk has a write fail it is disabled and emulated by parity. It won't be accessed again until rebuild, but the data can still be read from the parity calculation. That initial failed write and all subsequent writes to that emulated disk updates parity, so the data can even continue to be written, and rebuild will allow all those writes to be recovered from parity. So, as you can see, it isn't as simple as just fixing the connection and rebooting. Some people don't even realize a disk is disabled because it all appears to be working due to the emulation and they haven't looked at the webUI to see there is a problem until much later. It is very important to setup Notifications to alert you immediately by email or other agent as soon as a problem is detected. Don't let one problem become multiple problems that parity can't recover from and so data loss.
January 4, 20215 yr Parity doesn't contain any of your data. Parity is a common concept in computing and communications. It is basically the same idea wherever it is used. Parity is just an extra bit that allows a missing bit to be calculated from all the other bits. In the case of Unraid, the parity disk allows the data for a missing disk to be calculated from all the other disks. Parity isn't very complicated and understanding it can help make sense of many things about how Unraid works with the disks, and how you work with Unraid. Many of us that offer advice on the forum can do so because we understand parity. Here is the Unraid wiki on parity: https://wiki.unraid.net/UnRAID_6/Overview#Parity-Protected_Array
January 10, 20215 yr Author I have read the article about parity, thank you trurl. For 3 days this system was operating flawlessly, SAB, Radarr, plex... Now i have errors on both parity drives and disk 1. I have not messed with the physical connections during this time. I am suspecting my HBA after such random no read/write errors.
Archived
This topic is now archived and is closed to further replies.