July 21, 20205 yr Hello, This morning my Unraid main page is showing disk5 in an error state but also showing the same disk as unassigned. I've never seen this before, but I assume that the disk is bad and needs to be replaced. I had intended to retire this disk, anyway (old 2 TB disk) so this is a good time. The question is what to do: in the past, when I had a bad disk, I simply replaced it with a new disk of the same size and let Unraid rebuild the array. I have a new disk that I can put into the array, but it is 8 TB, not 2 TB. Should I remove the old 2 TB disk and replace it with the new 8 TB? Will Unraid rebuild the array with that config? If not, can I just copy (e.g. using rsync) the files from disk5 to another disk which has 2 TB of free space and then remove disk5? Attached are a screen shot of the main page and the diagnostics logs. Thanks. tower-diagnostics-20200721-0622.zip Edited July 21, 20205 yr by CaptainTivo wrong screen cap
July 21, 20205 yr Community Expert Disk looks fine, most likely an issue with the SASLP, since they are known to drop drives without a reason, could also be a cable/connection issue.
July 21, 20205 yr Author Ok, thanks. So I re-seated the connectors and re-booted but the disk still shows disabled. Should I run a parity check to rebuild or what?
July 23, 20205 yr Author OK. I did the re-enable rebuild procedure and it appears to be working fine now. Thanks for the help.
July 23, 20205 yr Author Looks like I spoke to soon. (I removed the SOLVED tag. I hope that is OK). This morning, the array is again reporting that disk5 is in an error state. I decided to simply reboot and see if the HBA would work long enough to do a SMART test on the drive. If I was sure it was OK, I would simply copy the (reconstructed) data to another disk and remove disk5. So I reboot and now I get the weirdness where the Main page shows disk5 with the red x and also showing it in the unassigned disks area. BUT, disk5 is NOT showing in the drop down so I can simply re-assign back to disk5. To further complicate things, there is a green dialog box showing "Notice{} - array turned good"!!!! This is clearly not true. Anyway, what to do now? 1) Start the array and hope that it has not forgotten that disk5 existed. I could still copy the data (reconstructed from the other disks and parity) to free space on another disk. 2) start the array in Maintenance Mode and run a SMART test on the disk. If its good, I could always mount it and copy from there. 3) Other? Attached are two diagnotics, one after disk5 was put into the error state (write errors to the disk) and the other after I rebooted to reset the HBA in the current state. tower-diagnostics-20200723-1104.zip - Unraid put disk5 into error. tower-diagnostics-20200723-1620.zip - after reboot but not starting the array. This is the current state of the server. Thanks again for taking the time to help. Also here is the main web page: tower-diagnostics-20200723-1104.zip tower-diagnostics-20200723-1620.zip
July 24, 20205 yr 2 hours ago, CaptainTivo said: Anyway, what to do now? Replace the HBA with an LSI based card. On 7/21/2020 at 11:07 AM, johnnie.black said: most likely an issue with the SASLP, since they are known to drop drives without a reason
July 24, 20205 yr Community Expert You can swap that disk with one using the onboard controller before rebuilding to see if it doesn't happen again, but either way you should replace that controller, they are not recommended for some time due to various known issues.
July 24, 20205 yr Author 4 hours ago, johnnie.black said: You can swap that disk with one using the onboard controller before rebuilding to see if it doesn't happen again, but either way you should replace that controller, they are not recommended for some time due to various known issues. I think you are right. I have been using the SASLP since I built the machine 9 years ago (version 4!) I had been running 6.7.2 since it came out with no problems but earlier this week I updated to 6.8.3 and this problem started. It could be a coincidence, but it suggestive. As it happens I bought a LSI SAS 9207-8i / LSI00301 a few months ago but did not install it. Question: can I install it now, with the disk in error state? I think I can simply swap out the the card without changing anything in the config, right? Alternatively, I can restore a backup of the 6.7.2 OS and see if I can get the server back to a stable state and then install the new HBA. What do you think?
July 24, 20205 yr Community Expert 10 minutes ago, CaptainTivo said: Question: can I install it now, with the disk in error state? Yep, you can rebuild it after.
July 25, 20205 yr Author OK. I replaced the AOC-SASLP with an LSI SAS 9207-8i card and rebuilt the drive. All seems well. Now to proceed with the array shrink.
Archived
This topic is now archived and is closed to further replies.