jcamer Posted June 3, 2021 Share Posted June 3, 2021 (edited) I said "drive failures" because I think my drives are actually fine. I just didn't know if anyone else has run across this. Yesterday my server emailed me saying it had array errors, 3 disks with read errors (Parity disk, Parity disk 2, and Disk 3). It disabled the Parity Drive as well as Disk 3, but things kept running. I was thinking the chances of three drives having errors all at once out of nowhere seemed a bit low, so I doubted it was actually bad drives. Turns out the following below is not the problem. Problem just came back and shows 11 drives with errors. I'm at a loss and have diagnostics if someone smarter than me can make sense of them. To get to it, I am wondering if this is the problem somehow.. I use the docker ShinySDR for a SDR dongle I use. I unplugged it from my server a day prior as I was getting a longer antenna cable for it. The docker file had the usb device set as /dev/bus/usb/003/002 (which was correct prior to me unplugging it). and the docker was set to automatically start. Somewhere in there I rebooted the server and I think this is where the issues started. I rebooted numerous times, shut down all docker containers, shut down the one vm I run, and tried to remove all plugins I felt I didn't need trying to find what the issue might be. I forced the server off a couple of times as it was just unresponsive as well. The server actually emailed me yesterday afternoon saying I had 9 disks with read errors. Well I opened the terminal and ran lsusb to see what it had connected and /dev/bus/usb/003/002 was now "Bus 003 Device 002: ID 058f:6387 Alcor Micro Corp. Flash Drive" - This is my Unraid USB drive... I am wondering if this cold have been the cause. I didn't know if the docker container could be trying to access the usb drive in such a way as to spew out all of these read errors and disable my drives. I did run tools -> diagnostics several times, but I now know that every time you reboot you might miss something important. These files along with the syslog did show errors, but I'm hesitant to believe it as it has since rebuilt the parity drive, and is currently 66% through rebuilding disk 3. The syslog currently shows only the errors for the disabled drives, prior to me removing them and adding them back. thoughts? thanks, John Edited June 3, 2021 by jcamer Quote Link to comment
jcamer Posted June 3, 2021 Author Share Posted June 3, 2021 (edited) I have been searching and this seems to be exactly what I am experiencing. In the logs I also see the exact same error across multiple drives on the exact same sector. As with that poster, I also have a Supermicro Chassis with 12 drive bays. I am going to try what they tried (disable spin down) and will see how that works. I wouldn't think it'd be a power issue as it has dual 1000w power supplies. I'll also try to update the firmware on my card (Supermicro Card, LSI3008-IT, running firmware 6.00.00.00-IT) The card and cables haven't been moved so nothing has come unseated or anything. My question now is, how do I get my two disabled drives back without having to stop the array, remove them, start the array, and add them back? Is there an easy way to tell Unraid to trust they're good and reenable them? edit: I stopped the array and removed both drives. Restarted the array, added them back, etc. Rebuilding now. Thanks again, John Edited June 3, 2021 by jcamer add content Quote Link to comment
JorgeB Posted June 4, 2021 Share Posted June 4, 2021 If it happens again post the diagnostics before rebooting. Quote Link to comment
jcamer Posted June 4, 2021 Author Share Posted June 4, 2021 16 minutes ago, JorgeB said: If it happens again post the diagnostics before rebooting. Here they are, taken shortly after. understonekeep-diagnostics-20210603-1507-anon.zip Quote Link to comment
JorgeB Posted June 4, 2021 Share Posted June 4, 2021 Fist thing would should do is updating the LSI firmware, if still issues after that see if disabling spin down helps. Quote Link to comment
jcamer Posted June 5, 2021 Author Share Posted June 5, 2021 (edited) 18 hours ago, JorgeB said: Fist thing would should do is updating the LSI firmware, if still issues after that see if disabling spin down helps. I updated the firmware to the latest on the supermicro site. I'm nervous to tell the drives to spin down yet, I might give it a day or so. No issues since telling them not to spin down. I'll see how the newer firmware does then reenable spin down. Thanks again, appreciate it. updated: original: Edited June 5, 2021 by jcamer 1 Quote Link to comment
jcamer Posted June 8, 2021 Author Share Posted June 8, 2021 Well, it's been running a couple of days with no issues after updating the firmware and keeping spin down turned off. Now, I'll enable disk spin down again and see how it goes. Quote Link to comment
jcamer Posted June 11, 2021 Author Share Posted June 11, 2021 I updated the firmware and it's been fine for a few days now with spin down enabled. thanks! 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.