Artie17 Posted May 1, 2020 Posted May 1, 2020 (edited) This morning after opening UnRAID I saw a warning that 2 disks have read errors and 1 disc is in error state and disabled. They're both new WD RED 6TB drives that I've been using for the past few months. One of them is a Parity disk and the other is used for storage. Under Array Devices, the Parity device has a green indicator next to it like nothing is wrong, but Disk 1 (the other WD drive) is disabled. Both drives now appear under 'Under Unassigned Devices'. I'm confused why both of them got unmounted automatically (never happened before) and for some reason the data drive that's offline has the orange 'Mount' option and the Parity drive which appears as though it's active (green), is grayed out. I'm unable to get the SMART report for both drives. I tried spinning up the disks to get it, but it won't let me. I was able to get the Disk Log for both drives. Disk1 Parity Disk I also noticed on my Dashboard that the system logs are at 100%. I've never experienced this with UnRaid before. What are the chances both drives error out at the same time and unmount automatically. Any ideas why this could be happening? Edit: Logs attached S tower-smart-20200501-1316.zip Edited May 4, 2020 by Artie17 Issue Resolved Quote
trurl Posted May 1, 2020 Posted May 1, 2020 On mobile now so can't look at Diagnostics. Your data is almost certainly OK unless you do something wrong. Don't do anything without further advice. Quote
trurl Posted May 2, 2020 Posted May 2, 2020 Sorry, I thought you had attached the complete diagnostics instead of only the SMART for a single drive. Go to Tools - Diagnostics and attach the complete diagnostics zip file to your NEXT post. Quote
trurl Posted May 2, 2020 Posted May 2, 2020 3 hours ago, Artie17 said: I also noticed on my Dashboard that the system logs are at 100%. And that SMART report you attached has nothing in it which probably means the disk has disconnected. Since you need to check connections anyway, and you may not even be able to get diagnostics in your current state, just go ahead and shutdown, check all connections, power and SATA, both ends including any power splitters. Then boot up and get those diagnostics for us. Quote
Artie17 Posted May 2, 2020 Author Posted May 2, 2020 Oops. I attached the wrong logs. tower-diagnostics-20200501-1308.zip Quote
trurl Posted May 2, 2020 Posted May 2, 2020 OK, so you were able to get diagnostics without rebooting. That is good and does provide more information than we would have gotten after reboot. You should still 12 minutes ago, trurl said: shutdown, check all connections, power and SATA, both ends including any power splitters. Then boot up and get those diagnostics for us. Quote
Artie17 Posted May 2, 2020 Author Posted May 2, 2020 Doing that now. I can't remember if I grabbed those logs before or after rebooting. Quote
trurl Posted May 2, 2020 Posted May 2, 2020 9 minutes ago, Artie17 said: Doing that now. I can't remember if I grabbed those logs before or after rebooting. Definitely before since syslog went back about a week. But then a parity check started at 3am this morning (scheduled I assume) and disk1 and parity both starting giving read errors. Was your previous parity check completely clean? Quote
Artie17 Posted May 2, 2020 Author Posted May 2, 2020 Ok good. This is weird. It wont shut down the server. I shut it down and noticed that my motherboard power light is still on and I could feel the drives spinning. I refreshed and noticed it's running. I even tried stopping the array and shutting it down, but it won't do it. I'm thinking about just holding down the power button on my pc to turn it off. What do you think? Quote
trurl Posted May 2, 2020 Posted May 2, 2020 If it still hasn't shutdown I guess the power button is the only choice. Quote
Artie17 Posted May 2, 2020 Author Posted May 2, 2020 Quick update. Just restarted after checking everything. First thing I see after signing in Both drives are mounted Parity drive looks fine now Disk 1 is still disabled On the Dashboard, the log file is down to 1% Checked 'Fix Common Problems' and the only that shows up is Disk1 being disabled. Did a quick test and it passed without errors. About to run an extended SMART self test. I was able to grab the logs for Disk1 tower-smart-20200501-1846.zip Quote
trurl Posted May 2, 2020 Posted May 2, 2020 Post new diagnostics. Diagnostics includes syslog since reboot, SMART for all attached disks, and a lot of other information that gives a more complete understanding of the total situation. We always prefer the complete diagnostics zip file instead of anything else unless we ask for it. Quote
Artie17 Posted May 2, 2020 Author Posted May 2, 2020 Got it. Attached. tower-diagnostics-20200501-1904.zip Quote
trurl Posted May 2, 2020 Posted May 2, 2020 8 minutes ago, Artie17 said: First thing I see after signing in This is expected since the error count resets on reboot. 9 minutes ago, Artie17 said: Both drives are mounted I assume you mean they are shown in the array and not in Unassigned now. Mounted is a different concept, it means the filesystem on the disk was able to mount and the files are accessible. Parity cannot actually mount since it has no filesystem. If disk1 is actually mounted that is a good sign since it means there is no corruption on the emulated disk. The physical disk1 isn't actually used since it is disabled, but the disk is emulated by calculating its data from parity plus all remaining disks. 15 minutes ago, Artie17 said: Disk 1 is still disabled And it will be disabled until it is rebuilt. When a write to a disk fails, Unraid updates parity anyway so that failed write and any subsequent writes can be recovered, but now that disk is out-of-sync and has to be rebuilt from parity. 17 minutes ago, Artie17 said: On the Dashboard, the log file is down to 1% Syslog is in RAM, like the rest of the OS, and so no log from before reboot is there anymore. 18 minutes ago, Artie17 said: Checked 'Fix Common Problems' and the only that shows up is Disk1 being disabled. Good. 19 minutes ago, Artie17 said: Did a quick test and it passed without errors. About to run an extended SMART self test. Probably make more sense to just do the rebuild instead of the extended test. Rebuild is needed anyway, will be a good test, and extended test will take a long time and rebuild still has to be done. Also, until rebuild is done, you have no protection since you already have a disabled disk. Let me take a look at those diagnostics and we can discuss how to rebuild. Quote
Artie17 Posted May 2, 2020 Author Posted May 2, 2020 Thanks for all the info. How do I initiate a rebuild and do you recommend disabling any apps (Plex, Sonarr,) from running during the rebuild? Quote
trurl Posted May 2, 2020 Posted May 2, 2020 Those diagnostics look OK. I noticed you have 40G allocated for docker image. 20G should be more than enough. Have you had problems filling it? We can discuss that later. 6 minutes ago, Artie17 said: do you recommend disabling any apps (Plex, Sonarr,) from running during the rebuild? Rebuilding requires reading all the disks simultaneously to calculate the data for the rebuild, then writing that data to the rebuilding disk. It is OK to keep using the disks for other things and won't cause any data loss, but if other things are competing for access to the disks, then the rebuild will be slower, and those other things will be slower also. Rebuilding 6TB will take many hours though. Similar to a parity check. I sometimes do a little with my system during parity checks but avoid large reads and writes. The safest approach is to rebuild to a new disk. This allows you to keep the original disk as it was in case there are problems with the rebuild. But many people rebuild to the same disk (I have) and since there doesn't seem to be any problems with any of the disks or the filesystems it is probably OK to just rebuild to the same disk if you don't have a spare. Do you want to rebuild to the same disk or a new disk? Quote
trurl Posted May 2, 2020 Posted May 2, 2020 Stop the array. Unassign the disabled disk. Start the array with the disabled disk unassigned. Stop the array. Reassign the disk. Start the array to begin rebuild. Quote
Artie17 Posted May 2, 2020 Author Posted May 2, 2020 Thanks. To answer your question. I don't believe I need that much room for the docker image. I don't remember why I set it to use 40GBs. Where I can adjust this down to 20GB? I'm assuming it's best I wait until the rebuild is over to make these changes? Quote
trurl Posted May 2, 2020 Posted May 2, 2020 2 minutes ago, Artie17 said: To answer your question. I don't believe I need that much room for the docker image. I don't remember why I set it to use 40GBs. Where I can adjust this down to 20GB? I'm assuming it's best I wait until the rebuild is over to make these changes? To change that, you have to go to Settings - Docker, disable dockers, then it will let you delete the docker image. Then you can change the size and enabling dockers again will recreate it. After that you can reinstall all your dockers exactly as they were using the Previous Apps feature on the Apps page. Or you can leave it if you don't mind wasting that space. As long as usage isn't growing (as shown on the Dashboard) it isn't really a problem. The reason I even check on that in the diagnostics is because people often have an application misconfigured so it is writing to a path that isn't mapped, and so it writes data into the docker image and fills it up then their dockers stop working. They see their docker image is getting full so they think they can fix the problem by increasing its size, but all that does is make it take longer to fill and they need to fix the paths setup in the application instead. Wait until the rebuild is over to make any changes. Quote
Artie17 Posted May 2, 2020 Author Posted May 2, 2020 Got it. Thanks again for all the help man. You're like an encyclopedia. I'll report back once the rebuild is over. Enjoy your Friday night! Quote
Artie17 Posted May 2, 2020 Author Posted May 2, 2020 Good news. The rebuild just finished with zero errors and everything looks good. I'm trying to think what caused all of this to happen in the first place. I'm going to reduce my docker size. You mentioned that if I reinstall my apps from the Previous Apps feature, it would install them exactly as they were. I just wanted to confirm that it will retain the configurations and won't require to remap things. Quote
JorgeB Posted May 2, 2020 Posted May 2, 2020 There was a problem with both disks at the same time: May 1 04:28:34 Tower kernel: sd 1:0:0:0: device_block, handle(0x0009) May 1 04:28:34 Tower kernel: sd 1:0:2:0: device_block, handle(0x000b) May 1 04:28:36 Tower kernel: sd 1:0:0:0: device_unblock and setting to running, handle(0x0009) May 1 04:28:36 Tower kernel: sd 1:0:2:0: device_unblock and setting to running, handle(0x000b) 1:0:0:0 is disk1, 1:0:2:0 is parity, this suggests a cable/power issue, they like share a mini SAS cable, could also be a power connector, like a splitter, if shared by both. Quote
John_M Posted May 3, 2020 Posted May 3, 2020 Which explains why disks that started out as sde and sdc became reallocated as sdj and sdk under Unassigned Devices. 1 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.