Stubbs Posted July 20, 2021 Posted July 20, 2021 (edited) I ran into a problem. I was encoding videos in the Handbrake container, then at the same time, ran into that highly obnoxious issue where the file permissions for certain folders reset to something I couldn't move, so I ran Safe New Permissions. This caused my Unraid server to freeze with all my CPU cores locked at 100%, and this wasn't ending so I had to shut down my server manually. When I restarted, I started getting these nasty errors. My parity drive wasn't working. I tried removing it from the array and readding it, it started a read check, but that just started returning a mountain of errors, and now my Disk 4 isn't working. I can't even copy-paste the log because it keeps filling up endlessly with errors. I'm scared all my data is starting to disappear now. Is there nothing that can be done? Edited July 20, 2021 by Stubbs Quote
JorgeB Posted July 20, 2021 Posted July 20, 2021 3 minutes ago, Stubbs said: I can't even copy-paste the log That's not what we need, please post the diagnostics: Tools -> Diagnostics Quote
Stubbs Posted July 20, 2021 Author Posted July 20, 2021 1 minute ago, JorgeB said: That's not what we need, please post the diagnostics: Tools -> Diagnostics https://files.catbox.moe/nsotzb.zip Alright, here it is. I also can't even boot my array anymore. My Parity HDD and Disk 4 are now showing up as unassigned devices that need to be formatted, and trying to format them results in "fail". I'm feeling sick... Quote
Stubbs Posted July 20, 2021 Author Posted July 20, 2021 (edited) Edited July 20, 2021 by Stubbs Quote
JorgeB Posted July 20, 2021 Posted July 20, 2021 2 minutes ago, Stubbs said: and trying to format them results in "fail". Don't try to format any disks. Quote
JorgeB Posted July 20, 2021 Posted July 20, 2021 3 minutes ago, Stubbs said: https://files.catbox.moe/nsotzb.zip Next time please attach any files to the forum, you don't need to use external sites. Quote
Stubbs Posted July 20, 2021 Author Posted July 20, 2021 Just now, JorgeB said: Next time please attach any files to the forum, you don't need to use external sites. My mistake. I was just looking at the top bar and neglected to look at the bottom of the posting area. Being stressed doesn't help... tower-diagnostics-20210721-0041.zip Quote
JorgeB Posted July 20, 2021 Posted July 20, 2021 Replace/reconnect cables (both power and SATA) on these disks: WDC_WD30EFZX-68AWUN0_WD-WX22DB0KE762 WDC_WD30EFRX-68EUZN0_WD-WCC4N4LHZESE And post new diags. Quote
Stubbs Posted July 20, 2021 Author Posted July 20, 2021 12 minutes ago, JorgeB said: Replace/reconnect cables (both power and SATA) on these disks: WDC_WD30EFZX-68AWUN0_WD-WX22DB0KE762 WDC_WD30EFRX-68EUZN0_WD-WCC4N4LHZESE And post new diags. Ok, my Disk 4 is back online and working, but my Parity Drive is still broken. Both of these drives are in my x3 HDD Hotswap Bay, which are powered by two power cables. They're extremely difficult to replace or re-insert in my set up so I couldn't do that. The middle HDD (disk 3) has not failed yet, but the Parity Drive in there still has problems. I replaced both SATA cables for the top and bottom bay (parity and disk 4). tower-diagnostics-20210721-0105.zip Quote
JorgeB Posted July 20, 2021 Posted July 20, 2021 No ATA errors so far, try re-syncing parity: https://wiki.unraid.net/Manual/Storage_Management#Rebuilding_a_drive_onto_itself Quote
Stubbs Posted July 20, 2021 Author Posted July 20, 2021 (edited) 3 minutes ago, JorgeB said: No ATA errors so far, try re-syncing parity: https://wiki.unraid.net/Manual/Storage_Management#Rebuilding_a_drive_onto_itself This is what I did last time, which resulted in a 90% used up error log, and basically this: Trying again right now though. If it happens again, I'll upload the diagnostic. Edited July 20, 2021 by Stubbs Quote
Stubbs Posted July 20, 2021 Author Posted July 20, 2021 10 minutes ago, JorgeB said: No ATA errors so far, try re-syncing parity: https://wiki.unraid.net/Manual/Storage_Management#Rebuilding_a_drive_onto_itself Well, it appears to have started rebuilding without any issues so far. The problem is, I think I replaced the previous SATA cables with slower 3GB/S ones. I'm thinking that maybe I should do a quick swap back to the old ones, assuming they're not the problem. I don't want to spend a full 24 hours stressing my array. Quote
JorgeB Posted July 20, 2021 Posted July 20, 2021 11 minutes ago, Stubbs said: I think I replaced the previous SATA cables with slower 3GB/S ones. That can't be it, SATA1 can still do close to 150MB/s, but you can post new diags to see if there's something visible. Quote
Stubbs Posted July 20, 2021 Author Posted July 20, 2021 (edited) 8 minutes ago, JorgeB said: That can't be it, SATA1 can still do close to 150MB/s, but you can post new diags to see if there's something visible. I just reconnected the previous(fast) cables, booted up, parity rebuild started, and sure enough, those old, nasty errors started appearing again, along with a lousy 6 MB/s rebuild speed (in contrast to the 50MB/s speed I had with the "slower" cables. Here is the log. My Array won't even turn off now, it's stuck on "Array Stopping•Retry unmounting disk share(s)...". I'm worried I will have to restart it again and pray no data will be lost. tower-diagnostics-20210721-0142.zip Edited July 20, 2021 by Stubbs Quote
Stubbs Posted July 20, 2021 Author Posted July 20, 2021 Help tower-diagnostics-20210721-0215.zip Quote
Stubbs Posted July 20, 2021 Author Posted July 20, 2021 I replaced the cables again and started rebuilding the parity drive. The speed went down to 2 MB/s at the start before going back up to 25 MB/s. I'm worried I'll be an old man before it fully rebuilds. I don't feel like touching it again because this has been an absolute nightmare for me all day. Jul 21 02:40:19 Tower kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth7db6519: link becomes ready Jul 21 02:40:19 Tower kernel: br-236e25b7e44c: port 3(veth7db6519) entered blocking state Jul 21 02:40:19 Tower kernel: br-236e25b7e44c: port 3(veth7db6519) entered forwarding state Jul 21 02:40:21 Tower avahi-daemon[4067]: Joining mDNS multicast group on interface veth7db6519.IPv6 with address fe80::ec19:f4ff:fe80:239c. Jul 21 02:40:21 Tower avahi-daemon[4067]: New relevant interface veth7db6519.IPv6 for mDNS. Jul 21 02:40:21 Tower avahi-daemon[4067]: Registering new address record for fe80::ec19:f4ff:fe80:239c on veth7db6519.*. Jul 21 02:40:39 Tower kernel: ata6.00: exception Emask 0x0 SAct 0x703000 SErr 0x0 action 0x0 Jul 21 02:40:39 Tower kernel: ata6.00: irq_stat 0x40000008 Jul 21 02:40:39 Tower kernel: ata6.00: failed command: WRITE FPDMA QUEUED Jul 21 02:40:39 Tower kernel: ata6.00: cmd 61/40:a0:a8:a4:7f/05:00:00:00:00/40 tag 20 ncq dma 688128 out Jul 21 02:40:39 Tower kernel: res 41/10:00:a8:a4:7f/00:00:00:00:00/40 Emask 0x481 (invalid argument) <F> Jul 21 02:40:39 Tower kernel: ata6.00: status: { DRDY ERR } Jul 21 02:40:39 Tower kernel: ata6.00: error: { IDNF } Jul 21 02:40:39 Tower kernel: ata6.00: configured for UDMA/133 Jul 21 02:40:39 Tower kernel: ata6: EH complete Jul 21 02:40:46 Tower kernel: ata6.00: exception Emask 0x0 SAct 0x7000000 SErr 0x0 action 0x0 Jul 21 02:40:46 Tower kernel: ata6.00: irq_stat 0x40000008 Jul 21 02:40:46 Tower kernel: ata6.00: failed command: WRITE FPDMA QUEUED Jul 21 02:40:46 Tower kernel: ata6.00: cmd 61/40:c0:a8:a4:7f/05:00:00:00:00/40 tag 24 ncq dma 688128 out Jul 21 02:40:46 Tower kernel: res 41/10:00:a8:a4:7f/00:00:00:00:00/40 Emask 0x481 (invalid argument) <F> Jul 21 02:40:46 Tower kernel: ata6.00: status: { DRDY ERR } Jul 21 02:40:46 Tower kernel: ata6.00: error: { IDNF } tower-diagnostics-20210721-0240.zip Quote
JorgeB Posted July 20, 2021 Posted July 20, 2021 If you still getting those ATA errors there's still a problem, like mentioned it's likely cable/power related, if different cables don't help try another PSU. Quote
Stubbs Posted July 20, 2021 Author Posted July 20, 2021 16 minutes ago, JorgeB said: If you still getting those ATA errors there's still a problem, like mentioned it's likely cable/power related, if different cables don't help try another PSU. They are not showing up on the log anymore, but the rebuild speed varies from 29 MB/s to 3 MB/s. I've been changing the cables around multiple times, and re-inserted the drives into the hotswap bay. I think it's working fine now, and I'm starting to suspect there's a big problem with one of my previous SATA cables. I actually had a really difficult time unplugging one from the motherboard. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.