Cii1 Posted June 20, 2023 Share Posted June 20, 2023 (edited) The parity drive is disabled right after I copied 100GB of video footage to the array. I tried running SMART short test and it finished with no errors. I am not able to resolve the issue no matter what I tried. I updated my server from Unraid 6.11.5 to 6.12.0. But I have restored back to 6.11.5 after I encountered this situation. I also tried to replace with several different SATA cables but no luck I attached the diagnotics of my unraid serve below takserver-diagnostics-20230620-1954.zip Edited June 30, 2023 by Cii1 Quote Link to comment
itimpi Posted June 20, 2023 Share Posted June 20, 2023 Without diagnostics taken when the problem occurred (and before rebooting) then we have no idea why the drive got disabled. Did you ever try to rebuild parity? It is quite likely the problem was nothing to do with the upgrade. Quote Link to comment
trurl Posted June 20, 2023 Share Posted June 20, 2023 43 minutes ago, Cii1 said: I tried running SMART short test and it finished with no errors. I am not able to resolve the issue no matter what I tried. I updated my server from Unraid 6.11.5 to 6.12.0. But I have restored back to 6.11.5 after I encountered this situation. I also tried to replace with several different SATA cables but no luck None of these things will enable a disabled drive. 3 minutes ago, itimpi said: Did you ever try to rebuild parity? A drive gets disabled when a write to it fails. The failed write makes it out-of-sync with the array. A disabled drive has to be rebuilt since it is out-of-sync with the array. Quote Link to comment
Cii1 Posted June 21, 2023 Author Share Posted June 21, 2023 I am rebuilding the parity. And it seems one of my drive is going to fail soon, as it has read error during the parity rebuild. I think will let the rebuild finish and replace my failed drive. I got 2 Ironwolf drives and both got read errors with 3 years. I have 7 WD Red drives in my server and I only replace 2 in 5 years. I am really disappointed by Seagate.. Quote Link to comment
trurl Posted June 21, 2023 Share Posted June 21, 2023 56 minutes ago, Cii1 said: rebuilding the parity. And it seems one of my drive is going to fail soon, as it has read error during the parity rebuild. Post new diagnostics. Since you have single parity, problems with another drive could affect rebuilding. Then if your parity rebuild isn't good, you can't reliably rebuild another drive. Better to figure out the problem and maybe correct it before building parity. Quote Link to comment
trurl Posted June 21, 2023 Share Posted June 21, 2023 @Cii1 Please post new diagnostics Quote Link to comment
Cii1 Posted June 22, 2023 Author Share Posted June 22, 2023 21 hours ago, trurl said: @Cii1 Please post new diagnostics takserver-diagnostics-20230622-2148.zip Quote Link to comment
trurl Posted June 23, 2023 Share Posted June 23, 2023 Connection problems with parity and disk9 Quote Link to comment
Cii1 Posted June 23, 2023 Author Share Posted June 23, 2023 3 hours ago, trurl said: Connection problems with parity and disk9 Thanks. I noticed the free space was ~5GB for disk9. After a rebooy and the free space went back to ~60GB. There is a warning from Fix Common Problem. I think the disk9 is too full to copy the files. So the parity sync failed. And now docker doesn't work as the path of the docker folder doesn't exist anymore Quote Link to comment
trurl Posted June 23, 2023 Share Posted June 23, 2023 7 hours ago, Cii1 said: disk9 is too full to copy the files. So the parity sync failed Parity sync is totally unrelated to files on disks. It is all just bits to parity. 8 hours ago, Cii1 said: After a reboo Did you do anything about the connection problems before rebooting? If not, you need to check connections, all disks, both ends, SATA and power. Be careful you don't disturb connections when working inside. The connectors should sit squarely and firmly on the connection with no tension in the cable. After fixing connections, post new diagnostics with the array started. Quote Link to comment
Cii1 Posted June 26, 2023 Author Share Posted June 26, 2023 On 6/23/2023 at 9:25 PM, trurl said: Parity sync is totally unrelated to files on disks. It is all just bits to parity. Did you do anything about the connection problems before rebooting? If not, you need to check connections, all disks, both ends, SATA and power. Be careful you don't disturb connections when working inside. The connectors should sit squarely and firmly on the connection with no tension in the cable. After fixing connections, post new diagnostics with the array started. All drives works after I secured all the cables. I ran extended self-test on the parity drive and disk9, which both having warnings before, and no error has been found. I thought it all went well. But after I rebuilt the parity drive and the array started normal, I ran the parity check and now there are errors again on the parity disk. takserver-diagnostics-20230626-1236.zip Quote Link to comment
Solution JorgeB Posted June 26, 2023 Solution Share Posted June 26, 2023 Doesn't look like a disk problem, suggest swapping that disk to the onboard SATA controller and re-test, in case it's some compatibility issue with the HBA. Quote Link to comment
Cii1 Posted June 26, 2023 Author Share Posted June 26, 2023 1 hour ago, JorgeB said: Doesn't look like a disk problem, suggest swapping that disk to the onboard SATA controller and re-test, in case it's some compatibility issue with the HBA. Thanks for the reply. I've swapped to onboard SATA connection for the parity drive. Howevery, the parity is still disable. Is there any method which I can enable the parity drive without rebuilding the whole parity again? Quote Link to comment
JorgeB Posted June 26, 2023 Share Posted June 26, 2023 22 minutes ago, Cii1 said: Is there any method which I can enable the parity drive without rebuilding the whole parity again? Nope, you should re-sync, you could do a new config and check "parity is already valid" but would then need to do a correcting check, and that won't be faster. Quote Link to comment
Cii1 Posted June 26, 2023 Author Share Posted June 26, 2023 1 hour ago, JorgeB said: Nope, you should re-sync, you could do a new config and check "parity is already valid" but would then need to do a correcting check, and that won't be faster. Thank you for the quick response. I am currently working on a project involving the server. I hope no drives fail before I can rebuild the parity in a few days. On 6/23/2023 at 9:52 AM, trurl said: Connection problems with parity and disk9 Thank you for the help! I will post an update in a few days. Quote Link to comment
Cii1 Posted June 28, 2023 Author Share Posted June 28, 2023 On 6/26/2023 at 5:49 PM, JorgeB said: Doesn't look like a disk problem, suggest swapping that disk to the onboard SATA controller and re-test, in case it's some compatibility issue with the HBA. After switching to onboard SATA controller, the disk errors are gone. I suspect that either the HBA is loosen by the vibration, or the mini SAS to SATA cables are broken. I have a spare cable at home and I will try to replace it later when I have time to do maintenance. Thank you for all the help! 1 Quote Link to comment
Cii1 Posted June 30, 2023 Author Share Posted June 30, 2023 Unfortunately, the issue has came back. I noticed there is problem with the connection. Is it also cause by bad connection between the drive and motherboard? Jun 30 22:15:58 TakServer emhttpd: error: hotplug_devices, 1706: No such file or directory (2): tagged device ST12000NE0008-2PK103_ZS805F5F was (sdg) is now (sdn) Jun 30 22:15:58 TakServer emhttpd: error: hotplug_devices, 1706: No such file or directory (2): tagged device WDC_WD120EFAX-68UNTN0_8DGG1WTY was (sdf) is now (sdo) takserver-diagnostics-20230630-2246.zip Quote Link to comment
JorgeB Posted June 30, 2023 Share Posted June 30, 2023 You're having issues with multiple disks Jun 30 22:15:04 TakServer kernel: sd 7:0:4:0: device_unblock and setting to running, handle(0x000d) Jun 30 22:15:04 TakServer kernel: sd 7:0:4:0: [sdf] tag#99 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=DRIVER_OK cmd_age=0s Jun 30 22:15:04 TakServer kernel: sd 7:0:4:0: [sdf] tag#99 CDB: opcode=0x35 35 00 00 00 00 00 00 00 00 00 Jun 30 22:15:04 TakServer kernel: I/O error, dev sdf, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 2 and Jun 30 22:15:09 TakServer kernel: sd 7:0:5:0: device_unblock and setting to running, handle(0x000e) Jun 30 22:15:09 TakServer kernel: sd 7:0:5:0: [sdg] Synchronizing SCSI cache Jun 30 22:15:09 TakServer kernel: sd 7:0:5:0: [sdg] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK and Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: Power-on or device reset occurred Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] 23437770752 512-byte logical blocks: (12.0 TB/10.9 TiB) Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] 4096-byte physical blocks Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] Write Protect is off Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] Mode Sense: 7f 00 10 08 Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] Write cache: enabled, read cache: enabled, supports DPO and FUA Suggesting a power/connection problem. 1 Quote Link to comment
Cii1 Posted June 30, 2023 Author Share Posted June 30, 2023 2 hours ago, JorgeB said: You're having issues with multiple disks Jun 30 22:15:04 TakServer kernel: sd 7:0:4:0: device_unblock and setting to running, handle(0x000d) Jun 30 22:15:04 TakServer kernel: sd 7:0:4:0: [sdf] tag#99 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=DRIVER_OK cmd_age=0s Jun 30 22:15:04 TakServer kernel: sd 7:0:4:0: [sdf] tag#99 CDB: opcode=0x35 35 00 00 00 00 00 00 00 00 00 Jun 30 22:15:04 TakServer kernel: I/O error, dev sdf, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 2 and Jun 30 22:15:09 TakServer kernel: sd 7:0:5:0: device_unblock and setting to running, handle(0x000e) Jun 30 22:15:09 TakServer kernel: sd 7:0:5:0: [sdg] Synchronizing SCSI cache Jun 30 22:15:09 TakServer kernel: sd 7:0:5:0: [sdg] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK and Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: Power-on or device reset occurred Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] 23437770752 512-byte logical blocks: (12.0 TB/10.9 TiB) Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] 4096-byte physical blocks Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] Write Protect is off Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] Mode Sense: 7f 00 10 08 Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] Write cache: enabled, read cache: enabled, supports DPO and FUA Suggesting a power/connection problem. Thanks. I totally forgot about thepower supply issue. I've rearranged the power cables for the drives and it works fine again for now. I think my 550W PSU doesn't provide enough power for 1x SSD + 5x 7200RPM +5x 5400RPM harddrives when it's under full load. Quote Link to comment
trurl Posted July 1, 2023 Share Posted July 1, 2023 Preferably no more than 4 drives per power cable. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.