Jump to content

(SOLVED) Parity drive disable when copying files to the array


Go to solution Solved by JorgeB,

Recommended Posts

The parity drive is disabled right after I copied 100GB of video footage to the array. I tried running SMART short test and it finished with no errors. I am not able to resolve the issue no matter what I tried.

 

I updated my server from Unraid 6.11.5 to 6.12.0. But I have restored back to 6.11.5 after I encountered this situation.
I also tried to replace with several different SATA cables but no luck

I attached the diagnotics of my unraid serve below

takserver-diagnostics-20230620-1954.zip

Edited by Cii1
Link to comment
43 minutes ago, Cii1 said:

I tried running SMART short test and it finished with no errors. I am not able to resolve the issue no matter what I tried.

 

I updated my server from Unraid 6.11.5 to 6.12.0. But I have restored back to 6.11.5 after I encountered this situation.
I also tried to replace with several different SATA cables but no luck

None of these things will enable a disabled drive.

 

3 minutes ago, itimpi said:

Did you ever try to rebuild parity?

A drive gets disabled when a write to it fails. The failed write makes it out-of-sync with the array.

 

A disabled drive has to be rebuilt since it is out-of-sync with the array.

Link to comment

I am rebuilding the parity. And it seems one of my drive is going to fail soon, as it has read error during the parity rebuild. 

I think will let the rebuild finish and replace my failed drive. I got 2 Ironwolf drives and both got read errors with 3 years. I have 7 WD Red drives in my server and I only replace 2 in 5 years. I am really disappointed by Seagate..

Link to comment
56 minutes ago, Cii1 said:

rebuilding the parity. And it seems one of my drive is going to fail soon, as it has read error during the parity rebuild. 

Post new diagnostics.

 

Since you have single parity, problems with another drive could affect rebuilding.

 

Then if your parity rebuild isn't good, you can't reliably rebuild another drive.

 

Better to figure out the problem and maybe correct it before building parity.

Link to comment
3 hours ago, trurl said:

Connection problems with parity and disk9

Thanks. I noticed the free space was ~5GB for disk9. After a rebooy and the free space went back to ~60GB.
There is a warning from Fix Common Problem.
image.thumb.png.42459376900b08e3b49390318211f496.png

I think the disk9 is too full to copy the files. So the parity sync failed. And now docker doesn't work as the path of the docker folder doesn't exist anymore
image.png.affd3afd6abe2ed9ea234a25902577fc.png

Link to comment
7 hours ago, Cii1 said:

disk9 is too full to copy the files. So the parity sync failed

Parity sync is totally unrelated to files on disks. It is all just bits to parity.

 

8 hours ago, Cii1 said:

After a reboo

Did you do anything about the connection problems before rebooting? If not, you need to check connections, all disks, both ends, SATA and power. Be careful you don't disturb connections when working inside. The connectors should sit squarely and firmly on the connection with no tension in the cable.

 

After fixing connections, post new diagnostics with the array started.

 

 

Link to comment
On 6/23/2023 at 9:25 PM, trurl said:

Parity sync is totally unrelated to files on disks. It is all just bits to parity.

 

Did you do anything about the connection problems before rebooting? If not, you need to check connections, all disks, both ends, SATA and power. Be careful you don't disturb connections when working inside. The connectors should sit squarely and firmly on the connection with no tension in the cable.

 

After fixing connections, post new diagnostics with the array started.

 

 

All drives works after I secured all the cables. I ran extended self-test on the parity drive and disk9, which both having warnings before, and no error has been found.

I thought it all went well. But after I rebuilt the parity drive and the array started normal, I ran the parity check and now there are errors again on the parity disk. 

takserver-diagnostics-20230626-1236.zip

Link to comment
1 hour ago, JorgeB said:

Doesn't look like a disk problem, suggest swapping that disk to the onboard SATA controller and re-test, in case it's some compatibility issue with the HBA.

Thanks for the reply. I've swapped to onboard SATA connection for the parity drive. Howevery, the parity is still disable. Is there any method which I can enable the parity drive without rebuilding the whole parity again?

Link to comment
1 hour ago, JorgeB said:

Nope, you should re-sync, you could do a new config and check "parity is already valid" but would then need to do a correcting check, and that won't be faster.

Thank you for the quick response. I am currently working on a project involving the server. I hope no drives fail before I can rebuild the parity in a few days. 

 

On 6/23/2023 at 9:52 AM, trurl said:

Connection problems with parity and disk9

Thank you for the help!


I will post an update in a few days. 

Link to comment
On 6/26/2023 at 5:49 PM, JorgeB said:

Doesn't look like a disk problem, suggest swapping that disk to the onboard SATA controller and re-test, in case it's some compatibility issue with the HBA.


After switching to onboard SATA controller, the disk errors are gone. I suspect that either the HBA is loosen by the vibration, or the mini SAS to SATA cables are broken. I have a spare cable at home and I will try to replace it later when I have time to do maintenance. 

Thank you for all the help!

  • Like 1
Link to comment
  • Cii1 changed the title to Parity drive disable when copying files to the array

Unfortunately, the issue has came back. I noticed there is problem with the connection. Is it also cause by bad connection between the drive and motherboard?

Jun 30 22:15:58 TakServer emhttpd: error: hotplug_devices, 1706: No such file or directory (2): tagged device ST12000NE0008-2PK103_ZS805F5F was (sdg) is now (sdn)
Jun 30 22:15:58 TakServer emhttpd: error: hotplug_devices, 1706: No such file or directory (2): tagged device WDC_WD120EFAX-68UNTN0_8DGG1WTY was (sdf) is now (sdo)

takserver-diagnostics-20230630-2246.zip

Link to comment

You're having issues with multiple disks 

 

 

Jun 30 22:15:04 TakServer kernel: sd 7:0:4:0: device_unblock and setting to running, handle(0x000d)
Jun 30 22:15:04 TakServer kernel: sd 7:0:4:0: [sdf] tag#99 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=DRIVER_OK cmd_age=0s
Jun 30 22:15:04 TakServer kernel: sd 7:0:4:0: [sdf] tag#99 CDB: opcode=0x35 35 00 00 00 00 00 00 00 00 00
Jun 30 22:15:04 TakServer kernel: I/O error, dev sdf, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 2

and

Jun 30 22:15:09 TakServer kernel: sd 7:0:5:0: device_unblock and setting to running, handle(0x000e)
Jun 30 22:15:09 TakServer kernel: sd 7:0:5:0: [sdg] Synchronizing SCSI cache
Jun 30 22:15:09 TakServer kernel: sd 7:0:5:0: [sdg] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK

and 

Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: Power-on or device reset occurred
Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] 23437770752 512-byte logical blocks: (12.0 TB/10.9 TiB)
Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] 4096-byte physical blocks
Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] Write Protect is off
Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] Mode Sense: 7f 00 10 08
Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] Write cache: enabled, read cache: enabled, supports DPO and FUA


Suggesting a power/connection problem.

 

  • Like 1
Link to comment
  • Cii1 changed the title to (SOLVED) Parity drive disable when copying files to the array
2 hours ago, JorgeB said:

You're having issues with multiple disks 

 

 

Jun 30 22:15:04 TakServer kernel: sd 7:0:4:0: device_unblock and setting to running, handle(0x000d)
Jun 30 22:15:04 TakServer kernel: sd 7:0:4:0: [sdf] tag#99 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=DRIVER_OK cmd_age=0s
Jun 30 22:15:04 TakServer kernel: sd 7:0:4:0: [sdf] tag#99 CDB: opcode=0x35 35 00 00 00 00 00 00 00 00 00
Jun 30 22:15:04 TakServer kernel: I/O error, dev sdf, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 2

and

Jun 30 22:15:09 TakServer kernel: sd 7:0:5:0: device_unblock and setting to running, handle(0x000e)
Jun 30 22:15:09 TakServer kernel: sd 7:0:5:0: [sdg] Synchronizing SCSI cache
Jun 30 22:15:09 TakServer kernel: sd 7:0:5:0: [sdg] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK

and 

Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: Power-on or device reset occurred
Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] 23437770752 512-byte logical blocks: (12.0 TB/10.9 TiB)
Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] 4096-byte physical blocks
Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] Write Protect is off
Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] Mode Sense: 7f 00 10 08
Jun 30 22:15:22 TakServer kernel: sd 7:0:9:0: [sdo] Write cache: enabled, read cache: enabled, supports DPO and FUA


Suggesting a power/connection problem.

 


Thanks. I totally forgot about thepower supply issue. I've rearranged the power cables for the drives and it works fine again for now. 

I think my 550W PSU doesn't provide enough power for 1x SSD + 5x 7200RPM +5x 5400RPM harddrives when it's under full load.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...