April 10, 20224 yr So I've been having issues with UDMA CRC errors. Here's my previous post, UDMA CRC ERRORS FROM ONLY 1 DISK FROM LSI BROADCOM SAS 9300-8I. I have yet to be able to build my new case that removes the backplane thinking that is what the underlying issue is. In the meantime I decided to use SpaceInvaderOne's video on shrinking my array and preserving Parity so that I can RMA Disk4, which also had UDMA CRC errors. Disk6 in my other post was out of warranty unfortunately. Thanks @SpaceInvaderOne, your videos are amazing. I was able to move all of my data from Disk4 onto other drives, then clear the drive. I had zero issues following his video. However, when I started the array after unassigning Disk4, Disk6 threw errors and went disabled. I just ordered a new hard drive because I don't have a spare. Yes I know, I should always have at least one spare. I'm an idiot. My question is, I don't see any 'bad' errors, like Unallocated or Pending sector. So is the disk actually bad or is there another issue like in my previous post with the bad backplane or maybe a problem with my LSI card? I am 100% going to leave my server off until I get the new drive, then I will rebuild onto it after doing a Preclear, but should I re-build everything into my new case that I got, which doesn't have a backplane, or should I use my current case with the backplane and rebuild my array with the new disk? I just don't want to keep getting all these errors especially when I'm trying to rebuild a disk. I'm not sure if the Diagnostics file will show all the errors or not, but I'm including it and Disk6 smart report. I also started a SMART extended self-test on the disabled disk. threadripper19-diagnostics-20220410-1708.zip threadripper19-smart-20220410-1709.zip Edited September 29, 20223 yr by FQs19 Topic Solved
April 11, 20224 yr Community Expert 11 hours ago, FQs19 said: So is the disk actually bad or is there another issue like in my previous post with the bad backplane or maybe a problem with my LSI card? Disk looks healthy.
April 11, 20224 yr Author 4 hours ago, JorgeB said: Disk looks healthy. Thank you for looking at it. Would you rebuild that disk with a new one in the current case or move the system over to the new case and then rebuild the disk?
April 11, 20224 yr Community Expert Assuming nothing was written to disk6 since it got disable I would do a new config to re-enable it, then run a correcting parity check.
April 11, 20224 yr Author 5 hours ago, JorgeB said: Assuming nothing was written to disk6 since it got disable I would do a new config to re-enable it, then run a correcting parity check. I just want to verify with you, but is this the procedure you would like me to follow to re-enable the disk: Rebuilding a drive onto itself I haven't written anything to the disk that I know of. I've disabled mover (set it to run monthly but not til the 1st of the month), disabled my dockers, and haven't written to the data disks from other computers in my house. Thanks again for the help.
April 12, 20224 yr Community Expert Solution 10 hours ago, FQs19 said: I just want to verify with you, but is this the procedure you would like me to follow to re-enable the disk: No, I mentioned doing a new config: Tools -> New Config, keep all assignments, check parity is already valid before array start, then run a correcting check.
April 12, 20224 yr Community Expert 11 hours ago, FQs19 said: Rebuilding a drive onto itself Note that this is also a valid option bur only if the emulated disk6 is mounting and contents look correct.
April 12, 20224 yr Author 7 hours ago, JorgeB said: No, I mentioned doing a new config: Tools -> New Config, keep all assignments, check parity is already valid before array start, then run a correcting check. Glad I checked with you first. I did exactly what you said, Tools>New Config>Keep All Assignments, then checked the box for 'Parity is Valid', started array, then started a Correcting Parity Check by checking the box 'Write Corrections to Parity'. It'll take at least 19hrs for it to finish. After about 2 mins of starting the array, Disk6 received 4 more UDMA CRC Errors. I need this parity correcting check to finish so I can shutdown the server and move everything over to my new case ASAP. Thanks so much for the help.
April 13, 20224 yr Author 19 hours ago, JorgeB said: No, I mentioned doing a new config: Tools -> New Config, keep all assignments, check parity is already valid before array start, then run a correcting check. Just wanted to give an update: I'm in the middle of the correcting parity check and I'm at 517 sync errors. I'm not seeing any errors on the disks though. Should I do another correcting parity check after this finishes or just a parity check? I understand that a parity check should always come back with 0 errors. So I assume I should do another correcting parity check instead of wasting time doing a parity check then having to do another correcting parity check after finding errors with a parity check. threadripper19-diagnostics-20220412-2151.zip
April 13, 20224 yr Community Expert If you are running a correcting check then it should be fixing the errors reported. The next check should be non-correcting and if everything is good will come back with 0 errors.
April 13, 20224 yr Community Expert Some sync errors are expected because of what happened, just let it finish, like mentioned by itimpi you can then run a non correcting check if you want to confirm all is fine, and it should be.
April 13, 20224 yr Author 9 hours ago, itimpi said: If you are running a correcting check then it should be fixing the errors reported. The next check should be non-correcting and if everything is good will come back with 0 errors. 5 hours ago, JorgeB said: Some sync errors are expected because of what happened, just let it finish, like mentioned by itimpi you can then run a non correcting check if you want to confirm all is fine, and it should be. Thank you both for the help. I'm at 91% on the parity check and the errors are still at 517. I'll let it finish then run a non-correcting parity check to confirm 0 errors. I can then finally switch this server over to my new case that doesn't have a backplane.
May 6, 20224 yr Author I finally had time to move my server into a new case. I moved to a Rosewill RSV-L4500U, which doesn't have a SATA backplane/Hot Swap bays. It has three 5 disk bays. I ordered new SATA power cables because my power supply only has four 1x3 ones. I wish they made 1x5 cables, but I got 1x4 from EVGA 4x SATA Cable (Single). I also got extensions to run to the 5th disk in each bay from my fourth SATA power cable. Cable management is difficult without purchasing custom cables (which I'm not going to do). Wish I could use the software for the MSI MPG Series CORELIQUID K360 AIO I have for it, but I'm just using the motherboard's fan controllers. I have 4 disks (including the 2 Parity disks) connected to my motherboard's TRX40 chipset controller. I have the remaining 8 disks connected to my HBA LSI 9300 8i card. I will have 2 disks that will be connected to my motherboard's ASmedia's controller as spares. I'm currently running a Correcting Parity Check. I will do a Non-Correcting Parity Check after it finishes. I'm sure there will be errors since I stopped the Correcting Parity Check before it finished. I haven't seen any errors being reported since switching to the new case. Fingers crossed that it was the old case's backplane causing all my errors. If you guys have any suggestions on what to do after I complete my parity checks, please let me know. Thanks again for the help.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.