clowncracker Posted November 1, 2023 Share Posted November 1, 2023 (edited) I recently had a drive fail during a parity check (was not writing corrections). I then replaced the drive and did a data-rebuild. Afterwards I ran Check Filesystem Status and corrected the newly rebuilt drive. I started another Parity Check (non-correcting), but I've run into 3168 errors in the first 5 minutes. Do I have an issue with my parity or my newly rebuilt drive? Is there a way I can check and validate that my data isn't corrupted? EDIT: I also just noticed that my rebuilt drive isn't showing the correct capacity in Unraid. I replaced a 5tb drive with a 14tb drive, but the drive is still showing as only 5tb in Unraid. I still have access to my old drive and a backup of my usb drive. Should I go back and try and fix my old drive and then replace it? What is best practice here? Would that fix my parity problem as well? Edited November 1, 2023 by clowncracker Quote Link to comment
JorgeB Posted November 1, 2023 Share Posted November 1, 2023 Please post the diagnostics. Quote Link to comment
clowncracker Posted November 1, 2023 Author Share Posted November 1, 2023 4 hours ago, JorgeB said: Please post the diagnostics. Here is the diagnostic file. I turned the system back on it doesn't seem to accept the newly rebuilt drive: clowncracker-diagnostics-20231101-0721.zip Quote Link to comment
trurl Posted November 1, 2023 Share Posted November 1, 2023 Can you start the array and post new diagnostics? Quote Link to comment
clowncracker Posted November 1, 2023 Author Share Posted November 1, 2023 (edited) 3 minutes ago, trurl said: Can you start the array and post new diagnostics? What should I do with disk 3? Leave it with no device? For reference, TOSHIBA_HDWE150 was the disk that failed and was replaced with WDC_WD140EDGZ. WDC_WD140EDGZ was rebuilt and then I did a Check Filesystem to correct the issues. Edited November 1, 2023 by clowncracker Quote Link to comment
trurl Posted November 1, 2023 Share Posted November 1, 2023 11 minutes ago, clowncracker said: TOSHIBA_HDWE150 was the disk that failed and was replaced with WDC_WD140EDGZ. WDC_WD140EDGZ was rebuilt There are 2 screenshots in your post that shows both of these. I assume one was taken before the rebuild? Can you see the missing drive in BIOS? Quote Link to comment
clowncracker Posted November 1, 2023 Author Share Posted November 1, 2023 9 minutes ago, trurl said: There are 2 screenshots in your post that shows both of these. I assume one was taken before the rebuild? Can you see the missing drive in BIOS? Sorry, both of those screenshots are from after the rebuild. WDC_WD140EDGZ was already rebuilt and is selectable from the dropdown as a replacement for disk 3. I was just showing that it recognizes TOSHIBA_HDWE150 as disk 3, even though it was already rebuilt with WDC_WD140EDGZ. Quote Link to comment
trurl Posted November 1, 2023 Share Posted November 1, 2023 5 minutes ago, clowncracker said: it recognizes TOSHIBA_HDWE150 as disk 3, even though it was already rebuilt with WDC_WD140EDGZ. That suggests it thinks the toshiba is still disk3. Did you make a flash backup after the replacement? Your array disk assignments are in config/super.dat (not a plain-text file). Quote Link to comment
clowncracker Posted November 1, 2023 Author Share Posted November 1, 2023 3 minutes ago, trurl said: That suggests it thinks the toshiba is still disk3. Did you make a flash backup after the replacement? Your array disk assignments are in config/super.dat (not a plain-text file). I did not make a backup after the replacement. I didn't think I would need to, since in theory once it was replaced the active config should be recognize the WD drive as disk 3. Quote Link to comment
trurl Posted November 1, 2023 Share Posted November 1, 2023 You might check your flash drive in your PC to make sure it isn't corrupt. Quote Link to comment
clowncracker Posted November 1, 2023 Author Share Posted November 1, 2023 (edited) 6 minutes ago, trurl said: You might check your flash drive in your PC to make sure it isn't corrupt. Flash drive looks fine. Just plugged it in and it looks like there are no issues. I guess I could rebuild disk 3 again, but I am concerned with the number of parity errors it was throwing out yesterday when I ran it after the rebuild. It also might have been throwing the parity errors because for some reason disk 3 wasn't recognized correctly after I did the Check Filesystem. Maybe I should have done the Check Filesystem before I rebuilt disk 3? It did show disk 3 only having 5TB after the rebuild, which is weird. The screenshot for that is in the original post. Edited November 1, 2023 by clowncracker Quote Link to comment
JorgeB Posted November 1, 2023 Share Posted November 1, 2023 Unraid is still looking for the old disk, something went wrong during the replacement, unassign disk3 and post new diags with the array started. Quote Link to comment
clowncracker Posted November 1, 2023 Author Share Posted November 1, 2023 (edited) 2 hours ago, JorgeB said: Unraid is still looking for the old disk, something went wrong during the replacement, unassign disk3 and post new diags with the array started. Just to confirm: clowncracker-diagnostics-20231101-1125.zip Edited November 1, 2023 by clowncracker Quote Link to comment
trurl Posted November 1, 2023 Share Posted November 1, 2023 Emulated disk3 is mounted and 82% full. Should be OK to rebuild to the new disk. No idea what happened before. Post new diagnostics without rebooting if there are problems or when rebuild completes. Quote Link to comment
clowncracker Posted November 2, 2023 Author Share Posted November 2, 2023 On 11/1/2023 at 2:39 PM, trurl said: Emulated disk3 is mounted and 82% full. Should be OK to rebuild to the new disk. No idea what happened before. Post new diagnostics without rebooting if there are problems or when rebuild completes. Rebuild just finished. Attached i s the new diagnostics file without rebooting (pulled it immediately after the rebuild finished). clowncracker-diagnostics-20231102-1555.zip Quote Link to comment
trurl Posted November 2, 2023 Share Posted November 2, 2023 Looks OK. Make sure you make a backup of flash Quote Link to comment
clowncracker Posted November 7, 2023 Author Share Posted November 7, 2023 (edited) On 11/2/2023 at 4:05 PM, trurl said: Looks OK. Make sure you make a backup of flash Sorry for the delay, but I've been out of town for a few days. I just started another parity check. I've already found 7685 errors in the past 10 minutes, so I'm concerned there is still a problem. I decided to cancel the parity check and restart my server. The config still looks correct, I've attached is another diagnostics log. clowncracker-diagnostics-20231107-1044.zip Edited November 7, 2023 by clowncracker Quote Link to comment
JorgeB Posted November 7, 2023 Share Posted November 7, 2023 There are no sysnc errors in the diags, are they after rebooting? Quote Link to comment
clowncracker Posted November 7, 2023 Author Share Posted November 7, 2023 39 minutes ago, JorgeB said: There are no sysnc errors in the diags, are they after rebooting? I've attached another diagnostic file after restarting the parity check after a fresh reboot. I'm at around 390 errors at the moment. clowncracker-diagnostics-20231107-1201.zip Quote Link to comment
JorgeB Posted November 8, 2023 Share Posted November 8, 2023 No indication o any disk issues, start by running memtest. Quote Link to comment
clowncracker Posted November 9, 2023 Author Share Posted November 9, 2023 12 hours ago, JorgeB said: No indication o any disk issues, start by running memtest. Memtest came back with no issues. Quote Link to comment
JorgeB Posted November 9, 2023 Share Posted November 9, 2023 Could also be controller or a disk, if it's a disk it can be a pain to find out which one, you basically need to test without one disk at a time, try the controller first if possible, and note that memtest not finding errors does not completely rule out RAM, if you have more than one RAM stick try with one at a time, also remember that after any change you need to run two checks, since the first one can still find errors. Quote Link to comment
clowncracker Posted November 9, 2023 Author Share Posted November 9, 2023 (edited) 7 hours ago, JorgeB said: Could also be controller or a disk, if it's a disk it can be a pain to find out which one, you basically need to test without one disk at a time, try the controller first if possible, and note that memtest not finding errors does not completely rule out RAM, if you have more than one RAM stick try with one at a time, also remember that after any change you need to run two checks, since the first one can still find errors. I honestly don't think it's a hardware issue (memory, controller, cables, etc). It might be disk related, but how would I go about testing that? Stopping the array, disabling a disk, starting the server and just running a parity check? If that's the case how would I actually go about fixing the issue? Edited November 9, 2023 by clowncracker Quote Link to comment
JorgeB Posted November 9, 2023 Share Posted November 9, 2023 32 minutes ago, clowncracker said: but how would I go about testing that? 8 hours ago, JorgeB said: you basically need to test without one disk at a time Like this: Quote Link to comment
clowncracker Posted November 9, 2023 Author Share Posted November 9, 2023 19 minutes ago, JorgeB said: Like this: I'm going to assume the parity disks aren't the issue, since they've both been replaced in past 4 months with brand new drives. I'm going to test disk 3 that was just replaced (which caused all of these issues to begin with), disk 4 that has sector count issues and disk 10 (which is the newest drive in the array). So to confirm: 1) Save a backup of super.dat (I'll just use the file from the diagnostics). 2) Stop the array. 3) Tools > New config, selecting all in the Preserve current assignments section. 4) DESELECT DISK 3 - making sure Parity is Valid is UNCHECKED. 5) Start the array. 6a) Run a parity check, if there are issues I should restart the process with another drive. 6b) If there are no issues, run another parity check to make sure there are no issues. 7) Once I've identified the drive with problems, stop the array and rebuild it with a new disk. Run two parity checks to make sure there are no issues. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.