disk error during parity check - UnRAID stuck


weirdcrap
Go to solution Solved by JorgeB,

Recommended Posts

v6.11-RC5

 

This is a follow up to this post. I had repaired the errors and was running a final non-correcting check.

 

was running a non-correcting parity check when one of my disks decided to sh*t the bed? or Maybe the sas controller lost contact? Now UnRAID is stuck on a paused "Read Check".

 

This was the last thing before a flood of read and write failures:

Sep 21 16:47:26 VOID kernel: mpt2sas_cm1: log_info(0x31110d01): originator(PL), code(0x11), sub_code(0x0d01)
Sep 21 16:47:26 VOID kernel: sd 12:0:2:0: [sdo] tag#1652 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=DRIVER_OK cmd_age=3s
Sep 21 16:47:26 VOID kernel: sd 12:0:2:0: [sdo] tag#1652 Sense Key : 0x2 [current] 
Sep 21 16:47:26 VOID kernel: sd 12:0:2:0: [sdo] tag#1652 ASC=0x4 ASCQ=0x0 
Sep 21 16:47:26 VOID kernel: sd 12:0:2:0: [sdo] tag#1652 CDB: opcode=0x88 88 00 00 00 00 00 04 29 0f 48 00 00 04 00 00 00
Sep 21 16:47:26 VOID kernel: I/O error, dev sdo, sector 69799752 op 0x0:(READ) flags 0x0 phys_seg 128 prio class 0

 

Now it's like the webui is half broken? Neither button to resume or cancel the check works. Firefox acts like it's loading (refresh symbol changes to stop) but nothing ever happens.

 

I tried to change the spin down delay on the failed disk to run an extended smart test and the apply button doesn't work.

 

I tried to run a short smart test and the button for it doesn't work either.

 

I guess at this point I just restart the server to try to restore some semblance of functionality?

void-diagnostics-20220921-1650.zip

Edited by weirdcrap
Link to comment
18 minutes ago, trurl said:

No SMART report for disk13, but disabled/emulated disk13 is mounted and 91% full.

 

I guess you will have to shutdown and check connections

 

 

I'm actually remote, probably won't be able to get to it for about a week. I was hoping I could give it a reboot in the meantime to atleast fix the webui so I can run a Smart test and see if the disk has failed or something happened with the connections. Or if someone like johnny had any insight into what the controller errors pointed to.

 

I guess to play it safe I should just try to shut it down until I can get home.

 

How do I initiate a clean power down from the terminal since the webui seems to be borked?

Edited by weirdcrap
Link to comment

So this continues to get stranger.

 

I got home and ran a short smart test on the disk and it passed. I transferred it to another PC to do a long smart test because I had issues with it not spinning down in UD. It passed that as well. Despite it passing I opted to replace the disk anyway.

 

I precleared the new disk and started the parity rebuild. Different slot than the previous disk and everything. Not even an hour later it has also failed throwing a ton of errors. What gives?

 

If it was a cabling issue it shouldn't have followed me to another slot...

void-diagnostics-20220925-2144.zip

 

EDIT: CA AppData backup apparently got real pissed that I didn't have the array started for a few days (it may have tried to run a backup) and wouldn't let me stop the array. I ended up having to force the system down so I could move the drive to another slot and try again.

 

EDIT2: so there's something really strange going on here. Now every time I try to stop the array it hangs on unmounting the disk shares. last time it was the CA user share, this time it's the cache drive.

syslog.txt

 

EDIT3: I'm giving a rebuild one more go in another slot on an entirely different RAID controller this time. If it fails again hopefully someone here has a better idea on what's going on than I do.

Edited by weirdcrap
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.