Disk Issues - Since 6.9


Recommended Posts

10 minutes ago, JorgeB said:

Same thing, because:

 

Because of how parity1 works having an odd number of devices can create a valid (or invalid) file-system in the parity disk, and with btrfs this can cause more issues because it will be detected in the scan:

 

 

 

Okay,  can you expand on this and once again THANKS so much for your help
 

I change to Parity 2, and things get working again..

Then I add another disk to make it even in the array, re-sync.. will parity 2 now be messed up?

 

Honestly, it makes no sense to me. Isn't the whole point of unraid mix/match disk. Can't see in any documentations about an odd number of disk.

You also said it can create a valid "file system", I'm just trying to make sense of it.


So it's possible, I could clear that disk, put it back in rebuild.. and everything could be fine?
 


 

Link to comment
9 minutes ago, G Speed said:

Then I add another disk to make it even in the array, re-sync.. will parity 2 now be messed up?

No, parity2 is calculated in a different way, so that should never happen.

 

10 minutes ago, G Speed said:

Can't see in any documentations about an odd number of disk.

This is a very rare situation, it can only happen with an odd number of array devices, but it won't happen every time you have an odd number of devices.

 

10 minutes ago, G Speed said:

So it's possible, I could clear that disk, put it back in rebuild.. and everything could be fine?

If after that you assign it as parity1 it will it will be the same, since it's the parity information calculated from your currents devices that is creating the issue.

 

 

 

Link to comment
36 minutes ago, JorgeB said:

No, parity2 is calculated in a different way, so that should never happen.

 

This is a very rare situation, it can only happen with an odd number of array devices, but it won't happen every time you have an odd number of devices.

 

If after that you assign it as parity1 it will it will be the same, since it's the parity information calculated from your currents devices that is creating the issue.

 

 

 

 

So it's just a one off thing?


So I could potentially swap out one of my 3tb drives for a 4tb and it could potentially be fine?


Could this have been messed up from parity + other disks residing on my h310 that wasn't flashed properly?
Now it's properly in IT mode? Is it possible?

Parity - Disk 1-7 on H310
Disk 8-9 - cache 1-2 onboard

 

Edited by G Speed
Link to comment
3 minutes ago, G Speed said:

So I could potentially swap out one of my 3tb drives for a 4tb and it could potentially be fine?

Swapping just a disk likely won't change anything, but if for example you added two new disks it might not be an issue anymore.

 

It is strange that before there was the error but it wasn't causing any issues, and it did with new controller, maybe because the IDs changed order.

  • Haha 1
Link to comment
40 minutes ago, JorgeB said:

Swapping just a disk likely won't change anything, but if for example you added two new disks it might not be an issue anymore.

 

It is strange that before there was the error but it wasn't causing any issues, and it did with new controller, maybe because the IDs changed order.

 

Exactly, I had no issues with my setup. I even recently swapped out a 1tb for a 2tb, and used parity to rebuild.
Exact same hardware running 6.8.3 IIRC

Worked without a hitch

 

That's why I'm not sure parity 2 is the ONLY way to go

Edited by G Speed
Link to comment

So this is interesting, I put the parity disk back into slot 1. Then assigned it to Parity 1.

Gave me a Blue Square, and said it needs to do a rebuild - Obviously


I press start,  get that error again.. ALSO I noticed same original problem on Disk 4

 

Power-on or device reset occurred

 

I shut down the array, reseat drive and replug in all cables etc.. for Disk 4 "no power drive"
Start up system, remove parity it goes to unassigned devices, I format as NTFS - takes 5 seconds.

I throw it back into array, Parity 1. Asks for Rebuild - ZERO ERROS!

 

Mar 22 15:02:42 Servo emhttpd: shcmd (112): udevadm settle
Mar 22 15:02:42 Servo emhttpd: writing GPT on disk (sdm), with partition 1 byte offset 32KiB, erased: 0
Mar 22 15:02:42 Servo emhttpd: shcmd (113): sgdisk -Z /dev/sdm
Mar 22 15:02:42 Servo kernel: sdm: sdm1
Mar 22 15:02:43 Servo root: GPT data structures destroyed! You may now partition the disk using fdisk or
Mar 22 15:02:43 Servo root: other utilities.
Mar 22 15:02:43 Servo emhttpd: shcmd (114): sgdisk -o -a 8 -n 1:32K:0 /dev/sdm
Mar 22 15:02:43 Servo kernel: sdm: sdm1
Mar 22 15:02:44 Servo root: Creating new GPT entries in memory.
Mar 22 15:02:44 Servo root: The operation has completed successfully.
Mar 22 15:02:44 Servo emhttpd: shcmd (115): udevadm settle
Mar 22 15:02:44 Servo kernel: sdm: sdm1

 

Now just waiting to see when Parity done, and I reboot I get any errors.

And if Disk 4 throws anymore errors :(

Link to comment
Mar 22 16:49:06 Servo kernel: mpt2sas_cm0: log_info(0x31120311): originator(PL), code(0x12), sub_code(0x0311)
Mar 22 16:49:06 Servo kernel: sd 1:0:5:0: Power-on or device reset occurred
Mar 22 16:49:07 Servo kernel: sd 1:0:5:0: Power-on or device reset occurred
Mar 22 17:32:01 Servo kernel: mpt2sas_cm0: log_info(0x31120311): originator(PL), code(0x12), sub_code(0x0311)
Mar 22 17:32:02 Servo kernel: sd 1:0:5:0: Power-on or device reset occurred
Mar 22 17:32:02 Servo kernel: sd 1:0:5:0: Power-on or device reset occurred


It has begun.... :(
Same two drives Disk 4 and Disk 7

They share the same Cable, Same Hotswap Backplane, Same Power..

Yet Disk 5 and 6 are fine?

Link to comment
11 hours ago, JorgeB said:

Swap cables/slots between those disks and see where the problem follows.


Parity Finished - Rebooted, Started Array. Checksum error is back "so your right I need to stick with parity 2" in my case

 

As for the cables/slots.. I moved the two troubled drives over to the motherboard controller and in a different backplane

 

Link to comment

 

 

Mar 27 00:29:53 Servo emhttpd: spinning down /dev/sdd
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:04:00 Servo kernel: sd 7:0:0:0: Power-on or device reset occurred
Mar 27 01:04:01 Servo kernel: sd 7:0:0:0: Power-on or device reset occurred
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:09 Servo kernel: sd 7:0:0:0: Power-on or device reset occurred
Mar 27 01:44:10 Servo kernel: sd 7:0:0:0: Power-on or device reset occurred
Mar 27 01:44:19 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:19 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:19 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:19 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:19 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:19 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:19 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:19 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:19 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:19 Servo kernel: mpt2sas_cm0: log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)
Mar 27 01:44:20 Servo kernel: sd 7:0:0:0: Power-on or device reset occurred
Mar 27 01:44:20 Servo kernel: sd 7:0:0:0: Power-on or device reset occurred
Mar 27 02:03:24 Servo emhttpd: read SMART /dev/sdd

 

Still happening, and the two drives have been moved to a different backplane, on different cables, to a different port "mobo"

Also this is a different h310 installed now

 

The two drives moved over to the backplane "smart look fine"

 

Disk 4 and Disk 7 = The original problem drives are the ones moved to mobo
Disk 8 and Disk 9 = Are in the "old problem spot"

servo-diagnostics-20210327-0824.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.