Jump to content

Moving parity drives from external USB to (shucked) internal SATA


-C-
Go to solution Solved by Kilrah,

Recommended Posts

22 minutes ago, -C- said:

Is a parity sync achieved by keeping the "Write corrections to parity" checked before clicking the Check button?

Parity sync is when you add/replace the parity drive or after a new config, when you click o the button next to array stop it's always a parity check, correct or non correct are the only options, you need to run a correcting check.

Link to comment
2 minutes ago, JorgeB said:

correct or non correct are the only options, you need to run a correcting check.

OK, and that's achieved by leaving the "Write corrections to parity" box checked before I click the Check button?

 

As I said, that's what I thought I'd done on the one that finished on the 21st. I need to be clear that's the way to do it so I'm not wasting another 2 days+ on the wrong type of parity check... Thanks

Link to comment
24 minutes ago, -C- said:

OK, and that's achieved by leaving the "Write corrections to parity" box checked before I click the Check button?

 

As I said, that's what I thought I'd done on the one that finished on the 21st. I need to be clear that's the way to do it so I'm not wasting another 2 days+ on the wrong type of parity check... Thanks

If you have the parity check tuning plugin installed then it should be including in the history information whether the check was correcting or not.

Link to comment

Happy New Year all

 

It's just finished, found 2 errors again as expected, but listed the same in history as before, as a check without correction.

 

On my system (6.11.1) the box is checked by default. I saw @trurl's last post after starting the sync, so stopped it and refreshed the Main page to double-check and this is what mine looks like by default:

image.png.df623c488ef5647185fcdd6eeba25aa7.png

 

So I unchecked it and re-checked it, just to be sure, and started it again. But that hasn't made any difference- it has run a check again and not corrected the errors. This confirms what I was fairly sure of previously, that I'd run with this box checked on the check that finished on the 21st.

 

image.thumb.png.7100468f4bbb9ec4818cbc26d519da85.png

 

I'm getting a lot of these in my syslog while the server's sitting idle:

Jan  1 18:05:39 Tower  rc.diskinfo[6358]: PHP Warning: strpos(): Empty needle in /usr/local/emhttp/plugins/unassigned.devices.preclear/scripts/rc.diskinfo on line 413
Jan  1 18:05:47 Tower  rc.diskinfo[6742]: PHP Warning: strpos(): Empty needle in /usr/local/emhttp/plugins/unassigned.devices.preclear/scripts/rc.diskinfo on line 413
Jan  1 18:05:54 Tower  rc.diskinfo[7030]: PHP Warning: strpos(): Empty needle in /usr/local/emhttp/plugins/unassigned.devices.preclear/scripts/rc.diskinfo on line 413
Jan  1 18:06:02 Tower  rc.diskinfo[7807]: PHP Warning: strpos(): Empty needle in /usr/local/emhttp/plugins/unassigned.devices.preclear/scripts/rc.diskinfo on line 413
Jan  1 18:06:09 Tower  rc.diskinfo[8114]: PHP Warning: strpos(): Empty needle in /usr/local/emhttp/plugins/unassigned.devices.preclear/scripts/rc.diskinfo on line 413
Jan  1 18:06:17 Tower  rc.diskinfo[8598]: PHP Warning: strpos(): Empty needle in /usr/local/emhttp/plugins/unassigned.devices.preclear/scripts/rc.diskinfo on line 413
Jan  1 18:06:24 Tower  rc.diskinfo[9008]: PHP Warning: strpos(): Empty needle in /usr/local/emhttp/plugins/unassigned.devices.preclear/scripts/rc.diskinfo on line 413
Jan  1 18:06:32 Tower  rc.diskinfo[9357]: PHP Warning: strpos(): Empty needle in /usr/local/emhttp/plugins/unassigned.devices.preclear/scripts/rc.diskinfo on line 413
Jan  1 18:06:39 Tower  rc.diskinfo[9738]: PHP Warning: strpos(): Empty needle in /usr/local/emhttp/plugins/unassigned.devices.preclear/scripts/rc.diskinfo on line 413
Jan  1 18:06:46 Tower  rc.diskinfo[10176]: PHP Warning: strpos(): Empty needle in /usr/local/emhttp/plugins/unassigned.devices.preclear/scripts/rc.diskinfo on line 413
Jan  1 18:06:48 Tower  rc.diskinfo[10318]: PHP Warning: strpos(): Empty needle in /usr/local/emhttp/plugins/unassigned.devices.preclear/scripts/rc.diskinfo on line 413
Jan  1 18:07:03 Tower  rc.diskinfo[10980]: PHP Warning: strpos(): Empty needle in /usr/local/emhttp/plugins/unassigned.devices.preclear/scripts/rc.diskinfo on line 413
Jan  1 18:07:03 Tower  rc.diskinfo[11040]: PHP Warning: strpos(): Empty needle in /usr/local/emhttp/plugins/unassigned.devices.preclear/scripts/rc.diskinfo on line 413

 

I've looked into what could be causing this, and not found anything yet. I don't currently have any unassigned devices connected.

 

I've also been getting a GUI crash, seemingly due to unassigned devices, when running rsync manually, or via the unbalance plugin to move files around. Please see my post here for more info.

 

Could these be causing issues with me not being able to run the parity check as correcting?

Link to comment
24 minutes ago, trurl said:

Those are from the preclear addon for UD. If you aren't using it you could see if removing it helps.

 

Thanks. I've removed it, rebooted and started the parity check again. It looks like this:

image.png.06bf390062388b0dca7e93ca12f581b3.png

 

Is there any way to confirm that it's running as a correcting check without having to wait for it to complete?

Link to comment
37 minutes ago, -C- said:

Is there any way to confirm that it's running as a correcting check without having to wait for it to complete?

If by any chance you have the Parity Check Tuning plugin installed then from a console command line you could use the command:

   parity.check status

and that will tell you.

  • Thanks 1
Link to comment
2 hours ago, itimpi said:

If by any chance you have the Parity Check Tuning plugin installed

I do indeed- how could I not be making use of such a fine piece of software? ; P

 

Thanks so much, that's confirmed:

root@Tower:~# parity.check status
DEBUG:   Manual Correcting Parity Check running
Status: Manual Manual Correcting Parity Check (9.6% completed)

 

P.S. Be super great if this info could be displayed on the dashboard within the Parity block.

 

Link to comment
4 hours ago, -C- said:

P.S. Be super great if this info could be displayed on the dashboard within the Parity block.

I would love to do this but I have not found an easy way to do this.  
 

I might be able to submit a PR that would display this information in the status line while the check is running,as,that would then work even without the plugin installed

Link to comment
On 1/2/2023 at 4:00 AM, itimpi said:

I would love to do this but I have not found an easy way to do this.  
 

I might be able to submit a PR that would display this information in the status line while the check is running,as,that would then work even without the plugin installed

That would be great 🤞

Link to comment

Parity check's now complete, but I don't see any difference- The status:

image.png.4995d7a040bafef847ff6bc255e88a68.png

...and the history:

image.thumb.png.b74baf6947e5bc73d2e4de2333afd6db.png

 

both look the same as the previous non-correcting checks, with no mention of the errors having been corrected, which is of concern considering what @itimpi said previously:

On 12/30/2022 at 6:14 PM, itimpi said:

If you have the parity check tuning plugin installed then it should be including in the history information whether the check was correcting or not.

 

However, in the syslog I see this:

Jan  3 12:48:01 Tower Parity Check Tuning: DEBUG:   Manual Correcting Parity Check running
Jan  3 12:49:22 Tower kernel: md: recovery thread: PQ corrected, sector=39063584664
Jan  3 12:49:22 Tower kernel: md: recovery thread: PQ corrected, sector=39063584696
Jan  3 12:49:22 Tower kernel: md: sync done. time=148056sec
Jan  3 12:49:22 Tower kernel: md: recovery thread: exit status: 0

 

Can I now consider the Parity valid?

 

Should I now run a non-correcting check as @trurl recommended previously?

 

 

Link to comment

Dang it- I'm caught in a loop!

 

image.png.c68ec2bdcaca21406185549aa67e1364.png

image.thumb.png.9f6685ffe4998e228b35e206a8c8bf34.png


Jan  6 04:51:01 Tower Parity Check Tuning: DEBUG:   Manual Non-Correcting Parity Check running
Jan  6 05:00:01 Tower Parity Check Tuning: DEBUG:   Manual Non-Correcting Parity Check running
Jan  6 05:02:08 Tower kernel: md: recovery thread: PQ incorrect, sector=39063584664
Jan  6 05:02:08 Tower kernel: md: recovery thread: PQ incorrect, sector=39063584696
Jan  6 05:02:08 Tower kernel: md: sync done. time=25740sec
Jan  6 05:02:08 Tower kernel: md: recovery thread: exit status: 0

 

I'm not sure why this one took so much longer than previous checks to complete. At times it sounded like it was random seeking for hours, with very slow read speeds. Have been avoiding using the array during these checks as much as possible- barely any writes and only a few reads.

tower-diagnostics-20230106-1229.zip

Link to comment
1 hour ago, JorgeB said:

This is usually controller or disk, unlikely to be the onboard controller, did you just move parity or other disks also? Were previous checks error free?

I first set this system up on an older PC with the 2 parity drives connected via USB. With that setup everything worked OK. I didn't do a check, but the sync completed successfully after I added the 2nd parity.

 

I then moved the parity drives from USB to SATA with the new system- I checked parity as Unraid was seeing them as different drives due to being connected directly rather than via their USB controllers.

That check and all that I've done since they've been connected directly have come back with 2 errors, but only the check that finished on the 3rd was definitely run as a correcting check, confirmed by the syslog saying that the 2 problematic sectors (39063584664 & 39063584696) had been corrected. These are the same sectors listed with errors in today's result.

 

Not sure if it's relevant, but while the check's in progress I see it run through the drives and finish with each one in order of size as expected. My biggest array drive is 18TB, the 2 parties 20TB. It seems like these errors are being found after it's finished checking the 18TB, so the error is somewhere in that last 2TB.

Link to comment
11 minutes ago, -C- said:

It seems like these errors are being found after it's finished checking the 18TB, so the error is somewhere in that last 2TB.

Yes, they are really close to the end, at the 19 531 792 332K mark, that's just before the end of the parity disks, total size is 19 531 825 100K, good news is that this means the problem cannot be any of the data disks, on the other hand both parity and parity2 being incorrect is strange, you could maybe try with just one of the parity1, if the same just parity2.

Link to comment
7 hours ago, JorgeB said:

good news is that this means the problem cannot be any of the data disks

That is somewhat comforting : )

 

7 hours ago, JorgeB said:

on the other hand both parity and parity2 being incorrect is strange, you could maybe try with just one of the parity1, if the same just parity2.

 

OK- willing to try anything to get this back to working.

 

Is the process to stop the array, remove the parity 2 disk, restart the array and run a correcting parity check?

 

Edit: have now done as above, will report back

 

Edited by -C-
Update
Link to comment

Correcting check with parity 2 disconnected finished earlier. It says it corrected errors on the same 2 sectors as previously:

 

Jan  8 16:18:01 Tower Parity Check Tuning: DEBUG:   Automatic Correcting Parity Check running
Jan  8 16:24:01 Tower Parity Check Tuning: DEBUG:   Automatic Correcting Parity Check running
Jan  8 16:30:01 Tower Parity Check Tuning: DEBUG:   Automatic Correcting Parity Check running
Jan  8 16:31:20 Tower kernel: md: recovery thread: P corrected, sector=39063584664
Jan  8 16:31:20 Tower kernel: md: recovery thread: P corrected, sector=39063584696
Jan  8 16:31:21 Tower kernel: md: sync done. time=144317sec
Jan  8 16:31:21 Tower kernel: md: recovery thread: exit status: 0

 

Yet the status and history are both saying there's still 2 errors:

 

image.png.d79c402cfd4d8c2c4a836b600c9d0ede.png

 

image.thumb.png.376103e4ce5c9ebe3f0fe192c3e53352.png

Link to comment

After the check that finished on the 8th, while attempting to remove the 2nd parity drive, the array wouldn't start back up. The message in the GUI footer was "Array Stopped... stale configuration".

From that I found this Reddit post, which pointed to the issue being due to using Firefox, which has been my daily driver since the Firebird days. I have seen a fair few "resend the last request" dialogues since I started using Unraid and always chosen the 'Cancel' option and it seemed there were no consequences. In the Reddit post they reference v6.11.5, yet I've been on v6.11.1 since I first tried Unraid. They also mention that this issue is related to making changes to the array, yet I'm fairly certain I've seen it at a few different places throughout the GUI. Someone there even found that their /mnt/user folder was missing until they downgraded to v6.11.4

 

To be clear- I have not been getting the "resend the last request" dialogue while attempting to get this parity check to complete successfully. If I had that would have rung alarm bells and I would have investigated why that dialogue was appearing.

 

I rebooted the server and the array started back up.

 

I keep a portable version of Chrome around for testing, so reran a correcting check (with only the parity 1 drive) using that, and had success:

image.thumb.png.bc6fe6b6cfeeccfc7cfe467e55ab7687.png

 

 

I've now reconnected the Parity 2 disk to the array and a parity sync is running.

 

Looks like I'm going to have to run 2 browsers until I hear that this Unraid incompatibility with Firefox has been rectified.

 

 

 

Edited by -C-
typo
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...