[Plugin] Parity Check Tuning


Recommended Posts

11 hours ago, Malachi said:

I've run into some behavior I'm not sure is expected. I was running a parity sync when appdata backup started. The sync operation was paused, as expected, but then not resumed when it finished. It was outside the increment window, but I would have only expected that to apply to parity checks, not syncs, especially since I have increments disabled for Parity-Sync/Data Rebuild.

 

297413068_2023-08-2901_05_53-Window.png.af96e0187a40da4829ae41d0288e9ede.png

 

2082369482_2023-08-2901_06_08-root@NAS__bash--login(NAS)Waterfox.thumb.png.9759692f59946ddd1557eeee0e4def97.png

It actually applies to all array operations as the backup interacts badly whatever the operation.

 

The current release has a bug that means the resume is not working as expected.  I have been testing for the last few days a release which fixes this.  Hopefully if nothing comes out at the last moment this fix release will be out tomorrow.  This release will also incorporate the notifications behaviour I mentioned in my previous post

  • Thanks 1
Link to comment

Just pushed the release 2023-09-02 that includes:

  • Fix for correctly resuming if paused due to mover running
  • Reworked notifications as discussed earlier

Testing this took a little longer than anticipated but hopefully that means I caught most edge cases.  There was quite a lot of code reorganization so always the chance that a regressions was introduced.

 

Please let me know if you spot/encounter any anomalous behaviour. 

  • Like 2
Link to comment

I just updated to the latest version of this plugin and now am being spammed with notifications every six minutes - "Array operation not resumed - outside increment window".

 

I have tried turning off "Send notifications for Pause or Resume of increments" but I am still receiving notifications every 6 minutes.

 

My scheduled parity check is in-progress, but currently outside the increment window (the notification is accurate), but I would like a way to turn this off if possible, and only display notifications for when it is being resumed and when it is being paused.

 

Attached are screenshots displaying the above.

 

 

image2.png

image.png

Link to comment

OK - I will look at the not resumed message - it is meant to only occur once .

 

I also need to work out why you are getting the notifications at all if you have them turned off in the plugin settings.    I will look at getting a fix out for this ASAP as I suspect it will be obvious when I look into it.

 

EDIT:   I think I see the cause for both issues, and am currently testing it so expect a fix soon.

 

Link to comment

I ran an upgrade to Unraid 6.12.3 while a parity check was running. It paused, and resumed correctly after the upgrade when the resume time came around, and completed successfully after a couple of days.

 

But now, every day at 07:00 (the pause time) I get a Pushover notification saying "Paused. No array operation in progress (0.0% completed)".

 

Is there a way to get rid of this? Would editing the progress.save file be enough to fix it? (Turning "send notifications for pause or resume of increments" off did stop it, but I want those to be on when a check is actually running…)

 

Here's the parity.check.tuning.progress.save file:

 

type|date|time|sbSynced|sbSynced2|sbSyncErrs|sbSyncExit|mdState|mdResync|mdResyncPos|mdResyncSize|mdResyncCorr|mdResyncAction|Description
MANUAL|2023 Jul 26 06:34:22|1690349662|1690348691|0|0|0|STARTED|11718885324|129515364|11718885324|1|check P|Manual Correcting Parity-Check|
PAUSE|2023 Jul 26 07:00:06|1690351206|1690348691|0|0|0|STARTED|11718885324|334602096|11718885324|1|check P|Manual Correcting Parity-Check|
RESUME (MANUAL)|2023 Jul 26 10:24:39|1690363479|1690363219|0|0|0|STARTED|11718885324|369333984|11718885324|1|check P|Manual Correcting Parity-Check|
PAUSE|2023 Jul 27 01:00:54|1690416054|1690363219|0|0|0|STARTED|11718885324|6269949784|11718885324|1|check P|Manual Correcting Parity-Check|
PAUSE (MANUAL)|2023 Jul 27 01:30:18|1690417818|1690363219|1690416054|0|-4|STARTED|0|6270038284|11718885324|1|check P|Manual Correcting Parity-Check|
RESUME (MANUAL)|2023 Jul 27 06:24:22|1690435462|1690435247|0|0|0|STARTED|11718885324|6294077764|11718885324|1|check P|Manual Correcting Parity-Check|
PAUSE|2023 Jul 27 07:00:08|1690437608|1690435247|0|0|0|STARTED|11718885324|6526254048|11718885324|1|check P|Manual Correcting Parity-Check|
RESUME (MANUAL)|2023 Jul 27 12:30:39|1690457439|1690457240|0|0|0|STARTED|11718885324|6545790780|11718885324|1|check P|Manual Correcting Parity-Check|
PAUSE|2023 Jul 28 01:00:48|1690502448|1690457240|0|275|0|STARTED|11718885324|11664698612|11718885324|1|check P|Manual Correcting Parity-Check|
PAUSE (MANUAL)|2023 Jul 28 01:12:46|1690503166|1690457240|1690502448|275|-4|STARTED|0|11664770296|11718885324|1|check P|Manual Correcting Parity-Check|
RESUME (MANUAL)|2023 Jul 28 06:18:21|1690521501|1690521397|0|275|0|STARTED|11718885324|11674986128|11718885324|1|check P|Manual Correcting Parity-Check|
COMPLETED|2023 Jul 28 06:30:16|1690522216|1690521397|1690521948|275|0|STARTED|0|0|11718885324|1|check P|No array operation in progress|

 

And here's the parity.check.tuning.progress:

 

type|date|time|sbSynced|sbSynced2|sbSyncErrs|sbSyncExit|mdState|mdResync|mdResyncPos|mdResyncSize|mdResyncCorr|mdResyncAction|Description
SCHEDULED|2023 Aug 21 22:00:07|1692651607|1690521397|1690521948|275|0|STARTED|0|0|11718885324|1|check P|No array operation in progress|
PAUSE|2023 Aug 22 07:00:06|1692684006|1692651607|0|0|0|STARTED|11718885324|1975074812|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
RESUME|2023 Aug 22 22:30:08|1692739808|1692651607|1692684006|0|-4|STARTED|0|1975112588|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
PAUSE|2023 Aug 23 01:00:27|1692748827|1692739809|0|0|0|STARTED|11718885324|2939569160|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
PAUSE (MANUAL)|2023 Aug 23 01:12:19|1692749539|1692739809|1692748827|0|-4|STARTED|0|2939681292|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
RESUME|2023 Aug 23 22:30:08|1692826208|1692739809|1692748827|0|-4|STARTED|0|2939681292|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
PAUSE|2023 Aug 24 01:00:35|1692835235|1692826208|0|0|0|STARTED|11718885324|3717129116|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
PAUSE (MANUAL)|2023 Aug 24 01:30:19|1692837019|1692826208|1692835236|0|-4|STARTED|0|3717246472|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
RESUME|2023 Aug 24 22:30:08|1692912608|1692826208|1692835236|0|-4|STARTED|0|3717246472|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
PAUSE|2023 Aug 25 01:00:27|1692921627|1692912609|0|0|0|STARTED|11718885324|4750530368|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
PAUSE (MANUAL)|2023 Aug 25 01:18:23|1692922703|1692912609|1692921628|0|-4|STARTED|0|4750665580|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
RESUME|2023 Aug 25 22:30:07|1692999007|1692912609|1692921628|0|-4|STARTED|0|4750665580|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
PAUSE|2023 Aug 26 01:00:25|1693008025|1692999008|0|0|0|STARTED|11718885324|5863042816|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
PAUSE (MANUAL)|2023 Aug 26 01:12:17|1693008737|1692999008|1693008026|0|-4|STARTED|0|5863140880|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
STOPPING|2023 Aug 26 08:46:08|1693035968|1692999008|1693008026|0|-4|STARTED|0|5863140880|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
PAUSE (RESTART)|2023 Aug 26 08:46:09|1693035969|1692999008|1693008026|0|-4|STARTED|0|5863140880|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
RESUME (RESTART)|2023 Aug 26 08:57:18|1693036638|1693036628|0|0|0|STARTED|11718885324|5863827540|11718885324|0|check P|No array operation in progress|
RESUME|2023 Aug 26 22:30:07|1693085407|1693036628|1693036640|0|-4|STARTED|0|5864121344|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
PAUSE|2023 Aug 27 07:00:06|1693116006|1693085407|0|0|0|STARTED|11718885324|9213726048|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
RESUME|2023 Aug 27 22:30:08|1693171808|1693085407|1693116007|0|-4|STARTED|0|9213817272|11718885324|0|check P|Scheduled Non-Correcting Parity-Check|
PAUSE|2023 Aug 28 07:00:07|1693202407|1693171809|1693194257|0|0|STARTED|0|0|11718885324|0|check P|No array operation in progress|
PAUSE|2023 Aug 29 07:00:08|1693288808|1693171809|1693194257|0|0|STARTED|0|0|11718885324|0|check P|No array operation in progress|
PAUSE|2023 Aug 30 07:00:06|1693375206|1693171809|1693194257|0|0|STARTED|0|0|11718885324|0|check P|No array operation in progress|
PAUSE|2023 Aug 31 07:00:07|1693461607|1693171809|1693194257|0|0|STARTED|0|0|11718885324|0|check P|No array operation in progress|
PAUSE|2023 Sep 01 07:00:06|1693548006|1693171809|1693194257|0|0|STARTED|0|0|11718885324|0|check P|No array operation in progress|
PAUSE|2023 Sep 02 07:00:07|1693634407|1693171809|1693194257|0|0|STARTED|0|0|11718885324|0|check P|No array operation in progress|
PAUSE|2023 Sep 03 07:00:07|1693720807|1693171809|1693194257|0|0|STARTED|0|0|11718885324|0|check P|No array operation in progress|
PAUSE|2023 Sep 04 07:00:07|1693807207|1693171809|1693194257|0|0|STARTED|0|0|11718885324|0|check P|No array operation in progress|
PAUSE|2023 Sep 04 13:50:11|1693831811|1693171809|1693194257|0|0|STARTED|0|0|11718885324|0|check P|No array operation in progress|

 

This is with the 2023.09.03 version of the plugin. I've attached diagnostics, and a syslog with logging set to Testing.

 

Thanks!

 

parity.txtcelestia-diagnostics-20230904-1400.zip

Link to comment

Deleting the progress file will not help.    What seems to be confusing things is a parity.tuning.restart file in the plugins folder on the flash drive - that should only be present immediately after booting if a restart of an array operation is pending.  Deleting this file should solve your immediate problem but it is not clear why it was there in the first place.  You do have the restart option enabled but that file should only get created if an array operation was actually in progress when the system was last shutdown/rebooted - I assume this was not the case?    
 

There is also a parity.tuning.scheduled file present which I would also expect to be removed after the parity check completes, but I do not see an entry in the parity.tuning.progress file that the completion was ever detected which probably also explains the parity.tuning.scheduled file still being there.   I assume it DID finish?

 

i will see if i can recreate the issue, but if after deleting the parity.tuning.restart file it reappears i would love to know what lead up to that.

 

EDIT:   I can confirm that I am getting some unexpected behaviour if a restart happens during the check so that gives me something to look at.   The restart happens fine but the plugin then seems to get a bit confused about the state of things for tidying up as it should

 

 

Link to comment

I, too, am running into the multi-notification issue but just updated to the latest so I'd expect that to resolve.  I do however have another behavior that is plaguing me.

 

I've got my parity check scheduled from 12:30am - 7:30am with it pausing for updates, backups and mover which occur at 3am, 4am and 6am respectively.  What I can't figure out is the parity check pauses at 4am and does not resume until the next day at 12:30am.  This feels like a config issue and maybe not a bug.  Anyone have any ideas?  I can provide the diagnostics if it would help.

Screen Shot 2023-09-04 at 9.23.24 AM.png

Link to comment
1 hour ago, Mat W said:

What I can't figure out is the parity check pauses at 4am and does not resume until the next day at 12:30am.  This feels like a config issue and maybe not a bug.

This sounds like the behaviour which has recently been fixed where a resume after pausing for mover or backup active was not resuming when they completed.   If it is still occurring with the latest release then let me know.     You can also disable this type of pause/resume in the plugin settings but should not now need to.  

Link to comment
33 minutes ago, itimpi said:

This sounds like the behaviour which has recently been fixed where a resume after pausing for mover or backup active was not resuming when they completed.   If it is still occurring with the latest release then let me know.     You can also disable this type of pause/resume in the plugin settings but should not now need to.  

 

Will do, appreciate the quick reply.

Link to comment
11 hours ago, itimpi said:

Deleting the progress file will not help.    What seems to be confusing things is a parity.tuning.restart file in the plugins folder on the flash drive - that should only be present immediately after booting if a restart of an array operation is pending.  Deleting this file should solve your immediate problem but it is not clear why it was there in the first place.  You do have the restart option enabled but that file should only get created if an array operation was actually in progress when the system was last shutdown/rebooted - I assume this was not the case?    
 

There is also a parity.tuning.scheduled file present which I would also expect to be removed after the parity check completes, but I do not see an entry in the parity.tuning.progress file that the completion was ever detected which probably also explains the parity.tuning.scheduled file still being there.   I assume it DID finish?

 

i will see if i can recreate the issue, but if after deleting the parity.tuning.restart file it reappears i would love to know what lead up to that.

 

EDIT:   I can confirm that I am getting some unexpected behaviour if a restart happens during the check so that gives me something to look at.   The restart happens fine but the plugin then seems to get a bit confused about the state of things for tidying up as it should

 

 

 

I also get "extra" notifications. That is, I had an unclean shutdown (an issue with a VM). I resolved the situation but now I get the following notification every time I start the array after a reboot even though the shutdowns have been clean.

"Event: Parity Check Tuning
Subject: [XXX] Automatic unRaid Parity-Check will be started
Description: Unclean shutdown detected
Importance: warning"

 

Additionally, it does not start a parity check, I just get the notification. 

 

Link to comment
6 hours ago, Ruato said:

 

I also get "extra" notifications. That is, I had an unclean shutdown (an issue with a VM). I resolved the situation but now I get the following notification every time I start the array after a reboot even though the shutdowns have been clean.

"Event: Parity Check Tuning
Subject: [XXX] Automatic unRaid Parity-Check will be started
Description: Unclean shutdown detected
Importance: warning"

 

Additionally, it does not start a parity check, I just get the notification. 

 

I thought I mentioned earlier that the plugin can currently think there has been an unclean shutdown when this not the case so generates that notification spuriously and it can be ignored.   This is being worked on.

  • Thanks 1
Link to comment
17 minutes ago, itimpi said:

I thought I mentioned earlier that the plugin can currently think there has been an unclean shutdown when this not the case so generates that notification spuriously and it can be ignored.   This is being worked on.

 

Sorry for the unnecessary question/post. I had missed your earlier post.

Link to comment
On 9/4/2023 at 2:32 PM, itimpi said:

Deleting the progress file will not help.    What seems to be confusing things is a parity.tuning.restart file in the plugins folder on the flash drive - that should only be present immediately after booting if a restart of an array operation is pending.  Deleting this file should solve your immediate problem but it is not clear why it was there in the first place.  You do have the restart option enabled but that file should only get created if an array operation was actually in progress when the system was last shutdown/rebooted - I assume this was not the case?    
 

There is also a parity.tuning.scheduled file present which I would also expect to be removed after the parity check completes, but I do not see an entry in the parity.tuning.progress file that the completion was ever detected which probably also explains the parity.tuning.scheduled file still being there.   I assume it DID finish?

 

Thanks — yes, it did finish the parity check successfully; it was about 50% through (and paused during the daytime) when I ran the upgrade — so in progress but paused automatically.

 

UPDATE: While I'm sure I had a notification saying it had finished a couple of days after the reboot, after removing the parity.tuning.restart file I got a notification about half an our later saying the parity check had finished after 9 days, so I guess that gave it a bit of a kick 😁

Edited by ElectricBadger
Link to comment
21 hours ago, itimpi said:

This sounds like the behaviour which has recently been fixed where a resume after pausing for mover or backup active was not resuming when they completed.   If it is still occurring with the latest release then let me know.     You can also disable this type of pause/resume in the plugin settings but should not now need to.  

It failed to restart after the 4am backups again this morning.  I'm seeing the "Array operation not resumed - outside increment window" messaging in the logs.  I'm currently using the 2023.09.03 version of the plugin.

Link to comment
1 hour ago, Mat W said:

It failed to restart after the 4am backups again this morning.  I'm seeing the "Array operation not resumed - outside increment window" messaging in the logs.  I'm currently using the 2023.09.03 version of the plugin.

What time did the backups finish?

 

To look at why I would need the testing mode logging to be activated in the plugin settings covering that period and the resulting diagnostics posted.

Link to comment
6 minutes ago, itimpi said:

What time did the backups finish?

 

To look at why I would need the testing mode logging to be activated in the plugin settings covering that period and the resulting diagnostics posted.

The backups finished around 4:14am and the plugin registered they finished and it was outside of increment window at 4:18am.

Sep  5 04:14:05 Unplucky kernel: br-fa8d007e8d97: port 2(vethb97540b) entered blocking state
Sep  5 04:14:05 Unplucky kernel: br-fa8d007e8d97: port 2(vethb97540b) entered forwarding state
Sep  5 04:18:21 Unplucky Parity Check Tuning: Send notification: backup no longer running:   
Sep  5 04:18:21 Unplucky Parity Check Tuning: Send notification: Array operation not resumed - outside increment window: Scheduled Non-Correcting Parity-Check (68.2% completed) 

I have enabled testing mode logging and will provide the diags tomorrow morning.

Link to comment
5 hours ago, itimpi said:

Thought it was worth pointing out that I have made the 2023-09-05 release available which should stop incorrect detection of unclean shutdowns.    

 

Not sure, though, if there is not some other lingering problem that still needs resolving.

 

 Oh man I just had my first power outage that was long enough to require my server to shutdown. Boy was I annoyed when I still got an unclean shutdown message after power came back!

Very glad to know it is just a bug, I just finished my quarterly parity check this morning!

 

EDIT: I completely forgot to mention the part where the power outage was long enough for my UPS to issue the shutdown. I thought my UPS had gone ahead and failed or something resulting in the unclean shutdown.

Edited by Swarles
Link to comment
11 minutes ago, Swarles said:

 Oh man I just had my first power outage that was long enough to require my server to shutdown. Boy was I annoyed when I still got an unclean shutdown message after power came back!

Very glad to know it is just a bug, I just finished my quarterly parity check this morning!

 

You will still get an Unclean shutdown message if it WAS an unclean shutdown with the 2023-09-05 release but at least it should now be genuine.   It is also now more prominent as it happens earlier in the start up sequence and is flagged as an alert (red) notification.   Again this is only an informative message but I think it adds value as users will at least know if they have a shutdown issue if they get it when not expecting it.   I might make the notification point to the part of the online documentation about troubleshooting unclean shutdowns if you click on it to help people find that. Any thoughts on that?

 

I have currently disabled the message that indicates Unraid (not the plugin) is going to start an automatic parity check.  I think that it would now be valid if you get it but I want to do more testing first and since it was only ever meant to be just an informative message disabling it does no harm.

  • Like 1
Link to comment
2 hours ago, itimpi said:

You will still get an Unclean shutdown message if it WAS an unclean shutdown with the 2023-09-05 release but at least it should now be genuine.   It is also now more prominent as it happens earlier in the start up sequence and is flagged as an alert (red) notification.   Again this is only an informative message but I think it adds value as users will at least know if they have a shutdown issue if they get it when not expecting it.   I might make the notification point to the part of the online documentation about troubleshooting unclean shutdowns if you click on it to help people find that. Any thoughts on that?

Oops, I edited my message to make it make more sense haha.

I think adding the link to the unclean troubleshooting is a great idea, could also potentially result in less people asking here what to do.

2 hours ago, itimpi said:

I have currently disabled the message that indicates Unraid (not the plugin) is going to start an automatic parity check.  I think that it would now be valid if you get it but I want to do more testing first and since it was only ever meant to be just an informative message disabling it does no harm.

I'm not sure if this is helpful, but I collected system logs during my shutdown procedure and start up procedure (which resulted in the spurious "unclean shutdown" message). I was preparing to make a post asking about it in the forums because I hadn't initially realised it was incorrect and couldn't figure out why it was unclean. If you think those system logs might be helpful at all I'd be happy to send them through.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.