[Plugin] Parity Check Tuning


Recommended Posts

On 4/11/2022 at 2:31 PM, Quejo said:

asa you release it i will test and report. 

Just made release for this available, so let me know how it goes.

 

I am not sure if the value of Settings->Disk Settings->Tunable (poll_attributes) affects now often temperatures are updated?  I was testing with this set to 120.

Link to comment
13 hours ago, Quejo said:

first completed parity check with latest release.

tower-diagnostics-20220415-1520.zip 323.63 kB · 2 downloads

I will look through the log information to see if I spot anything.

 

Any feedback as to whether it is now behaving as expected or was there still unexpected behavior?

 

Also, I would appreciate it of you can provide copies of the .progress or .progress.save type files from the plugin's folder on the flash drive as this will help with identifying key points in the syslog.

 

Link to comment
On 4/16/2022 at 3:53 AM, itimpi said:

I will look through the log information to see if I spot anything.

 

Any feedback as to whether it is now behaving as expected or was there still unexpected behavior?

 

Also, I would appreciate it of you can provide copies of the .progress or .progress.save type files from the plugin's folder on the flash drive as this will help with identifying key points in the syslog.

 

From what i could see its working as expected. I have enabled spin down delay to 45 minutes and now disks that get overheated after that time sleeps and cool down normally. Also disks that are not being checked during parity can sleep and dissipate less heat.

parity.check.tuning.progress.save

Link to comment
3 minutes ago, Quejo said:

From what i could see its working as expected. I have enabled spin down delay to 45 minutes and now disks that get overheated after that time sleeps and cool down normally. Also disks that are not being checked during parity can sleep and dissipate less heat.

parity.check.tuning.progress.save 9.06 kB · 0 downloads

Thanks for the feedback and the file for me to check.   It confirms what I thought I saw in the diagnostics.

 

I might be able to make some small tweaks to improve logging but unless I hear otherwise I will assume that I have now got the logic correct.   Next time you get around to running a check you might want to try with one of the less verbose logging options to see if the amount of detail they give feels about right to you.

Link to comment
On 4/18/2022 at 3:00 PM, itimpi said:

Thanks for the feedback and the file for me to check.   It confirms what I thought I saw in the diagnostics.

 

I might be able to make some small tweaks to improve logging but unless I hear otherwise I will assume that I have now got the logic correct.   Next time you get around to running a check you might want to try with one of the less verbose logging options to see if the amount of detail they give feels about right to you.

here it is

thanks

tower-diagnostics-20220421-1739.zip

Link to comment

I am noticing unexpected behavior with this plugin recently. I am on version 2022.04.12. I use it to pause parity check when disks overheat (trust me, I've tried various cooling solutions...)

 

Recently, I enabled "Use increments for scheduled parity check," with increments running between 2am and 6am daily and the scheduled check starting at 2am every Saturday.

 

The behavior I get is that parity check starts at 2am on Saturday, runs until the disks overheat (15-20 min) and then permanently pauses. (I can manually resume, but it won't resume automatically)

 

Attached are the current portion of the log file, the current .progress file, the current .progress.save file and a screen grab of my settings. I'll appreciate any help/advice on how I might get this to work as expected. Thanks!

parity.check.tuning.settings.png

parity.check.tuning.log parity.check.tuning.progress parity.check.tuning.progress.save

Link to comment

To give me more information on exactly what is happening under the covers can you enable the Testing mode of logging in the plugin and then give me the syslog (or diagnostics which includes the syslog).  You may also get a clue if you simply enable the Debug level of logging as that will show you the details of the plugin monitoring the temperatures.
 

Sounds as if something may be going wrong with determining the correct resume conditions, but it is not obvious what that might be.  Having said that I notice you have a large value for the temperature drop at which to resume - are you sure your disks cool down enough to reach that temperature as if not that it could explain why you do not get the check resumed.  Regardless the testing mode of logging will let me see if that is the problem.

 

Link to comment

Just closing the loop on this for everyone... Sadly (?) this was user error. Some of my disks in fact have not been cooling to the restart temperature. I have changed the thresholds in the plugin settings and will report back if things are still not working. Many thanks to itimpi for reviewing my logs and letting me know this.

Link to comment

Had an unclean (system forced) shutdown when rebooting after updating to 6.10 RC5 and was surprised that my system didn't start a parity check after rebooting. Then I received this notification which makes me think that the Parity Check Tuning plugin is preventing this.

 

Quote

Event: Parity Check Tuning
Subject: [BRUNNHILDE] Automatic unRaid No array operation in progress will be started
Description: Unclean shutdown detected
Importance: warning

 

I would prefer that a parity check was started after an unclean shutdown but not sure which setting needs to be adjusted to accomplish this?

 

brunnhilde-diagnostics-20220426-2128.zip

Link to comment

Are you sure that you had an unclean shutdown as the plugin should never prevent Unraid starting a parity check so something else is going on.  Simply doing the reboot after an upgrade should not  cause an unclean shutdown unless the Unraid did not succeed in stopping the array cleanly.

 

it looks as if the plugin was expecting an automatic check because of an unclean shutdown and is trying to inform you of this.  However the message itself does not actually make sense as an automatic check does not appear to have started, so I need to look into that code to see why the plugin thought an unclean shutdown occurred in case it got that wrong :(    It has never been tested against rc5 so that may have caused some unexpected. behaviour and a spurious message to be displayed.   Now that I can get rc5 for myself I will look into this.

Link to comment
8 hours ago, itimpi said:

Are you sure that you had an unclean shutdown as the plugin should never prevent Unraid starting a parity check so something else is going on.  Simply doing the reboot after an upgrade should not  cause an unclean shutdown unless the Unraid did not succeed in stopping the array cleanly.

 

it looks as if the plugin was expecting an automatic check because of an unclean shutdown and is trying to inform you of this.  However the message itself does not actually make sense as an automatic check does not appear to have started, so I need to look into that code to see why the plugin thought an unclean shutdown occurred in case it got that wrong :(    It has never been tested against rc5 so that may have caused some unexpected. behaviour and a spurious message to be displayed.   Now that I can get rc5 for myself I will look into this.

Definitely had a forced shutdown (420 second timeout exceeded). Saw the message on the console. Was surprised that an automatic parity check didn’t start after the startup was complete. Got the notification I included in my previous post so it appeared that the tuning plugin had somehow prevented the parity check.

 

I should be able to try an reproduce this later today if you want.

Link to comment

Not everything that is "forced" is an unclean shutdown.

 

Certain versions of the OS were too aggressive in what they called an unclean shutdown and the resulting parity check, and this may be what you were used to seeing.

 

An unclean shutdown is where the system cannot  unmount the drives, and has to force them to unmount.

 

So long as the drives can be unmounted correctly, the shutdown is "clean" regardless if a process(es) had to be forcibly terminated.

Link to comment
23 minutes ago, Squid said:

So long as the drives can be unmounted correctly, the shutdown is "clean" regardless if a process(es) had to be forcibly terminated.

Good to know. As far as I can tell my forced shutdowns are being caused by the Home Assistant VM not shutting down within the timeout period (420 seconds). If I shutdown the VM manually before rebooting I never get a forced shutdown. Won’t even get a wait of more than a few seconds usually.

 

The notification email still seems to indicate that an unclean shutdown was detected though.

Quote

Event: Parity Check Tuning
Subject: [BRUNNHILDE] Automatic unRaid No array operation in progress will be started
Description: Unclean shutdown detected
Importance: warning

 

Link to comment
1 minute ago, wgstarks said:

Home Assistant VM not shutting down within the timeout period (420 seconds)

If it's Windows, you should hibernate the VM instead of doing a shutdown.  (VM Settings - change shutdown to be "Hibernate" and install the QEMU Guest Tools on the VM

Link to comment
1 hour ago, wgstarks said:

I should be able to try an reproduce this later today if you want.

If you ARE going to try and reproduce this issue can you first enable the Testing logging mode in the plugin so I have more information on what is happening under the covers.

 

18 minutes ago, wgstarks said:

The notification email still seems to indicate that an unclean shutdown was detected though.

The message is from the plugin (not Unraid) and indicates that the plugin THOUGHT an unclean shutdown occurred and that Unraid is about to start an automatic check.  However if Unraid does not agree it was an unclean shutdown then no check is started.  Looking at the code that can explain why the message text was a little strange.  I am working on handling such a scenario in a tidy manner.

Link to comment
23 hours ago, itimpi said:

If you ARE going to try and reproduce this issue can you first enable the Testing logging mode in the plugin so I have more information on what is happening under the covers.

Tried 3 times to reproduce this and couldn't. Can't get a forced shutdown when I want it.

Link to comment
  • 1 month later...

I prefer to not operate my server over night.  I started a manual parity check off of the Main tab and got to about 8.5%.  Then I shutdown.  Later I restarted and started parity check which proceeded to start from 0.  Is this expected behavior or am I doing something wrong?  I was assuming that I would be able to resume.  I am on the latest version of Unraid and am using the default settings with the exception of "Use increments for manual parity check:" which is set to Y.   I have "Scheduled parity check:" set to disabled.

Edited by mikela
Link to comment
4 hours ago, mikela said:

I prefer to not operate my server over night.  I started a manual parity check off of the Main tab and got to about 8.5%.  Then I shutdown.  Later I restarted and started parity check which proceeded to start from 0.  Is this expected behavior or am I doing something wrong?  I was assuming that I would be able to resume.  I am on the latest version of Unraid and am using the default settings with the exception of "Use increments for manual parity check:" which is set to Y.   I have "Scheduled parity check:" set to disabled.

You cannot restart a check manually except from the start.

 

What SHOULD work is that in the plugin's settings you set the option to automatically restart a parity check the next time the array starts.   The plugin will then detect a parity check was in progress during the array shutdown process and save the information needed for restarting the parity check.   As long as on restart the plugin thinks it was not an untidy shutdown the plugin will automatically resume the parity check from the point it had reached when the shutdown was initiated.

 

If you cannot get that working then please turn on the Testing mode logging in the plugin's settings with the option to send the logging information to the flash drive set.  You might also want to have the syslog server set up so that you have full syslog information covering the shutdown/reboot sequence.  Then let me have that log file and I should be able to work out why it is not working for you on your system.

 

Whether you have the  "Scheduled parity check:" option set should be irrelevant as the plugin can resume both scheduled and manually initiated checks.

 

 

  • Thanks 1
Link to comment

Would it be possible to add an option for ending after a certain percentage instead of time? Right now I have it run from 1:30am to 7:30am. It takes 4 days to get to like 93%. Would love to be able to have it do 25% each day and get it done in 4 days. Have both, so I can tell it to pause at increments of x% or if it runs for y amount of time.

Link to comment
15 minutes ago, VKapadia said:

Would it be possible to add an option for ending after a certain percentage instead of time? Right now I have it run from 1:30am to 7:30am. It takes 4 days to get to like 93%. Would love to be able to have it do 25% each day and get it done in 4 days. Have both, so I can tell it to pause at increments of x% or if it runs for y amount of time.

I’ll think about it, but I am not convinced it would not just end up complicating the settings page without much benefit for the majority of users.   You could achieve the same effect by slightly tweaking the increment pause/resume times so that each increment can run a little longer.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.