Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

[Plugin] Parity Check Tuning

Featured Replies

My flat experiences frequent power outages, occurring every few days. I recently changed my array configuration and started a parity-sync, configuring the plugin to use increments for Parity-Sync/Data Rebuild operations. However, after a recent power outage, the sync restarted from 0%. Any idea why this happened?

 

The only explanation I can think of is that I configured the plugin during the ongoing parity-sync operation, so perhaps the changes only take effect for parity syncs initiated after the configuration was applied.

 

Below are the history entries for the last two incomplete parity-syncs.

 

Action         Date                                          Size    Duration                           Speed         Status    Errors       Elapsed Time Increments

Parity-Sync 2024-10-03, 17:01:01 (Thursday)   24 TB 16 hr, 1 min, 15 sec            Unavailable Canceled 17189393  

Parity-Sync 2024-10-01, 07:16:14 (Tuesday)     24 TB 6 day, 16 hr, 1 min, 10 sec 41.7 MB/s    Canceled 3443815   6 day, 16 hr, 1 min, 10 sec 1

 

Edit: 17:01:01 is the time of the last power outage. Maybe I'm reading it incorrectly, but the shutdown initiated by the UPS seems to trigger a cancellation of the parity-sync instead of pausing it.

Edited by Gico

  • Replies 1.1k
  • Views 180.2k
  • Created
  • Last Reply

Top Posters In This Topic

Most Popular Posts

  • I am currently working on the code to allow array operations to be restarted (resumed) from where they were as long as: the array was shutdown cleanly there have been no changes to the

  • i have been working through all cases where this can happen in the code and I think I now have them all fixed in the version running on my test server.  There was a number of places in the code where

  • I have just pushed what I hope is the ‘fixed’ version of the plugin to GitHub.    Let me know if you notice any further anomalies/bugs.

Posted Images

  • Author
21 hours ago, Gico said:

However, after a recent power outage, the sync restarted from 0%. Any idea why this happened?

The parity sync will only resume from where it had reached if you managed to get a clean shutdown.  If you had a power outage do are you sure your UPS runs long enough so that the server can still shutdown tidily.   Did the plugin warn you that there had been an unclean shutdown?

  • Author
On 10/3/2024 at 4:21 AM, mftovey said:

So does anyone have any thoughts on this? Do I just have a defective drive that does not properly report its temperature, or is this going to be a continual issue?

There has been a known issue with Samsung SSD's where they can intermittently report excessively high temperatures.  I thought that it was fixed by a firmware update.

 

Note that you can configure individual warning/critical temperatures for each drive by clicking on it on the Main tab and the Parity Check Tuning plugin will take those into account.

 

On 10/3/2024 at 4:21 AM, mftovey said:

Found that there is a parameter in the configuration that would allow me to change the wait time: "parityTuningHeatTooLong"

I must admit I never expected that option to be used in anger :)  Not sure I want to add it to the GUI, but you can edit the plugin's parity.check.tuning.cfg file in the plugins folder on the flash drive and it will then start being used going forward.   However it should not stop the parity check - it is just meant to be a warning that you might need to look into providing better cooling for your system.

On 8/16/2024 at 10:55 AM, itimpi said:

I have uploaded a new version of the plugin that should fix the issue of the Parity History not displaying correctly.    
 

@terag1e you will notice that this update ignored any history records that are older than 2018 (which existed in the sample history file you provided me) as those entries do not have the year as part of the date field.   I could display them just giving the month + day but there does not seem much point.

Once again, thanks for the help. It certainly worked for a few months. In my October parity check it only captured one increment which is very much not the case. I was trying to move the parity check along, so I resumed during low server activity periods during the daylight hours. It made it through the parity check a bit faster than just the few hours each night which was my goal. However, the history now only shows one increment (there were at least 3 and likely 4-5), so the speed calculation is WAYYYY wrong! I only wish I had that speed!

 

At any rate, if there is anything I can provide, please let me know. I have had to restart the server, and am currently in the process of rebuilding a drive - so things captured in the server log (Unraid) may have gotten lost in the restart shuffle.

 

Again - love the plugin! I run a few VMs and they would always begin to suffer terrible lag when parity checks were active. Now I just run the parity checks during my sleep hours and the VMs run great! 

  • Author
3 hours ago, terag1e said:

At any rate, if there is anything I can provide, please let me know. I have had to restart the server, and am currently in the process of rebuilding a drive - so things captured in the server log (Unraid) may have gotten lost in the restart shuffle.

If there is any chance of having the plugin testing mode of login enabled if you do something similar and sending the diagnostics through at the end that would allow me to determine for sure why the speed calculation is wrong as the plugin SHOULD recognise manual pause/resumes once the check has started, but your description make it seem that it this may not be registering correctly for some reason.

 

In the mean time I will try some testing at my end to see if I can reproduce your symptoms.

On 10/4/2024 at 4:06 PM, itimpi said:

The parity sync will only resume from where it had reached if you managed to get a clean shutdown.  If you had a power outage do are you sure your UPS runs long enough so that the server can still shutdown tidily.   Did the plugin warn you that there had been an unclean shutdown?

I didn't notice such a warning, but it might have happened because my UPS batteries only last a few minutes. I paused the parity sync, backed up the flash drive, stopped the array, yet still received the "Parity Sync/Data rebuild cancelled" message, and this parity sync is listed as cancelled in the plugin history. Why is that? It was a clean stop.

Unfortunately syslog was full so nothing was written to it.

 

I'm planning to replace the UPS batteries, restore the flash from the backup, and hopefully be able to resume the parity sync.

  • Author
1 hour ago, Gico said:

I paused the parity sync, backed up the flash drive, stopped the array, yet still received the "Parity Sync/Data rebuild cancelled" message, and this parity sync is listed as cancelled in the plugin history. Why is that? It was a clean stop.

I have no idea as in my testing this works for me.   Enabling the plugin's testing mode logging and then posting diagnostics after recreating the issue might allow me to determine why.

Diagnostics attached. First parity-sync was in Debug mode, than another one in Testing mode. Both reported as cancelled, but I paused them, then stopped the server.

 

Edit: After the second time, when I started the array, I could (manually) resume the parity sync. This suggests that the plugin works correctly in Testing mode, but not in other modes, at least on my server.

 

Edit2: BTW, I also have diagnostics from after the first array stop, when the plugin was in Debug mode. Let me know if you need me to post it.

juno-diagnostics-20241007-1442.zip

Edited by Gico

  • Author
3 hours ago, Gico said:

Edit: After the second time, when I started the array, I could (manually) resume the parity sync.

There should be no functional difference - just extra information logged as the plugin is running.

 

3 hours ago, Gico said:

Edit2: BTW, I also have diagnostics from after the first array stop, when the plugin was in Debug mode. Let me know if you need me to post it.

Not sure it will show anything extra but would not do any harm to post them anyway so I can check them.

  • Author

@Gico  Looking at the diagnostics you posted I can see that there is no restart information saved when you boot the system.     This suggests that whatever went wrong happened during the earlier shutdown of the system.   I can see that when you stopped the array without doing a shutdown that the information to restart the parity check when you start the array WAS saved.

 

If you change the logging state of the plugin to be Testing mode but include logging to flash rather than just the syslog, this will create a parity.check.tuning.log file in the plugin’s folder on the flash drive that will include entries from the server shutdown phase that might help shed light on what is happening.

5 minutes ago, itimpi said:

@Gico  Looking at the diagnostics you posted I can see that there is no restart information saved when you boot the system.     This suggests that whatever went wrong happened during the earlier shutdown of the system. 

 

The previous server shutdown was not clean, which explains the lack of restart information. However, after the restart, a new parity sync began. Why would pausing it, stopping the array, then starting it cause the parity sync's resumption to still depend on an abnormal shutdown that occurred before the parity sync started?

 

12 minutes ago, itimpi said:

If you change the logging state of the plugin to be Testing mode but include logging to flash rather than just the syslog

The plugin is still in Testing mode. I changed "Mirror syslog to flash" to "Yes," and "Copy syslog to flash on shutdown" was already set to "Yes." Is this sufficient?

I've replaced the UPS batteries, so there shouldn't be any more abnormal shutdowns. However, if the parity sync starts from the beginning again, I'll post the diagnostics.

 

I'm still facing an issue with read errors during the sync, which I haven't resolved yet.

 

Attached are the diagnostics from after the first parity sync.

juno-diagnostics-20241007-1420.zip

  • Author
37 minutes ago, Gico said:

The plugin is still in Testing mode. I changed "Mirror syslog to flash" to "Yes," and "Copy syslog to flash on shutdown" was already set to "Yes." Is this sufficient?

No.   That will not necessarily copy what happens during the shutdown as on a reboot Unraid resets the syslog (the information might though end up in syslog-previous) .     The option within the plugin’s Testing mode logging to write to flash sets a log file to be written to the plugin’s folder on the flash drive that is only appended to so always survives any reboots.

OK got it: It's a plugin setting, not the Unraid syslog setting.

I know that it defeats the purpose but is it possible with this to start/stop parity check manually?  

 

Possible feature suggestion - would it be possible to set it at if the server is just sitting there - nothing is happening to have it run the parity check or let me know if this sitting is already there and I just don't understand the settings.  

On 10/4/2024 at 6:18 AM, itimpi said:

There has been a known issue with Samsung SSD's where they can intermittently report excessively high temperatures.  I thought that it was fixed by a firmware update.

My SSD is Silicon Power and I am having some doubts about them.  I think I will look around for some better options.

 

On 10/4/2024 at 6:18 AM, itimpi said:

Note that you can configure individual warning/critical temperatures for each drive by clicking on it on the Main tab and the Parity Check Tuning plugin will take those into account.

It took me a while to work out how to make that work. The "Warning disk temperature threshold" field was initially greyed out and uneditable on my system.  I finally found out about smart-one.cfg, but it had not changed on my system since 2021 and it had only three disks in it (my system has six) and two of those were swapped out long ago.  I finally found out that the format of the file had changed a few years back, but somehow it did not get updated on my system at that time.  I created a new file, manually added the disks to it, then added 'hotTemp = "49"' to the SDD entry.  That translates to "120" in the GUI, and that somehow translates to a cool/warm threshold of 106 in parity.check.tuning.php.  That is greater than the "104" that the SDD is always displaying and so now the parity check can run to completion (with cool down pauses).

 

On 10/4/2024 at 6:18 AM, itimpi said:

I must admit I never expected that option to be used in anger :)  Not sure I want to add it to the GUI, but you can edit the plugin's parity.check.tuning.cfg file in the plugins folder on the flash drive and it will then start being used going forward.   However it should not stop the parity check - it is just meant to be a warning that you might need to look into providing better cooling for your system.

After deeper analysis of the code, I see that this piece of the code does nothing to the functionality of the parity check.  It just spits out a warning message and then breaks out.  So waiting a little longer might reduce the number of messages that are output but nothing more.

 

So, I have a working solution.  My next task will be to figure out how to get better cooling for the drives.  Thanks for your help!

 

-Mark

 

 

Hello all, 

 

I've recently attempted to run a parity check with the parity check tuning plugin installed and using the increments setting to try and split the parity check into a few days of work. However, I seem to be running into a weird situation. It has been over 4 days now and it seems that it keeps starting over from the beginning rather than resuming from where it was paused. Allow me to explain a bit from the start of my journey. 

 

I configured parity check and parity check tuning settings so that it runs every quarter (Jan, Apr, Jul, Oct) at any day or week of said months. I would like it to increment the check daily from 00:15 to 09:30. I set that increments will be used for scheduled and manual parity checks only, as the other checks throws me an alert that it's not wise to set increments for those. 

 

When the check first ran automatically, I noticed that it did not pause at 09:30 and I thought I had done something wrong. But then I noticed that my timezone was incorrect (PDT instead of EST) and so it would've paused at 12:30 instead. At this point I had already manually paused the check. The progress at this point was already around 31-32% complete. 

 

After fixing the timezone issue, I observed the following day whether the check will automatically pause at 09:30 EST, and sure enough, it did. The progressed showed somewhere around 64%, and I thought everything was good at this point and the whole parity check will complete in the next day or so. At some point in the night, I recalled that I had manually resumed the parity check earlier than scheduled to allow the check to complete sooner. 

 

However, yesterday when I checked on the progress, the percentage count had gone back down to 30ish %, seemingly indicating that the process had started all over again. I was annoyed but was like, that's alright, let's just get through this. 

 

But today, I woke up to see the same thing happening again. The progress is back to the 30ish % again. At this point, I'm not sure what is causing this. My server didn't restart or lost power (to the best of my knowledge) and I can't seem to see a cause for this other than it restarting over when it resumes. It is possible that my settings are incorrect and that my action of manually resuming it caused some kind of issue, but it is still odd that it started over twice now. I'm worried that my parity check will never complete at this point unless I remove the increment settings. 

 

I've attached my most recent diagnostisc, as well as screenshots of my settings. I hope there's a solution to this. Thank you.

 

image.png.547c8c1f89f7a24f37324fc219a273bb.png

 

image.thumb.png.cdb654648f5ac7154fc18245fc40a5d7.png

 

image.thumb.png.9d1f514689268910767b3bfebe434369.png

magi-unraid-diagnostics-20241011-0857.zip

  • Author

Looking at the settings you have for the standard parity check it looks like you have it set to restart every day during October.   This is what I am seeing in the syslog.  You need to correct this before the plugin can handle the pause/resumes as you want i to.

 

BTW:  I notice that you have the scheduled check set to be correcting.   We normally recommend that this is set to non-correcting for the scheduled checks so that if you have a hardware issue you have not yet noticed you do not inadvertently end up corrupting parity.   Then only run a correcting check when you think the hardware is fine and you have reason to think parity needs correcting.

 

18 minutes ago, itimpi said:

We normally recommend that this is set to non-correcting for the scheduled checks so that if you have a hardware issue you have not yet noticed you do not inadvertently end up corrupting parity.

I've made some tests recently, and I think the issue was caused by the the firmware in some older Seagate drives, over 12 years old, I can easily reproduce it with those disks and never with different models/brands, that would also explain why I haven't seen a real world case in many years, those disks are probably mostly out of circulation by now, so for now, and unless I see a new case with more modern disks, I'm changing my recommendation to always run a correcting check.

1 hour ago, itimpi said:

Looking at the settings you have for the standard parity check it looks like you have it set to restart every day during October.   This is what I am seeing in the syslog.  You need to correct this before the plugin can handle the pause/resumes as you want i to.

 

BTW:  I notice that you have the scheduled check set to be correcting.   We normally recommend that this is set to non-correcting for the scheduled checks so that if you have a hardware issue you have not yet noticed you do not inadvertently end up corrupting parity.   Then only run a correcting check when you think the hardware is fine and you have reason to think parity needs correcting.

 

hi @itimpi thank you for your reply. Are you referring to this portion of the Parity Check settings? 

image.thumb.png.82664a893aee2f3d0687f712c7de442f.png

 

I was thinking that this meant that the check can run at any day of any week of the selected months. If that's not the case, then I am gravely incorrect lol! 

 

Should I just manually select a day and a week that I want it to run, rather than leaving it as "every xxxx "? 

 

And after I change the settings, should I start the check over again from scratch, or will the plugin pause/resume normally after the change?

1 hour ago, JorgeB said:

I've made some tests recently, and I think the issue was caused by the the firmware in some older Seagate drives, over 12 years old, I can easily reproduce it with those disks and never with different models/brands, that would also explain why I haven't seen a real world case in many years, those disks are probably mostly out of circulation by now, so for now, and unless I see a new case with more modern disks, I'm changing my recommendation to always run a correcting check.

 

Hi @JorgeB I almost certain I don't have any 12 year old Seagates in my array (knock on wood) so I guess it would be fine to leave correcting check on :D

  • Author

Yes - you need to set it so that only 1 day in the month is selected.   Currently it set to ‘every’ rather than ‘any’.

 

The parity check tuning plugin will pick up any parity check that is in progress when one of its monitor points kicks in so no need to restart it.   Note that the plugin does not initiate a check - only pause/resume one that is started independently of the plugin.

1 hour ago, itimpi said:

Yes - you need to set it so that only 1 day in the month is selected.   Currently it set to ‘every’ rather than ‘any’.

 

The parity check tuning plugin will pick up any parity check that is in progress when one of its monitor points kicks in so no need to restart it.   Note that the plugin does not initiate a check - only pause/resume one that is started independently of the plugin.

Got it, thank you very much for the assist :)

  • 5 weeks later...
On 10/5/2024 at 12:35 AM, terag1e said:

Once again, thanks for the help. It certainly worked for a few months. In my October parity check it only captured one increment which is very much not the case. I was trying to move the parity check along, so I resumed during low server activity periods during the daylight hours. It made it through the parity check a bit faster than just the few hours each night which was my goal. However, the history now only shows one increment (there were at least 3 and likely 4-5), so the speed calculation is WAYYYY wrong! I only wish I had that speed!

 

At any rate, if there is anything I can provide, please let me know. I have had to restart the server, and am currently in the process of rebuilding a drive - so things captured in the server log (Unraid) may have gotten lost in the restart shuffle.

 

Again - love the plugin! I run a few VMs and they would always begin to suffer terrible lag when parity checks were active. Now I just run the parity checks during my sleep hours and the VMs run great! 

Continuing saga: Beyond having to rebuild the server (YAY) the last manual parity check I ran still appears to have incorrect timing calculations for running time and total time. I have attached the relevant entries from system log and parity check log. For the parity check in question, the  elapsed time (by log entry) is just short of 2 days, yet is displayed as nearly 3 1/2 days with similar error in the running time. This particular parity check had 4 pauses. When there are no pauses in the parity check, the times appear to be accurate. The incorrect times calculation directly impact the MB/s metric.

Parity Check Tuning log entries.txt

  • Author

For me to check things out I would need the ‘parity.check.tuning.progress.save’ file from the plugins folder on the flash drive as that is used for calculating the times.   

  • 2 weeks later...
On 11/10/2024 at 2:08 PM, itimpi said:

For me to check things out I would need the ‘parity.check.tuning.progress.save’ file from the plugins folder on the flash drive as that is used for calculating the times.   

Is that file just appended to with each parity check? Or is it rewritten fresh for each parity check. I will likely wait until my next scheduled parity check (first of Dec) and let the current settings I use for the tuning plug-in be used. Are just the parity-checks.log and parity.check.tuning.progress.save files the ones needed? I currently have the Parity Check Tuning logging set to Basic. Should that be set to something else (Debug or Testing)?

  • Author

A new progress file is created (without a .save extension) every time a new check starts.    A line is added to it every time the plugin detects a pause/resume.   When the check completes this file is analysed to work out the times; a history record created; and the file renamed to have a .save extension.

 

this means that the .save file always contains details of the last check run.    Normally that is the only file needed to work out the times.   

 

if it appears that incorrect details are being recorded then it may be necessary to have a different logging level to work out why.

 

 

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.