[Plugin] Parity Check Tuning


367 posts in this topic Last Reply

Recommended Posts

My parity check isn't resuming with all drives classified as cool. Any ideas?

 

Sep  5 16:45:01 tower parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  5 16:45:01 tower parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  5 16:45:01 tower parity.check.tuning.php: TESTING: parity temp=43 (settings are: hot=47, cool=43))
Sep  5 16:45:01 tower parity.check.tuning.php: TESTING: disk1 temp=42 (settings are: hot=47, cool=43))
Sep  5 16:45:01 tower parity.check.tuning.php: TESTING: disk2 temp=39 (settings are: hot=47, cool=43))
Sep  5 16:45:01 tower parity.check.tuning.php: TESTING: disk3 temp=38 (settings are: hot=47, cool=43))
Sep  5 16:45:01 tower parity.check.tuning.php: TESTING: disk4 temp=38 (settings are: hot=47, cool=43))
Sep  5 16:45:01 tower parity.check.tuning.php: DEBUG: array drives=5, hot=0, warm=0, cool=5
Sep  5 16:45:01 tower parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
Sep  5 16:45:01 tower parity.check.tuning.php: DEBUG: -----------MONITOR end-------

 

Link to post
  • Replies 366
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Popular Posts

Parity Check Tuning plugin   The Parity Check Tuning plugin is  primarily designed to allow you to split a parity check into increments and then specify when those increments should be run.

i have been working through all cases where this can happen in the code and I think I now have them all fixed in the version running on my test server.  There was a number of places in the code where

I am currently working on the code to allow array operations to be restarted (resumed) from where they were as long as: the array was shutdown cleanly there have been no changes to the

Posted Images

On 9/5/2020 at 4:47 PM, robobub said:

My parity check isn't resuming with all drives classified as cool. Any ideas?

I haven't tested enough to be certain but I think if the difference between hot pause and cool resume isn't sufficient it won't resume.

 

With

Hot pause: 2 below

Cool resume: 5 below

(Difference of 3)

It didn't resume

 

With

Hot pause: 2 below

Cool resume: 7 below

(Difference of 5)

It did resume

 

But it's possible I misinterpreted and something else caused the resume failure.

Link to post
1 hour ago, CS01-HS said:

I haven't tested enough to be certain but I think if the difference between hot pause and cool resume isn't sufficient it won't resume.

 

With

Hot pause: 2 below

Cool resume: 5 below

(Difference of 3)

It didn't resume

 

With

Hot pause: 2 below

Cool resume: 7 below

(Difference of 5)

It did resume

 

But it's possible I misinterpreted and something else caused the resume failure.

This suggests an error somewhere as the size of the difference should not matter (as long as it gets to the lower temperature).    
 

It would be useful if you can get me some diagnostic information covering the two cases.    To get this turn on the “testing” level of debugging in the plugin settings and then see if you can repeat the scenarios.     After doing that a copy of your syslog (or the system diagnostics zip file as that includes the syslog) should allow me to see what is happening.   You will then want to disable this level of debug to avoid filling up your syslog with diagnostic messages).

Link to post

Here are the relevant lines from the syslog. I can send the entire log after some cleanup if necessary.

It looks like I was wrong about the difference between hot/cold settings but something's causing a failure to resume even when all drive temperatures reach "cool."

 

Start of Parity check (difference of 3 between hot and cool):

Sep  7 13:10:23 NAS parity.check.tuning.php: DEBUG: -----------UPDATECRON start------
Sep  7 13:10:23 NAS parity.check.tuning.php: DEBUG: created cron entries for running increments
Sep  7 13:10:23 NAS parity.check.tuning.php: DEBUG: created cron entry for monitoring disk temperatures
Sep  7 13:10:23 NAS parity.check.tuning.php: TESTING: updated cron settings are in /boot/config/plugins/parity.check.tuning/parity.check.tuning.cron
Sep  7 13:10:23 NAS parity.check.tuning.php: DEBUG: -----------UPDATECRON end-------
Sep  7 13:10:31 NAS kernel: mdcmd (217): check Resume
Sep  7 13:10:31 NAS kernel: md: recovery thread: check P ...
Sep  7 13:15:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 13:15:01 NAS parity.check.tuning.php: TESTING: parity temp=36 (settings are: hot=40, cool=37))
Sep  7 13:15:01 NAS parity.check.tuning.php: TESTING: disk1 temp=37 (settings are: hot=40, cool=37))
Sep  7 13:15:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
Sep  7 13:15:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
Sep  7 13:15:01 NAS parity.check.tuning.php: DEBUG: Non-Correcting Parity Check with all drives below temperature threshold for a Pause
Sep  7 13:15:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 13:18:57 NAS autofan: Board temp is 46C, hottest disk is 40C (/dev/sdh), setting fan speed to: 165 (76% @ 1588rpm)
Sep  7 13:20:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 13:20:01 NAS parity.check.tuning.php: TESTING: parity temp=36 (settings are: hot=40, cool=37))
Sep  7 13:20:01 NAS parity.check.tuning.php: TESTING: disk1 temp=37 (settings are: hot=40, cool=37))
Sep  7 13:20:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
Sep  7 13:20:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
Sep  7 13:20:01 NAS parity.check.tuning.php: DEBUG: Non-Correcting Parity Check with all drives below temperature threshold for a Pause
Sep  7 13:20:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 13:25:02 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 13:25:02 NAS parity.check.tuning.php: TESTING: parity temp=36 (settings are: hot=40, cool=37))
Sep  7 13:25:02 NAS parity.check.tuning.php: TESTING: disk1 temp=37 (settings are: hot=40, cool=37))
Sep  7 13:25:02 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
Sep  7 13:25:02 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
Sep  7 13:25:02 NAS parity.check.tuning.php: DEBUG: Non-Correcting Parity Check with all drives below temperature threshold for a Pause
Sep  7 13:25:02 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 13:25:11 NAS autofan: Board temp is 45C, hottest disk is 42C (/dev/sdh), setting fan speed to: 195 (90% @ 1819rpm)
Sep  7 13:30:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 13:30:01 NAS parity.check.tuning.php: TESTING: parity temp=36 (settings are: hot=40, cool=37))
Sep  7 13:30:01 NAS parity.check.tuning.php: TESTING: disk1 temp=37 (settings are: hot=40, cool=37))
Sep  7 13:30:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
Sep  7 13:30:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
Sep  7 13:30:01 NAS parity.check.tuning.php: DEBUG: Non-Correcting Parity Check with all drives below temperature threshold for a Pause
Sep  7 13:30:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 13:35:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 13:35:01 NAS parity.check.tuning.php: TESTING: parity temp=36 (settings are: hot=40, cool=37))
Sep  7 13:35:01 NAS parity.check.tuning.php: TESTING: disk1 temp=37 (settings are: hot=40, cool=37))
Sep  7 13:35:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
Sep  7 13:35:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
Sep  7 13:35:01 NAS parity.check.tuning.php: DEBUG: Non-Correcting Parity Check with all drives below temperature threshold for a Pause
Sep  7 13:35:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 13:40:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 13:40:01 NAS parity.check.tuning.php: TESTING: parity temp=36 (settings are: hot=40, cool=37))
Sep  7 13:40:01 NAS parity.check.tuning.php: TESTING: disk1 temp=37 (settings are: hot=40, cool=37))
Sep  7 13:40:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
Sep  7 13:40:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
Sep  7 13:40:01 NAS parity.check.tuning.php: DEBUG: Non-Correcting Parity Check with all drives below temperature threshold for a Pause
Sep  7 13:40:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------

 

Drives overheat, check is paused, drives begin cool-down:

Sep  7 13:45:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 13:45:01 NAS parity.check.tuning.php: TESTING: parity temp=39 (settings are: hot=40, cool=37))
Sep  7 13:45:01 NAS parity.check.tuning.php: TESTING: disk1 temp=42 (settings are: hot=40, cool=37))
Sep  7 13:45:01 NAS parity.check.tuning.php: TESTING: disk2 temp=42 (settings are: hot=40, cool=37))
Sep  7 13:45:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=2, warm=1, cool=0
Sep  7 13:45:01 NAS parity.check.tuning.php: Paused Non-Correcting Parity Check  (39.6% completed) : Following drives overheated: 42 42
Sep  7 13:45:02 NAS parity.check.tuning.php: DEBUG: written PAUSE (HOT) record to  /boot/config/plugins/parity.check.tuning/parity.check.tuning.progress
Sep  7 13:45:02 NAS parity.check.tuning.php: DEBUG: -----------MDCMD start------
Sep  7 13:45:02 NAS parity.check.tuning.php: DEBUG: detected that mdcmd had been called from sh with command mdcmd nocheck PAUSE
Sep  7 13:45:02 NAS parity.check.tuning.php: DEBUG: -----------MDCMD end-------
Sep  7 13:45:02 NAS kernel: mdcmd (218): nocheck PAUSE
Sep  7 13:45:02 NAS kernel:
Sep  7 13:45:02 NAS kernel: md: recovery thread: exit status: -4
Sep  7 13:45:02 NAS parity.check.tuning.php: TESTING: Heat notifications disabled so Pause Following drives overheated: 42 42  not sent
Sep  7 13:45:02 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 13:50:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 13:50:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 13:50:01 NAS parity.check.tuning.php: TESTING: parity temp=39 (settings are: hot=40, cool=37))
Sep  7 13:50:01 NAS parity.check.tuning.php: TESTING: disk1 temp=42 (settings are: hot=40, cool=37))
Sep  7 13:50:01 NAS parity.check.tuning.php: TESTING: disk2 temp=42 (settings are: hot=40, cool=37))
Sep  7 13:50:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=2, warm=1, cool=0
Sep  7 13:50:01 NAS parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
Sep  7 13:50:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 13:50:08 NAS autofan: Board temp is 39C, hottest disk is 40C (/dev/sdh), setting fan speed to: 165 (76% @ 1605rpm)
Sep  7 13:55:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 13:55:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 13:55:01 NAS parity.check.tuning.php: TESTING: parity temp=39 (settings are: hot=40, cool=37))
Sep  7 13:55:01 NAS parity.check.tuning.php: TESTING: disk1 temp=42 (settings are: hot=40, cool=37))
Sep  7 13:55:01 NAS parity.check.tuning.php: TESTING: disk2 temp=42 (settings are: hot=40, cool=37))
Sep  7 13:55:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=2, warm=1, cool=0
Sep  7 13:55:01 NAS parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
Sep  7 13:55:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 14:00:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 14:00:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 14:00:01 NAS parity.check.tuning.php: TESTING: parity temp=39 (settings are: hot=40, cool=37))
Sep  7 14:00:01 NAS parity.check.tuning.php: TESTING: disk1 temp=42 (settings are: hot=40, cool=37))
Sep  7 14:00:01 NAS parity.check.tuning.php: TESTING: disk2 temp=42 (settings are: hot=40, cool=37))
Sep  7 14:00:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=2, warm=1, cool=0
Sep  7 14:00:01 NAS parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
Sep  7 14:00:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 14:00:01 NAS root: Cache used space threshhold (75) not exceeded.  Used Space: 72.  Not moving files
Sep  7 14:05:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 14:05:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 14:05:01 NAS parity.check.tuning.php: TESTING: parity temp=39 (settings are: hot=40, cool=37))
Sep  7 14:05:01 NAS parity.check.tuning.php: TESTING: disk1 temp=42 (settings are: hot=40, cool=37))
Sep  7 14:05:01 NAS parity.check.tuning.php: TESTING: disk2 temp=42 (settings are: hot=40, cool=37))
Sep  7 14:05:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=2, warm=1, cool=0
Sep  7 14:05:01 NAS parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
Sep  7 14:05:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 14:10:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 14:10:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 14:10:01 NAS parity.check.tuning.php: TESTING: parity temp=39 (settings are: hot=40, cool=37))
Sep  7 14:10:01 NAS parity.check.tuning.php: TESTING: disk1 temp=42 (settings are: hot=40, cool=37))
Sep  7 14:10:01 NAS parity.check.tuning.php: TESTING: disk2 temp=42 (settings are: hot=40, cool=37))
Sep  7 14:10:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=2, warm=1, cool=0
Sep  7 14:10:01 NAS parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
Sep  7 14:10:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 14:15:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 14:15:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 14:15:01 NAS parity.check.tuning.php: TESTING: parity temp=37 (settings are: hot=40, cool=37))
Sep  7 14:15:01 NAS parity.check.tuning.php: TESTING: disk1 temp=34 (settings are: hot=40, cool=37))
Sep  7 14:15:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
Sep  7 14:15:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
Sep  7 14:15:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
Sep  7 14:15:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 14:15:07 NAS autofan: Board temp is 41C, hottest disk is 38C (/dev/sdh), setting fan speed to: 135 (62% @ 1386rpm)
Sep  7 14:20:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 14:20:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 14:20:01 NAS parity.check.tuning.php: TESTING: parity temp=37 (settings are: hot=40, cool=37))
Sep  7 14:20:01 NAS parity.check.tuning.php: TESTING: disk1 temp=34 (settings are: hot=40, cool=37))
Sep  7 14:20:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
Sep  7 14:20:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
Sep  7 14:20:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
Sep  7 14:20:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 14:25:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 14:25:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 14:25:01 NAS parity.check.tuning.php: TESTING: parity temp=37 (settings are: hot=40, cool=37))
Sep  7 14:25:01 NAS parity.check.tuning.php: TESTING: disk1 temp=34 (settings are: hot=40, cool=37))
Sep  7 14:25:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
Sep  7 14:25:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
Sep  7 14:25:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
Sep  7 14:25:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 14:27:40 NAS autofan: Board temp is 40C, hottest disk is 36C (/dev/sdh), setting fan speed to: 105 (48% @ 1117rpm)
Sep  7 14:30:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 14:30:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 14:30:01 NAS parity.check.tuning.php: TESTING: parity temp=37 (settings are: hot=40, cool=37))
Sep  7 14:30:01 NAS parity.check.tuning.php: TESTING: disk1 temp=34 (settings are: hot=40, cool=37))
Sep  7 14:30:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
Sep  7 14:30:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
Sep  7 14:30:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
Sep  7 14:30:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 14:35:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 14:35:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 14:35:01 NAS parity.check.tuning.php: TESTING: parity temp=37 (settings are: hot=40, cool=37))
Sep  7 14:35:01 NAS parity.check.tuning.php: TESTING: disk1 temp=34 (settings are: hot=40, cool=37))
Sep  7 14:35:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
Sep  7 14:35:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
Sep  7 14:35:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
Sep  7 14:35:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 14:40:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 14:40:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 14:40:01 NAS parity.check.tuning.php: TESTING: parity temp=37 (settings are: hot=40, cool=37))
Sep  7 14:40:01 NAS parity.check.tuning.php: TESTING: disk1 temp=34 (settings are: hot=40, cool=37))
Sep  7 14:40:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
Sep  7 14:40:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
Sep  7 14:40:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
Sep  7 14:40:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 14:45:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 14:45:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 14:45:01 NAS parity.check.tuning.php: TESTING: parity temp=37 (settings are: hot=40, cool=37))
Sep  7 14:45:01 NAS parity.check.tuning.php: TESTING: disk1 temp=34 (settings are: hot=40, cool=37))
Sep  7 14:45:01 NAS parity.check.tuning.php: TESTING: disk2 temp=38 (settings are: hot=40, cool=37))
Sep  7 14:45:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
Sep  7 14:45:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
Sep  7 14:45:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------

 

Drives all "cool" but check still paused

Sep  7 14:50:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 14:50:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 14:50:01 NAS parity.check.tuning.php: TESTING: parity temp=33 (settings are: hot=40, cool=37))
Sep  7 14:50:01 NAS parity.check.tuning.php: TESTING: disk1 temp=35 (settings are: hot=40, cool=37))
Sep  7 14:50:01 NAS parity.check.tuning.php: TESTING: disk2 temp=35 (settings are: hot=40, cool=37))
Sep  7 14:50:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=0, cool=3
Sep  7 14:50:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
Sep  7 14:50:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------

 

Mover-tuning settings updated without changing values, I hoped this might jumpstart it, nope

Sep  7 14:52:31 NAS ool www[2754]: /usr/local/emhttp/plugins/parity.check.tuning/parity.check.tuning.php 'updatecron'
Sep  7 14:52:31 NAS parity.check.tuning.php: DEBUG: -----------UPDATECRON start------
Sep  7 14:52:31 NAS parity.check.tuning.php: DEBUG: created cron entries for running increments
Sep  7 14:52:31 NAS parity.check.tuning.php: DEBUG: created cron entry for monitoring disk temperatures
Sep  7 14:52:31 NAS parity.check.tuning.php: TESTING: updated cron settings are in /boot/config/plugins/parity.check.tuning/parity.check.tuning.cron
Sep  7 14:52:31 NAS parity.check.tuning.php: DEBUG: -----------UPDATECRON end-------
Sep  7 14:55:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 14:55:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 14:55:01 NAS parity.check.tuning.php: TESTING: parity temp=33 (settings are: hot=40, cool=37))
Sep  7 14:55:01 NAS parity.check.tuning.php: TESTING: disk1 temp=35 (settings are: hot=40, cool=37))
Sep  7 14:55:01 NAS parity.check.tuning.php: TESTING: disk2 temp=35 (settings are: hot=40, cool=37))
Sep  7 14:55:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=0, cool=3
Sep  7 14:55:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
Sep  7 14:55:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  7 15:00:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 15:00:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 15:00:01 NAS parity.check.tuning.php: TESTING: parity temp=33 (settings are: hot=40, cool=37))
Sep  7 15:00:01 NAS parity.check.tuning.php: TESTING: disk1 temp=35 (settings are: hot=40, cool=37))
Sep  7 15:00:01 NAS parity.check.tuning.php: TESTING: disk2 temp=35 (settings are: hot=40, cool=37))
Sep  7 15:00:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=0, cool=3
Sep  7 15:00:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
Sep  7 15:00:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------

 

Mover-tuning settings updated to lower "cool" by 2 (difference of 5 now between hot and cool), check still doesn't resume:

Sep  7 15:02:52 NAS ool www[19347]: /usr/local/emhttp/plugins/parity.check.tuning/parity.check.tuning.php 'updatecron'
Sep  7 15:02:52 NAS parity.check.tuning.php: DEBUG: -----------UPDATECRON start------
Sep  7 15:02:52 NAS parity.check.tuning.php: DEBUG: created cron entries for running increments
Sep  7 15:02:52 NAS parity.check.tuning.php: DEBUG: created cron entry for monitoring disk temperatures
Sep  7 15:02:52 NAS parity.check.tuning.php: TESTING: updated cron settings are in /boot/config/plugins/parity.check.tuning/parity.check.tuning.cron
Sep  7 15:02:52 NAS parity.check.tuning.php: DEBUG: -----------UPDATECRON end-------
Sep  7 15:05:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  7 15:05:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  7 15:05:01 NAS parity.check.tuning.php: TESTING: parity temp=33 (settings are: hot=40, cool=35))
Sep  7 15:05:01 NAS parity.check.tuning.php: TESTING: disk1 temp=35 (settings are: hot=40, cool=35))
Sep  7 15:05:01 NAS parity.check.tuning.php: TESTING: disk2 temp=35 (settings are: hot=40, cool=35))
Sep  7 15:05:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=0, cool=3
Sep  7 15:05:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
Sep  7 15:05:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------

 

 

 

Link to post

The diagnostics posted allowed me to track down a bug in my code.  The version of this plugin I have just uploaded should now fix resume not working correctly after drives paused due to overheating and then not starting when they cool sufficiently.

 

If any other anomalies are spotted please report them so I can get them fixed.

  • Like 1
  • Thanks 1
Link to post

Still happening for me, I've had to turn off overheat protection for the time being, i had upper temperatiure set to 45c

 

Sep 8 11:25:01 Tower parity.check.tuning.php: Paused Non-Correcting Parity Check (8.2% completed) : Following drives overheated: 39 38 35 26
Sep 8 11:25:01 Tower parity.check.tuning.php: DEBUG: written PAUSE (HOT) record to /boot/config/plugins/parity.check.tuning/parity.check.tuning.progress
Sep 8 11:25:01 Tower parity.check.tuning.php: DEBUG: -----------MDCMD start------
Sep 8 11:25:01 Tower parity.check.tuning.php: DEBUG: detected that mdcmd had been called from sh with command mdcmd nocheck PAUSE
Sep 8 11:25:01 Tower parity.check.tuning.php: DEBUG: -----------MDCMD end-------
Sep 8 11:25:01 Tower kernel: mdcmd (2687): nocheck PAUSE
Sep 8 11:25:01 Tower kernel:
Sep 8 11:25:01 Tower kernel: md: recovery thread: exit status: -4
Sep 8 11:25:01 Tower parity.check.tuning.php: DEBUG: Sent notification message: Non-Correcting Parity Check (8.2% completed) Pause
Sep 8 11:25:01 Tower parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep 8 11:30:01 Tower parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep 8 11:30:01 Tower parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep 8 11:30:01 Tower parity.check.tuning.php: DEBUG: array drives=10, hot=4, warm=0, cool=6
Sep 8 11:30:01 Tower parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
Sep 8 11:30:01 Tower parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep 8 11:35:01 Tower parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep 8 11:35:01 Tower parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep 8 11:35:01 Tower parity.check.tuning.php: DEBUG: array drives=10, hot=4, warm=0, cool=6
Sep 8 11:35:01 Tower parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
Sep 8 11:35:01 Tower parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep 8 11:36:28 Tower ool www[11819]: /usr/local/emhttp/plugins/parity.check.tuning/parity.check.tuning.php 'updatecron'
Sep 8 11:36:28 Tower parity.check.tuning.php: DEBUG: -----------UPDATECRON start------
Sep 8 11:36:28 Tower parity.check.tuning.php: DEBUG: created cron entries for running increments
Sep 8 11:36:28 Tower parity.check.tuning.php: DEBUG: -----------UPDATECRON end-------
Sep 8 11:36:35 Tower kernel: mdcmd (2688): check Resume
Sep 8 11:36:35 Tower kernel: md: recovery thread: check P ...

 

Link to post

 

32 minutes ago, Spies said:

Still happening for me, I've had to turn off overheat protection for the time being, i had upper temperatiure set to 45c

Are you sure it's not user error? The latest update fixed it for me, thanks itimpi!

Sep  8 06:55:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  8 06:55:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  8 06:55:01 NAS parity.check.tuning.php: TESTING: parity temp=32 (settings are: hot=40, cool=35))
Sep  8 06:55:01 NAS parity.check.tuning.php: TESTING: disk1 temp=36 (settings are: hot=40, cool=35))
Sep  8 06:55:01 NAS parity.check.tuning.php: TESTING: disk2 temp=35 (settings are: hot=40, cool=35))
Sep  8 06:55:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=1, cool=2
Sep  8 06:55:01 NAS parity.check.tuning.php: DEBUG: Array operation paused but drives not cooled enough to resume
Sep  8 06:55:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep  8 07:00:01 NAS parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep  8 07:00:01 NAS parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep  8 07:00:01 NAS parity.check.tuning.php: TESTING: parity temp=33 (settings are: hot=40, cool=35))
Sep  8 07:00:01 NAS parity.check.tuning.php: TESTING: disk1 temp=35 (settings are: hot=40, cool=35))
Sep  8 07:00:01 NAS parity.check.tuning.php: TESTING: disk2 temp=35 (settings are: hot=40, cool=35))
Sep  8 07:00:01 NAS parity.check.tuning.php: DEBUG: array drives=3, hot=0, warm=0, cool=3
Sep  8 07:00:01 NAS parity.check.tuning.php: Resumed Non-Correcting Parity Check  (77.7% completed)  as drives now cooled down
Sep  8 07:00:01 NAS root: Cache used space threshhold (75) not exceeded.  Used Space: 72.  Not moving files
Sep  8 07:00:01 NAS parity.check.tuning.php: DEBUG: written RESUME (COOL) record to  /boot/config/plugins/parity.check.tuning/parity.check.tuning.progress
Sep  8 07:00:02 NAS parity.check.tuning.php: DEBUG: -----------MDCMD start------
Sep  8 07:00:02 NAS parity.check.tuning.php: DEBUG: detected that mdcmd had been called from sh with command mdcmd check RESUME
Sep  8 07:00:02 NAS parity.check.tuning.php: DEBUG: -----------MDCMD end-------
Sep  8 07:00:02 NAS kernel: mdcmd (236): check RESUME
Sep  8 07:00:02 NAS kernel:

 

Link to post
1 hour ago, Spies said:

Still happening for me, I've had to turn off overheat protection for the time being, i had upper temperatiure set to 45c

 


Sep 8 11:25:01 Tower parity.check.tuning.php: Paused Non-Correcting Parity Check (8.2% completed) : Following drives overheated: 39 38 35 26
Sep 8 11:25:01 Tower parity.check.tuning.php: DEBUG: written PAUSE (HOT) record to /boot/config/plugins/parity.check.tuning/parity.check.tuning.progress
Sep 8 11:25:01 Tower parity.check.tuning.php: DEBUG: -----------MDCMD start------
Sep 8 11:25:01 Tower parity.check.tuning.php: DEBUG: detected that mdcmd had been called from sh with command mdcmd nocheck PAUSE
Sep 8 11:25:01 Tower parity.check.tuning.php: DEBUG: -----------MDCMD end-------
Sep 8 11:25:01 Tower kernel: mdcmd (2687): nocheck PAUSE
Sep 8 11:25:01 Tower kernel:
Sep 8 11:25:01 Tower kernel: md: recovery thread: exit status: -4
Sep 8 11:25:01 Tower parity.check.tuning.php: DEBUG: Sent notification message: Non-Correcting Parity Check (8.2% completed) Pause
Sep 8 11:25:01 Tower parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep 8 11:30:01 Tower parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep 8 11:30:01 Tower parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep 8 11:30:01 Tower parity.check.tuning.php: DEBUG: array drives=10, hot=4, warm=0, cool=6
Sep 8 11:30:01 Tower parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
Sep 8 11:30:01 Tower parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep 8 11:35:01 Tower parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep 8 11:35:01 Tower parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep 8 11:35:01 Tower parity.check.tuning.php: DEBUG: array drives=10, hot=4, warm=0, cool=6
Sep 8 11:35:01 Tower parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
Sep 8 11:35:01 Tower parity.check.tuning.php: DEBUG: -----------MONITOR end-------
Sep 8 11:36:28 Tower ool www[11819]: /usr/local/emhttp/plugins/parity.check.tuning/parity.check.tuning.php 'updatecron'
Sep 8 11:36:28 Tower parity.check.tuning.php: DEBUG: -----------UPDATECRON start------
Sep 8 11:36:28 Tower parity.check.tuning.php: DEBUG: created cron entries for running increments
Sep 8 11:36:28 Tower parity.check.tuning.php: DEBUG: -----------UPDATECRON end-------
Sep 8 11:36:35 Tower kernel: mdcmd (2688): check Resume
Sep 8 11:36:35 Tower kernel: md: recovery thread: check P ...

 

That log snippet shows that you still have drives that are too hot to resume.  A Resume needs all the drives to be registered as 'cool'.

Link to post
50 minutes ago, itimpi said:

That log snippet shows that you still have drives that are too hot to resume.  A Resume needs all the drives to be registered as 'cool'.

They're all under 45c (which was my hot threshold).

 

Why am I not seeing the testing portion in the log?

Edited by Spies
Link to post
27 minutes ago, Spies said:

They're all under 45c (which was my hot threshold).

They all need to be "cool" which is determined by the Resume setting, not the Pause setting ("hot")

30 minutes ago, Spies said:

Why am I not seeing the testing portion in the log?

Set debug logging to "Testing"

Link to post
4 hours ago, CS01-HS said:

They all need to be "cool" which is determined by the Resume setting, not the Pause setting ("hot")

Set debug logging to "Testing"

 

Sep 8 18:20:02 Tower parity.check.tuning.php: DEBUG: -----------MONITOR start------
Sep 8 18:20:02 Tower parity.check.tuning.php: DEBUG: Parity check appears to be paused
Sep 8 18:20:02 Tower parity.check.tuning.php: TESTING: parity temp=26 (settings are: hot=0, cool=6))
Sep 8 18:20:02 Tower parity.check.tuning.php: TESTING: disk1 temp=35 (settings are: hot=0, cool=6))
Sep 8 18:20:02 Tower parity.check.tuning.php: TESTING: disk2 temp=31 (settings are: hot=0, cool=6))
Sep 8 18:20:02 Tower parity.check.tuning.php: TESTING: disk3 temp=* (settings are: hot=0, cool=6))
Sep 8 18:20:02 Tower parity.check.tuning.php: TESTING: disk4 temp=32 (settings are: hot=0, cool=6))
Sep 8 18:20:02 Tower parity.check.tuning.php: TESTING: disk5 temp=30 (settings are: hot=0, cool=6))
Sep 8 18:20:02 Tower parity.check.tuning.php: TESTING: disk6 temp=28 (settings are: hot=0, cool=6))
Sep 8 18:20:02 Tower parity.check.tuning.php: TESTING: disk7 temp=30 (settings are: hot=0, cool=6))
Sep 8 18:20:02 Tower parity.check.tuning.php: TESTING: disk8 temp=* (settings are: hot=0, cool=6))
Sep 8 18:20:02 Tower parity.check.tuning.php: TESTING: disk9 temp=27 (settings are: hot=0, cool=6))
Sep 8 18:20:02 Tower parity.check.tuning.php: DEBUG: array drives=10, hot=8, warm=0, cool=2
Sep 8 18:20:02 Tower parity.check.tuning.php: DEBUG: Array operation paused with some drives still too hot to resume
Sep 8 18:20:02 Tower parity.check.tuning.php: DEBUG: -----------MONITOR end-------

Drives with * are spun down due to inactivity.

 

It worked in the past just fine.

 

Pause at 45c

Resume at 39c are my settings.

Edited by Spies
Link to post

My hot/cool:

Sep  8 06:55:01 NAS parity.check.tuning.php: TESTING: parity temp=32 (settings are: hot=40, cool=35))

and yours

Sep 8 18:20:02 Tower parity.check.tuning.php: TESTING: parity temp=26 (settings are: hot=0, cool=6))

are very different.

 

Are you maybe setting absolute temperatures on the plugin page?

They should be relative:

1173541983_ScreenShot2020-09-08at5_51_31AM.thumb.png.d4023e755bcaef0a272eebb54874cf98.png

 

Edited by CS01-HS
Link to post

I have been trying to see how those messages are being generated.   I expect them to look like those reported by @CS01-HS so I think I am going to add a log message that gives the plugin settings before listing each drive to see what they are to confirm each drive is being categorised correctly.

 

FYI:   Any drive that does not currently report a temperature is automatically treated as falling into the ‘cool’ category.

Link to post
3 hours ago, CS01-HS said:

My hot/cool:


Sep  8 06:55:01 NAS parity.check.tuning.php: TESTING: parity temp=32 (settings are: hot=40, cool=35))

and yours


Sep 8 18:20:02 Tower parity.check.tuning.php: TESTING: parity temp=26 (settings are: hot=0, cool=6))

are very different.

 

Are you maybe setting absolute temperatures on the plugin page?

They should be relative:

1173541983_ScreenShot2020-09-08at5_51_31AM.thumb.png.d4023e755bcaef0a272eebb54874cf98.png

 

Interesting, is relative temperature a new addition because as I said, this used to work fine, I'll try changing the values and see if things start working as they should 

Link to post
36 minutes ago, Spies said:

Interesting, is relative temperature a new addition because as I said, this used to work fine, I'll try changing the values and see if things start working as they should 

It has always been specified that way.    There have been bugs fixed around the resume code not working as intended so maybe at some point you were working more by accident than by design.

Link to post
10 hours ago, itimpi said:

It has always been specified that way.    There have been bugs fixed around the resume code not working as intended so maybe at some point you were working more by accident than by design.

Ah, haha, well at least its set up correctly now!

Link to post
  • 1 month later...

Option to shutdown the server if any array or cache drive reaches the threshold you define has been added.   If set this will function independently of any array or cache operation being active.  The prime Use Case is seen as protecting your drives if your Unraid server's cooling fails for any reason.

Link to post
On 10/23/2020 at 3:36 AM, itimpi said:

Option to shutdown the server if any array or cache drive reaches the threshold you define has been added.   If set this will function independently of any array or cache operation being active.  The prime Use Case is seen as protecting your drives if your Unraid server's cooling fails for any reason.

Suggestion - have this default to "No" rather than "Yes."

Reason - grumpy text message from wife asking why Plex wasn't working :) I do like the feature though! I just had to tweak some settings.

Link to post
7 minutes ago, ClunkClunk said:

Suggestion - have this default to "No" rather than "Yes."

Reason - grumpy text message from wife asking why Plex wasn't working :) I do like the feature though! I just had to tweak some settings.

I thought it did :) (and was certainly meant to).   I’ll check this out as you are almost certainly right if it did not default correctly for you.

Link to post

I have the following debug messages every five minutes in my log:

Quote

parity.check.tuning.php: DEBUG: -----------MONITOR start------
parity.check.tuning.php: DEBUG: No drives appear to have reached shutdown threshold   
parity.check.tuning.php: DEBUG: No array operation currently in progress
DeathStar parity.check.tuning.php: DEBUG: -----------MONITOR end-------

I did verify that I have logging disabled.

image.thumb.png.7dfa97743b92c6f24e634c5e1fd4dea4.png

This started after the most recent update - 2020.10.23

Link to post

I can confirm that debug level logging is active when it should not be. 

There is another buglet where for existing users of the plugin the default for the new Shutdown option is to have it enabled rather than disabled (it is correct for new users) so you might want to check this out as that logging actually shows it is active and you may not have intended this?

 

I will get both of these issues addressed and an update issued later today.

Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.