Jump to content

parity check completion notification


ljm42

Recommended Posts

I have a small request... right now there is a notification sent when a parity check starts, can we send one when it is done too?  Ideally it would contain all of the same information as the standard "array health report".  I'd basically like confirmation that it completed, and a record of what temp all the drives were when it finished.

 

Thanks for considering it!

Link to comment

I have a small request... right now there is a notification sent when a parity check starts, can we send one when it is done too?  Ideally it would contain all of the same information as the standard "array health report".  I'd basically like confirmation that it completed, and a record of what temp all the drives were when it finished.

 

Thanks for considering it!

 

Current implementation sends a warning notification when the parity check starts and a notify or warning notification (depending on the result) when the parity check is finished.

 

You are not receiving both notifications?

 

Link to comment

I'm not getting both but I don't have email ticked off for notifications so that probably explains it.  I'm not aware of having turned that off, though, so I wonder if it is that way by default?  I've often wondered the same about a completion email.

I think email notifications are probably off by default.  After all until you have set up details of an email account there is nowhere to send such notifications.
Link to comment

I think email notifications are probably off by default.  After all until you have set up details of an email account there is nowhere to send such notifications.

 

Correct, to make email notifications work, it is required to configure SMTP settings and enable email notifications where desired.

 

Link to comment

Current implementation sends a warning notification when the parity check starts and a notify or warning notification (depending on the result) when the parity check is finished.

 

You are not receiving both notifications?

 

Ah, I had "notices" disabled because they were mostly noise.  But a simple "parity check complete" notification is only part of what I'd like to know...

 

 

On parity check day I get the following notifications:

 

The night before, a status report at 12:20 AM saying all is well:

 

Event: unRAID Status
Subject: Notice [TOWER] - array health report [PASS]
Description: Array has 5 disks (including parity & cache)
Importance: normal

Parity - ST4000VN000-1H4168_Z300T071 (sdl) - standby [OK]
Disk 1 - ST4000VN000-1H4168_Z300T01R (sdi) - standby [OK]
Disk 2 - ST4000VN000-1H4168_Z300T009 (sdj) - standby [OK]
Disk 3 - ST4000VN000-1H4168_Z30139M5 (sdk) - standby [OK]
Cache - Samsung_SSD_850_PRO_512GB_S1SXNSAF920020W (sdh) - active 31 C [OK]

Parity is valid
Last checked on Friday, September  2, 2016, 06:12 AM (29 days ago), finding 0 errors.
Duration: 8 hours, 12 minutes, 37 seconds. Average speed: 135.4 MB/s

 

Then a "parity check started" warning at 10:00pm:

 

Event: unRAID Parity check
Subject: Notice [TOWER] - Parity check started
Description: Size: 4 TB
Importance: warning

 

Then a series of warnings as each disk goes hot:

 

Event: unRAID Parity disk temperature
Subject: Warning [TOWER] - Parity disk is hot (48 C)
Description: ST4000VN000-1H4168_Z300T071 (sdl)
Importance: warning

 

Then at 12:20 am, a failed status check:

 

Event: unRAID Status
Subject: Notice [TOWER] - array health report [FAIL]
Description: Array has 5 disks (including parity & cache)
Importance: warning

Parity - ST4000VN000-1H4168_Z300T071 (sdl) - active 48 C (disk is hot) [NOK]
Disk 1 - ST4000VN000-1H4168_Z300T01R (sdi) - active 48 C (disk is hot) [NOK]
Disk 2 - ST4000VN000-1H4168_Z300T009 (sdj) - active 50 C (disk is hot) [NOK]
Disk 3 - ST4000VN000-1H4168_Z30139M5 (sdk) - active 49 C (disk is hot) [NOK]
Cache - Samsung_SSD_850_PRO_512GB_S1SXNSAF920020W (sdh) - active 33 C [OK]

Parity check in progress.
Total size: 4 TB
Elapsed time: 2 hours, 20 minutes
Current position: 1.39 TB (34.7 %)
Estimated speed: 162.3 MB/sec
Estimated finish: 4 hours, 28 minutes
Sync errors corrected: 0

 

And then nothing until the following night at 12:20 am, when the status check confirms it finished and temps are back to normal:

 

Event: unRAID Status
Subject: Notice [TOWER] - array health report [PASS]
Description: Array has 5 disks (including parity & cache)
Importance: normal

Parity - ST4000VN000-1H4168_Z300T071 (sdl) - standby [OK]
Disk 1 - ST4000VN000-1H4168_Z300T01R (sdi) - active 34 C [OK]
Disk 2 - ST4000VN000-1H4168_Z300T009 (sdj) - standby [OK]
Disk 3 - ST4000VN000-1H4168_Z30139M5 (sdk) - standby [OK]
Cache - Samsung_SSD_850_PRO_512GB_S1SXNSAF920020W (sdh) - active 30 C [OK]

Parity is valid
Last checked on Sunday, October  2, 2016, 06:39 AM (yesterday), finding 0 errors.
Duration: 8 hours, 39 minutes, 35 seconds. Average speed: 128.3 MB/s

 

Unfortunately, I only know the temp of the disks when they first cross the threshold and whatever they happened to be at 12:20 am, I have no idea how hot they ultimately got during the parity check.

 

What I would really like to see is another "unRAID status" message when the parity check completes.  That would both confirm it was done, and show me the temp of the disks at the end of the run.  For people whose disks are all the same size, that would pretty much show the max temps the drives reached.

 

It wouldn't be quite as helpful for people with drives of different sizes, since some of them would have spun down before the end.  It would be more work, but to really take care of those people the system could track the max temp each drive reached during the parity check and report *that* temp on the final email instead of the current temp.

 

Parity checks stress our systems on a regular basis, and this would give us more insight into how well they can take it.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...