goinsnoopin Posted March 26 Share Posted March 26 During storm we lost power. I have a UPS, but for some reason did not shut unraid down cleanly like it has in the past. On reboot this triggered a parity check. My parity check runs over several days due to the hours I restrict this activity. It is currently at 90% or so complete and there are 25,752 sync errors. Monthly on my scheduled parity checks, I set the corrections to No...not sure what the settings are from an unclean shutdown. Logs are attached. Any suggestions on how to proceed? Should I cancel balance of the parity check? tower-diagnostics-20240326-1000.zip Quote Link to comment
trurl Posted March 26 Share Posted March 26 syslog says it is correcting, but I think unclean shutdown parity check is supposed to be non-correcting. I don't know if plugin affects that or not. Seems like more sync errors than I would expect from unclean shutdown. Were previous parity checks zero sync errors? Quote Link to comment
goinsnoopin Posted March 26 Author Share Posted March 26 Yes, just double checked history...monthly parity checks for the last year have been 0. I saw that in the logs and was concerned as well. It was an ice storm and the power came on and off several times in a 5 hour window. So its possible the UPS battery ran down on first outage and got minimal charge before the next outage. Unfortunately I was not home so I am going by what my kids told me. Dan Quote Link to comment
trurl Posted March 26 Share Posted March 26 UPS should only be on battery long enough to get past brief outage, then if power doesn't come back very soon, shutdown. You don't want to run on UPS, only shutdown on UPS. Quote Link to comment
goinsnoopin Posted March 26 Author Share Posted March 26 (edited) I realize that, and have the settings so it does a shutdown with 5 minutes remaining on battery. I think the issue was server was brough back up after utility power was on for an hour...ups settings did their shutdown again with 5 minutes remaining on ups. This cycle repeated itself a couple times. If I was home, I just would have left the server off. Any suggestions...should I cancel parity check? It will start again at midnight. Edited March 26 by goinsnoopin Quote Link to comment
trurl Posted March 26 Share Posted March 26 All your array data disks are mounted and no I/O errors logged. Your docker.img has corruption so you will probably have to recreate and reinstall, but that isn't related to parity. Since parity has none of your data, might as well let it finish correcting sync errors. Quote Link to comment
goinsnoopin Posted March 27 Author Share Posted March 27 @trurl Parity Check completed and I got an email when it finished indicating that there were 0 errors?? I have attached a current diagnostics. I also attached a screenshot of the parity history that shows the zero errors and the sync errors that were corrected. There was also a second email that read as follows...(what is error code -4 listed after the sync errors): Event: Unraid Status Subject: Notice [TOWER] - array health report [PASS] Description: Array has 9 disks (including parity & pools) Importance: normal Parity - WDC_WD140EDGZ-11B2DA2_3GKH2J1F (sdk) - active 32 C [OK] Parity 2 - WDC_WD140EDFZ-11A0VA0_9LG37YDA (sdm) - active 32 C [OK] Disk 1 - WDC_WD120EMFZ-11A6JA0_QGKYB4RT (sdc) - active 32 C [OK] Disk 2 - WDC_WD20EFRX-68EUZN0_WD-WMC4M1062491 (sdf) - standby [OK] Disk 3 - WDC_WD40EFRX-68N32N0_WD-WCC7K4PLUR7A (sdi) - active 28 C [OK] Disk 4 - WDC_WD30EFRX-68EUZN0_WD-WCC4NEUA5L20 (sdd) - standby [OK] Disk 5 - WDC_WD30EFRX-68EUZN0_WD-WMC4N0M6V0HC (sdj) - standby [OK] Disk 6 - WDC_WD40EFRX-68N32N0_WD-WCC7K3PZZ7Y7 (sdh) - standby [OK] Cache - Samsung_SSD_860_EVO_500GB_S598NJ0NA53226M (sde) - active 37 C [OK] Last check incomplete on Tue 26 Mar 2024 06:30:01 AM EDT (yesterday), finding 25752 errors. Error code: -4 tower-diagnostics-20240327-1952.zip Quote Link to comment
trurl Posted March 28 Share Posted March 28 53 minutes ago, goinsnoopin said: Last check incomplete on Tue 26 Mar 2024 06:30:01 AM EDT (yesterday), finding 25752 errors. Error code: -4 That seems to correspond with when parity check was paused. Mar 26 06:30:01 Tower kernel: mdcmd (39): nocheck pause Mar 26 06:30:01 Tower kernel: Mar 26 06:30:01 Tower kernel: md: recovery thread: exit status: -4 Later on syslog says Mar 27 00:30:01 Tower kernel: md: recovery thread: check P Q ... ... Mar 27 03:57:12 Tower kernel: md: sync done. time=12431sec Mar 27 03:57:12 Tower kernel: md: recovery thread: exit status: 0 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.