-C- Posted January 12, 2023 Author Share Posted January 12, 2023 Sync of parity 2 has now completed without error. I find it worrying that using Firefox with the Unraid GUI has been an issue since at least v6.11.1 and there is no message displayed to Firefox users to warn of potential issues and that it's safest to avoid Firefox. I've spent many hours on this and have avoided using my server since building it nearly a month ago until this was resolved. I know that Firefox is a marginal browser nowadays, but I'd imagine there's likely a higher percentage of users among the Unraid customer base than globally. There must be a fair few people like me experiencing weird issues and the last thing you'd imagine in 2022/3 is that the browser you're using could cause technical issues with your server's functionality. Quote Link to comment
-C- Posted March 11, 2023 Author Share Posted March 11, 2023 Over the last couple of months I've shut the server down twice. Both times a parity check as auto-started on startup with a message about an unclean startup. Each time I stopped the automatically started sync (as I'd had trouble with that before) and started a correcting check again using Edge. The first time the check completed without error, but I've done the same thing again and I'm back to where I was when I started the check with Firefox: Mar 10 18:06:11 Tower Parity Check Tuning: DEBUG: Manual Correcting Parity-Check running Mar 10 18:09:37 Tower kernel: md: recovery thread: P corrected, sector=39063584664 Mar 10 18:09:37 Tower kernel: md: recovery thread: P corrected, sector=39063584696 Mar 10 18:09:37 Tower kernel: md: sync done. time=148148sec Mar 10 18:09:37 Tower kernel: md: recovery thread: exit status: 0 This is the same errors on the same sectors as previous. I'm at a loss as to what to do now. It appears to not be related to the browser I'm using. Is there anything else I can try? Now a new issue has appeared, when I click the History button under Array Operations I get a blank box overlaid with no means of closing it, I have to go to another page and back to view the Main page again. This happens in Firefox and Edge. Quote Link to comment
JorgeB Posted March 11, 2023 Share Posted March 11, 2023 Unraid will save diags on the flash drive /logs folder if it cannot do a clean shutdown, post those, they might show why that is happening. Quote Link to comment
itimpi Posted March 11, 2023 Share Posted March 11, 2023 11 hours ago, -C- said: Now a new issue has appeared, when I click the History button under Array Operations I get a blank box overlaid with no means of closing it, I have to go to another page and back to view the Main page again. This happens in Firefox and Edge. I see you have the Parity Check Tuning plugin installed. This plugin replaces the built-in code for displaying history so it could be a problem there that is giving the blank dialog box. To allow me to check this out to see if I can reproduce your symptoms could you let me have: version number of Unraid you are using version number of the plugin you are using a copy of the config/parity-checks.log file from the flash drive Quote Link to comment
-C- Posted March 11, 2023 Author Share Posted March 11, 2023 4 hours ago, JorgeB said: Unraid will save diags on the flash drive /logs folder if it cannot do a clean shutdown, post those, they might show why that is happening. Here is the log from the shutdown signal to the last entry before it shut down: Mar 7 14:43:38 Tower shutdown[8597]: shutting down for system halt Mar 7 14:43:38 Tower init: Switching to runlevel: 0 Mar 7 14:43:38 Tower flash_backup: stop watching for file changes Mar 7 14:43:38 Tower init: Trying to re-exec init Mar 7 14:43:59 Tower Parity Check Tuning: DEBUG: Array stopping Mar 7 14:43:59 Tower Parity Check Tuning: DEBUG: No array operation in progress so no restart information saved Mar 7 14:43:59 Tower kernel: mdcmd (36): nocheck cancel Mar 7 14:44:00 Tower emhttpd: Spinning up all drives... Mar 7 14:44:00 Tower emhttpd: spinning up /dev/sdh Mar 7 14:44:00 Tower emhttpd: spinning up /dev/sdg Mar 7 14:44:00 Tower emhttpd: spinning up /dev/sdd Mar 7 14:44:00 Tower emhttpd: spinning up /dev/sde Mar 7 14:44:00 Tower emhttpd: spinning up /dev/sdf Mar 7 14:44:00 Tower emhttpd: spinning up /dev/sdi Mar 7 14:44:00 Tower emhttpd: spinning up /dev/sda Mar 7 14:44:17 Tower kernel: ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Mar 7 14:44:17 Tower kernel: ata5.00: configured for UDMA/133 Mar 7 14:44:17 Tower emhttpd: sdspin /dev/sdh up: 1 Mar 7 14:44:17 Tower emhttpd: read SMART /dev/sdj Mar 7 14:44:17 Tower emhttpd: read SMART /dev/sdk Mar 7 14:44:17 Tower emhttpd: read SMART /dev/sdh Mar 7 14:44:17 Tower emhttpd: read SMART /dev/sdg Mar 7 14:44:17 Tower emhttpd: read SMART /dev/sdd Mar 7 14:44:17 Tower emhttpd: read SMART /dev/sde Mar 7 14:44:17 Tower emhttpd: read SMART /dev/sdb Mar 7 14:44:17 Tower emhttpd: read SMART /dev/sdf Mar 7 14:44:17 Tower emhttpd: read SMART /dev/nvme0n1 Mar 7 14:44:17 Tower emhttpd: read SMART /dev/nvme1n1 Mar 7 14:44:17 Tower emhttpd: read SMART /dev/sdi Mar 7 14:44:17 Tower emhttpd: read SMART /dev/sda Mar 7 14:44:17 Tower emhttpd: Stopping services... Mar 7 14:44:38 Tower emhttpd: shcmd (9923955): /etc/rc.d/rc.docker stop Mar 7 14:44:39 Tower kernel: docker0: port 9(vethb92db4c) entered disabled state Mar 7 14:44:39 Tower kernel: vetha796224: renamed from eth0 Mar 7 14:44:39 Tower avahi-daemon[10171]: Interface vethb92db4c.IPv6 no longer relevant for mDNS. Mar 7 14:44:39 Tower avahi-daemon[10171]: Leaving mDNS multicast group on interface vethb92db4c.IPv6 with address fe80::84b5:c3ff:fe35:1c52. Mar 7 14:44:39 Tower kernel: docker0: port 9(vethb92db4c) entered disabled state Mar 7 14:44:39 Tower kernel: device vethb92db4c left promiscuous mode Mar 7 14:44:39 Tower kernel: docker0: port 9(vethb92db4c) entered disabled state Mar 7 14:44:39 Tower avahi-daemon[10171]: Withdrawing address record for fe80::84b5:c3ff:fe35:1c52 on vethb92db4c. Mar 7 14:44:39 Tower kernel: veth520a485: renamed from eth0 Mar 7 14:44:39 Tower kernel: docker0: port 6(vethc2c8bcf) entered disabled state Mar 7 14:44:39 Tower avahi-daemon[10171]: Interface vethc2c8bcf.IPv6 no longer relevant for mDNS. Mar 7 14:44:39 Tower avahi-daemon[10171]: Leaving mDNS multicast group on interface vethc2c8bcf.IPv6 with address fe80::f0eb:9cff:fe48:b5f0. Mar 7 14:44:39 Tower kernel: docker0: port 6(vethc2c8bcf) entered disabled state Mar 7 14:44:39 Tower kernel: device vethc2c8bcf left promiscuous mode Mar 7 14:44:39 Tower kernel: docker0: port 6(vethc2c8bcf) entered disabled state Mar 7 14:44:39 Tower avahi-daemon[10171]: Withdrawing address record for fe80::f0eb:9cff:fe48:b5f0 on vethc2c8bcf. Mar 7 14:44:39 Tower kernel: veth359095c: renamed from eth0 Mar 7 14:44:39 Tower kernel: docker0: port 1(veth11635d1) entered disabled state Mar 7 14:44:39 Tower avahi-daemon[10171]: Interface veth11635d1.IPv6 no longer relevant for mDNS. Mar 7 14:44:39 Tower avahi-daemon[10171]: Leaving mDNS multicast group on interface veth11635d1.IPv6 with address fe80::c8d0:34ff:fe40:b86c. Mar 7 14:44:39 Tower kernel: docker0: port 1(veth11635d1) entered disabled state Mar 7 14:44:39 Tower kernel: device veth11635d1 left promiscuous mode Mar 7 14:44:39 Tower kernel: docker0: port 1(veth11635d1) entered disabled state Mar 7 14:44:39 Tower avahi-daemon[10171]: Withdrawing address record for fe80::c8d0:34ff:fe40:b86c on veth11635d1. Mar 7 14:44:43 Tower kernel: docker0: port 8(vethba1f846) entered disabled state Mar 7 14:44:43 Tower kernel: veth39aff71: renamed from eth0 Mar 7 14:44:43 Tower avahi-daemon[10171]: Interface vethba1f846.IPv6 no longer relevant for mDNS. Mar 7 14:44:43 Tower avahi-daemon[10171]: Leaving mDNS multicast group on interface vethba1f846.IPv6 with address fe80::d40f:c0ff:fe86:60e8. Mar 7 14:44:43 Tower kernel: docker0: port 8(vethba1f846) entered disabled state Mar 7 14:44:43 Tower kernel: device vethba1f846 left promiscuous mode Mar 7 14:44:43 Tower kernel: docker0: port 8(vethba1f846) entered disabled state Mar 7 14:44:43 Tower avahi-daemon[10171]: Withdrawing address record for fe80::d40f:c0ff:fe86:60e8 on vethba1f846. Mar 7 14:44:43 Tower kernel: docker0: port 2(vethb13e418) entered disabled state Mar 7 14:44:43 Tower kernel: veth8acea87: renamed from eth0 Mar 7 14:44:43 Tower kernel: veth82bed5c: renamed from eth0 Mar 7 14:44:43 Tower kernel: docker0: port 5(veth59668a6) entered disabled state Mar 7 14:44:43 Tower avahi-daemon[10171]: Interface vethb13e418.IPv6 no longer relevant for mDNS. Mar 7 14:44:43 Tower avahi-daemon[10171]: Leaving mDNS multicast group on interface vethb13e418.IPv6 with address fe80::88b4:78ff:fe8f:4348. Mar 7 14:44:43 Tower kernel: docker0: port 2(vethb13e418) entered disabled state Mar 7 14:44:43 Tower kernel: device vethb13e418 left promiscuous mode Mar 7 14:44:43 Tower kernel: docker0: port 2(vethb13e418) entered disabled state Mar 7 14:44:43 Tower avahi-daemon[10171]: Withdrawing address record for fe80::88b4:78ff:fe8f:4348 on vethb13e418. Mar 7 14:44:43 Tower avahi-daemon[10171]: Interface veth59668a6.IPv6 no longer relevant for mDNS. Mar 7 14:44:43 Tower avahi-daemon[10171]: Leaving mDNS multicast group on interface veth59668a6.IPv6 with address fe80::5c8f:c0ff:fe00:838. Mar 7 14:44:43 Tower kernel: docker0: port 5(veth59668a6) entered disabled state Mar 7 14:44:43 Tower kernel: device veth59668a6 left promiscuous mode Mar 7 14:44:43 Tower kernel: docker0: port 5(veth59668a6) entered disabled state Mar 7 14:44:43 Tower avahi-daemon[10171]: Withdrawing address record for fe80::5c8f:c0ff:fe00:838 on veth59668a6. Mar 7 14:44:43 Tower kernel: docker0: port 7(veth3623bf7) entered disabled state Mar 7 14:44:43 Tower kernel: vethe14a813: renamed from eth0 Mar 7 14:44:43 Tower avahi-daemon[10171]: Interface veth3623bf7.IPv6 no longer relevant for mDNS. Mar 7 14:44:43 Tower avahi-daemon[10171]: Leaving mDNS multicast group on interface veth3623bf7.IPv6 with address fe80::84f7:d3ff:fe68:350b. Mar 7 14:44:43 Tower kernel: docker0: port 7(veth3623bf7) entered disabled state Mar 7 14:44:43 Tower kernel: device veth3623bf7 left promiscuous mode Mar 7 14:44:43 Tower kernel: docker0: port 7(veth3623bf7) entered disabled state Mar 7 14:44:43 Tower avahi-daemon[10171]: Withdrawing address record for fe80::84f7:d3ff:fe68:350b on veth3623bf7. Mar 7 14:44:43 Tower kernel: docker0: port 3(veth2739f34) entered disabled state Mar 7 14:44:43 Tower kernel: vethb683262: renamed from eth0 Mar 7 14:44:43 Tower avahi-daemon[10171]: Interface veth2739f34.IPv6 no longer relevant for mDNS. Mar 7 14:44:43 Tower avahi-daemon[10171]: Leaving mDNS multicast group on interface veth2739f34.IPv6 with address fe80::4884:77ff:feb7:a969. Mar 7 14:44:43 Tower kernel: docker0: port 3(veth2739f34) entered disabled state Mar 7 14:44:43 Tower kernel: device veth2739f34 left promiscuous mode Mar 7 14:44:43 Tower kernel: docker0: port 3(veth2739f34) entered disabled state Mar 7 14:44:43 Tower avahi-daemon[10171]: Withdrawing address record for fe80::4884:77ff:feb7:a969 on veth2739f34. Mar 7 14:44:43 Tower kernel: docker0: port 4(veth5bc1dc8) entered disabled state Mar 7 14:44:43 Tower kernel: vethea5fbb3: renamed from eth0 Mar 7 14:44:43 Tower avahi-daemon[10171]: Interface veth5bc1dc8.IPv6 no longer relevant for mDNS. Mar 7 14:44:43 Tower avahi-daemon[10171]: Leaving mDNS multicast group on interface veth5bc1dc8.IPv6 with address fe80::8ae:8eff:fede:a0fe. Mar 7 14:44:43 Tower kernel: docker0: port 4(veth5bc1dc8) entered disabled state Mar 7 14:44:43 Tower kernel: device veth5bc1dc8 left promiscuous mode Mar 7 14:44:43 Tower kernel: docker0: port 4(veth5bc1dc8) entered disabled state Mar 7 14:44:43 Tower avahi-daemon[10171]: Withdrawing address record for fe80::8ae:8eff:fede:a0fe on veth5bc1dc8. Mar 7 14:44:43 Tower kernel: br-8038ba180b14: port 1(veth7a733d2) entered disabled state Mar 7 14:44:43 Tower kernel: veth7f5366a: renamed from eth0 Mar 7 14:44:43 Tower avahi-daemon[10171]: Interface veth7a733d2.IPv6 no longer relevant for mDNS. Mar 7 14:44:43 Tower avahi-daemon[10171]: Leaving mDNS multicast group on interface veth7a733d2.IPv6 with address fe80::a89e:7dff:fe9b:6b6. Mar 7 14:44:43 Tower kernel: br-8038ba180b14: port 1(veth7a733d2) entered disabled state Mar 7 14:44:43 Tower kernel: device veth7a733d2 left promiscuous mode Mar 7 14:44:43 Tower kernel: br-8038ba180b14: port 1(veth7a733d2) entered disabled state Mar 7 14:44:43 Tower avahi-daemon[10171]: Withdrawing address record for fe80::a89e:7dff:fe9b:6b6 on veth7a733d2. Mar 7 14:44:48 Tower kernel: docker0: port 10(veth82f62ce) entered disabled state Mar 7 14:44:48 Tower kernel: veth7af951d: renamed from eth0 Mar 7 14:44:49 Tower avahi-daemon[10171]: Interface veth82f62ce.IPv6 no longer relevant for mDNS. Mar 7 14:44:49 Tower avahi-daemon[10171]: Leaving mDNS multicast group on interface veth82f62ce.IPv6 with address fe80::c86c:beff:fefd:e3c4. Mar 7 14:44:49 Tower kernel: docker0: port 10(veth82f62ce) entered disabled state Mar 7 14:44:49 Tower kernel: device veth82f62ce left promiscuous mode Mar 7 14:44:49 Tower kernel: docker0: port 10(veth82f62ce) entered disabled state Mar 7 14:44:49 Tower avahi-daemon[10171]: Withdrawing address record for fe80::c86c:beff:fefd:e3c4 on veth82f62ce. Mar 7 14:44:49 Tower root: stopping dockerd ... Mar 7 14:44:50 Tower root: waiting for docker to die ... Mar 7 14:44:51 Tower avahi-daemon[10171]: Interface docker0.IPv6 no longer relevant for mDNS. Mar 7 14:44:51 Tower avahi-daemon[10171]: Leaving mDNS multicast group on interface docker0.IPv6 with address fe80::42:c2ff:fe45:3fc5. Mar 7 14:44:51 Tower avahi-daemon[10171]: Interface docker0.IPv4 no longer relevant for mDNS. Mar 7 14:44:51 Tower avahi-daemon[10171]: Leaving mDNS multicast group on interface docker0.IPv4 with address 172.17.0.1. Mar 7 14:44:51 Tower avahi-daemon[10171]: Withdrawing address record for fe80::42:c2ff:fe45:3fc5 on docker0. Mar 7 14:44:51 Tower avahi-daemon[10171]: Withdrawing address record for 172.17.0.1 on docker0. Mar 7 14:44:51 Tower emhttpd: shcmd (9923956): umount /var/lib/docker Mar 7 14:44:52 Tower cache_dirs: Stopping cache_dirs process 4448 Mar 7 14:44:53 Tower cache_dirs: cache_dirs service rc.cachedirs: Stopped Mar 7 14:45:04 Tower unassigned.devices: Unmounting All Devices... Mar 7 14:45:04 Tower unassigned.devices: Unmounting partition 'sda2' at mountpoint '/mnt/disks/WD_Green_4TB_714'... Mar 7 14:45:04 Tower unassigned.devices: Unmount cmd: /sbin/umount -fl '/dev/sda2' 2>&1 Mar 7 14:45:04 Tower ntfs-3g[15177]: Unmounting /dev/sda2 (WD Green 4TB 714) Mar 7 14:45:04 Tower unassigned.devices: Successfully unmounted 'sda2' Mar 7 14:45:04 Tower sudo: pam_unix(sudo:session): session closed for user root Mar 7 14:45:05 Tower emhttpd: shcmd (9923957): /etc/rc.d/rc.samba stop Mar 7 14:45:05 Tower wsdd2[9999]: 'Terminated' signal received. Mar 7 14:45:05 Tower winbindd[10075]: [2023/03/07 14:45:05.569343, 0] ../../source3/winbindd/winbindd_dual.c:1957(winbindd_sig_term_handler) Mar 7 14:45:05 Tower winbindd[10075]: Got sig[15] terminate (is_parent=1) Mar 7 14:45:05 Tower winbindd[10077]: [2023/03/07 14:45:05.569373, 0] ../../source3/winbindd/winbindd_dual.c:1957(winbindd_sig_term_handler) Mar 7 14:45:05 Tower winbindd[10077]: Got sig[15] terminate (is_parent=0) Mar 7 14:45:05 Tower winbindd[11433]: [2023/03/07 14:45:05.569416, 0] ../../source3/winbindd/winbindd_dual.c:1957(winbindd_sig_term_handler) Mar 7 14:45:05 Tower winbindd[11433]: Got sig[15] terminate (is_parent=0) Mar 7 14:45:05 Tower wsdd2[9999]: terminating. Mar 7 14:45:05 Tower emhttpd: shcmd (9923958): rm -f /etc/avahi/services/smb.service Mar 7 14:45:05 Tower avahi-daemon[10171]: Files changed, reloading. Mar 7 14:45:05 Tower avahi-daemon[10171]: Service group file /services/smb.service vanished, removing services. Mar 7 14:45:05 Tower emhttpd: Stopping mover... Mar 7 14:45:05 Tower emhttpd: shcmd (9923960): /usr/local/sbin/mover stop Mar 7 14:45:05 Tower root: mover: not running Mar 7 14:45:05 Tower emhttpd: Sync filesystems... Mar 7 14:45:05 Tower emhttpd: shcmd (9923961): sync Mar 7 14:45:06 Tower ProFTPd: Running unmountscript.sh... I checked the log after startup, and can't see anything related to the array until this entry: Mar 7 20:12:03 Tower Parity Check Tuning: DEBUG: Automatic Correcting Parity-Check running Quote Link to comment
-C- Posted March 11, 2023 Author Share Posted March 11, 2023 1 hour ago, itimpi said: I see you have the Parity Check Tuning plugin installed. This plugin replaces the built-in code for displaying history so it could be a problem there that is giving the blank dialog box. To allow me to check this out to see if I can reproduce your symptoms could you let me have: version number of Unraid you are using version number of the plugin you are using a copy of the config/parity-checks.log file from the flash drive I'm running Unraid 6.11.1 Can't see a version number, but it's dated 2023.03.01 Here are all the entries from parity-checks.log 2022 Nov 11 01:35:22|2|0|-4|0|recon P|17578328012 2022 Nov 12 20:18:51|120861|148933137|0|0|recon P|17578328012 2022 Nov 30 22:54:57|128424|140162336|0|0|check P|17578328012 2022 Dec 7 19:33:32|3|0|-4|0|recon Q|19531792332 2022 Dec 8 19:05:46|266984|74.9MB/s|0|0|recon Q|269251|2|AUTOMATIC Parity Sync/Data Rebuild 2022 Dec 10 11:47:13|141961|140887676|0|0|check Q|19531792332 2022 Dec 11 17:04:58|95747|208.9MB/s|0|0|clear|95747|1|AUTOMATIC Disk Clear 2022 Dec 12 19:06:47|116|0|-4|0|recon P|19531792332 2022 Dec 13 22:37:14|2|0|-4|0|recon P|19531792332 2022 Dec 15 00:57:26|252391|79.2MB/s|0|0|recon P|252760|2|AUTOMATIC Parity Sync/Data Rebuild 2022 Dec 21 18:45:10|171468|116.6MB/s|0|2|check P Q|171468|1|MANUAL Correcting Parity Check 2022 Dec 25 11:04:39|328646|60.9MB/s|0|2|check P Q|338803|2|MANUAL Non-Correcting Parity Check 2022 Dec 31 00:42:17|2786|0|-4|0|check P Q|19531825100 2023 Jan 1 17:58:32|148553|134.6MB/s|0|2|check P Q|148553|1|MANUAL Correcting Parity Check 2023 Jan 3 12:49:22|148056|135.1MB/s|0|2|check P Q|148056|1|MANUAL Correcting Parity Check 2023 Jan 6 05:02:08|423315|47.2MB/s|0|2|check P Q|423648|2|MANUAL Non-Correcting Parity Check 2023 Jan 7 00:25:47|19|0|-4|0|check P|19531825100 2023 Jan 8 16:31:21|144317|138587893|0|2|check P|19531825100 2023 Jan 10 13:02:42|142749|140.1MB/s|0|0|check P|142749|1|MANUAL Correcting Parity Check 2023 Jan 10 13:21:39|130|0|-4|0|recon Q|19531825100 2023 Jan 12 17:17:53|60312|331.6MB/s|0|0|recon Q|60312|1|AUTOMATIC Parity Sync/Data Rebuild 2023 Jan 31 06:25:45|145405|137550902|0|2|check P Q|19531825100 2023 Feb 2 19:41:30|153414|130.4MB/s|0|0|check P Q|153414|1|MANUAL Correcting Parity Check 2023 Mar 8 12:13:04|24865|0|-4|0|check P Q|19531825100 2023 Mar 10 18:09:37|148148|135.0 MB/s|0|2|check P Q|148148|1|Manual Correcting Parity-Check Quote Link to comment
-C- Posted July 8, 2023 Author Share Posted July 8, 2023 My issue with the 2 errors being found during parity check remains. I've now got a failing drive and have a new one to replace it with. I've successfully moved everything off the old drive. I had an unclean shutdown recently and Unraid came back up it ran an automatic correcting check which finished today and this is the result from the log: Jul 8 03:18:43 Tower Parity Check Tuning: DEBUG: Automatic Correcting Parity-Check running Jul 8 03:19:25 Tower kernel: md: recovery thread: P corrected, sector=39063584664 Jul 8 03:19:25 Tower kernel: md: recovery thread: P corrected, sector=39063584696 Jul 8 03:19:25 Tower kernel: md: sync done. time=1844sec Jul 8 03:19:25 Tower kernel: md: recovery thread: exit status: 0 The problem is with the same 2 sectors on parity P that have been coming up as bad since the middle of December, but not always: Both parity drives completed their SMART short self-tests without error. I'm unsure how best to proceed. As my largest data disk is 18TB, the parities are 20TB and these 2 problem sectors are right at the end of the 20TB, so outside the area with data and I have moved all of the data off the disk that I want to replace, do I just ignore the parity errors, then follow this guide: https://docs.unraid.net/unraid-os/manual/storage-management#replacing-a-disk-to-increase-capacity or is there something else I can try? tower-diagnostics-20230708-1356.zip Quote Link to comment
JorgeB Posted July 9, 2023 Share Posted July 9, 2023 16 hours ago, -C- said: do I just ignore the parity errors I would for the replacement, then and since the errors are on P only I would try a different disk there, or swap P with Q and see if the error follows the disk. Quote Link to comment
trurl Posted July 9, 2023 Share Posted July 9, 2023 Since when does unclean shutdown run a correcting parity check? Quote Link to comment
trurl Posted July 9, 2023 Share Posted July 9, 2023 Just now, trurl said: Since when does unclean shutdown run a correcting parity check? Is that a "feature" of Parity Check Tuning plugin? Quote Link to comment
JorgeB Posted July 9, 2023 Share Posted July 9, 2023 33 minutes ago, trurl said: Is that a "feature" of Parity Check Tuning plugin? Not sure, @itimpi? Quote Link to comment
itimpi Posted July 9, 2023 Share Posted July 9, 2023 1 hour ago, trurl said: Is that a "feature" of Parity Check Tuning plugin? No. The parity check tuning plugin never initiates a parity check from the beginning, it relies on Unraid to do that and the plugin then handles pause/resume. The only time the plugin initiates anything is when there was an array operation in progress at the time of the shutdown AND you have set the option to restart operations from point reached AND the shutdown was a clean shutdown. Even then whether it is correcting or non-correcting will depend on what it was before the shutdown. Starting a correcting parity check after an unclean shutdown looks like new behaviour at the Unraid level. I am sure this check used to be non-correcting so not sure if this is a bug or is by design. 1 Quote Link to comment
JorgeB Posted July 9, 2023 Share Posted July 9, 2023 34 minutes ago, itimpi said: Starting a correcting parity check after an unclean shutdown looks like new behaviour at the Unraid level. I didn't know this changed, but after an unclean shutdown I do prefer a correcting check, since some sync errors are usually normal, might as well correct them on the first pass. Quote Link to comment
Kilrah Posted July 9, 2023 Share Posted July 9, 2023 Is it really correcting? I seem to remember that at some point it would say it was but wasn't Quote Link to comment
trurl Posted July 9, 2023 Share Posted July 9, 2023 2 hours ago, JorgeB said: I didn't know this changed, but after an unclean shutdown I do prefer a correcting check, since some sync errors are usually normal, might as well correct them on the first pass. If you really want it to correct, you can just cancel the non-correcting check and manually run it as correcting. I can imagine scenarios (bad RAM?) that might result in unclean shutdown where you wouldn't want it to change parity. Quote Link to comment
JorgeB Posted July 10, 2023 Share Posted July 10, 2023 15 hours ago, trurl said: you can just cancel the non-correcting check and manually run it as correcting. Yeah, but most users won't know about that, so they will wait for it to finish then run another one, or assume that the errors were corrected, though there are good arguments for doing either way. Quote Link to comment
-C- Posted July 11, 2023 Author Share Posted July 11, 2023 On 7/10/2023 at 8:20 AM, JorgeB said: most users won't know about that I'm still not 100% sure about what's going on with all this 😜 Here's an update with what happened. I followed the guide to replace the failing disk. The rebuild onto the new disk appears to have gone well with no errors reported: and What's strange is that there's nothing in the logs at the 10:00 timestamp that the parity result shows as the rebuild end time: Jul 10 06:45:11 Tower emhttpd: spinning down /dev/sde Jul 10 09:15:08 Tower autofan: Highest disk temp is 43C, adjusting fan speed from: 230 (90% @ 833rpm) to: 205 (80% @ 854rpm) Jul 10 09:20:14 Tower autofan: Highest disk temp is 44C, adjusting fan speed from: 205 (80% @ 868rpm) to: 230 (90% @ 834rpm) Jul 10 09:39:17 Tower emhttpd: read SMART /dev/sdh Jul 10 09:59:53 Tower webGUI: Successful login user root from 192.168.34.42 Jul 10 10:00:43 Tower kernel: md: sync done. time=132325sec Jul 10 10:00:43 Tower kernel: md: recovery thread: exit status: 0 Jul 10 10:05:23 Tower autofan: Highest disk temp is 43C, adjusting fan speed from: 230 (90% @ 869rpm) to: 205 (80% @ 907rpm) Jul 10 10:09:42 Tower emhttpd: spinning down /dev/sdh Jul 10 10:14:57 Tower webGUI: Successful login user root from 192.168.34.42 Jul 10 10:15:29 Tower autofan: Highest disk temp is 42C, adjusting fan speed from: 205 (80% @ 869rpm) to: 180 (70% @ 854rpm) Jul 10 10:30:00 Tower webGUI: Successful login user root from 192.168.34.42 Jul 10 10:30:34 Tower autofan: Highest disk temp is 41C, adjusting fan speed from: 180 (70% @ 850rpm) to: 155 (60% @ 853rpm) Jul 10 10:30:44 Tower emhttpd: spinning down /dev/sdg I can see this in the log when the rebuild starts: Jul 8 21:17:28 Tower Parity Check Tuning: DEBUG: Parity Sync/Data Rebuild running Jul 8 21:17:28 Tower Parity Check Tuning: Parity Sync/Data Rebuild detected Jul 8 21:17:28 Tower Parity Check Tuning: DEBUG: Created cron entry for 6 minute interval monitoring Then I get the update every 6 minutes as expected: Jul 9 02:24:34 Tower Parity Check Tuning: DEBUG: Parity Sync/Data Rebuild running Jul 9 02:30:20 Tower Parity Check Tuning: DEBUG: Parity Sync/Data Rebuild running Jul 9 02:36:33 Tower Parity Check Tuning: DEBUG: Parity Sync/Data Rebuild running Until here: Jul 9 02:42:20 Tower Parity Check Tuning: DEBUG: Parity Sync/Data Rebuild running Jul 9 02:42:20 Tower Parity Check Tuning: DEBUG: detected that mdcmd had been called from sh with command mdcmd nocheck PAUSE Which happens a couple of minutes after this: Jul 9 02:40:01 Tower root: mover: started There are no further parity related entries after that. I'm not sure whether I can consider things OK now, or whether I should be investigating further. Quote Link to comment
itimpi Posted July 11, 2023 Share Posted July 11, 2023 1 hour ago, -C- said: What's strange is that there's nothing in the logs at the 10:00 timestamp that the parity result shows as the rebuild end time: Yes there is - there are the standard Unraid messages when an array operation completes that look like this: Jul 10 10:00:43 Tower kernel: md: sync done. time=132325sec Jul 10 10:00:43 Tower kernel: md: recovery thread: exit status: 0 The messages that look like this: Jul 8 21:17:28 Tower Parity Check Tuning: DEBUG: Parity Sync/Data Rebuild running Jul 8 21:17:28 Tower Parity Check Tuning: Parity Sync/Data Rebuild detected Jul 8 21:17:28 Tower Parity Check Tuning: DEBUG: Created cron entry for 6 minute interval monitoring are from the parity check tuning plugin that is not a standard part of Unraid. This currently has an issue that I do not understand where the monitor task seems to stop running and I do not know why. The latest version of the plugin is 2023-07-08 but I suspect you did not have that installed at the time. Quote Link to comment
-C- Posted July 12, 2023 Author Share Posted July 12, 2023 Thanks Dave- that makes things clearer. If only the standard messages were as descriptive as the Parity Check Tuning ones. I check in on my server most days and try to stay on top of app & plugin updates as soon as they become available. The Parity Check Tuning plugin is indeed on 2023.07.08 and I believe it was updated before I replaced the disk, but not certain. Good luck with finding the cause of the stopping monitoring task. In my case all seemed good until the daily mover operation started. Quote Link to comment
itimpi Posted July 13, 2023 Share Posted July 13, 2023 8 hours ago, -C- said: Good luck with finding the cause of the stopping monitoring task. In my case all seemed good until the daily mover operation started. The (undesirable) side-effect of the monitor task stopping running is that if you have the plugin settings set to pause a parity check if you have mover or appdata backup running then when either of those is detected you are likely to get the plugin executing the pause but then not doing the resume so you need a manual resume to continue. Quote Link to comment
-C- Posted July 13, 2023 Author Share Posted July 13, 2023 4 hours ago, itimpi said: if you have mover or appdata backup running then when either of those is detected you are likely to get the plugin executing the pause but then not doing the resume so you need a manual resume to continue. I have both of those running daily and although the PCT log entries stopped just after the mover started, the actual rebuild continued and completed seemingly successfully without any interaction on my part. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.