superloopy1 Posted April 16, 2021 Share Posted April 16, 2021 (edited) Hi ... i've been trying to parity check my recently built system after loading the data for the past couple of nights. Each time the system has hung and needed to be hard booted to restore access so parity is never being established. Today i set up a syslog server and 'babysat' a fresh attempt and one third into the parity check at approx 35% it began to 'die' with the time needed to complete dropping dramatically to a point of 8 days and a speed of 8.2 MB/s. At that point i paused it (it was never able to be resumed) and dumped the diagnostics as attached. Looks like it hit a kernel panic ... anyone any ideas? tower-diagnostics-20210416-1547.zip Edited April 16, 2021 by superloopy1 Quote Link to comment
SimonF Posted April 16, 2021 Share Posted April 16, 2021 (edited) Suggest logging a bug report as may need devs to look at the md process. Apr 16 15:41:00 Tower kernel: kernel BUG at drivers/md/unraid.c:310! Apr 16 15:41:00 Tower kernel: invalid opcode: 0000 [#1] SMP PTI Apr 16 15:41:00 Tower kernel: CPU: 5 PID: 8217 Comm: mdrecoveryd Tainted: P O 5.10.21-Unraid #1 Apr 16 15:41:00 Tower kernel: Hardware name: Supermicro Super Server/X10DRi-LN4+, BIOS 3.3 10/24/2020 Apr 16 15:41:00 Tower kernel: RIP: 0010:_get_active_stripe+0x13c/0x339 [md_mod] Apr 16 15:41:00 Tower kernel: Code: 00 00 00 e9 e4 00 00 00 48 89 ef e8 50 fe ff ff 48 85 c0 49 89 c6 0f 84 61 ff ff ff 48 8b 48 20 8b 81 70 05 00 00 85 c0 75 02 <0f> 0b 4d 89 7e 30 49 8d 86 38 01 00 00 31 ff 39 79 10 7e 46 4c 8b Apr 16 15:41:00 Tower kernel: RSP: 0018:ffffc90000e4fcd0 EFLAGS: 00010046 Apr 16 15:41:00 Tower kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: Edited April 16, 2021 by SimonF Quote Link to comment
superloopy1 Posted April 16, 2021 Author Share Posted April 16, 2021 And again following reboot, one further attempt a parity check .... tower-diagnostics-20210416-1830.zip Quote Link to comment
superloopy1 Posted April 16, 2021 Author Share Posted April 16, 2021 Now raised as a bug report (SimonF), first time for me. Still happening .... random run times following parity start and parity check speeds drop like a stone. Telling info is in syslogs and diagnostics posted up. Quote Link to comment
SimonF Posted April 16, 2021 Share Posted April 16, 2021 5 minutes ago, superloopy1 said: Now raised as a bug report This will require @limetechto review the bug report. Quote Link to comment
superloopy1 Posted April 16, 2021 Author Share Posted April 16, 2021 So .... it just sits there for the time being? All new stuff this to me ... had unraid for years, rock solid till now. I thought about going to 6.9.2 but cant really see any correlation in the changelogs but if it is still happening on the latest release then i'd find out, yes? Quote Link to comment
superloopy1 Posted April 16, 2021 Author Share Posted April 16, 2021 I've noticed this in the logs, no idea if it's relevant or not, anyone know what it means? 'kernel: pmd_set_huge: Cannot satisfy [mem 0xc0000000-0xc0200000] with a huge-page mapping due to MTRR override' Not being flagged as anything worthy of an error or even a warning but reads that way? MTRR override?? Quote Link to comment
JorgeB Posted April 17, 2021 Share Posted April 17, 2021 You can downgrade back to the release it was working with, this will also confirm it's not a hardware issue. Quote Link to comment
Andy Castille Posted May 19, 2023 Share Posted May 19, 2023 This just started happening to me after I set up a parity drive. I had been using Unraid without a parity drive for about a year, and then I added one a little over month ago. This happened during the first parity check but it finished successfully after rebooting. But now (1 month later), its second parity check is failing with this problem repeatedly (3 times today). I have changed parity checks to yearly until I can figure out how to stop the entire system from hanging. I have attached a log from the next successful boot, and here are the messages it printed before the crash that led me to find this thread: kernel BUG at drivers/md/unraid.c:310! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 13 PID: 4587 Comm: mdrecoveryd Not tainted 5.19.17-Unraid #2 syslog.txt Quote Link to comment
Solution JorgeB Posted May 19, 2023 Solution Share Posted May 19, 2023 Try updating to v6.12-rc6 and try again. Quote Link to comment
Andy Castille Posted May 20, 2023 Share Posted May 20, 2023 On 5/19/2023 at 1:05 AM, JorgeB said: Try updating to v6.12-rc6 and try again. Installing the RC version seems to have fixed it. It finished a parity check successfully. Thank you! 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.