Nodiaque Posted April 28, 2022 Share Posted April 28, 2022 Hello everyone, Just woke up this morning without Internet (pihole dead). Connected to my unraid server, nothing under docker. Check in settings ==> docker, it doesn't load; the page stay blank. Restarted unraid server, samething. When I click to see log, I only have this: Quote Apr 28 07:21:36 ServRaid kernel: Call Trace: Apr 28 07:21:36 ServRaid kernel: ? _raw_spin_unlock_irqrestore+0xd/0xe Apr 28 07:21:36 ServRaid kernel: ? __kthread_should_park+0x5/0x10 Apr 28 07:21:36 ServRaid kernel: ? ksoftirqd_running+0x28/0x32 Apr 28 07:21:36 ServRaid kernel: ? __irq_exit_rcu+0x58/0x80 Apr 28 07:21:36 ServRaid kernel: ? sysvec_apic_timer_interrupt+0x87/0x95 Apr 28 07:21:36 ServRaid kernel: ? asm_sysvec_apic_timer_interrupt+0x12/0x20 Apr 28 07:21:36 ServRaid kernel: ? _nv018915rm+0x24f/0x280 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? _nv018915rm+0x24f/0x280 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? _nv029397rm+0xe7/0x120 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? _nv026141rm+0x60/0xc0 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? _nv010182rm+0x173/0x2c0 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? _nv026153rm+0x202/0x390 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? _nv026154rm+0x57/0x70 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? _nv018931rm+0x6d/0xe0 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? _nv018926rm+0x2c/0x90 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? _nv021908rm+0x6c/0xc0 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? _nv021906rm+0x488/0x870 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? _nv021907rm+0x210/0x380 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? _nv022130rm+0x37/0x120 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? _nv000643rm+0x13a3/0x20b0 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? rm_init_adapter+0xc5/0xe0 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? nv_open_device+0x456/0x686 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? nvidia_open+0x2bf/0x421 [nvidia] Apr 28 07:21:36 ServRaid kernel: ? nvidia_frontend_open+0x62/0x8d [nvidia] Apr 28 07:21:36 ServRaid kernel: ? chrdev_open+0x150/0x187 Apr 28 07:21:36 ServRaid kernel: ? cdev_put+0x19/0x19 Apr 28 07:21:36 ServRaid kernel: ? do_dentry_open+0x184/0x289 Apr 28 07:21:36 ServRaid kernel: ? path_openat+0x85e/0x937 Apr 28 07:21:36 ServRaid kernel: ? shmem_getpage_gfp.isra.0+0x166/0x543 Apr 28 07:21:36 ServRaid kernel: ? atime_needs_update+0x6d/0xcc Apr 28 07:21:36 ServRaid kernel: ? do_filp_open+0x4c/0xa9 Apr 28 07:21:36 ServRaid kernel: ? _cond_resched+0x1b/0x1e Apr 28 07:21:36 ServRaid kernel: ? getname_flags+0x24/0x146 Apr 28 07:21:36 ServRaid kernel: ? kmem_cache_alloc+0x108/0x130 Apr 28 07:21:36 ServRaid kernel: ? do_sys_openat2+0x6f/0xec Apr 28 07:21:36 ServRaid kernel: ? do_sys_open+0x35/0x4f Apr 28 07:21:36 ServRaid kernel: ? do_syscall_64+0x5d/0x6a Apr 28 07:21:36 ServRaid kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 It seems it's a trace from a failure, but I don't know where to go from there. Thank you Quote Link to comment
JorgeB Posted April 28, 2022 Share Posted April 28, 2022 You should post the diagnostics but error appears to be GPU related, see if it works without the Nvidia GPU, or without loading the Nvidia driver. Quote Link to comment
Nodiaque Posted April 28, 2022 Author Share Posted April 28, 2022 My GUI stop responding, I cannot log to it anymore but ssh work. I was gonna do the diagnostics Quote Link to comment
JorgeB Posted April 28, 2022 Share Posted April 28, 2022 You can get the diagnostics on the console. Quote Link to comment
Nodiaque Posted April 28, 2022 Author Share Posted April 28, 2022 oh yeah, I missed that. I guess this take time? It's been on "starting diagnostics collection" 5 min ago Quote Link to comment
JorgeB Posted April 28, 2022 Share Posted April 28, 2022 1 minute ago, Nodiaque said: 5 min ago Unlikely to finish then. Quote Link to comment
Nodiaque Posted April 28, 2022 Author Share Posted April 28, 2022 it seems something is stuck, I tried to start a shutdown and it just doesn't shutdown. It does send the broadcast message but the system is still up after 2 minutes... I think I'll have to do a force shutdown poor array Quote Link to comment
trurl Posted April 28, 2022 Share Posted April 28, 2022 5 minutes ago, Nodiaque said: force shutdown Get diagnostics as soon as you reboot. Then setup syslog server if you can. Quote Link to comment
Nodiaque Posted April 28, 2022 Author Share Posted April 28, 2022 ok. So it seems the server was stuck somewhere. I had to do a power failure, which mean now a parity check has begun on the 64tb array... gonna take a week at least to complete but, everything is back up! I do have a diagnostics but it's post reboot, so I guess it doesn't server any purpose anymore. Quote Link to comment
trurl Posted April 28, 2022 Share Posted April 28, 2022 18 minutes ago, Nodiaque said: gonna take a week at least to complete Why? That itself makes me wonder if you have hardware problems. You say 64TB array, but it is the size of parity that determines how long parity checks take, assuming no bottlenecks such as port multipliers (or even worse, trying to use USB for array disks). Typically parity checks take 2-3 hours per TB of parity. 18 minutes ago, Nodiaque said: I do have a diagnostics but it's post reboot, so I guess it doesn't server any purpose anymore. 24 minutes ago, trurl said: Get diagnostics as soon as you reboot. Then setup syslog server if you can. I wouldn't have asked for them post reboot if they didn't serve any purpose. Attach to your NEXT post in this thread. Quote Link to comment
Nodiaque Posted April 28, 2022 Author Share Posted April 28, 2022 Last time it took about 22 hours when I had only 2 drives. Right now it says 24h, we'll see. Here is the log I run once I was online. I could run another if needed. Thanks a lot for the help servraid-diagnostics-20220428-0810.zip Quote Link to comment
trurl Posted April 28, 2022 Share Posted April 28, 2022 11 minutes ago, Nodiaque said: Last time it took about 22 hours when I had only 2 drives Doesn't matter how many drives. 16 minutes ago, trurl said: it is the size of parity that determines how long parity checks take, assuming no bottlenecks such as port multipliers (or even worse, trying to use USB for array disks). Typically parity checks take 2-3 hours per TB of parity. 13 minutes ago, Nodiaque said: Right now it says 24h You have 16TB parity. I would expect it to take longer than 24 hours, it typically gets slower as it gets to the shorter inner tracks. But you were expecting a week which didn't make any sense. 44 minutes ago, trurl said: setup syslog server Quote Link to comment
Nodiaque Posted April 28, 2022 Author Share Posted April 28, 2022 OK, good to know then. I'll enable the syslog server, thanks! Quote Link to comment
Nodiaque Posted April 28, 2022 Author Share Posted April 28, 2022 just saw it's already enabled Quote Link to comment
trurl Posted April 28, 2022 Share Posted April 28, 2022 Since you are not Mirroring to flash (not recommended for extended use), you have to specify the Local syslog folder and the Remote syslog server (your Unraid server if you want). Quote Link to comment
Nodiaque Posted April 28, 2022 Author Share Posted April 28, 2022 AH! ok, I've setup a new share on cache for that and it started to log. Quote Link to comment
Nodiaque Posted May 1, 2022 Author Share Posted May 1, 2022 Ok, happened again today. Here is the syslog log. I couldn't access anything on the server this time. GUI was non responsive, couldn't access share and when I tried in ssh to ls the mnt folder, it frozed. Also, here's the new diagnostic ran from gui after the reboot. syslog-192.168.0.4.log servraid-diagnostics-20220501-1943.zip Quote Link to comment
JorgeB Posted May 2, 2022 Share Posted May 2, 2022 Still seems Nvidia related, so my original advice remains. Quote Link to comment
Nodiaque Posted May 2, 2022 Author Share Posted May 2, 2022 Ok, I wasn't sure since I was seeing something about cpu stalle. I'll swap the card and see. Thanks Quote Link to comment
Nodiaque Posted May 27, 2022 Author Share Posted May 27, 2022 Ok, so the problem happened again today. According to my syslog, it seems to be while it was verifying the backup. Something like a cpu stall. syslog-192.168.0.4.log Quote Link to comment
Zonediver Posted May 27, 2022 Share Posted May 27, 2022 Normal time for a parity check is 8h/4TB. So 16TB takes ~32h My parity check with a 12TB parity disk takes ~24h Quote Link to comment
Nodiaque Posted May 27, 2022 Author Share Posted May 27, 2022 3 minutes ago, Zonediver said: Normal time for a parity check is 8h/4TB. So 16TB takes ~32h My parity check with a 12TB parity disk takes ~24h yeah, it took about 32h. But it's ok since everything else is still running while the parity is running. But my main issue is back Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.