liquidrt Posted April 2, 2023 Share Posted April 2, 2023 (edited) I built an unraid server about 6 weeks ago and it has been working flawlessly until two days ago. Starting two days ago, the server becomes unresponsive. I can not access network shares, ssh to the server, or access the GUI. The LAN lights are flashing and the switchport it is plugged into is still lit. The only resolution is to power the server off and turn it back on. I have not been able to console into the server when it crashes to determine if there is any GUI output or directly connect in any way due to the location. Here are the only warning/error syslog messages on the logs. Apr 1 21:03:58 NAS kernel: ACPI: Early table checksum verification disabled Apr 1 21:03:58 NAS kernel: floppy0: no floppy controllers found Apr 1 21:03:58 NAS kernel: i915 0000:00:02.0: [drm] failed to retrieve link info, disabling eDP Apr 1 21:04:02 NAS mcelog: failed to prefill DIMM database from DMI data Apr 1 21:04:14 NAS rpc.statd[2134]: Failed to read /var/lib/nfs/state: Success Apr 1 21:04:28 NAS kernel: nvme 0000:05:00.0: VPD access failed. This is likely a firmware bug on this device. Contact the card vendor for a firmware update Apr 1 21:04:28 NAS kernel: nvme 0000:05:00.0: failed VPD read at offset 1 Edited April 5, 2023 by liquidrt Quote Link to comment
JorgeB Posted April 2, 2023 Share Posted April 2, 2023 Enable the syslog server and post that after a crash. Quote Link to comment
liquidrt Posted April 3, 2023 Author Share Posted April 3, 2023 (edited) Thanks for the help @JorgeB ! I just experienced a crash about 20 minutes ago. The system ran fine for about 12 hours today. I started the plex container and 20 minutes later it crashed. Edited April 5, 2023 by liquidrt Quote Link to comment
JorgeB Posted April 3, 2023 Share Posted April 3, 2023 Did you enable the syslog server? Quote Link to comment
liquidrt Posted April 3, 2023 Author Share Posted April 3, 2023 Yes @JorgeBI enabled syslog to mirror to flash. For this, should I set up an external syslog server? Does the diagnostics I uploaded include the syslog that was mirrored to flash? Quote Link to comment
JorgeB Posted April 3, 2023 Share Posted April 3, 2023 The diags won't include the mirrored syslog, it will be save to wherever it was set to. Quote Link to comment
liquidrt Posted April 3, 2023 Author Share Posted April 3, 2023 (edited) Thanks @JorgeB. I did set up syslog mirroring to the flash. For some reason, when I run: tail -f -n 500000 /var/log/syslog I am still only seeing messages directly after reboot. I am not seeing logs persist prior to the crash or reboot. Maybe I am not looking in the correct spot. I am going to continue to read the syslog messages while the server is operating and hopefully catch the condition before it crashes next time since the logs are not persisting. Edited April 3, 2023 by liquidrt Quote Link to comment
JorgeB Posted April 3, 2023 Share Posted April 3, 2023 The persistent sylog will be in the flash drive /logs folder. Quote Link to comment
liquidrt Posted April 3, 2023 Author Share Posted April 3, 2023 (edited) Thanks @JorgeB That showed up! I am not seeing anything directly before the crashes which looks concerning. Please see below: Apr 2 08:55:22 GNAS root: Reloading Nginx configuration... Apr 2 21:02:31 GNAS kernel: eth0: renamed from veth69d7509 Apr 2 21:02:34 GNAS kernel: eth0: renamed from veth92c1b6c Apr 2 22:06:14 GNAS kernel: microcode: microcode updated early to revision 0xf0, date = 2021-11-15 Apr 2 22:06:14 GNAS kernel: Linux version 5.19.17-Unraid (root@Develop) (gcc (GCC) 12.2.0, GNU ld version 2.39-slack151) #2 SMP PREEMPT_DYNAMIC Wed Nov 2 11:54:15 PDT 2022 Apr 2 22:06:14 GNAS kernel: Command line: BOOT_IMAGE=/bzimage initrd=/bzroot Apr 3 07:25:20 GNAS kernel: eth0: renamed from veth278b16c Apr 3 08:23:21 GNAS kernel: md: sync done. time=36997sec Apr 3 08:23:21 GNAS kernel: md: recovery thread: exit status: 0 Apr 3 11:00:24 GNAS kernel: microcode: microcode updated early to revision 0xf0, date = 2021-11-15 Apr 3 11:00:24 GNAS kernel: Linux version 5.19.17-Unraid (root@Develop) (gcc (GCC) 12.2.0, GNU ld version 2.39-slack151) #2 SMP PREEMPT_DYNAMIC Wed Nov 2 11:54:15 PDT 2022 Apr 3 11:00:24 GNAS kernel: Command line: BOOT_IMAGE=/bzimage initrd=/bzroot Apr 3 11:00:24 GNAS kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' I do see some panic messages from earlier on and some memory messages Apr 3 01:19:19 GNAS smbd[2800]: [2023/04/03 01:19:19.395299, 0] ../../source3/smbd/close.c:312(close_remove_share_mode) Apr 3 01:19:19 GNAS smbd[2800]: close_remove_share_mode: Could not get share mode lock for file 2023/phone/rob/PXL_20230402_174218674.jpg.tacitpart Apr 3 01:19:19 GNAS smbd[2800]: [2023/04/03 01:19:19.395377, 0] ../../source3/smbd/fd_handle.c:39(fd_handle_destructor) Apr 3 01:19:19 GNAS smbd[2800]: PANIC: assert failed at ../../source3/smbd/fd_handle.c(39): (fh->fd == -1) || (fh->fd == AT_FDCWD) Apr 3 01:19:19 GNAS smbd[2800]: [2023/04/03 01:19:19.395387, 0] ../../lib/util/fault.c:173(smb_panic_log) Apr 3 01:19:19 GNAS smbd[2800]: =============================================================== Apr 3 01:19:19 GNAS smbd[2800]: [2023/04/03 01:19:19.395399, 0] ../../lib/util/fault.c:174(smb_panic_log) Apr 3 01:19:19 GNAS smbd[2800]: INTERNAL ERROR: assert failed: (fh->fd == -1) || (fh->fd == AT_FDCWD) in pid 2800 (4.17.3) Apr 3 01:19:19 GNAS smbd[2800]: [2023/04/03 01:19:19.395406, 0] ../../lib/util/fault.c:178(smb_panic_log) Apr 3 01:19:19 GNAS smbd[2800]: If you are running a recent Samba version, and if you think this problem is not yet fixed in the latest versions, please consider reporting this bug, see https://wiki.samba.org/index.php/Bug_Reporting Apr 3 01:19:19 GNAS smbd[2800]: [2023/04/03 01:19:19.395414, 0] ../../lib/util/fault.c:183(smb_panic_log) Apr 3 01:19:19 GNAS smbd[2800]: =============================================================== Apr 3 01:19:19 GNAS smbd[2800]: [2023/04/03 01:19:19.395426, 0] ../../lib/util/fault.c:184(smb_panic_log) Apr 3 01:19:19 GNAS smbd[2800]: PANIC (pid 2800): assert failed: (fh->fd == -1) || (fh->fd == AT_FDCWD) in 4.17.3 Apr 3 01:19:19 GNAS smbd[2800]: [2023/04/03 01:19:19.395669, 0] ../../lib/util/fault.c:292(log_stack_trace) Apr 3 01:19:19 GNAS smbd[2800]: BACKTRACE: 32 stack frames: Apr 3 00:51:19 GNAS nginx: 2023/04/03 00:51:19 [crit] 2237#2237: ngx_slab_alloc() failed: no memory Apr 3 00:51:19 GNAS nginx: 2023/04/03 00:51:19 [error] 2237#2237: shpool alloc failed Apr 3 00:51:19 GNAS nginx: 2023/04/03 00:51:19 [error] 2237#2237: nchan: Out of shared memory while allocating message of size 387. Increase nchan_max_reserved_memory. Apr 3 00:51:19 GNAS nginx: 2023/04/03 00:51:19 [error] 2237#2237: *82179 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/cpuload?buffer_length=1 HTTP/1.1", host: "localhost" Apr 3 00:51:19 GNAS nginx: 2023/04/03 00:51:19 [error] 2237#2237: MEMSTORE:00: can't create shared message for channel /cpuload Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [crit] 2237#2237: ngx_slab_alloc() failed: no memory Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: shpool alloc failed Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: nchan: Out of shared memory while allocating message of size 11163. Increase nchan_max_reserved_memory. Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: *82180 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/devices?buffer_length=1 HTTP/1.1", host: "localhost" Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: MEMSTORE:00: can't create shared message for channel /devices Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [crit] 2237#2237: ngx_slab_alloc() failed: no memory Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: shpool alloc failed Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: nchan: Out of shared memory while allocating message of size 234. Increase nchan_max_reserved_memory. Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: *82181 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/arraymonitor?buffer_length=1 HTTP/1.1", host: "localhost" Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: MEMSTORE:00: can't create shared message for channel /arraymonitor Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [crit] 2237#2237: ngx_slab_alloc() failed: no memory Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: shpool alloc failed Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: nchan: Out of shared memory while allocating message of size 311. Increase nchan_max_reserved_memory. Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: *82182 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/parity?buffer_length=1 HTTP/1.1", host: "localhost" Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: MEMSTORE:00: can't create shared message for channel /parity Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [crit] 2237#2237: ngx_slab_alloc() failed: no memory Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: shpool alloc failed Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: nchan: Out of shared memory while allocating message of size 492. Increase nchan_max_reserved_memory. Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: *82183 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/diskload?buffer_length=1 HTTP/1.1", host: "localhost" Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: MEMSTORE:00: can't create shared message for channel /diskload Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [crit] 2237#2237: ngx_slab_alloc() failed: no memory Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: shpool alloc failed Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: nchan: Out of shared memory while allocating message of size 3596. Increase nchan_max_reserved_memory. Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: *82184 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/var?buffer_length=1 HTTP/1.1", host: "localhost" Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [error] 2237#2237: MEMSTORE:00: can't create shared message for channel /var Apr 3 00:51:20 GNAS nginx: 2023/04/03 00:51:20 [crit] 2237#2237: ngx_slab_alloc() failed: no memory Edited April 5, 2023 by liquidrt Quote Link to comment
JorgeB Posted April 3, 2023 Share Posted April 3, 2023 Nothing relevant logged before the crashes, this usually suggests a hardware problem, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one. Quote Link to comment
liquidrt Posted April 3, 2023 Author Share Posted April 3, 2023 Thanks! I am assuming the culprits for hardware problems are typically RAM? This use to be a desktop computer that was running fine for a year and a half or so. The only things added to this server was a generic PCI to Sata card from amazon and some third party RAM. I will start there and see what I can come up with, thank you! Quote Link to comment
JorgeB Posted April 3, 2023 Share Posted April 3, 2023 Could be, start by running memtest, but could also be other hardware. Quote Link to comment
TurkeyPerson Posted April 29, 2023 Share Posted April 29, 2023 Did you get anywhere? Getting similar errors but only managed to get a screenshot of the log. Upgrading off stable and will see if it reappears in the coming days. Quote Link to comment
trurl Posted April 29, 2023 Share Posted April 29, 2023 5 minutes ago, TurkeyPerson said: Getting similar errors Start your own thread with your diagnostics and setup syslog server. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.