Hello guys, I need help with this problem that happened to my server. Please note it had already happened once, maybe 1-2 months ago, then never happened again until yesterday. Please find attached my diagnostics zip.
I don't understand how, but the machine enters an "unresponsive status": while still being powered on it is not possible to access web ui, containers ui, smb share, nothing works. If i try to ping it it says unreachable. To my ears, HDDs seem to be spin down.
The only solution (unfortunately) seems to be hard shut down, then power on again. Last time it worked, server did parity check after and there were no errors (this time i managed to reboot it again, parity check is in progress).
After the first time I enabled local rsyslog to a rpi, so I was able to collect the full log, which as far as I can understand doesn't tell much.
I am sure that the machine was working fine until 19:15, then I unfortunately was not at home until this morning.
In kernel.log there is a huge series of the two lines below, then (i suppose at the time of the crash, unfotrunately I was not home) it interrupts and next line is the power on of this morning after the hard sutdown.
2023-01-08T23:19:21+00:00 BerroServer kernel: pcieport 0000:00:1c.1: Enabling MPC IRBNCE
2023-01-08T23:19:21+00:00 BerroServer kernel: pcieport 0000:00:1c.1: Intel PCH root port ACS workaround enabled
2023-01-09T11:17:29+00:00 BerroServer kernel: md: unRAID driver 2.9.25 installed
It looks like 23:19 might have been the time of the crash (?)
Another file in rsyslog folder, named ".log", maybe tells a bit more. Here's the tail:
2023-01-08T16:26:04+00:00 BerroServer emhttpd: read SMART /dev/sdg
2023-01-08T16:26:11+00:00 BerroServer emhttpd: read SMART /dev/sde
2023-01-08T16:27:34+00:00 BerroServer shfs: share cache full
2023-01-08T16:27:34+00:00 BerroServer emhttpd: read SMART /dev/sdf
2023-01-08T16:27:41+00:00 BerroServer shfs: share cache full
2023-01-08T16:27:49+00:00 BerroServer message repeated 148 times: [ shfs: share cache full]
2023-01-08T16:32:48+00:00 BerroServer shfs: share cache full
2023-01-08T16:32:58+00:00 BerroServer message repeated 9 times: [ shfs: share cache full]
2023-01-08T16:57:45+00:00 BerroServer emhttpd: spinning down /dev/sdg
2023-01-08T17:05:15+00:00 BerroServer emhttpd: spinning down /dev/sde
2023-01-08T17:05:15+00:00 BerroServer emhttpd: spinning down /dev/sdf
2023-01-09T11:17:28+00:00 BerroServer sshd[7822]: Server listening on 0.0.0.0 port 22.
In this log the last entry before the reboot is dated 17:05, so way before the supposed crash time of 23:19.
I have looked up in the forum about the message "shfs: share cache full" and it looks like it shouldn't be the cause of this problem.
The only similar issue i found was this post unraid-became-mostly-unresponsive which unfortunately led to nowhere because there was no log.
If You need, i can provide the full zip export of rsyslog.
Could it be a hardware related problem? I do have 4x8GB ECC RAM, is it useful to run a memtest (after parity-check completion of course)?
Thanks in advance for the support.
berroserver-diagnostics-20230109-1238.zip