October 15, 20205 yr Hello, my media-server recently got a problem: It started crashing very randomly at random times. So it usually runs 3-24h and after that it just won't be accessible in any way, the web GUI is down, the SSH interface is down and all the dockers are down. When looking at the HDMI connected monitor, it displays the usual information it already did after start-up (cmdline mode). So I enabled syslog server to record all information necessary. This is what I've extracted out of the syslog.txt (complete version down below): Aug 11 05:33:23 RaidByte root: umount: /mnt/disk1: target is busy. Aug 11 05:33:23 RaidByte emhttpd: shcmd (105): exit status: 32 Aug 11 05:33:23 RaidByte emhttpd: Retry unmounting disk share(s)... Aug 11 05:33:28 RaidByte emhttpd: Unmounting disks... Aug 11 05:33:28 RaidByte emhttpd: shcmd (106): umount /mnt/disk1 Aug 11 05:33:28 RaidByte root: umount: /mnt/disk1: target is busy. Aug 11 05:33:28 RaidByte emhttpd: shcmd (106): exit status: 32 Aug 11 05:33:28 RaidByte emhttpd: Retry unmounting disk share(s)... Aug 11 05:33:31 RaidByte root: Status of all loop devices Aug 11 05:33:31 RaidByte root: /dev/loop1: [2049]:4 (/boot/bzfirmware) Aug 11 05:33:31 RaidByte root: /dev/loop0: [2049]:3 (/boot/bzmodules) Aug 11 05:33:31 RaidByte root: /dev/loop3: [2305]:6442451073 (/mnt/disk1/system/libvirt/libvirt.img) Aug 11 05:33:31 RaidByte root: Active pids left on /mnt/* Aug 11 05:33:31 RaidByte root: USER PID ACCESS COMMAND Aug 11 05:33:31 RaidByte root: /mnt/disk1: root kernel mount /mnt/disk1 Aug 11 05:33:31 RaidByte root: /mnt/disks: root kernel mount /mnt/disks Aug 11 05:33:31 RaidByte root: Active pids left on /dev/md* Aug 11 05:33:31 RaidByte root: USER PID ACCESS COMMAND Aug 11 05:33:31 RaidByte root: /dev/md1: root kernel mount /mnt/disk1 Aug 11 05:33:31 RaidByte root: Generating diagnostics... So it seems to unmount disks for whatever reason and fails doing so with a busy disk. Anyone has an idea why that could be? Also I recently had problems stopping the array because always one random disk did not unmount (every time a different one), so i had to force it via cmline. (Maybe this is connected) Also I have to mention that I'm currently on the beta version, that could be the problem too, but doesn't have to be. What do you think? Thanks for any help! System Info: Version: 6.9.0-beta22 MOBO: Das X470 GAMING PLUS MAX CPU: Ryzen 5 1600 GPU: GT 710 (for Win10-VM passthrough) HDDs: 3x4TB (one Barracuda, one WD Blue, one WD Red) SSDs: Newly installed Kington A2000 (1TB) as cache and a SanDisk Ultra (256 GB) Diagnostics: raidbyte-diagnostics-20200811-0533.zip
October 16, 20205 yr See here, make sure you're using the correct "power supply idle control" setting.
October 16, 20205 yr Author 9 hours ago, JorgeB said: See here, make sure you're using the correct "power supply idle control" setting. Thanks for the suggestion! Seems related, since I also have a Gen 1 CPU which are prone to this problem on linux machines. If found the exact "Power Supply Idle Control" setting on the MSI Motherboard and set it to "typical current idle" as suggested. I will let you know if the issue persists, I can't yet give a definitive answer since it happens quite randomly.
October 16, 20205 yr Author 12 hours ago, JorgeB said: See here, make sure you're using the correct "power supply idle control" setting. Ok, sadly it just crashed again. Didn't have syslog turned on, so no new logs, but probably wouldn't have helped more than the logs recorded previously anyways... (see in first post) Other ideas?
October 17, 20205 yr 11 hours ago, xBotRaid said: (see in first post Don't see any crash there, just failure to unmount the disks because something was still using them, you can try this.
October 22, 20205 yr Author On 10/17/2020 at 8:48 AM, JorgeB said: Don't see any crash there, just failure to unmount the disks because something was still using them, you can try this. Just as an update: I set up your recommended logging method, but since then the issue has not occured again. If it occurs again, I‘ll post the log here. Thanks for the suggestion Edited October 22, 20205 yr by xBotRaid
Archived
This topic is now archived and is closed to further replies.