The_N4RF Posted January 22, 2020 Share Posted January 22, 2020 (edited) syslog.zip My "onprem" unraid server is acting strangely. Seems every day or two the webgui will start showing strange data and fail to update fields. Strange data: shows 3 of 8 threads pegged at 100%, while htop shows no activity. Fail to load: unattached devices section of Main, docker, shares - all missing data or don't load at all. If I click around the webgui a few times it generally crashes the gui completely and the webpage won't load anymore ("internal server error", or "Gateway Time-out"). Meanwhile SSH works and SMB shares still respond as normal, except for "diagnostics" or any of the shutdown/restart commands. Server broadcasts that it is collecting diagnostic data, or shutting down, but nothing actually happens: Only way to get the server back to normal temporarily is to hard-reset and face the dreaded parity check. I was able to manually copy the syslog.txt file (attached). I'm not running any VMs, just three containers (pihole, zoneminder, plex). Thanks so much for the help! Hardware is quite old, I can try to get more specific if needed: i7 3770 32GB RAM Array: 4 x HDDs, 3 x SSDs (Cache) 1 x Unattached Device (USB data traveler) Edited January 22, 2020 by The_N4RF Quote Link to comment
trurl Posted January 22, 2020 Share Posted January 22, 2020 Can you get diagnostics before it starts getting weird? Quote Link to comment
The_N4RF Posted January 22, 2020 Author Share Posted January 22, 2020 After rebooting last night it already failed again today. I'll reboot again when I can. Thanks! Quote Link to comment
The_N4RF Posted January 22, 2020 Author Share Posted January 22, 2020 onprem-diagnostics-20200122-1537.zip Got the diagnostics before starting the array. Forgot to mention I'm on Unraid 6.8.1. Quote Link to comment
trurl Posted January 22, 2020 Share Posted January 22, 2020 After starting would have given more information. Are you booting from a USB2 port? You should. Quote Link to comment
The_N4RF Posted January 23, 2020 Author Share Posted January 23, 2020 Booting from USB2 thumbdrive installed in a USB3 port. Thanks for the tip, I will switch it on next reboot. Ok, diagnostics after starting array: onprem-diagnostics-20200122-1743.zip Quote Link to comment
trurl Posted January 23, 2020 Share Posted January 23, 2020 Your symptoms suggest a flash problem so switching the port may be the solution. Your "system" shares, appdata, domains, system, have files on the array. You probably created dockers/VMs before installing cache so they got created on the array. Best if they are all on cache and set to stay on cache. Your system share specifically is set to get moved to the array. Mover can't move open files so Docker and VM Services would have to be disabled to get them moved. Do you understand the Use cache settings? Quote Link to comment
The_N4RF Posted January 24, 2020 Author Share Posted January 24, 2020 Thanks, I'll try to get those moved as well. I'll update after these changes. Quote Link to comment
The_N4RF Posted January 25, 2020 Author Share Posted January 25, 2020 Ok, usb flash drive is moved to USB2 port and all of the data for appdata, system, and domains are hosted only on the Cache. Thanks for the help, we'll see how it goes! I did notice this week that if I left my Win10 VM running it seemed to be fine. But if I stopped or force-stopped the VM I would find the unraid server unusable soon after. This has only been tested twice though. Quote Link to comment
The_N4RF Posted January 26, 2020 Author Share Posted January 26, 2020 Failed again (VM was stopped). Will try again and leave the VM running. -webgui showing several CPUs maxed, but htop is showing idle. -Unassigned Devices won't load. (should be a 2.5" USB harddrive here) These are the initial display errors. Eventually the page will not load at all anymore. Quote Link to comment
The_N4RF Posted January 31, 2020 Author Share Posted January 31, 2020 Yup, worked for 2 days with the VM running. All I did was stop the VM, then 8hrs later I find the unraid server effectively crashed. Nonsense CPU load reporting. WebGUI eventually stops responding. SSH connects, most commands seem to work but will not shutdown/restart. Next step? It seems to be isolated to the VM. What changes when I stop the VM? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.