Matanceros Posted March 1, 2021 Share Posted March 1, 2021 So hi there! I'm pretty new to the whole unRAID thing, but I'm loving learning every step of the way! Right now I've run into something that I can't really solve. So right now the server will fail randomly when there are write tasks performed, at seemingly random intervals. These fails require hard restarts - I can't access the server from my browser, the keyboard/mouse connected to the server is unresponsive, etc. Basically the system is powered up, but is unresponsive in all fronts. I've tried cancelling parity and all (seems to help, but still fails). Some solutions recommend a cache drive (getting one in the mail in a few days), but looking for other solutions in the meantime. Any help is greatly appreciated! System specs: Dell T110ii E3 1270-v2 16GB 1600MHz ECC 6 TB SAS x3 Quote Link to comment
Hoopster Posted March 1, 2021 Share Posted March 1, 2021 (edited) 43 minutes ago, Matanceros said: Any help is greatly appreciated! Without diagnostics, any attempt to help will be a shot in the dark as there is very little information to go on at this point. Get your system diagnostics (tools --> diagnostics from the GUI or 'diagnostics' from the command line) and attach them to you next post, preferably after the system has been running for a while and before it locks up. Diagnostics right after a reboot don't often contain a lot of meaningful information. You may also want to setup a syslog server to capture information even in the event of a lockup and reboot. Edited March 1, 2021 by Hoopster Quote Link to comment
Matanceros Posted March 2, 2021 Author Share Posted March 2, 2021 8 hours ago, Hoopster said: Without diagnostics, any attempt to help will be a shot in the dark as there is very little information to go on at this point. Get your system diagnostics (tools --> diagnostics from the GUI or 'diagnostics' from the command line) and attach them to you next post, preferably after the system has been running for a while and before it locks up. Diagnostics right after a reboot don't often contain a lot of meaningful information. You may also want to setup a syslog server to capture information even in the event of a lockup and reboot. Thanks for the reply! I left the rig on at night to try to get some info, but it seems that it has crashed again this morning. I'm setting up a syslog server now and hopefully be able to produce something by end of day to upload. Quote Link to comment
Matanceros Posted March 2, 2021 Author Share Posted March 2, 2021 Woke up today w/ an unresponsive server, restarted and my disks were missing. Second restart brought the disks back. Log attached, what's ioctl? syslog-192.168.31.88.log Quote Link to comment
JorgeB Posted March 3, 2021 Share Posted March 3, 2021 If you're using SAS disks disable spin down to get rid of those errors. Quote Link to comment
Matanceros Posted March 3, 2021 Author Share Posted March 3, 2021 (edited) 2 hours ago, JorgeB said: If you're using SAS disks disable spin down to get rid of those errors. Aite, I'll try disabling spin down. Hopefully it'll solve my crashing issue. Will report back with more on next crash! Edit: Alternatively, would this help with the issue? Edited March 3, 2021 by Matanceros New info Quote Link to comment
JorgeB Posted March 3, 2021 Share Posted March 3, 2021 5 minutes ago, Matanceros said: Alternatively, would this help with the issue? Yes, but the spin down errors are likely unrelated to the crashing, but it will at least stop spamming the log. Quote Link to comment
Matanceros Posted March 3, 2021 Author Share Posted March 3, 2021 Just crashed again - Uploading the new log. It seems that the log collected is not verbose.. Is there any settings to improve the verbosity of the log? As an added measure, I've also attached the syslog downloaded via the tools page (diagnostics). All and any help is appreciated. syslog-192.168.31.88-20210303-2310hrs.log syslog.txt Quote Link to comment
trurl Posted March 3, 2021 Share Posted March 3, 2021 9 minutes ago, Matanceros said: I've also attached the syslog downloaded via the tools page (diagnostics). Those are not the Diagnostics, but they are available on the Tools page. Diagnostics includes syslog, SMART for all disks, configuration, hardware and other information in one nice neat package. I seldom look at syslog before looking at other things in the Diagnostics. Please attach the complete Diagnostics ZIP file to your NEXT post in this thread. Quote Link to comment
JorgeB Posted March 3, 2021 Share Posted March 3, 2021 Try this then post that log after a crash. Quote Link to comment
Matanceros Posted March 3, 2021 Author Share Posted March 3, 2021 3 minutes ago, trurl said: Those are not the Diagnostics, but they are available on the Tools page. Diagnostics includes syslog, SMART for all disks, configuration, hardware and other information in one nice neat package. I seldom look at syslog before looking at other things in the Diagnostics. Please attach the complete Diagnostics ZIP file to your NEXT post in this thread. Re-exported and attached. 3 minutes ago, JorgeB said: Try this then post that log after a crash. Yes, I have this set up. The syslog-192.168.31.88.log file I've been attaching is the syslog I'm getting from it. Thanks for the help! tower-diagnostics-20210303-2329.zip Quote Link to comment
JorgeB Posted March 3, 2021 Share Posted March 3, 2021 1 hour ago, Matanceros said: Yes, I have this set up. The syslog-192.168.31.88.log file I've been attaching is the syslog I'm getting from it. Don't see anything crash related in that log, this usually suggests a hardware problem, one thing you can try it to boot the server in safe mode with all docker/VMs disable, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one. Quote Link to comment
Matanceros Posted March 4, 2021 Author Share Posted March 4, 2021 aite, thanks! It stayed alive all night last night, so I'm formulating a plan to check the hardware.. First would be my SAS drives. Is it alright if I take one offline a day, and see how it performs? Would unRAID find any problems with that? All advice and pointers are deeply appreciated! Quote Link to comment
JorgeB Posted March 4, 2021 Share Posted March 4, 2021 You can but unlikely to be the disks crashing the server. Quote Link to comment
Matanceros Posted March 6, 2021 Author Share Posted March 6, 2021 Crashed again today, but the screen froze with two notifcations. Right now the machine is booted, but disk 2 has been automatically disabled. Keeping an eye on whether the machine fails again. May look into swaping out disk 2. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.