Jump to content

6.8.3 - unRAID fails randomly


Recommended Posts

So hi there!
I'm pretty new to the whole unRAID thing, but I'm loving learning every step of the way!

Right now I've run into something that I can't really solve.
So right now the server will fail randomly when there are write tasks performed, at seemingly random intervals.

These fails require hard restarts - I can't access the server from my browser, the keyboard/mouse connected to the server is unresponsive, etc. Basically the system is powered up, but is unresponsive in all fronts.

 

I've tried cancelling parity and all (seems to help, but still fails).

Some solutions recommend a cache drive (getting one in the mail in a few days), but looking for other solutions in the meantime.

 

Any help is greatly appreciated!

 

System specs:

Dell T110ii
E3 1270-v2
16GB 1600MHz ECC
6 TB SAS x3
 

Link to comment
43 minutes ago, Matanceros said:

Any help is greatly appreciated!

Without diagnostics, any attempt to help will be a shot in the dark as there is very little information to go on at this point.

 

Get your system diagnostics (tools --> diagnostics from the GUI or 'diagnostics' from the command line) and attach them to you next post, preferably after the system has been running for a while and before it locks up. Diagnostics right after a reboot don't often contain a lot of meaningful information.

 

You may also want to setup a syslog server to capture information even in the event of a lockup and reboot.

 

Edited by Hoopster
Link to comment
8 hours ago, Hoopster said:

Without diagnostics, any attempt to help will be a shot in the dark as there is very little information to go on at this point.

 

Get your system diagnostics (tools --> diagnostics from the GUI or 'diagnostics' from the command line) and attach them to you next post, preferably after the system has been running for a while and before it locks up. Diagnostics right after a reboot don't often contain a lot of meaningful information.

 

You may also want to setup a syslog server to capture information even in the event of a lockup and reboot.

 

Thanks for the reply!
I left the rig on at night to try to get some info, but it seems that it has crashed again this morning.

I'm setting up a syslog server now and hopefully be able to produce something by end of day to upload.

Link to comment
2 hours ago, JorgeB said:

If you're using SAS disks disable spin down to get rid of those errors.

Aite, I'll try disabling spin down. Hopefully it'll solve my crashing issue. Will report back with more on next crash!

 

Edit: Alternatively, would this help with the issue? 

 

Edited by Matanceros
New info
Link to comment
9 minutes ago, Matanceros said:

I've also attached the syslog downloaded via the tools page (diagnostics).

Those are not the Diagnostics, but they are available on the Tools page. Diagnostics includes syslog, SMART for all disks, configuration, hardware and other information in one nice neat package. I seldom look at syslog before looking at other things in the Diagnostics.

 

Please attach the complete Diagnostics ZIP file to your NEXT post in this thread.

 

Link to comment
3 minutes ago, trurl said:

Those are not the Diagnostics, but they are available on the Tools page. Diagnostics includes syslog, SMART for all disks, configuration, hardware and other information in one nice neat package. I seldom look at syslog before looking at other things in the Diagnostics.

 

Please attach the complete Diagnostics ZIP file to your NEXT post in this thread.

 

Re-exported and attached.

 

 

3 minutes ago, JorgeB said:

Try this then post that log after a crash.

Yes, I have this set up. The syslog-192.168.31.88.log file I've been attaching is the syslog I'm getting from it.

image.thumb.png.5d207103d775549550809e74be7e510a.png

 

 

Thanks for the help!

 

tower-diagnostics-20210303-2329.zip

Link to comment
1 hour ago, Matanceros said:

Yes, I have this set up. The syslog-192.168.31.88.log file I've been attaching is the syslog I'm getting from it.

Don't see anything crash related in that log, this usually suggests a hardware problem, one thing you can try it to boot the server in safe mode with all docker/VMs disable, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.

Link to comment

aite, thanks!

It stayed alive all night last night, so I'm formulating a plan to check the hardware.. First would be my SAS drives.

 

Is it alright if I take one offline a day, and see how it performs?
Would unRAID find any problems with that?

 

All advice and pointers are deeply appreciated!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...