[SOLVED] unRaid crashing - Disk files system corruption


Recommended Posts

Hi guys,

 

Hoping to get some assistance.

I have been running unRaid5 since it's stable release and have had no issues except a drive failure(easy swap) but lately i am not sure what is going on. When I turn my server on and runs like normal. I check SAB and sickbeard and then 2 minutes later its gone. I cannot get to /tower, sickbeard, sab or unmenu.

 

If I get in quick enough and stop the array things seem to be fine but the syslog shows me nothing (that I can understand, but by this time I have hard rebooted it and the syslog for when it failed is gone).

 

So I connected up a monitor and keyboard to get the syslog that way and I cannot type anything. It does not show the usual Tower User# (what ever it is when you log in) all it showed was a list of different [<C------------>] BLAH BLAH BLAH (I had no way of getting this sorry.) Attached is what I could but I not sure it is going to help.

 

Your advice is appreciated

syslog.txt

Link to comment

I suspect what you saw was a Call Trace, possibly a kernel 'OOPS'.  That plus 2 segfaults in the syslog, neither associated with dependency issues, points to a memory fault as the most likely cause.  Reboot and run the Memtest from the boot menu, and let it run for multiple passes or until errors appear.

Link to comment

Thanks RobJ,

Running memtest now and 3 passes and no errors. I'll let run for a few more hours and see what happens.

 

Any other thoughts? As this same problem has happened a few times.

 

Would upgrading to v6 give any benefit? Ie different tools, finer troubleshooting. Only reason I haven't is the transferring of all my sick and Sab settings.

 

Again the help is appreciated.

Link to comment

So I ran out to 10 passes for the memory check with no errors.

 

Thought I would see how things go, everything started jumped onto to SAB and sickbeard to see what I had missed. And then gone.

 

Took a photo of what the screen shows when connected to the server. Hope that gives some more ideas.

 

Thanks

unRaidCrash.jpg.a1b6f7f644237644a2bbe692d4495487.jpg

Link to comment

Thanks,

So I ran the file system check across my drives, and 1 came back with this.

 

Comparing bitmaps..Bad nodes were found, Semantic pass skipped

10 found corruptions can be fixed only when running with --rebuild-tree

 

So looked at doing the rebuild but I read it has its risks. What are your thoughts?

And can someone explain why this may only cause crashes when the array is running but be perfectly stable stopped or in maintenance mode?

 

Cheers

SDG_Disk4_Error.txt

Link to comment

Yes, you have a badly corrupted file system on that disk, with multiple corruptions dictating the need for the --rebuild-tree option.  I assume you've been reading Check Disk File systems?  Go ahead and proceed with that option, but be aware there's a good chance there's some data loss, and even what it recovers may require some handwork, to restore the right file name and put in the right folder.

 

It's rare but there have been other occasions when certain forms of corruption in the Reiser file system could actually crash the system.  The crashes could only happen of course when you were actually using the corrupt file system on that drive.

Link to comment

Ok Thanks all for your advice.

 

So far so good but it has only been 10 minutes. I have ended with only 7 files in my lost+found folder and so should be easy to sort. I can't seem to get access to it though. Advice on this?? :)

 

I'll see how the server goes for a week and if I don't have any more issues we can call this solved.

Link to comment

Ok Thanks all for your advice.

 

So far so good but it has only been 10 minutes. I have ended with only 7 files in my lost+found folder and so should be easy to sort. I can't seem to get access to it though. Advice on this?? :)

 

I'll see how the server goes for a week and if I don't have any more issues we can call this solved.

The lost+found folder will not have the correct permissions to allow network access.  This can be corrected by running the 'newperms' command against that folder from a telnet session.
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.