[SOLVED]: New build keeps dying


Recommended Posts

I have a new build and everytime I try to transfer data to the nas it will die after a while and when I check the console of the box it's got a bunch of memory address and processes dumped out to the screen.  There's no way for me to get any syslogs or see what's causing it as it's totally locked up at this point.

 

Link to comment

I have a new build and everytime I try to transfer data to the nas it will die after a while and when I check the console of the box it's got a bunch of memory address and processes dumped out to the screen.  There's no way for me to get any syslogs or see what's causing it as it's totally locked up at this point.

 

 

run a memtest (it is an option on the boot menu instead of booting unRAID)

 

What is your hardware?  Give us a breakdown of everything in your server.

Link to comment

Memtest is running right now (50% no errors yet)

 

Hardware:

Gigabyte G41M-ES2L motherboard

Q6600 processor

4GB ram DDR2-800

PCI sil3114 sata controller flashed to IDE mode

Cruzer USB 4gb flash drive

parity  WDC_WD20EARS No jumper, formated as MBR 4k aligned

disk1  WDC_WD10EACS

disk2  ST3500630AS

disk3  ST3500630AS

disk4  ST3500630AS

disk5  ST3500630AS

cache  WDC_WD740ADFS

Link to comment

See this thread that is stickied at the top of the forum.  There is a command in there that will allow you to follow the syslog as things are written to it.  Use that command once the memtest has run for at least 8 hours.

 

Let the "tail" command run until the server crashes again and the copy and paste the output into a text file that you can attach to your next post.

Link to comment

Trust me, I've tried the tail the only thing on the screen is exactly what I showed you in the screenshot.

 

I've tried both 4.7 and the 5.6a beta the exact same thing occurs with both.  I've been trying to build this unraid box for 3 days now but it just keeps crashing. 

Link to comment

Your stability problems are most like a hardware issue. Most likely memory, motherboard, power supply, controller or a hard disk. The best way to trouble shoot these sorts of problems is to swap out parts. The couple of times I have had similar issues it turned out to be a problem hard drive.

 

You have another lurking problem, and that is HPA:

 

http://lime-technology.com/wiki/index.php?title=UnRAID_Topical_Index#HPA

 

I have a similar Gigabyte board (G31M-ES2L) for use as with windows. HPA cannot be disabled on my board, it is most likely the case with your board as well. If that is the case with the G41M-ES2L it is not suitable for use with unRAID.

Link to comment

I checked out HPA before I purchased the board for this system a week ago.  The board defaults to HPA off meaning this board is fine.

 

I will begin swapping out parts to see if I can get rid of the issue.

  • Is your motherboard made by Gigabyte? If so, you need to check to see if your motherboard has an HPA 'feature'. This 'feature' (or curse in the unRAID community) can make your motherboard incompatible with unRAID in a very subtle and nefarious way. You can read more about HPA here. HPA goes by many names, such as 'Save a copy of BIOS to HDD' and 'Backup BIOS Image to HDD'. Look around in your board's BIOS for something like that - it is generally in the 'Advanced BIOS Features' tab.
     
     
     
     
  • If you motherboard has HPA enabled by default, this is very bad. Try upgrading the BIOS. If the most up-to-date version of the BIOS still has HPA enabled by default, then the motherboard is incompatible with unRAID. Throw it on the ground. Actually, don't, because it will most likely still work just fine as a desktop, HTPC, or any other purpose - just not for unRAID.
     
     
  • If the motherboard has HPA disabled by default, then it is most likely compatible with unRAID. Most Gigabyte motherboards manufactured in 2009 and later have HPA disabled by default. To be doubly-extra sure, you can perform this simple test - shut down the computer/server, disconnect the power supply, and clear the CMOS (either by removing the CMOS battery for 30 mins or more, or CAREFULLY connecting the two Clear_CMOS jumpers on the motherboard with either a flathead screwdriver or a jumper). After clearing the CMOS, boot into BIOS and see if HPA is still disabled. If it is, you are good to go. If not, try updating the BIOS and running this test again.
     

Link to comment

Good to hear and good luck. If you don't have any critical data, I would recommend using just one data disk without parity to start. Thus you can eliminate your SATA controller and then the motherboard SATA controller. The tail command is something to try, but most times it won't provide meaningful insight into a problem like yours.

Link to comment

What version of unRAID are you running??

 

And the tail should work just fine, it will get overwritten slightly but will still give you something.

 

Something to the effect of:

tail -f --line=100 /var/log/syslog

 

 

 

Run this command in a telnet window. Then you can scroll back up.

Link to comment

At this point it does appear to be my WD20EARS drive that was causing the kernel panics.

 

I have removed the WD20EARS and am currently stress testing the system.  I am @ 5 hours stable while transferring data.  Prior I could not run more than 30 minutes.

 

Wonder what makes a particular drive cause kernel panics?

 

The drive does not have any bad sectors and no errors reported in smart.

Link to comment

So after nearly a week of troubleshooting and swapping out and stress testing every single piece of hardware... I've come to the conclusion that unraid is entirely too unstable an OS for me to trust my data to.  I will be searching for another product.  Just wish I could get my $$ back now.

 

GL to all you who trust your data in this OS.

Link to comment

So after nearly a week of troubleshooting and swapping out and stress testing every single piece of hardware... I've come to the conclusion that unraid is entirely too unstable an OS for me to trust my data to.   I will be searching for another product.  Just wish I could get my $$ back now.

 

GL to all you who trust your data in this OS.

 

Then you have people like me who built a system from scratch, with no linux knowledge, using all WD 2 TB EARS drives and it has been running for well over a month now with only one reboot when I added a cache drive in and hasn't missed a beat.

Link to comment

Unfortunately for you but fortunately for all the rest of us, the problem is NOT the OS, it's YOUR hardware.

 

You're frustrated because of your hardware issues, but storming off throwing a tantrum will not help you. You haven't even gone beyond 2 days of troubleshooting. You must be willing to try to work through your issues to solve the problem. Moving to a different OS or software setup will NOT fix your issues. Not asking the community for pointers and listening to what they have to say will NOT fix your issues.

 

You also have others who have been using this OS for well over 2 years without a single incident of data loss and multiple incidents of data protection.

 

Link to comment
I've come to the conclusion that unraid is entirely too unstable an OS for me to trust my data to.

 

unRAID is not an OS, as such.  It is a system built on a Slackware Linux distribution.  Slackware, itself, uses a standard Linux kernel

 

I will be searching for another product.

 

I wish you well in your search - you'll be looking for a product that isn't built on a Linux platform then (especially not Slackware).

Link to comment
  • 1 month later...

Those can certainly be a pain in the ass to troubleshoot!

 

Thanks for letting us know what the issue turned out being. It's always helpful to close the loop on weird stuff like this to help the next person who may have similar issues!! :)

 

Hopefully this means you've changed your mind about UnRaid and come back to the community!

 

Personally, I can't imagine my network without it at this point.... I'd have to *shudder* go back to external drives and..... *gulp* dvd-roms!!! Backing up critical data on plastic and dye?? No redundancy? No parity???

 

What was I thinking??!!! ;)

Link to comment

Those can certainly be a pain in the ass to troubleshoot!

 

Thanks for letting us know what the issue turned out being. It's always helpful to close the loop on weird stuff like this to help the next person who may have similar issues!! :)

 

Hopefully this means you've changed your mind about UnRaid and come back to the community!

 

Personally, I can't imagine my network without it at this point.... I'd have to *shudder* go back to external drives and..... *gulp* dvd-roms!!! Backing up critical data on plastic and dye?? No redundancy? No parity???

 

What was I thinking??!!! ;)

 

Yep, I'm back to unraid now..

 

My next challenge is to try and get crashplan running on my unraid box.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.