Jump to content

Old box have trouble booting/checking parity


rockytt

Recommended Posts

Running 5.0.5 with 10+1(parity) disks -

Have had the beast for quite awhile now, and periodically it would lock up - no response to anything/no access from the network. The only way forward has been an unclean shutdown (no other option) -

 

The last time I had to restart (a week ago), it wouldn't get through the parity check (sometimes 5%, sometimes 75%) before freezing again. Currently, I can't even start it in maintenance mode w/o it locking up.

 

I'm hopeful that something in the last log file I could grab will be helpful and someone could point to something I might be able to check.

 

syslog.txt is the last one I was able to grab from the web interface, while syslog2.txt was pulled from the flash drive after the last unsuccessful attempt at starting the array.

 

Thanks!!

syslog.txt

syslog2.txt

Link to comment
  • 2 weeks later...

Rocky,

 

The Log button appears in the upper right of the unRAID webGUI. This page appears in web browser on your PC, Mac, or other client computer. Click the Log button immediately after starting unRAID and the log will be updated in the browser window. Keep this window open until the server crashes. The window will remain open after the server crashes. Your browser may allow you to save the contents of the window to a file or copy and paste its entire contents into a text file. Attach the text file to a post.

Link to comment

One thing to do is to open up the box, clean the dust and dirt out.  Pay particular attention to the CPU fan, CPU cooling fins and the power supply.  (You might want to do this outside of the house as you will probably be amazed at the amount of crap in there!)

 

Make sure that all of the fans are running.  Almost all PSs and CPUs have thermo protection and will shut down if they get too hot.  Occasionally, the heat sink lubricant will dry out between the CPU and the heatsink.  The only way to tell to remove the heatsink from the CPU.  Of course, this would then require cleaning the old lubricant off of both parts and applying a fresh coating. 

 

While you have the case open, check the capacitors on the mother board and see if any of them have their tops swelled up.  (A few years back there was a manufacturing problem with capacitors that caused them to fail prematurely.) 

Link to comment

OK - two issues now...

1) Uploaded a new syslog - this was the very latest one I could grab which was about a second before the box froze. Currently it's freezing up within a second or two of a parity check.

 

2) In the off-chance that perhaps an update to the version might help, I attempted a (most likely ill-advised) change from 5.05 to 5.06. Did it as a fresh install - reformatted the flash drive/copied the files/brought over my saved "config" folder/made the drive bootable. During the boot process I get a whole string of "edd: error 0100 reading sector xxxxxxxxx" lines before the machine boots. (I have run the "scan/fix" on the drive and it always returns 0 errors - in spite of this, does it sound like the flash drive has died?)

 

All the help is much appreciated-

syslog.txt

Link to comment

much craziness-

different usb stick and still the same errors...(even formatted on a different pc to make sure my laptop wasn't acting up)

 

unraid still boots btw, and if I cancel the parity check immediately after starting the array it works "fine" (can read all files/no red balls/etc) - although it crashes when I try and check parity later-

 

Demon-possession perhaps?

 

(I have attached the latest syslog - this after starting the array, but cancelling the parity check. Couldn't grab a later one after attempting a parity check - so this is as much as we're going to get)

syslog.txt

Link to comment

nope - running one now though (and I did replace the memory a couple months ago on the off-chance it was causing issues)

 

Still really strange about those errors during booting though

 

I'll report back tonight/tomorrow after the memtest - can't hurt to run it...

Link to comment

Yup - I've moved the usb stick around and swapped memory sticks (actual sticks as well as slots) to try and get rid of this "hanging" problem that has really been going on for a few months now. (The errors during boot are new)

The problem seems to resolve itself and I think "aha!" and then it crops up again.

 

In an attempt to trace the problem recently, it appeared that there were problems writing to one of the drives. I was able to replace it and successfully restore the data. ("Aha!")

Unfortunately, when I went to check parity the array froze again :( (which brings us to today)

Link to comment

Yeah - I figured as much - I think the core of this system is over 8 years old now (and some if it was certainly not cutting edge when I put it together!) so I guess it should be expected.

Probably be a couple of weeks before I can rebuild it - but one last favor would be for someone to take a look at the last bit of this latest syslog I just uploaded. This time I started the array w/o mounting the disks and the parity check hasn't yet frozen (knock on wood!) - anyone see anything suspicious in there?

 

Thanks again for all the help, and I'll probably be posting again (new thread) when I get some new bits.

syslog.txt

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...