Jump to content

Recommended Posts

Last night before I went to bed I started a parity check on my server (RC5). This morning I have found what I believe is a drive failure. Ive read a bit about the red dot meaning the drive failed in a write request but I just wanted to make sure by asking on here.

 

It currenty states there is a parity rebuild in progress. Once that has finished can I swap the drive out without losing data? How do I find out where unraid has placed the data that was on the failed disk?

 

Ill try to get a SMART report on the disk before I take it out, as suggested on other threads.

 

Kind regards

Untitled.jpg.4bad26048b2aaca93e6c234fea471577.jpg

Link to comment
  • Replies 131
  • Created
  • Last Reply

Top Posters In This Topic

Check out the speed and estimated finish and the number of writes. I don't think Dan201 will have the patience to wait almost 82 days.

I missed that...  :-[

 

I did see it was almost half way done.  (the number of "writes" did not bother me)

 

Clearly something is unusual.  he needs to examine the syslog to see what is happening.

 

Joe L.

Link to comment

When I type in 'smartctl  -a  -d  ata  /dev/sda' I get the following message;

 

Failed: No such device.

 

C is definitely the failed drive.

 

Remove the "-d ata" option (doesn't work with some controllers), 'smartctl -a /dev/sda' should be enough.

 

and really we want a report from sdc (the disk having issues by the looks of things), but wouldn't hurt to check all the drives.

Link to comment

I tried to stop the array but the server crashed and I had to manually power it down with the button.

 

The good news is I got a smart file for the drive in question. Hurray!

 

I also double checked all the cables and everything seemed fine but I unplugged and reconnected everything just in case.

 

Does the smart report say anything interesting? The server has automatically gone into a parity check when powered on.

smart.txt

Link to comment

I tried to stop the array but the server crashed and I had to manually power it down with the button.

 

The good news is I got a smart file for the drive in question. Hurray!

 

I also double checked all the cables and everything seemed fine but I unplugged and reconnected everything just in case.

 

Does the smart report say anything interesting? The server has automatically gone into a parity check when powered on.

you mis-typed the command.  (did you even look at the smart report?)

 

smartctl -a

not

smartctl a-

 

Joe L.

Link to comment

Yes I did read it, as I said it an above post. But I didn't realise I typed the incorrect command in. It just said I had used 2 device names.

 

Here is what I believe to be a successful report.

The only oddity of that SMART report is the very high temperature of the drive.

194 Temperature_Celsius    0x0022  102  101  000    Old_age  Always      -      50

 

Most of us try to keep drives below 40C. (low to mid 30s as a goal)

Some drives will fail if they get above 50C.

 

You need to work on the array cooling.

Link to comment

Ok ill try to bring the temps down with some new fans. Could the drive have somehow dropped out of the array if over heated?

 

It seems to be working now. The drive that had the red dot doesn't have any data on it so I wold have thought it would be one of the cooler ones as its not often spun up.

Link to comment

Looking closer, there is this line:

192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -      20

 

That attribute usually indicates the drive had to retract the disk heads in an unexpected power down.  In your disk that happened 20 times so far. (but the drive has only been power cycled 51 times.)

 

Unless you know something about the prior use of this drive where power was just shut off rather than the OS cleanly shut down, you might inspect the power connections to the drive. 

(most disks fail to operate properly after power is removed  ;D)

 

Joe L.

Link to comment

Ok ill try to bring the temps down with some new fans.

Most cases seem engineered to circulate air through the whole case for the CPU and video card, and leave the drives in somewhat stagnant air. Many times you would do well to make sure all the intake air for the case is forced over the drives, to cool them first. That normally means taping up case holes that don't have fans, and possibly reversing fans to exhaust instead of intake air. Sometimes you need to create ductwork with pieces of cardboard to seal leaks that allow air to go around drives without cooling them. If you don't have an engineering mind, feel free to take pics of an overview of the inside of your case and post them here. Someone will probably make suggestions to help you with the airflow.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...