Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Parity disk red balled

Featured Replies

Looks like my parity disk is disabled:

 

Untitled.jpg

 

Not sure what is going on.  Under health, the parity disk shows it failing a SMART command.

 

Where do I start?  This is totally new to me.

 

Thanks.

You may have an improperly seated drive or cable ... or perhaps a bad SATA cable.

 

Try reseating the cables (ideally replacing the SATA cable) ... then Start the array; Stop it; unassign the parity drive; Start the Array; Stop the Array; re-assign the same drive to parity; and Start the array.

 

That should force parity to be rebuilt on the same drive.

 

If that doesn't resolve it, you need to replace the drive.

 

  • Author

Interesting.  I replaced the cable and the system told me "new parity disk found" (my old one) and now a parity check is in progress.

 

Thanks for the help.

 

Hopefully that fixes things and its not a failing disk.  syslog showed nothing related to disk errors.

 

Assuming it was just a bad cable (the system has been up for 62 days with this parity drive) -- its interesting how a cable can just go bad.

 

Thanks again.

Have you moved your system at all?

 

Was the original cable a locking cable?  [Hopefully the new one is]

 

Your cable may not have been bad -- just not seated completely.  Reseating it may have been all that was necessary; but if I suspect any cable issues I always just replace the cable with a nice new locking cable.

 

  • Author

System didn't move, but its a floor-sitting server, and the kids could have easily bumped it with the vacuum cleaner etc.

 

The old cable was locking -- the new cable is too.  I've got a drive cage (iStarUSA) but I believe the drive was seated fine in it.  Frankly, when I removed the old cable, both ends were locked up tight.  But I had an extra NIB cable laying around so what the heck.

 

What's the SMART output? Do you have smart history report available? If not, are you able to telnet into your unraid server?

 

If you're able to telnet into your unraid server run this command and post the output ...

 

 

smartctl -A /dev/hdc

 

this will gives more information on what SMART bits failed. However, if SMART has failed, your drive is either toast, or will be shortly toast - aka don't trust it. At all. While there is a small chance the cable for data or the cable for power has failed, there is a greater chance the drive has failed.

 

While this is bad - this is why we have parity calculating arrays. It supports a single drive failure.

 

Right now, if I was in your shoes I'd buy a new drive, and pray nothing else fails while you wait for shipping and a parity rebuild.

It's quite possible the smart output will lead us to a bad power/data cable rendering my fear mongering invalid ...

 

However, it's always a good idea to keep a spare drive (as large if not larger than your parity drive) for just such issues.

 

 

What's the SMART output? Do you have smart history report available? If not, are you able to telnet into your unraid server?

 

If you're able to telnet into your unraid server run this command and post the output ...

 

 

smartctl -A /dev/hdc

 

this will gives more information on what SMART bits failed. However, if SMART has failed, your drive is either toast, or will be shortly toast - aka don't trust it. At all. While there is a small chance the cable for data or the cable for power has failed, there is a greater chance the drive has failed.

 

While this is bad - this is why we have parity calculating arrays. It supports a single drive failure.

 

Right now, if I was in your shoes I'd buy a new drive, and pray nothing else fails while you wait for shipping and a parity rebuild.

  • Author

I have no idea if this will mean anything, but here's what I get from the Health -> Disk Attributes tab:

 

Attached to port: sdc

ID# ATTRIBUTE NAME FLAG VALUE WORST THRESH TYPE UPDATED FAILED RAW VALUE

1 Raw Read Error Rate 0x000f 117 099 006 Pre-fail Always Never 155874392

3 Spin Up Time 0x0003 092 091 000 Pre-fail Always Never 0

4 Start Stop Count 0x0032 100 100 020 Old age Always Never 203

5 Reallocated Sector Ct 0x0033 100 100 010 Pre-fail Always Never 0

7 Seek Error Rate 0x000f 062 060 030 Pre-fail Always Never 1743018

9 Power On Hours 0x0032 099 099 000 Old age Always Never 1575

10 Spin Retry Count 0x0013 100 100 097 Pre-fail Always Never 0

12 Power Cycle Count 0x0032 100 100 020 Old age Always Never 10

183 Runtime Bad Block 0x0032 100 100 000 Old age Always Never 0

184 End-to-End Error 0x0032 100 100 099 Old age Always Never 0

187 Reported Uncorrect 0x0032 100 100 000 Old age Always Never 0

188 Command Timeout 0x0032 100 100 000 Old age Always Never 0

189 High Fly Writes 0x003a 098 098 000 Old age Always Never 2

190 Airflow Temperature Cel 0x0022 067 057 045 Old age Always Never 33 (Min/Max 33/41)

191 G-Sense Error Rate 0x0032 100 100 000 Old age Always Never 0

192 Power-Off Retract Count 0x0032 100 100 000 Old age Always Never 6

193 Load Cycle Count 0x0032 100 100 000 Old age Always Never 834

194 Temperature Celsius 0x0022 033 043 000 Old age Always Never 33 (0 25 0 0)

197 Current Pending Sector 0x0012 100 100 000 Old age Always Never 0

198 Offline Uncorrectable 0x0010 100 100 000 Old age Offline Never 0

199 UDMA CRC Error Count 0x003e 200 200 000 Old age Always Never 0

240 Head Flying Hours 0x0000 100 253 000 Old age Offline Never 104728482546885

241 Total LBAs Written 0x0000 100 253 000 Old age Offline Never 21661333656

242 Total LBAs Read 0x0000 100 253 000 Old age Offline Never 53856634739

 

 

The command you listed to run resulted in "no such device" being returned. 

 

Thanks for the followup replies.  I may end up with a new disk anyway just to have a spare on hand, this morning's sheer panic feeling tells me I need a spare handy.

 

 

Hang on, I'll run the short test now.

 

 

ETA the short test has been at 90% for about 15 minutes.... is that normal?

From what I see (and really, take internet advice with a grain of salt) ...

 

The current pending sector is 0.

The Reallocated sector count is 0.

 

So, your drive either decided the sectors it had previously thought were iffy weren't, and said 'fuck it, good enough for me'

 

the one bit I don't know about is: 7  Seek Error Rate  0x000f  062  060  030  Pre-fail  Always  Never  1743018

 

Otherwise, from what you've posted your HD has been powered up for 65 days(1575 hours), has had zero reallocated sectors, has a pending reallocated sector count of 0.

 

Anyone else care to chime in?

 

If i"m reading these correctly, I'd run it as is, with a backup drive available in case of failure.

I would say that the drive is fine.

 

My understanding is that:"Raw Read Error Rate" has to do with the drive's own internal error correction.  This apparently happens all the time and is normal given the high data densities of modern hard drives.  Drives from different manufacturers will also report this value differently. I have read that seagates tend to report high values for this but other drives will not.

 

The "Seek Error Rate" means that the drive is over- or under-shooting the correct track when it moves the heads, and it has to do another (small) re-seek to acquire the track before it can read or write the data. A problem with the drive in this area is more of a performance concern rather than a concern with data integrity.

 

The more important thing to realize with either of these attributes is that the "Raw Value" they report is actually a rate and not an absolute count of actual errors.  The best way to gauge whats going on is is to look at the other columns for those attributes.  The "VALUE" (not RAW VALUE) column can be considered like a score where 100 (or above) would be considered really good.  The "WORST" column states what has been the worst recorded score for the drive.  The "THRESH" column states at what score the drive would be considered to be failing for that attribute.  So for this particular drive I would say the "Raw Read Error Rate" has a fantastic score while the "Seek Error Rate" is just alright and still in the OK range.

 

Ultimately as far as data integrity goes you want to pay more attention to the "Reallocation Count" and "Pending Sector Count" values as they indicate failures to read the data from the disk itself.  Those are solid indicators of the health and reliability of the drive.

 

 

  • Author

Thanks to all for the replies.  I finally got the command to work:

 

 

root@ffs1:~# /usr/sbin/smartctl -A /dev/sdc

smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build)

Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

 

=== START OF READ SMART DATA SECTION ===

SMART Attributes Data Structure revision number: 10

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x000f  117  099  006    Pre-fail  Always      -      155874392

  3 Spin_Up_Time            0x0003  092  091  000    Pre-fail  Always      -      0

  4 Start_Stop_Count        0x0032  100  100  020    Old_age  Always      -      203

  5 Reallocated_Sector_Ct  0x0033  100  100  010    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x000f  062  060  030    Pre-fail  Always      -      1813923

  9 Power_On_Hours          0x0032  099  099  000    Old_age  Always      -      1579

10 Spin_Retry_Count        0x0013  100  100  097    Pre-fail  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  020    Old_age  Always      -      10

183 Runtime_Bad_Block      0x0032  100  100  000    Old_age  Always      -      0

184 End-to-End_Error        0x0032  100  100  099    Old_age  Always      -      0

187 Reported_Uncorrect      0x0032  100  100  000    Old_age  Always      -      0

188 Command_Timeout        0x0032  100  100  000    Old_age  Always      -      0

189 High_Fly_Writes        0x003a  098  098  000    Old_age  Always      -      2

190 Airflow_Temperature_Cel 0x0022  068  057  045    Old_age  Always      -      32 (Min/Max 32/41)

191 G-Sense_Error_Rate      0x0032  100  100  000    Old_age  Always      -      0

192 Power-Off_Retract_Count 0x0032  100  100  000    Old_age  Always      -      6

193 Load_Cycle_Count        0x0032  100  100  000    Old_age  Always      -      834

194 Temperature_Celsius    0x0022  032  043  000    Old_age  Always      -      32 (0 25 0 0)

197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0010  100  100  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x003e  200  200  000    Old_age  Always      -      0

240 Head_Flying_Hours      0x0000  100  253  000    Old_age  Offline      -      198539158226121

241 Total_LBAs_Written      0x0000  100  253  000    Old_age  Offline      -      23794604480

242 Total_LBAs_Read        0x0000  100  253  000    Old_age  Offline      -      53856634739

 

 

Interesting to see the the "head flying hours" show 22.66 BILLION years.  Ha!

 

The self test is still running, but parity check is almost done, so maybe that's why its going so slow.  The "short test" has now been running for hours and hours.  If its not finished in the morning I'm not sure what I'm going to do.  Probably reboot the server and cross my fingers. 

 

I don't have twins, and my kids are older.  The server sits next to the entertainment center, across from the dog's bed.  So any number of things could have bumped the server, although it is unlikely as the kids know better and the dog is lazy and sleeps a lot.

 

Still not exactly sure what happened initially, but assuming parity completes and all looks good, I'll likely watch it for a few days and see what happens.  Still will probably order another drive to use as a spare though. 

The SMART data all looks good.    Different manufacturers report some parameters slightly differently than others => I'd be concerned, for example, with the Seek Error Rate value (62) on a SMART report for a WD drive;  but values in the 50's & 60's aren't at all unusual for Seagate drives -- so you're fine.

 

 

  • Author

Cool.  So probably a bad/unseated cable after all?

Cool.  So probably a bad/unseated cable after all?

 

Yes, that seems to be a good assumption.

 

  • Author

Great.  Many thanks to all for the help and advice!!!

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.