Jump to content
guitarlp

Bad disk? Smartctl attached

16 posts in this topic Last Reply

Recommended Posts

I added a new cache disk today. It's actually an old disk of mine... but before I decided to start using it I ran Smartctl and this is what it output:

 

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   200   001   051    Pre-fail  Always   In_the_past 0
  3 Spin_Up_Time            0x0007   132   104   021    Pre-fail  Always       -       3900
  4 Start_Stop_Count        0x0032   099   099   040    Old_age   Always       -       1343
  5 Reallocated_Sector_Ct   0x0033   163   163   140    Pre-fail  Always       -       581
  7 Seek_Error_Rate         0x000b   200   200   051    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   082   082   000    Old_age   Always       -       13412
10 Spin_Retry_Count        0x0013   100   100   051    Pre-fail  Always       -       0
11 Calibration_Retry_Count 0x0013   100   100   051    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       1297
194 Temperature_Celsius     0x0022   117   253   000    Old_age   Always       -       33
196 Reallocated_Event_Count 0x0032   185   185   000    Old_age   Always       -       15
197 Current_Pending_Sector  0x0012   198   175   000    Old_age   Always       -       102
198 Offline_Uncorrectable   0x0012   174   167   000    Old_age   Always       -       1084
199 UDMA_CRC_Error_Count    0x000a   200   253   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0009   167   155   051    Pre-fail  Offline      -       1133

 

I'm pretty sure that's a bad disk right? Multi-Zone error rate is 1133, offline un_correctable is 1084. Current pending sector is 102. From what I understand, none of those are good to have.

Share this post


Link to post

I would not trust my data on this drive.

 

You can possibly get the drive to work for a while by dd'ing zeros to the whole drive.

If pending sectors goes down to 0 you may be ok for a while.

 

after writing zeros to the whole drive, I would do a -tlong test on the drive to be sure.

 

Share this post


Link to post

I figured it was no good. It's an old disk that I haven't used in over a year or two but I thought I may be able to get some use out of it. Oh well... Looks like I'll pick up a cheap drive from Fry's so I can start using the cache feature.

 

It's not even worth trying to see if I can make it work. I would hate to copy a bunch of stuff to unRAID to find out the next morning the disk failed and everything I thought I had copied had failed.

 

Better to be safe then sorry though :)

Share this post


Link to post

I'm running 4.3.beta6 Pro.

 

I have anomalous temperature readings on two of my disks and didn't really think too much of it, since I've read in other threads that it's not that big of a deal.

 

It was nagging at me, however, and after reading this thread I ran smartctl on all of the disks in my array.  Being a Linux noob, I'm not sure how to interpret the results, but I see some things that concern me.  Could someone examine these attachments and help me to identify any problems?

 

 

Share this post


Link to post

The rest of the attachments:

 

It's parity and data1 disk are reporting temps of 65 celcius, whereas the rest of the disks are at about 38 degrees.

 

Share this post


Link to post

I don't see anything that gives me big concerns from your smat captures.  I (personally) do not like to run my drives too hot.  Even 40C is too hot IMO.  Consider adding some active fans blowing air between your drives.  If those 65C termperatures are believed, that would be way too hot.  Not sure what that attribute 190 is, but smartctl is definately not too happy about it.  My guess is that it is not a big issue.

 

I wonder if there is a newer version of smartctl that understands some of the newer high capacity drives and can actually identify this attribute?

Share this post


Link to post

Run a -tlong test on the drives in question.

65c is way too hot, Not sure if it is accurate, I know some o fmy drives will shuitdown at 55c.

I lnow there is a later version of smartctl 5.38 heres the output on one of my drives.

 

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   115   099   006    Pre-fail  Always       -       84590200
  3 Spin_Up_Time            0x0003   092   087   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       77
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   066   060   030    Pre-fail  Always       -       3680587
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       798
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       2
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       32
184 Unknown_Attribute       0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Unknown_Attribute       0x0032   100   098   000    Old_age   Always       -       131079
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   065   058   045    Old_age   Always       -       35 (Lifetime Min/Max 21/38)
194 Temperature_Celsius     0x0022   035   042   000    Old_age   Always       -       35 (0 19 0 0)
195 Hardware_ECC_Recovered  0x001a   040   033   000    Old_age   Always       -       84590200
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0

 

Share this post


Link to post

What about this smartctl test on a new drive I just added?

 

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   105   100   006    Pre-fail  Always       -       9551287
  3 Spin_Up_Time            0x0003   096   096   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       5
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   100   253   030    Pre-fail  Always       -       13083
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       0
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       5
187 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
189 Unknown_Attribute       0x003a   100   100   000    Old_age   Always       -       0
190 Unknown_Attribute       0x0022   064   064   045    Old_age   Always       -       606076964
194 Temperature_Celsius     0x0022   036   040   000    Old_age   Always       -       36 (Lifetime Min/Max 0/24)
195 Hardware_ECC_Recovered  0x001a   077   060   000    Old_age   Always       -       10801851
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

 

There's no bad sectors... but I wonder why there are so many "Raw_Read_Error_Rate," "Seek_Error_Rate," and "Unknown_Attribute" values.

 

I added this disk as a cache disk. Once added unRAID formated the drive but it didn't do it's normal clearing like it does with my data drives. Is this normal or should I worry about the errors?

 

Edit:

 

I copied a 866 MB file from the cache disk to my PC about 5 times. Afterwards I captured the smartctl log again. Here's the output:

 

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   108   100   006    Pre-fail  Always       -       19086480
  3 Spin_Up_Time            0x0003   096   096   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       5
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   100   253   030    Pre-fail  Always       -       19925
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       0
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       5
187 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
189 Unknown_Attribute       0x003a   100   100   000    Old_age   Always       -       0
190 Unknown_Attribute       0x0022   063   063   045    Old_age   Always       -       622854181
194 Temperature_Celsius     0x0022   037   040   000    Old_age   Always       -       37 (Lifetime Min/Max 0/24)
195 Hardware_ECC_Recovered  0x001a   064   060   000    Old_age   Always       -       21051929
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

 

Looks like all the values I mentioned before went up considerably.

 

Bad disk?

 

Edit 2:

 

i releaized copying a 866 MB file from unRAID to my PC wasn't grabbing the data from the drive... it was grabbing to data from memory since the file size was so small.

 

I copied an 8 GB movie to the cache drive and compared it to the file on my PC. It passed with no errors for 5 tests.

 

Right now I'm running smartctl -tlong so hopefully when that's done I'll have some more information.

 

Edit 3:

 

Hmmm... found this link from another thread:

http://en.wikipedia.org/wiki/S.M.A.R.T.#Known_S.M.A.R.T._attributes

 

it states:

 

"Do note that Seagate drives often report a raw value, that does not mean it is in failure and show high value even as a new drive."

 

So maybe these high values are ok because the drive is a 500 GB Seagate drive.

 

Thoughts?

 

Share this post


Link to post

After -tlong test:

 

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   109   100   006    Pre-fail  Always       -       47216747
  3 Spin_Up_Time            0x0003   095   095   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       6
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   100   253   030    Pre-fail  Always       -       904164
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       4
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       5
187 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
189 Unknown_Attribute       0x003a   100   100   000    Old_age   Always       -       0
190 Unknown_Attribute       0x0022   067   060   045    Old_age   Always       -       555810849
194 Temperature_Celsius     0x0022   033   040   000    Old_age   Always       -       33 (Lifetime Min/Max 0/24)
195 Hardware_ECC_Recovered  0x001a   062   060   000    Old_age   Always       -       132126835
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

Share this post


Link to post
"Do note that Seagate drives often report a raw value, that does not mean it is in failure and show high value even as a new drive."

 

So maybe these high values are ok because the drive is a 500 GB Seagate drive.

 

Thoughts?

 

I've seen this before with brand new seagate drives.

Best to run SEATOOLS just to be sure.

Share this post


Link to post

I don't think you have anything to worry about.  I have two of these seagate 1T drives.  I've posted the smart output below.

 

These RAW values are difficult to interpret without documentation.  For example, it could be that different bits in the value mean different things, and seeing it as a number if totally meaningless.  I'm just not sure.  If the drive is testing as okay, I just wouldn't lose too much sleep.  If you're seeing lots of remaps and/or drive errors - or if the drive is telling you it is failing - THEN I'd be worried!

 

WeeboTech - do you have a link to the newer version of smartctl?  Perhaps it would display different results if the drive is in the smartctl database

 

Disk1:

 

  1 Raw_Read_Error_Rate    0x000f  117  100  006    Pre-fail  Always      -      125573338

  3 Spin_Up_Time            0x0003  091  089  000    Pre-fail  Always      -      0

  4 Start_Stop_Count        0x0032  100  100  020    Old_age  Always      -      112

  5 Reallocated_Sector_Ct  0x0033  100  100  036    Pre-fail  Always      -      1

  7 Seek_Error_Rate        0x000f  052  051  030    Pre-fail  Always      -      47246454664

  9 Power_On_Hours          0x0032  098  098  000    Old_age  Always      -      1826

10 Spin_Retry_Count        0x0013  100  100  097    Pre-fail  Always      -      4

12 Power_Cycle_Count      0x0032  100  100  020    Old_age  Always      -      69

184 Unknown_Attribute      0x0032  100  100  099    Old_age  Always      -      0

187 Unknown_Attribute      0x0032  100  100  000    Old_age  Always      -      0

188 Unknown_Attribute      0x0032  100  099  000    Old_age  Always      -      4295032834

189 Unknown_Attribute      0x003a  100  100  000    Old_age  Always      -      0

190 Unknown_Attribute      0x0022  072  061  045    Old_age  Always      -      505020444

194 Temperature_Celsius    0x0022  028  040  000    Old_age  Always      -      28 (Lifetime Min/Max 0/17)

195 Hardware_ECC_Recovered  0x001a  048  036  000    Old_age  Always      -      125573338

197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0010  100  100  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x003e  200  200  000    Old_age  Always      -      0

 

Disk 2

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x000f  119  100  006    Pre-fail  Always      -      205589542

  3 Spin_Up_Time            0x0003  091  086  000    Pre-fail  Always      -      0

  4 Start_Stop_Count        0x0032  100  100  020    Old_age  Always      -      112

  5 Reallocated_Sector_Ct  0x0033  100  100  036    Pre-fail  Always      -      1

  7 Seek_Error_Rate        0x000f  066  060  030    Pre-fail  Always      -      4739075

  9 Power_On_Hours          0x0032  098  098  000    Old_age  Always      -      1821

10 Spin_Retry_Count        0x0013  100  100  097    Pre-fail  Always      -      2

12 Power_Cycle_Count      0x0032  100  100  020    Old_age  Always      -      64

184 Unknown_Attribute      0x0032  100  100  099    Old_age  Always      -      0

187 Unknown_Attribute      0x0032  100  100  000    Old_age  Always      -      0

188 Unknown_Attribute      0x0032  100  100  000    Old_age  Always      -      0

189 Unknown_Attribute      0x003a  100  100  000    Old_age  Always      -      0

190 Unknown_Attribute      0x0022  072  065  045    Old_age  Always      -      521797660

194 Temperature_Celsius    0x0022  028  040  000    Old_age  Always      -      28 (Lifetime Min/Max 0/17)

195 Hardware_ECC_Recovered  0x001a  050  039  000    Old_age  Always      -      205589542

197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0010  100  100  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x003e  200  200  000    Old_age  Always      -      0

 

Share this post


Link to post

Here is ver 5.38 It requires the libstdc++.so.6.

 

root@Atlas:~# ldd /usr/sbin/smartctl

        linux-gate.so.1 =>  (0xb7fc2000)

        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0xb7eda000)

        libm.so.6 => /lib/libm.so.6 (0xb7eb3000)

        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0xb7ea8000)

        libc.so.6 => /lib/libc.so.6 (0xb7d66000)

        /lib/ld-linux.so.2 (0xb7fc3000)

 

Šroot@Atlas:/usr/sbin# grep libstdc++.so.6 /var/log/packages/*

/var/log/packages/cxxlibs-6.0.8-i486-4:usr/lib/libstdc++.so.6.0.8

 

Which is in the cxxlibs-6.0.8-i486-4 package.

http://packages.slackware.it/package.php?q=12.0/cxxlibs-6.0.8-i486-4

http://packages.slackware.it/package.php?q=current/cxxlibs-6.0.9-i486-1

 

I'm using 6.0.8-i486-4 right now, There seems to be a later version as 6.0.9-i486-1. Not sure how that will work.

Share this post


Link to post

Thanks for the replies. I ran seatools and the disk passed both the long and short test.

 

I'll add the disk back to my unRAID server and start using it as my cache drive. Finally... write speeds better then 12.6 MB/sec :)

Share this post


Link to post

Here is ver 5.38 It requires the libstdc++.so.6.

 

root@Atlas:~# ldd /usr/sbin/smartctl

        linux-gate.so.1 =>  (0xb7fc2000)

        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0xb7eda000)

        libm.so.6 => /lib/libm.so.6 (0xb7eb3000)

        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0xb7ea8000)

        libc.so.6 => /lib/libc.so.6 (0xb7d66000)

        /lib/ld-linux.so.2 (0xb7fc3000)

 

Šroot@Atlas:/usr/sbin# grep libstdc++.so.6 /var/log/packages/*

/var/log/packages/cxxlibs-6.0.8-i486-4:usr/lib/libstdc++.so.6.0.8

 

Which is in the cxxlibs-6.0.8-i486-4 package.

http://packages.slackware.it/package.php?q=12.0/cxxlibs-6.0.8-i486-4

http://packages.slackware.it/package.php?q=current/cxxlibs-6.0.9-i486-1

 

I'm using 6.0.8-i486-4 right now, There seems to be a later version as 6.0.9-i486-1. Not sure how that will work.

 

Many thanks to WeeboTech for posting this!!!

 

I finally got around to installing this today.  Since I know so little about Unix, even something simple like this took some time for me.  So I decided to post this to help anyone else that wants to use this version.  Note that smartctl 5.38 recognizes newer drives much better than 5.36.  I highly recommend upgrading to this version.

 

Here is what you need to do:

1 - Download this library http://packages.slackware.it/package.php?q=12.0/cxxlibs-6.0.8-i486-4 and put on your flash disk.  Put it in a directory called "/custom/usr/share/packages" on the flash disk.

2 - Rename the file to have a ".tgz" extension instead of a ".gz" extention.

3 - Go to a telnet (putty) prompt and enter the command "installpkg /boot/custom/usr/share/packages/cxxlibs-6.0.8-i486-4.tgz"

4 - It should not give any errors

5 - Download the updated smartctl program (HERE) and unzip the program inside.  Put on the root of your USB drive.

6 - Run the command, for example, "/boot/smartctl -a -d ata /dev/device/sda"

7 - (To be able to run it after a reboot).  Edit your "go" script in your "config" directory of your flash disk, and add the "installpkg" command from step3.

 

Copying files from your windows computer to your flash is easy, just navigate to "//tower/flash" from Windows Explorer.  You can then just drag and drop or cut and paste.

 

Update:  Package directory changed to match the standard.

 

Share this post


Link to post

The suggested directories are

 

/boot/custom/usr/share/packages.

 

/boot/custom/bin

 

In the future there will be some of us who will install packages in those structures.

 

I even have an idea for a package manager to allow you to enable/disable them from a browser interface.

Hopefully one day we'll have a CGI capable http server so I can implement that.

 

See this Wiki article for more information.

http://lime-technology.com/wiki/index.php?title=Third_Party_Boot_Flash_Plugin_Architecture

Share this post


Link to post

Package directory updated in the prior post to follow suggested directory structure standard.

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.