Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Question about EARS drive syslog error messages

Featured Replies

Hello - I am new to the forums. I have been able to get all my questions answered since starting with unRAID and I am grateful to the active user community and to the folks at Lime Technology for the great support.

 

I am running Pro 4.5.4.  I originally installed amongst my drive pool a couple of 1.5 TB WD EARS Green drives.  Since they were the largest to enter the pool, I have one running parity.  I installed these before all the talk of the issues relating to Advanced Format Drives.  When all the discussions started earlier this year, and responses strongly recommending installing them with the jumper, I followed the procedure to install the jumpers, one at a time, clear the drives and rebuild the array.  I am happy to report that the process worked exactly as expected.

 

My question - the one that I have not found answered in any topics - is: what about all the boot time error messages in syslog?  Between the 2 drives I get in excess of 100 errors and more than 1000 lines of error text in the log.  They are a repeating series of the following messages:

 

Jul 15 21:55:02 Tower kernel: ata6.00: status: { DRDY ERR }

Jul 15 21:55:02 Tower kernel: ata6.00: error: { UNC }

Jul 15 21:55:02 Tower kernel: ata6.00: configured for UDMA/100

Jul 15 21:55:02 Tower kernel: sd 6:0:0:0: [sdf] Unhandled sense code

Jul 15 21:55:02 Tower kernel: sd 6:0:0:0: [sdf] Result: hostbyte=0x00 driverbyte=0x08

Jul 15 21:55:02 Tower kernel: sd 6:0:0:0: [sdf] Sense Key : 0x3 [current] [descriptor]

Jul 15 21:55:02 Tower kernel: Descriptor sense data with sense descriptors (in hex):

Jul 15 21:55:02 Tower kernel:        72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00

Jul 15 21:55:02 Tower kernel:        ae a8 7b 2f

Jul 15 21:55:02 Tower kernel: sd 6:0:0:0: [sdf] ASC=0x11 ASCQ=0x4

Jul 15 21:55:02 Tower kernel: sd 6:0:0:0: [sdf] CDB: cdb[0]=0x28: 28 00 ae a8 7b 28 00 00 08 00

Jul 15 21:55:02 Tower kernel: end_request: I/O error, dev sdf, sector 2930277167

Jul 15 21:55:02 Tower kernel: Buffer I/O error on device sdf, logical block 366284645

Jul 15 21:55:02 Tower kernel: ata6: EH complete

Jul 15 21:55:02 Tower kernel: ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Jul 15 21:55:02 Tower kernel: ata6.00: BMDMA stat 0x24

Jul 15 21:55:02 Tower kernel: ata6.00: failed command: READ DMA EXT

Jul 15 21:55:02 Tower kernel: ata6.00: cmd 25/00:08:28:7b:a8/00:00:ae:00:00/e0 tag 0 dma 4096 in

Jul 15 21:55:02 Tower kernel:          res 51/01:00:2f:7b:a8/01:00:ae:00:00/e0 Emask 0x1 (device error)

Jul 15 21:55:02 Tower kernel: ata6.00: status: { DRDY ERR }

Jul 15 21:55:02 Tower kernel: ata6.00: configured for UDMA/100

Jul 15 21:55:02 Tower kernel: ata6: EH complete

Jul 15 21:55:02 Tower kernel: ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Jul 15 21:55:02 Tower kernel: ata6.00: BMDMA stat 0x24

Jul 15 21:55:02 Tower kernel: ata6.00: failed command: READ DMA EXT

Jul 15 21:55:02 Tower kernel: ata6.00: cmd 25/00:08:28:7b:a8/00:00:ae:00:00/e0 tag 0 dma 4096 in

Jul 15 21:55:02 Tower kernel:          res 51/01:00:2f:7b:a8/01:00:ae:00:00/e0 Emask 0x1 (device error)

Jul 15 21:55:02 Tower kernel: ata6.00: status: { DRDY ERR }

Jul 15 21:55:02 Tower kernel: ata6.00: configured for UDMA/100

Jul 15 21:55:02 Tower kernel: ata6: EH complete

Jul 15 21:55:02 Tower kernel: ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Jul 15 21:55:02 Tower kernel: ata6.00: BMDMA stat 0x24

Jul 15 21:55:02 Tower kernel: ata6.00: failed command: READ DMA EXT

Jul 15 21:55:02 Tower kernel: ata6.00: cmd 25/00:08:28:7b:a8/00:00:ae:00:00/e0 tag 0 dma 4096 in

Jul 15 21:55:02 Tower kernel:          res 51/01:00:2f:7b:a8/01:00:ae:00:00/e0 Emask 0x1 (device error)

Jul 15 21:55:02 Tower kernel: ata6.00: status: { DRDY ERR }

Jul 15 21:55:02 Tower kernel: ata6.00: configured for UDMA/100

Jul 15 21:55:02 Tower kernel: ata6: EH complete

Jul 15 21:55:02 Tower kernel: ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Jul 15 21:55:02 Tower kernel: ata6.00: BMDMA stat 0x24

Jul 15 21:55:02 Tower kernel: ata6.00: failed command: READ DMA EXT

Jul 15 21:55:02 Tower kernel: ata6.00: cmd 25/00:08:28:7b:a8/00:00:ae:00:00/e0 tag 0 dma 4096 in

Jul 15 21:55:02 Tower kernel:          res 51/01:00:2f:7b:a8/01:00:ae:00:00/e0 Emask 0x1 (device error)

Jul 15 21:55:02 Tower kernel: ata6.00: status: { DRDY ERR }

Jul 15 21:55:02 Tower kernel: ata6.00: configured for UDMA/100

Jul 15 21:55:02 Tower kernel: ata6: EH complete

Jul 15 21:55:02 Tower kernel: ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Jul 15 21:55:02 Tower kernel: ata6.00: BMDMA stat 0x24

Jul 15 21:55:02 Tower kernel: ata6.00: failed command: READ DMA EXT

Jul 15 21:55:02 Tower kernel: ata6.00: cmd 25/00:08:28:7b:a8/00:00:ae:00:00/e0 tag 0 dma 4096 in

Jul 15 21:55:02 Tower kernel:          res 51/40:00:2f:7b:a8/40:00:ae:00:00/e0 Emask 0x9 (media error)

 

 

My boot time is running at about 5 minutes now, and, generally speaking, it can't be good to be getting these errors.

 

My question is whether there is there a way to kill the startup processing that generate these errors, or some proper means of eliminating them???

 

Thanks.

 

 

I would run a smart report on that drive.  From your excerpt there appear to be "media" errors.  Those are un-readable sectors on the disk and will show as sectors pending re-allocation. (or already re-allocated)

 

type:

smartctl -d ata -a /dev/sdf

  • Author

Thanks Joe.  I did actualy find a similar post here: http://lime-technology.com/forum/index.php?topic=5384.60.  You were in conversation in that post, but again there is no answer for why all these messages are being generated.  These messages did not show up for months, only immediately after applying the jumper, just as was the case for rd_blair. 

 

I ran the command on the indicated drive (the data drive) and got the following output:

 

=== START OF INFORMATION SECTION ===

Device Model:    WDC WD15EARS-00Z5B1

Serial Number:    WD-WMAVU1709886

Firmware Version: 80.00A80

User Capacity:    1,500,301,910,016 bytes

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:  8

ATA Standard is:  Exact ATA specification draft version not indicated

Local Time is:    Thu Jul 15 20:18:35 2010 CDT

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x84) Offline data collection activity

                                        was suspended by an interrupting command  from host.

                                        Auto Offline Data Collection: Enabled.

Self-test execution status:      (  0) The previous self-test routine completed

                                        without error or no self-test has ever

                                        been run.

Total time to complete Offline

data collection:                (33000) seconds.

Offline data collection

capabilities:                    (0x7b) SMART execute Offline immediate.

                                        Auto Offline data collection on/off support.

                                        Suspend Offline collection upon new command.

                                        Offline surface scan supported.

                                        Self-test supported.

                                        Conveyance Self-test supported.

                                        Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                                        power-saving mode.

                                        Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                                        General Purpose Logging supported.

Short self-test routine

recommended polling time:        (  2) minutes.

Extended self-test routine

recommended polling time:        ( 255) minutes.

Conveyance self-test routine

recommended polling time:        (  5) minutes.

SCT capabilities:              (0x3031) SCT Status supported.

                                        SCT Feature Control supported.

                                        SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_

FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  200  200  051    Pre-fail  Always      -

      0

  3 Spin_Up_Time            0x0027  182  181  021    Pre-fail  Always      -

      5866

  4 Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -

      183

  5 Reallocated_Sector_Ct  0x0033  200  200  140    Pre-fail  Always      -

      0

  7 Seek_Error_Rate        0x002e  200  200  000    Old_age  Always      -

      0

  9 Power_On_Hours          0x0032  099  099  000    Old_age  Always      -

      918

10 Spin_Retry_Count        0x0032  100  100  000    Old_age  Always      -

      0

11 Calibration_Retry_Count 0x0032  100  253  000    Old_age  Always      -

      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -

      50

192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -

      25

193 Load_Cycle_Count        0x0032  200  200  000    Old_age  Always      -

      840

194 Temperature_Celsius    0x0022  120  102  000    Old_age  Always      -

      30

196 Reallocated_Event_Count 0x0032  200  200  000    Old_age  Always      -

      0

197 Current_Pending_Sector  0x0032  200  200  000    Old_age  Always      -

      1

198 Offline_Uncorrectable  0x0030  200  200  000    Old_age  Offline      -

      0

199 UDMA_CRC_Error_Count    0x0032  200  200  000    Old_age  Always      -

      0

200 Multi_Zone_Error_Rate  0x0008  200  200  000    Old_age  Offline      -

      0

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]

 

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

/

 

Nothing sticks out.  I will run smartctl -t on that drive and post the results.  The full battery of tests run by this command take 4 hours for one of these drives.  I suspect it will not flag anything - the size posted in unRaid remained exactly the same without and later with the jumper.  If the unRAID disk size is a value that is calculated, then I did not have any sectors go bad through this process. 

 

I am thinking that there is something about the installed jumper that is giving the unRAID OS' boot process some indigestion. 

 

What do the Linux gurus out there think? 

 

Why is the jumper inserted on a WD EARS Advanced Format Drive causing all these errors to be generated?

 

 

 

Actually, you do have one sector marked for re-allocation when it is next written:

197 Current_Pending_Sector  0x0032  200  200  000    Old_age  Always      -  1

 

I don't think the reported size of the disk ever changes when the jumper is added.

 

If you issue a "long" test on that drive I think it will take more than 4 hours.

It says in your smart report: Total time to complete Offline data collection: (33000) seconds.

Be sure to disable any spin-down timers or you'll cause it to abort when the drive is forced to spin down in the middle of its test.

33,000 seconds = a bit over 9 hours.

 

 

 

  • Author

Thanks again for the response.  Yes - good thing I didn't wait for it to run. 

 

Here is the completed test report (with the same error you previously caught):

 

root@Tower:~# smartctl -a /dev/sdf

smartctl version 5.38 [i486-slackware-linux-gnu] Copyright © 2002-8 Bruce Alle

n

Home page is http://smartmontools.sourceforge.net/

 

=== START OF INFORMATION SECTION ===

Device Model:    WDC WD15EARS-00Z5B1

Serial Number:    WD-WMAVU1709886

Firmware Version: 80.00A80

User Capacity:    1,500,301,910,016 bytes

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:  8

ATA Standard is:  Exact ATA specification draft version not indicated

Local Time is:    Fri Jul 16 07:43:51 2010 CDT

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x85) Offline data collection activity

                                        was aborted by an interrupting command f

rom host.

                                        Auto Offline Data Collection: Enabled.

Self-test execution status:      ( 241) Self-test routine in progress...

                                        10% of test remaining.

Total time to complete Offline

data collection:                (33000) seconds.

Offline data collection

capabilities:                    (0x7b) SMART execute Offline immediate.

                                        Auto Offline data collection on/off supp

ort.

                                        Suspend Offline collection upon new

                                        command.

                                        Offline surface scan supported.

                                        Self-test supported.

                                        Conveyance Self-test supported.

                                        Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                                        power-saving mode.

                                        Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                                        General Purpose Logging supported.

Short self-test routine

recommended polling time:        (  2) minutes.

Extended self-test routine

recommended polling time:        ( 255) minutes.

Conveyance self-test routine

recommended polling time:        (  5) minutes.

SCT capabilities:              (0x3031) SCT Status supported.

                                        SCT Feature Control supported.

                                        SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_

FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  200  200  051    Pre-fail  Always      -

      0

  3 Spin_Up_Time            0x0027  249  181  021    Pre-fail  Always      -

      2508

  4 Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -

      192

  5 Reallocated_Sector_Ct  0x0033  200  200  140    Pre-fail  Always      -

      0

  7 Seek_Error_Rate        0x002e  200  200  000    Old_age  Always      -

      0

  9 Power_On_Hours          0x0032  099  099  000    Old_age  Always      -

      929

10 Spin_Retry_Count        0x0032  100  100  000    Old_age  Always      -

      0

11 Calibration_Retry_Count 0x0032  100  253  000    Old_age  Always      -

      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -

      50

192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -

      25

193 Load_Cycle_Count        0x0032  200  200  000    Old_age  Always      -

      865

194 Temperature_Celsius    0x0022  120  102  000    Old_age  Always      -

      30

196 Reallocated_Event_Count 0x0032  200  200  000    Old_age  Always      -

      0

197 Current_Pending_Sector  0x0032  200  200  000    Old_age  Always      -

      1

198 Offline_Uncorrectable  0x0030  200  200  000    Old_age  Offline      -

      0

199 UDMA_CRC_Error_Count    0x0032  200  200  000    Old_age  Always      -

      0

200 Multi_Zone_Error_Rate  0x0008  200  200  000    Old_age  Offline      -

      0

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]

 

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

 

/

 

But here is something interesting - I ran the same sequence of tests on the second EARS drive and received a very similar error but with one single sector already disabled:

 

root@Melnych1:~# smartctl -a /dev/sde

smartctl version 5.38 [i486-slackware-linux-gnu] Copyright © 2002-8 Bruce Alle

n

Home page is http://smartmontools.sourceforge.net/

 

=== START OF INFORMATION SECTION ===

Device Model:    WDC WD15EARS-00Z5B1

Serial Number:    WD-WMAVU1945232

Firmware Version: 80.00A80

User Capacity:    1,500,301,910,016 bytes

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:  8

ATA Standard is:  Exact ATA specification draft version not indicated

Local Time is:    Fri Jul 16 07:53:05 2010 CDT

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x82) Offline data collection activity

                                        was completed without error.

                                        Auto Offline Data Collection: Enabled.

Self-test execution status:      (  0) The previous self-test routine completed

 

                                        without error or no self-test has ever

                                        been run.

Total time to complete Offline

data collection:                (32400) seconds.

Offline data collection

capabilities:                    (0x7b) SMART execute Offline immediate.

                                        Auto Offline data collection on/off supp

ort.

                                        Suspend Offline collection upon new

                                        command.

                                        Offline surface scan supported.

                                        Self-test supported.

                                        Conveyance Self-test supported.

                                        Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                                        power-saving mode.

                                        Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                                        General Purpose Logging supported.

Short self-test routine

recommended polling time:        (  2) minutes.

Extended self-test routine

recommended polling time:        ( 255) minutes.

Conveyance self-test routine

recommended polling time:        (  5) minutes.

SCT capabilities:              (0x3031) SCT Status supported.

                                        SCT Feature Control supported.

                                        SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_

FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  200  200  051    Pre-fail  Always      -

      0

  3 Spin_Up_Time            0x0027  184  182  021    Pre-fail  Always      -

      5800

  4 Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -

      192

  5 Reallocated_Sector_Ct  0x0033  200  200  140    Pre-fail  Always      -

      0

  7 Seek_Error_Rate        0x002e  200  200  000    Old_age  Always      -

      0

  9 Power_On_Hours          0x0032  099  099  000    Old_age  Always      -

      929

10 Spin_Retry_Count        0x0032  100  100  000    Old_age  Always      -

      0

11 Calibration_Retry_Count 0x0032  100  253  000    Old_age  Always      -

      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -

      50

192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -

      25

193 Load_Cycle_Count        0x0032  200  200  000    Old_age  Always      -

      799

194 Temperature_Celsius    0x0022  119  101  000    Old_age  Always      -

      31

196 Reallocated_Event_Count 0x0032  200  200  000    Old_age  Always      -

      0

197 Current_Pending_Sector  0x0032  200  200  000    Old_age  Always      -

      1

198 Offline_Uncorrectable  0x0030  200  200  000    Old_age  Offline      -

      1

199 UDMA_CRC_Error_Count    0x0032  200  200  000    Old_age  Always      -

      0

200 Multi_Zone_Error_Rate  0x0008  200  200  000    Old_age  Offline      -

      0

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]

 

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

 

 

/

 

So with all this is there any correlation between this output and my syslog errors?  Should I be doing anything?

 

 

 

 

 

I had an EARS drive and a Samsung drive that were intermittently triggering errors like these in my syslog. After doing a bunch of troubleshooting, testing, and googling to no avail, what did fix it was replacing the SATA cable to the drive. I've been getting cheap cables from Meritline, and it seems the quality control isn't great. But they're so cheap I don't mind switching out the odd cable.

  • Author

Here is the latest update on my WD EARS advanced format drive saga.

 

I continued to have error messages in syslog so I decided to install unmenu for better/easier access to SMART data.  I'm not much of a Linux expert.  Once I did this the error messages became even more frequent - to the point of making the system unresponsive.  When that happened, I started up a telnet session to connect to the server to see what was happening and all of a sudden the system rebooted.  Subsequent reboots did not solve the problem.  I uninstalled unmenu and all the packages from the USB Key and rebooted.  Everything came back but of course the errors were still flooding syslog.

 

I was not too concerned about data loss because the messages were always for the same disk sector.  What was concerning me is that these drives were being used for data and parity so if anything further happened I risked losing a lot of data.  Fearing a bad situation I went to my local supplier Memory Express and picked up 2 Seagate 2TB drives (not advanced format drives - the regular 512 byte sector drives).  They told me to bring in the WD drives and they would initiate the RMA process for me.

 

So I replaced the drives one at a time.  I am back running normally again.  Throughout this process I never lost access to the data so I am certainly happy with the way unRAID kept me in business. 

 

I am concerned about using these WD drives with the jumper in place.  I did not experience any problems running without the jumpers for around 2 months, and I was doing lots of IO moving large video files to and from the unRAID server.

 

I guess in the end I am most puzzled with the drives not being able to deal with the errors.  I would have thought that there would be some auto correction to mark the sectors bad and not make any more reference to them. 

 

The RMA process if successful will at best result in me getting 2 more of the same drives.  I don't think I want them back in my unRAID server though.

 

Anyone need 3tb's of disk?

 

 

 

Here is the latest update on my WD EARS advanced format drive saga.

 

I continued to have error messages in syslog so I decided to install unmenu for better/easier access to SMART data.  I'm not much of a Linux expert.  Once I did this the error messages became even more frequent - to the point of making the system unresponsive.  When that happened, I started up a telnet session to connect to the server to see what was happening and all of a sudden the system rebooted.  Subsequent reboots did not solve the problem.  I uninstalled unmenu and all the packages from the USB Key and rebooted.  Everything came back but of course the errors were still flooding syslog.

 

I was not too concerned about data loss because the messages were always for the same disk sector.  What was concerning me is that these drives were being used for data and parity so if anything further happened I risked losing a lot of data.  Fearing a bad situation I went to my local supplier Memory Express and picked up 2 Seagate 2TB drives (not advanced format drives - the regular 512 byte sector drives).  They told me to bring in the WD drives and they would initiate the RMA process for me.

 

So I replaced the drives one at a time.  I am back running normally again.  Throughout this process I never lost access to the data so I am certainly happy with the way unRAID kept me in business. 

 

I am concerned about using these WD drives with the jumper in place.  I did not experience any problems running without the jumpers for around 2 months, and I was doing lots of IO moving large video files to and from the unRAID server.

 

I guess in the end I am most puzzled with the drives not being able to deal with the errors.  I would have thought that there would be some auto correction to mark the sectors bad and not make any more reference to them. 

 

The RMA process if successful will at best result in me getting 2 more of the same drives.  I don't think I want them back in my unRAID server though.

 

Anyone need 3tb's of disk?

 

 

 

 

The problem stems from formatting the drive without the jumper and then trying to install it later.  It seems like the drive firmware does not like that at all.  All the posts i have seen about this problem usually stem from the drive first being cleared and formatted without a jumper and then it being installed later.

Yes there is.

 

Step 1. Install the Jumper on pins 7-8 before installing into server.

Step 2. Run PreClear to ensure drive has no issues.

Step 3. Assign to unRAID array.

 

Step 2 is optional but highly advised to sort out any issues/deaths early on.

  • 2 weeks later...
  • Author

I will try your suggestion and report back.  Thanks.

  • 2 weeks later...
  • Author

Here is my last update on this topic. 

 

I tried to run the preclear but there were simply too many errors flooding the log and too much resetting of the drive to do the preclear effectively. I was getting less than 5 Mb/sec progress.

 

I took out the drives and put them into a Windows system where I tried to run the WD Data Lifeguard tool.  It simply would not work on either drive.  I repeated this without the jumper in place and everything tested fine.

 

I think this has the potential to nip at the heels of Lime Technology until support for 4K AFD drives is incorporated into a new release.  Many people will be purchasing 4K AFD drives and may run into the same problems I did - perhaps as innocently as redeploying a used drive that has been used elsewhere.

 

I returned the drives to where I purchased them.  They informed me that they are no longer recommending those drives for Linux systems as too many people have been having problems - especially when they do as I did - using the drives first without the jumper.  They had no way of guaranteeing me that they could get the RMA process started on them.  They were kind enough to give me back 75% of the value of the drives.  I purchased another Seagate 2Tb drive.  It is just time to move on.  I've wasted too much time with this.

 

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.