Jump to content

Preclear.sh results - Questions about your results? Post them here.


Recommended Posts

the unmenu shows not successful, here is the report and screenshot

That is the final SMART report.  It still shows no error.

 

Where is the preclear report itself.  (I have no idea what the myMain does, so can offer no help with the screen-shot)

 

What version of the preclear_disk.sh did you use?  Type:

preclear_disk.sh -v

to find out.

 

So far, you've shown me nothing that says the disk failed the pre-clear.  A SMART report does not indicate anything about the pre-clear process. 

 

The preclear_reports folder should have had three files.  The initial SMART report, the final SMART report, and the preclear results. 

 

Post the contents of the results file, not the SMART reports.

 

Joe L.

Link to comment

(I have no idea what the myMain does, so can offer no help with the screen-shot)

 

myMain echoes the status returned from preclear. My guess is there was a post-read verification error, but will have to look at the actual preclear report to be sure.

Link to comment

now put in all 3 files into the zip file.

It says that the post-read detected the disk did not have all zeros when read.

 

This could be caused by almost anything from

Bad memory, a bad disk, a bad disk controller, a bad motherboard chipset (early Nforce had this), or a bad power supply.

 

These types of errors are exactly why the test for zeros was added to the preclear-post-read.  They cause hair-loss. (because you will pull your hair out trying to find elusive parity errors if the drive is added to the array)

 

About the only hardware you can eliminate is the "mouse"  It is unlikely to be the cause of the errors.  ;)

 

I see another preclear_disk.sh run in your future.

 

Joe L.

Link to comment

I cleared 4 drives at the same time, 3 completed successfully and all 4 drives are connected to the same 8 port controller AOC-SASLP-MV8, therefore it's unlikely that it's the controller, same for the power supply and chipset. This is a i3 processor with Intel Chipset. So will run a single preclear on the drive again, lets see if that completes.

Link to comment

I cleared 4 drives at the same time, 3 completed successfully and all 4 drives are connected to the same 8 port controller AOC-SASLP-MV8, therefore it's unlikely that it's the controller, same for the power supply and chipset. This is a i3 processor with Intel Chipset. So will run a single preclear on the drive again, lets see if that completes.

I'm happy it might be as simple as that.  As I said, in the past those with similar drives that "randomly" returned inconsistent values (other than zeros) when subsequently read would drive their array owners insane, as once the disk is in the array the only symptom would be random parity errors when parity is checked, and there would be absolutely no way, other than by process of elimination, to figure out the hardware that was faulty.

 

Joe L.

Link to comment

now put in all 3 files into the zip file.

 

As I had thought, the post-read verify failed.  What preclear does is first read the entire disk, then zero the entire disk, and then read the entire disk verifying that it is full of zeros.  What happened is that last step found some location(s) where the data was not zero.  Could have litterally been 1 bit in 20 Trillion - that's all it takes.

 

As Joe L. says, there are numerous things that can cause this.  I'd recommend running an overnight memory test as a starting point.  If you have bad or misconfigured memory, it can cause problems during the write or the post-read phase.  Could also be a bad or loose data cable or some incompatibility.  These can be hard to find, but not always.  Do the memory test and based on the results we can suggest additional tests to try to narrow it down.

 

Link to comment

I don't understand why preclear is reporting a smartctl problem. When I run smarctl manually it seems to work fine.

 

 

root@192:/boot# preclear_disk.sh /dev/sdc

Pre-Clear unRAID Disk /dev/sdc
################################################################## 1.11

smartctl may not be able to run on /dev/sdc with the -d ata option.
however this should not affect the clearing of a disk.
smartctl exit status = 4
smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate DB35.3 Series
Device Model:     ST3160215SCE
Serial Number:    5RX1MFTP
Firmware Version: 3.ACF
User Capacity:    160,041,885,696 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Wed Jun 29 13:55:16 2011 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Error SMART Status command failed
Please get assistance from http://smartmontools.sourceforge.net/
Register values returned from SMART Status command are:
ST =0x40
ERR=0x00
NS =0x00
SC =0xa0
CL =0x9e
CH =0xa1
SEL=0x40
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.$
Do you wish to continue?
(Answer Yes to continue. Capital 'Y', lower case 'es'): 
root@192:/boot# smartctl -d ata -a /dev/sdc
smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate DB35.3 Series
Device Model:     ST3160215SCE
Serial Number:    5RX1MFTP
Firmware Version: 3.ACF
User Capacity:    160,041,885,696 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Wed Jun 29 13:55:48 2011 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
				was completed without error.
				Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
				without error or no self-test has ever 
				been run.
Total time to complete Offline 
data collection: 		 (15556) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
				Auto Offline data collection on/off support.
				Suspend Offline collection upon new
				command.
				Offline surface scan supported.
				Self-test supported.
				No Conveyance Self-test supported.
				Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
				power-saving mode.
				Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
				General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 (  54) minutes.
SCT capabilities: 	       (0x0031)	SCT Status supported.
				SCT Feature Control supported.
				SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x000f   115   072   006    Pre-fail  Always       -       92652310
 3 Spin_Up_Time            0x0003   098   097   000    Pre-fail  Always       -       0
 4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       79
 5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
 7 Seek_Error_Rate         0x000f   087   060   030    Pre-fail  Always       -       498343952
 9 Power_On_Hours          0x0032   086   086   000    Old_age   Always       -       12374
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       79
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   064   044   045    Old_age   Always   In_the_past 36 (Lifetime Min/Max 35/36)
194 Temperature_Celsius     0x0022   036   056   000    Old_age   Always       -       36 (0 11 0 0)
195 Hardware_ECC_Recovered  0x001a   115   064   000    Old_age   Always       -       92652310
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

root@192:/boot# 

 

Why am I seeing this problem?

When you run the command, are you checking its exit status?  the preclear script is, and your smartctl is exiting with a non-zero exit status.

 

Try the 1.12 version  of the preclear_disk.sh script attached to this post:  http://lime-technology.com/forum/index.php?topic=4068.msg128289#msg128289

It is version 1.12...  I had originally posted it there to allow those with 3TB drives to give it a try, but it also should fix your issue too. (as long as your disk will report with just smartctl -a /dev/sdX )

 

I would really appreciate some help because I am still receiving the error with version 12. Now the disk that is being tested is connected to SASLP card. Yet I receive a normal output when I run the command smartctl -a /dev/sdb.

 

Furthermore I wrote a small bash script that looks like follows:

http://pastebin.com/42nSRWXZ

And it exits with a code of 0. Yet when I modified the your preclear script and inserted an echo line between lines 1456 and 1457 that said "echo $smartstat" I got an exit code 4.  Can you please advise me as to what exactly is occurring and how I may resolve the issue.

Link to comment

I don't understand why preclear is reporting a smartctl problem. When I run smarctl manually it seems to work fine.

 

 

root@192:/boot# preclear_disk.sh /dev/sdc

Pre-Clear unRAID Disk /dev/sdc
################################################################## 1.11

smartctl may not be able to run on /dev/sdc with the -d ata option.
however this should not affect the clearing of a disk.
smartctl exit status = 4
smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate DB35.3 Series
Device Model:     ST3160215SCE
Serial Number:    5RX1MFTP
Firmware Version: 3.ACF
User Capacity:    160,041,885,696 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Wed Jun 29 13:55:16 2011 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Error SMART Status command failed
Please get assistance from http://smartmontools.sourceforge.net/
Register values returned from SMART Status command are:
ST =0x40
ERR=0x00
NS =0x00
SC =0xa0
CL =0x9e
CH =0xa1
SEL=0x40
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.$
Do you wish to continue?
(Answer Yes to continue. Capital 'Y', lower case 'es'): 
root@192:/boot# smartctl -d ata -a /dev/sdc
smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate DB35.3 Series
Device Model:     ST3160215SCE
Serial Number:    5RX1MFTP
Firmware Version: 3.ACF
User Capacity:    160,041,885,696 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Wed Jun 29 13:55:48 2011 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
				was completed without error.
				Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
				without error or no self-test has ever 
				been run.
Total time to complete Offline 
data collection: 		 (15556) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
				Auto Offline data collection on/off support.
				Suspend Offline collection upon new
				command.
				Offline surface scan supported.
				Self-test supported.
				No Conveyance Self-test supported.
				Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
				power-saving mode.
				Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
				General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 (  54) minutes.
SCT capabilities: 	       (0x0031)	SCT Status supported.
				SCT Feature Control supported.
				SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x000f   115   072   006    Pre-fail  Always       -       92652310
 3 Spin_Up_Time            0x0003   098   097   000    Pre-fail  Always       -       0
 4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       79
 5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
 7 Seek_Error_Rate         0x000f   087   060   030    Pre-fail  Always       -       498343952
 9 Power_On_Hours          0x0032   086   086   000    Old_age   Always       -       12374
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       79
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   064   044   045    Old_age   Always   In_the_past 36 (Lifetime Min/Max 35/36)
194 Temperature_Celsius     0x0022   036   056   000    Old_age   Always       -       36 (0 11 0 0)
195 Hardware_ECC_Recovered  0x001a   115   064   000    Old_age   Always       -       92652310
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

root@192:/boot# 

 

Why am I seeing this problem?

When you run the command, are you checking its exit status?  the preclear script is, and your smartctl is exiting with a non-zero exit status.

 

Try the 1.12 version  of the preclear_disk.sh script attached to this post:  http://lime-technology.com/forum/index.php?topic=4068.msg128289#msg128289

It is version 1.12...  I had originally posted it there to allow those with 3TB drives to give it a try, but it also should fix your issue too. (as long as your disk will report with just smartctl -a /dev/sdX )

 

I would really appreciate some help because I am still receiving the error with version 12. Now the disk that is being tested is connected to SASLP card. Yet I receive a normal output when I run the command smartctl -a /dev/sdb.

 

Furthermore I wrote a small bash script that looks like follows:

http://pastebin.com/42nSRWXZ

And it exits with a code of 0. Yet when I modified the your preclear script and inserted an echo line between lines 1456 and 1457 that said "echo $smartstat" I got an exit code 4.  Can you please advise me as to what exactly is occurring and how I may resolve the issue.

have you tried the "-D" option to preclear_disk.sh ??

 

Link to comment

Yes. Same results exit code 4, when it should be zero.

I've no idea...

Best I can offer is to suggest you invoke it as

sh -xv preclear_disk.sh /dev/sdX

 

and see what it is doing.  I certainly cannot fix your exit status if your smart command is exiting abnormally.

 

Joe L.

Link to comment

Yes. Same results exit code 4, when it should be zero.

I've no idea...

Best I can offer is to suggest you invoke it as

sh -xv preclear_disk.sh /dev/sdX

 

and see what it is doing.  I certainly cannot fix your exit status if your smart command is exiting abnormally.

 

Joe L.

 

Thanks Joe L.

 

Just thought I asked since it only occurs within the script, not from the command line or from within my test script which is practically identical to the offending lines in the pre-clear script. Anyhow I'll do some more test and run the command as you suggested.

 

Edit:

 

Did more tests, I added a 5 second sleep at line 1450 and that seem to resolve the issue.

Link to comment

Yes. Same results exit code 4, when it should be zero.

I've no idea...

Best I can offer is to suggest you invoke it as

sh -xv preclear_disk.sh /dev/sdX

 

and see what it is doing.  I certainly cannot fix your exit status if your smart command is exiting abnormally.

 

Joe L.

 

Thanks Joe L.

 

Just thought I asked since it only occurs within the script, not from the command line or from within my test script which is practically identical to the offending lines in the pre-clear script. Anyhow I'll do some more test and run the command as you suggested.

 

Edit:

 

Did more tests, I added a 5 second sleep at line 1450 and that seem to resolve the issue.

yes, but it sounds like the logic is now looking at the exit status of the added "sleep" command, and it (sleep) always is successful at sleeping.

 

Happy it works for you, but it  does not solve the issue.    It will not affect the preclear regardless, as it is just used to get the disk temperature.

Link to comment

I don't believe so, since sleep is run before line 1451 which is

echo $clearscreen$goto_top${bold}Pre-Clear unRAID Disk $theDisk$norm

 

So the exit code being stored in $smartstat should be that of smartctl. If this is not the case please forgive my statement. On another note, I noticed that when I did run pre-clear that all the reports look valid so its only this line that causes issues for people and gives them a confusing error that could possibly not be an error, like in my particular case.

 

unraid.png

 

Uploaded with ImageShack.us

 

PS:

 

Here is the pre-clear report from this disk using v11, before I though off inserting the 5 second delay:

http://pastebin.com/JJBE52gp

Pre-clear issued the error at the start of preclear "smartctl may not be able to run on...." so on, yet as you can see the rpt clearly shows the smartctl was able to run.

 

Thanks again Joe L. for putting up with my pestering. I love your script though, and like to contribute whenever possible.

Link to comment

I don't believe so, since sleep is run before line 1451 which is

echo $clearscreen$goto_top${bold}Pre-Clear unRAID Disk $theDisk$norm

 

So the exit code being stored in $smartstat should be that of smartctl. If this is not the case please forgive my statement. On another note, I noticed that when I did run pre-clear that all the reports look valid so its only this line that causes issues for people and gives them a confusing error that could possibly not be an error, like in my particular case.

 

unraid.png

 

Uploaded with ImageShack.us

All I can say is I'm happy it helped, but adding the "sleep 5" where you did should have had no effect of the subsequent smartctl exit status, yet you say it does.

 

The only way that could happen is it smartctl is not properly initializing one of the variables used internally and somehow inheriting some memory contents from a prior command that is not zero.  Perhaps it is some side effect of running the hdparm command on the disk, just a few lines above, to get its size, and the disk too some time to recover from that operation... I really don't know.

 

It certainly does no harm to add the sleep where you did.

 

Initially, I thought you were adding it between the invocation of smartctl and the evaluation of its exit status.

 

Joe L.

Link to comment

hi. i am a noob to unraid and pre-clear. i mainly wanted to use unraid and pre-clear to clear the drive and verify that its 'good to go' for other uses. so i run unraid from usb stick and do one drive at a time on a pretty decent machine w/ 5 gigs of ram. (memtested ans passed) i ran pre-clear from 'console' w/o remoting.

 

i did one 2TB drive and it all worked well, and took a little over 24 hrs, which seemed normal and seemed easy todo.

 

now i am trying to pre-clear a different drive, 2TB WD20EADS (32MB Cache) and Pre-Read and Post-Read steps take very very long time. First time i attempted, it took few days and it seems like it froze on Post Read, so i rebooted and tried again.

 

now its been about 18 hrs and Pre-Read is only 1% done. bytes read and elapsed time don't update often.

 

So... should i forget about this drive then? is it fubared now?

Link to comment

hi. i am a noob to unraid and pre-clear. i mainly wanted to use unraid and pre-clear to clear the drive and verify that its 'good to go' for other uses. so i run unraid from usb stick and do one drive at a time on a pretty decent machine w/ 5 gigs of ram. (memtested ans passed) i ran pre-clear from 'console' w/o remoting.

 

i did one 2TB drive and it all worked well, and took a little over 24 hrs, which seemed normal and seemed easy todo.

 

now i am trying to pre-clear a different drive, 2TB WD20EADS (32MB Cache) and Pre-Read and Post-Read steps take very very long time. First time i attempted, it took few days and it seems like it froze on Post Read, so i rebooted and tried again.

 

now its been about 18 hrs and Pre-Read is only 1% done. bytes read and elapsed time don't update often.

 

So... should i forget about this drive then? is it fubared now?

we don't know, you did not attach a system log. 
Link to comment

you probably already saw this, but there are tons of un-readable sectors on that disk.

Get a smart report to see the full statistics.

smartctl -a /dev/sda

look for sectors pending re-allocation.

 

Jan  2 16:33:13 Tower kernel:          res 51/40:00:f0:c1:9a/40:00:ad:00:00/e0 Emask 0x9 (media error)

Jan  2 16:33:13 Tower kernel: ata3.00: status: { DRDY ERR }

Jan  2 16:33:13 Tower kernel: ata3.00: error: { UNC }

Jan  2 16:33:18 Tower kernel: ata3.00: configured for UDMA/133

Jan  2 16:33:18 Tower kernel: ata3: EH complete

Jan  2 16:33:20 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Jan  2 16:33:20 Tower kernel: ata3.00: BMDMA stat 0x4

Jan  2 16:33:20 Tower kernel: ata3.00: failed command: READ DMA EXT

Jan  2 16:33:20 Tower kernel: ata3.00: cmd 25/00:08:f0:c1:9a/00:00:ad:00:00/e0 tag 0 dma 4096 in

Jan  2 16:33:20 Tower kernel:          res 51/40:00:f0:c1:9a/40:00:ad:00:00/e0 Emask 0x9 (media error)

Jan  2 16:33:20 Tower kernel: ata3.00: status: { DRDY ERR }

Jan  2 16:33:20 Tower kernel: ata3.00: error: { UNC }

Jan  2 16:33:23 Tower kernel: ata3.00: configured for UDMA/133

Jan  2 16:33:23 Tower kernel: sd 4:0:0:0: [sda] Unhandled sense code

Jan  2 16:33:23 Tower kernel: sd 4:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08

Jan  2 16:33:23 Tower kernel: sd 4:0:0:0: [sda] Sense Key : 0x3 [current] [descriptor]

Jan  2 16:33:23 Tower kernel: Descriptor sense data with sense descriptors (in hex):

Jan  2 16:33:23 Tower kernel:        72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00

Jan  2 16:33:23 Tower kernel:        ad 9a c1 f0

Jan  2 16:33:23 Tower kernel: sd 4:0:0:0: [sda] ASC=0x11 ASCQ=0x4

Jan  2 16:33:23 Tower kernel: sd 4:0:0:0: [sda] CDB: cdb[0]=0x28: 28 00 ad 9a c1 f0 00 00 08 00

Jan  2 16:33:23 Tower kernel: end_request: I/O error, dev sda, sector 2912600560

Jan  2 16:33:23 Tower kernel: Buffer I/O error on device sda, logical block 364075070

Jan  2 16:33:23 Tower kernel: ata3: EH complete

Jan  2 16:33:36 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Jan  2 16:33:36 Tower kernel: ata3.00: BMDMA stat 0x5

Jan  2 16:33:36 Tower kernel: ata3.00: failed command: READ DMA

Jan  2 16:33:36 Tower kernel: ata3.00: cmd c8/00:00:c8:56:41/00:00:00:00:00/e0 tag 0 dma 131072 in

Jan  2 16:33:36 Tower kernel:          res 51/40:4f:77:57:41/40:00:ad:00:00/e0 Emask 0x9 (media error)

Jan  2 16:33:36 Tower kernel: ata3.00: status: { DRDY ERR }

Jan  2 16:33:36 Tower kernel: ata3.00: error: { UNC }

Jan  2 16:33:40 Tower kernel: ata3.00: configured for UDMA/133

Jan  2 16:33:40 Tower kernel: ata3: EH complete

Jan  2 16:33:43 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Jan  2 16:33:43 Tower kernel: ata3.00: BMDMA stat 0x5

Jan  2 16:33:43 Tower kernel: ata3.00: failed command: READ DMA

Jan  2 16:33:43 Tower kernel: ata3.00: cmd c8/00:00:c8:56:41/00:00:00:00:00/e0 tag 0 dma 131072 in

Jan  2 16:33:43 Tower kernel:          res 51/40:4f:77:57:41/40:00:ad:00:00/e0 Emask 0x9 (media error)

Jan  2 16:33:43 Tower kernel: ata3.00: status: { DRDY ERR }

Jan  2 16:33:43 Tower kernel: ata3.00: error: { UNC }

Jan  2 16:33:47 Tower kernel: ata3.00: configured for UDMA/133

Jan  2 16:33:47 Tower kernel: ata3: EH complete

Jan  2 16:33:49 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Jan  2 16:33:49 Tower kernel: ata3.00: BMDMA stat 0x5

Jan  2 16:33:49 Tower kernel: ata3.00: failed command: READ DMA

Jan  2 16:33:49 Tower kernel: ata3.00: cmd c8/00:00:c8:56:41/00:00:00:00:00/e0 tag 0 dma 131072 in

Jan  2 16:33:49 Tower kernel:          res 51/40:4f:77:57:41/40:00:ad:00:00/e0 Emask 0x9 (media error)

Jan  2 16:33:49 Tower kernel: ata3.00: status: { DRDY ERR }

Jan  2 16:33:49 Tower kernel: ata3.00: error: { UNC }

Jan  2 16:33:52 Tower kernel: ata3.00: configured for UDMA/133

Jan  2 16:33:52 Tower kernel: ata3: EH complete

Jan  2 16:33:55 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Jan  2 16:33:55 Tower kernel: ata3.00: BMDMA stat 0x5

Jan  2 16:33:55 Tower kernel: ata3.00: failed command: READ DMA

Jan  2 16:33:55 Tower kernel: ata3.00: cmd c8/00:00:c8:56:41/00:00:00:00:00/e0 tag 0 dma 131072 in

Jan  2 16:33:55 Tower kernel:          res 51/40:4f:77:57:41/40:00:ad:00:00/e0 Emask 0x9 (media error)

Jan  2 16:33:55 Tower kernel: ata3.00: status: { DRDY ERR }

Jan  2 16:33:55 Tower kernel: ata3.00: error: { UNC }

 

 

Link to comment

you probably already saw this, but there are tons of un-readable sectors on that disk.

Get a smart report to see the full statistics.

smartctl -a /dev/sda

look for sectors pending re-allocation.

 

 

thanks for help.

to be honest i'm not sure how to read smart report or what it means. attached here.

smartlog.txt

Link to comment

I am running preclear and I used screen.  I reconnected and it looks frozen.  Problem is I don't know if it truly is locked up or if this is a problem with the interface.  When I run ps -al I see the following:

 

0 S     0  3150  2665  0  80   0 -   880 wait   pts/1    00:00:26 preclear_disk.s

1 S     0 24628  3150  0  80   0 -   880 pipe_w pts/1    00:00:00 preclear_disk.s

1 S     0 24634 24628  0  80   0 -   880 wait   pts/1    00:00:00 preclear_disk.s

0 D     0 24635 24634 50  80   0 -  2457 -      pts/1    03:37:41 dd

0 R     0 24636 24634 24  80   0 -   434 pipe_w pts/1    01:44:07 sed

0 S     0 24637 24634  0  80   0 -   524 pipe_w pts/1    00:00:00 awk

4 S     0 24668 24654  0  80   0 -   627 pause  pts/0    00:00:00 screen

4 R     0 24691 24670  0  80   0 -   519 -      pts/2    00:00:00 ps

 

 

Not sure if I screwed this up or if this drive is suspect.  It was laying around and I thought in previous installs it was a problem so I wanted to run it through its paces before I considered using it again.

 

Any suggestions would be greatly appreciated.

 

Thanks,

 

Neil

 

Link to comment

you probably already saw this, but there are tons of un-readable sectors on that disk.

Get a smart report to see the full statistics.

smartctl -a /dev/sda

look for sectors pending re-allocation.

 

 

thanks for help.

to be honest i'm not sure how to read smart report or what it means. attached here.

 

  5 Reallocated_Sector_Ct  0x0033  170  170  140    Pre-fail  Always      -      236

197 Current_Pending_Sector  0x0032  197  196  000    Old_age  Always      -      1029

 

There are 1029 sectors pending re-allocation and 236 that have already been re-allocated.  (Those pending are waiting for a subsequent "write" so the disk can know what should be in the sector it re-locates.)

That disk has failed.  Do not trust it with your data.  RMA it.   

 

Joe L.

Link to comment

i am a noob to unraid , been trying to preclear some 2tb western digital green drives they keep stoping with segmentation fault

 

syslog is attached

Looks to me like you are either running out of memory, or, have faulty memory, or memory where the voltage, timing, or clock speed is not set properly in the BIOS. (Some BIOS get it right automatically, some do not)

 

I suggest a memory test first.

 

Jul 11 10:59:51 Tower kernel: preclear_disk.s[2005]: segfault at 0 ip 0804e595 sp bf9e20d0 error 6 in bash[8048000+a0000]

Jul 11 11:28:14 Tower kernel: preclear_disk.s[9047]: segfault at 7e5f810 ip 07e5f810 sp bf8a8dfc error 14 in bash[8048000+a0000]

Jul 11 11:28:14 Tower kernel: preclear_disk.s[9046]: segfault at 7e5f810 ip 07e5f810 sp bf8a8dcc error 14 in bash[8048000+a0000]

Jul 11 11:28:14 Tower kernel: preclear_disk.s[9052]: segfault at 7e5f810 ip 07e5f810 sp bf8a8fdc error 14 in bash[8048000+a0000]

Jul 11 11:44:40 Tower kernel: preclear_disk.s[11567]: segfault at 7e5f810 ip 07e5f810 sp bf8a8dfc error 14 in bash[8048000+a0000]

Jul 11 11:44:40 Tower kernel: preclear_disk.s[11566]: segfault at 7e5f810 ip 07e5f810 sp bf8a8dcc error 14 in bash[8048000+a0000]

Jul 11 11:44:40 Tower kernel: preclear_disk.s[11572]: segfault at 7e5f810 ip 07e5f810 sp bf8a8fdc error 14 in bash[8048000+a0000]

Jul 11 11:46:16 Tower kernel: swap_free: Bad swap file entry 00002000

Jul 11 11:46:16 Tower kernel: BUG: Bad page map in process preclear_disk.s  pte:4000000000000 pmd:126bc9067

Jul 11 11:46:16 Tower kernel: addr:b78a1000 vm_flags:08000075 anon_vma:(null) mapping:f7086f78 index:144

Jul 11 11:46:16 Tower kernel: vma->vm_ops->fault: filemap_fault+0x0/0x305

Jul 11 11:46:16 Tower kernel: vma->vm_file->f_op->mmap: generic_file_mmap+0x0/0x3f

Jul 11 11:46:16 Tower kernel: Pid: 11805, comm: preclear_disk.s Not tainted 2.6.32.9-unRAID #8

Jul 11 11:46:16 Tower kernel: Call Trace:

Jul 11 11:46:16 Tower kernel:  [<c1057e18>] print_bad_pte+0x182/0x194

Jul 11 11:46:16 Tower kernel:  [<c1058cb5>] unmap_vmas+0x42f/0x64c

Jul 11 11:46:16 Tower kernel:  [<c105c802>] exit_mmap+0x8a/0x102

Jul 11 11:46:16 Tower kernel:  [<c10227ac>] mmput+0x28/0x96

Jul 11 11:46:16 Tower kernel:  [<c1025b53>] exit_mm+0xd3/0xdb

Jul 11 11:46:16 Tower kernel:  [<c1026bbf>] do_exit+0x152/0x508

Jul 11 11:46:16 Tower kernel:  [<c106d659>] ? fput+0x17/0x19

Jul 11 11:46:16 Tower kernel:  [<c1026fdc>] do_group_exit+0x67/0x8d

Jul 11 11:46:16 Tower kernel:  [<c1027011>] sys_exit_group+0xf/0x13

Jul 11 11:46:16 Tower kernel:  [<c1002935>] syscall_call+0x7/0xb

Jul 11 11:46:16 Tower kernel: Disabling lock debugging due to kernel taint

Jul 11 12:04:33 Tower kernel: preclear_disk.s[14549]: segfault at 29e5dd8e ip 0808a400 sp bf8a6180 error 4 in bash[8048000+a0000]

Jul 11 12:08:56 Tower kernel: preclear_disk.s[15210]: segfault at 803db25 ip 0803db25 sp bf8a9040 error 14 in bash[8048000+a0000]

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...