Jump to content

System locks up during Parity-Sync or Pre-Clear


Recommended Posts

Hi all

I'm a newbie to Unraid. trying to set up my first system, I've got 3 WD 2tb drives. So far I've got everything installed following the Configuration Tutorial.

Right now I'm at Pre-Clear step. Everything was going fine until about 65% everything just stops working. I cannot telnet into the tower, cannot wake the monitor see anything on screen.

LAN is still flashing, Fans are still running  :-\

I'm stumped. only way I can get back in is to do a hard Reboot.

I've tried also just doing a Parity-Sync before the pre-Clear and again when it's about 60% done or 6 hrs in. everything locks up.

I've retried 4 times now trying different things and I still get the same thing.

 

One of my last e-mail from the tower was this;

 

Pre Read in progress on /dev/sdc: 25% complete.

(  500,200,243,200  of  2,000,398,934,016  bytes read )at 37.6 MB/s

Disk Temperature: -->44<--C,

Using Block size of  1,032,192  Bytes

Next report at 50%

Calculated Read Speed: 36 MB/s

Elapsed Time of current cycle: 3:49:14

Total Elapsed time: 3:49:15

 

Could Unraid shut down the system if the drive gets to hot?

I started the 3 drives at the same time for the Pre-Clear got the above 3 hours in.

 

This was the last Email I got, 5 hours in. So I assume the dev/sdc is still working away

 

Pre Read in progress on /dev/sdd: 50% complete.

(  1,000,400,486,400  of  2,000,398,934,016  bytes read )at 58.4 MB/s

Disk Temperature: 40C,

Using Block size of  1,032,192  Bytes

Next report at 75%

Calculated Read Speed: 54 MB/s

Elapsed Time of current cycle: 5:06:33

Total Elapsed time: 5:06:33

 

 

Any ideas?

syslog.txt

Link to comment

My guess would be a problem with a hard drive. Try preclearing just one at a time (do not power up the other drives). Another easy thing to try is a memory check, run memtest overnight from the boot menu. Although 44 C is a little warm, it should not shut everything down. Can you direct additional air with an external fan as a test?

Link to comment

Thanks for the info

 

I've unplugged the two other drives and am in the process of doing a Pre_clear on the newest drive I have. The two Drives that were getting a little to warm are in a steel cage that is the farthest away from the fan. When I pre-clear them I'll modify the chassis so that more air can pass through the drives.

 

When I started the Pre-Clear I noticed this "Partition 1 does not end on cylinder boundary."

Is that Okay?  ???

 

 

 

Tower login: root

Linux 3.0.3-unRAID.

root@Tower:~# cd /boot

root@Tower:/boot# screen

Pre-Clear unRAID Disk /dev/sda

################################################################## 1.13

Device Model:    WDC WD20EARX-00PASB0

Serial Number:    WD-WCAZA8005251

Firmware Version: 51.0AB51

User Capacity:    2,000,398,934,016 bytes

 

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes

1 heads, 63 sectors/track, 62016336 cylinders, total 3907029168 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Disk identifier: 0x00000000

 

  Device Boot      Start        End      Blocks  Id  System

/dev/sda1              64  3907029167  1953514552  83  Linux

 

Tower login: root

Linux 3.0.3-unRAID.

root@Tower:~# cd /boot

root@Tower:/boot# screen

Pre-Clear unRAID Disk /dev/sda

################################################################## 1.13

Device Model:    WDC WD20EARX-00PASB0

Serial Number:    WD-WCAZA8005251

Firmware Version: 51.0AB51

User Capacity:    2,000,398,934,016 bytes

 

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes

1 heads, 63 sectors/track, 62016336 cylinders, total 3907029168 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Disk identifier: 0x00000000

 

  Device Boot      Start        End      Blocks  Id  System

/dev/sda1              64  3907029167  1953514552  83  Linux

Partition 1 does not end on cylinder boundary.

########################################################################

invoked as  ./preclear_disk.sh -A -M 4 /dev/sda

########################################################################

(-A option elected, partition will start on sector 64)

Are you absolutely sure you want to clear this drive?

(Answer Yes to continue. Capital 'Y', lower case 'es'):

 

 

Link to comment

Thanks for the info

 

I've unplugged the two other drives and am in the process of doing a Pre_clear on the newest drive I have. The two Drives that were getting a little to warm are in a steel cage that is the farthest away from the fan. When I pre-clear them I'll modify the chassis so that more air can pass through the drives.

 

When I started the Pre-Clear I noticed this "Partition 1 does not end on cylinder boundary."

Is that Okay?  ???

If you were partitioning a drive 30 years ago, yes.  Today NO disk has a fixed number of sectors per cylinder.  the message is meaningless on any disk produced in about the past 30 years. 

 

The number of "cylinders" described by the disk is completely fictitious, to satisfy legacy utilities such as "fdisk'

In some legacy OS a partition had to start on a cylinder boundary... And space was wasted if it did not end on one.

 

You can safely ignore the warning.

Link to comment

Been doing the Pre-clear for over 12hrs. getting e-mails like a charm....then all of a sudden they stopped during the Writing of Zeros.

 

Got home this evening and found out that our area had a black out at 10 am.  >:(

 

Lucky me gets to start this process all over again!!

First things first, I put the Unraid on a battery backup. Lets try again!

This is going to be a really long haul doing one drive at a time to see which one is at fault.

 

In the mean time I'm running the other drives on WD data lifeguard diagnostics to see if there is a issue.

 

Now that I think back to when I bought the first 2TB drive, I wanted it for my PCH (Popcorn Hour C200)

I had a ton of issues of the system locking up and just not working after i installed the drive. Ended up calling tech support, spent almost an hour with them and they finally thought it was the MB. Sent it in for repair or replace. They called me back a few days later and told me that they couldn't find anything with the MB...? They said they would replace the MB just to be on the safe side

Got the unit back and still had issues.

Process of elimination, I started to blame the drive

Did some research and people were bitching about the Ears Drives and how they sucked....I ended up testing it with another drive in the PCH. didn't have any issues with a non green 1.5TB drive.

I was now convinced that the drive was defective. I took it back to where I bought it and told them my story. They took it and tested it. Came back with a A+ ???

Really? your Kidding me, Right?

Put my tail between my legs and went home

Left the 1.5TB Non Green drive in the PCH and it has been working like a charm.

 

Forgetting all about the issues I had, I ended up putting the 2TB drive into a D-link DNS323. I needed another drive in order to be able to mirror the other drive.

So I bought another one of the exact brand and size.

Played with the D-link for a few months and got so fed up with the poor speed and lack of support, I was ready for something else.

I was just about ready to shell out some cash for a Qnap, until one day a coworker and I were shooting the BS and we got on the subject of storage, that is when he told me all about Unraid.

Well here I am... again with this one mysterious drive that doesn't want to behave and is causing me allot of grief!

 

just hoping that the WD data lifeguard diagnostics finds a fault in the drive so I can send it back!

 

Oh just got an email

 

Preclear Zeroing Disk progress 25% complete. ;D

 

 

Link to comment

my WDC WD20EARX-00PASB0 passed after 28hrs

 

So far I've run the other Drives through WD software and had no issues. Put it in unraid and try a pre-clear and I can't get past 30-50% MB locks up and that's it.

The two drive are WD20EARS 2.0TB

 

Trying one of the drives again...see what happens.

 

Anyone have issues like this?  >:(

 

Link to comment

Here is the first Drive

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Green (Adv. Format) family
Device Model:     WDC WD20EARS-00MVWB0
Serial Number:    WD-WMAZA1614623
Firmware Version: 51.0AB51
User Capacity:    2,000,398,934,016 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Fri Nov  4 23:38:54 2011 MDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84) Offline data collection activity
                                       was suspended by an interrupting command                                                                              from host.
                                       Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                       without error or no self-test has ever
                                       been run.
Total time to complete Offline
data collection:                 (36360) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                       Auto Offline data collection on/off supp                                                                             ort.
                                       Suspend Offline collection upon new
                                       command.
                                       Offline surface scan supported.
                                       Self-test supported.
                                       Conveyance Self-test supported.
                                       Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                       power-saving mode.
                                       Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                       General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 255) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x3035) SCT Status supported.
                                       SCT Feature Control supported.
                                       SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_         FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -                         0
 3 Spin_Up_Time            0x0027   253   172   021    Pre-fail  Always       -                          1183
 4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -                          451
 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -                          0
 7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -                          0
 9 Power_On_Hours          0x0032   093   093   000    Old_age   Always       -                          5797
10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -                         0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -                         0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -                        32
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -                       28
193 Load_Cycle_Count        0x0032   199   199   000    Old_age   Always       -                      4973
194 Temperature_Celsius     0x0022   109   105   000    Old_age   Always       -                      41
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -                      0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -                      0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -                      0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -                      0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -                      0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA                                                                             _of_first_error
# 1  Short offline       Completed without error       00%      5795         -
# 2  Conveyance offline  Completed without error       00%      5763         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

 

And the Other

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Green (Adv. Format) family
Device Model:     WDC WD20EARS-00MVWB0
Serial Number:    WD-WMAZA1495369
Firmware Version: 51.0AB51
User Capacity:    2,000,398,934,016 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Fri Nov  4 23:42:57 2011 MDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                       was completed without error.
                                       Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                       without error or no self-test has ever
                                       been run.
Total time to complete Offline
data collection:                 (38460) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                       Auto Offline data collection on/off support.
                                       Suspend Offline collection upon new
                                       command.
                                       Offline surface scan supported.
                                       Self-test supported.
                                       Conveyance Self-test supported.
                                       Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                       power-saving mode.
                                       Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                       General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 255) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x3035) SCT Status supported.
                                       SCT Feature Control supported.
                                       SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
 3 Spin_Up_Time            0x0027   253   168   021    Pre-fail  Always       -       1266
 4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       439
 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
 7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
 9 Power_On_Hours          0x0032   093   093   000    Old_age   Always       -       5807
10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       26
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       22
193 Load_Cycle_Count        0x0032   199   199   000    Old_age   Always       -       4621
194 Temperature_Celsius     0x0022   115   104   000    Old_age   Always       -       35
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Conveyance offline  Completed without error       00%      5799         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...