Jump to content

[SOLVED] Everything is slow, slow, sllllooowwwww


odd1

Recommended Posts

Posted

Hi all,

I've had the array up for close to a year now (I think). The array consists of 9 disks, including parity, of various sizes. All are SATA. No Cache drive (now). I run YAMJ and primarily use the array to feed my 7 popcorn hours around the house. Recently, I've been having some problems with speeds on the array. Mounting, Unmounting, writing to, etc... I say recently because this all started happening after I incorporated my cache drive into the array because I had run out of space on the other drives and never really used the cache drive. Did I miss something in that process?

 

As I write this, I am copying a 5.5GB file from my main PC to my tower/movies dir and am getting speeds of between 250KB/s and 1.8MB/s! It has gotten to the point where I can't even download anything directly to the array because it can't write fast enough and it keeps giving me errors in my torrent client. I tried to stop the array the other day to reboot and it took over an hour to unmount all of the drives! It now takes me over 1.5 hours to do a YAMJ scan! What the heck is going on?!

 

I set this up and have not had any problems until now. I have not had to tinker with anything so I am not as savvy as some of you all are with the inner workings of Linux and UnRaid. In order to help me at all, I know you will need more info, just let me know what you need & I will get it.

 

I really need to figure out what is happening. Please help!!! My family has gotten used to a certain level of convenience with the media delivery and now I am not able to keep up! ARGGGGGHHHHH....

 

thanks all....

Posted

All drives but my newly converted cache drive are reading 99-100% full. Could this be the problem? If so, is it just a matter of moving files from the full drives to the empty one or is there something more I need to do?

 

Right now the same file is still copying and none of the other drives are even spinning, only the one semi-empty one... Oh! it just finished the copy. 5.45GB took 1:21:27!

 

Thanks,

Posted

OK I cleaned up some stuff and moved some things around and no drive is near 100% any more. The problem is still there though. I've reset all routers and switches and rebooted both machines many times. It's not a network problem.

 

I tested moving a 421MB file from the tower to my PC and it took 10 sec. The same file took 6 minutes to be moved back to the tower. It seems to only slow down during writes not reads.

 

Anyone have any ideas?

 

Thanks...

Posted

sdc is my parity drive. I did both a short and long SMART self test today and these are the results as reported in the status report:

 

 

SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Extended offline    Completed: read failure      90%      5944        158494197

# 2  Short offline          Completed: read failure      10%      5944        245286409

 

Could someone help me interpret this?

 

 

 

 

The complete report:

smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build)

Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

 

=== START OF INFORMATION SECTION ===

Model Family:    Western Digital Caviar Green family

Device Model:    WDC WD20EADS-00W4B0

Serial Number:    WD-WCAVY6087956

Firmware Version: 01.00A01

User Capacity:    2,000,398,934,016 bytes

Device is:        In smartctl database [for details use: -P show]

ATA Version is:  8

ATA Standard is:  Exact ATA specification draft version not indicated

Local Time is:    Wed Jan 11 15:54:44 2012 CST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x84) Offline data collection activity

                        was suspended by an interrupting command from host.

                        Auto Offline Data Collection: Enabled.

Self-test execution status:      ( 121) The previous self-test completed having

                        the read element of the test failed.

Total time to complete Offline

data collection: (41580) seconds.

Offline data collection

capabilities: (0x7b) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: (  2) minutes.

Extended self-test routine

recommended polling time: ( 255) minutes.

Conveyance self-test routine

recommended polling time: (  5) minutes.

SCT capabilities:       (0x3035) SCT Status supported.

SCT Feature Control supported.

SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME                FLAG    VALUE WORST THRESH    TYPE      UPDATED  WHEN_FAILED    RAW_VALUE

  1 Raw_Read_Error_Rate          0x002f    200    200        051      Pre-fail    Always          -                            0

  3 Spin_Up_Time                        0x0027    237    234        021      Pre-fail    Always          -                            10125

  4 Start_Stop_Count                  0x0032  100    100        000      Old_age  Always          -                            434

  5 Reallocated_Sector_Ct          0x0033  177    177        140      Pre-fail    Always          -                            181

  7 Seek_Error_Rate                    0x002e  200    200        000      Old_age  Always          -                            0

  9 Power_On_Hours                  0x0032  092    092        000      Old_age  Always          -                            5952

10 Spin_Retry_Count                0x0032  100    100        000      Old_age  Always          -                            0

11 Calibration_Retry_Count      0x0032  100    253        000      Old_age  Always          -                            0

12 Power_Cycle_Count              0x0032  100    100        000      Old_age  Always          -                            23

192 Power-Off_Retract_Count    0x0032  200    200        000      Old_age  Always          -                            8

193 Load_Cycle_Count              0x0032  136    136        000      Old_age  Always          -                            193856

194 Temperature_Celsius          0x0022  120    116        000      Old_age  Always          -                            32

196 Reallocated_Event_Count    0x0032  080    080        000      Old_age  Always          -                            120

197 Current_Pending_Sector    0x0032  197    197        000      Old_age  Always          -                            1001

198 Offline_Uncorrectable          0x0030  198    198        000      Old_age  Offline            -                            753

199 UDMA_CRC_Error_Count    0x0032  200    200        000      Old_age  Always          -                            0

200 Multi_Zone_Error_Rate      0x0008  200    184      000      Old_age  Offline            -                            22

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

Num  Test_Description    Status                          Remaining      LifeTime(hours)    LBA_of_first_error

# 1  Extended offline    Completed: read failure      90%            5944                        158494197

# 2  Short offline          Completed: read failure      10%            5944                        245286409

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

 

Posted

Your drive is dying.  Both your LONG and SHORT tests aborted on read errors when they encountered un-readable sectors.

 

It has 1001 unreadable sectors pending re-allocation when next written, and 181 unreadable sectors it has already re-allocated.

5 Reallocated_Sector_Ct          0x0033  177    177        140      Pre-fail    Always          -                            181

197 Current_Pending_Sector    0x0032  197    197        000      Old_age  Always          -                            1001

 

Can you say RMA?    I would replace the drive with a good one as soon as possible.

Posted

That's what I was afraid of. This is a brand new drive! Less than 3 months old!

 

Does the spin-up/spin-down process shorten the lives of these drives? I've had drives in my main PC (pre Unraid) that stayed up all the time and lasted for 5 years or more.  Since I moved to unraid, I've lost 3 drives!

Posted

It's been stated that spin up spin down can shorten the life of drives, but if that were true, manufacturers would suggest leaving them spin.

Instead there is firmware to spin down idle drives.

 

What affects drives allot is any kind of power fluctuations. It can ruin a few sectors.

 

You could try running badblocks in destructive write mode to force reallocation of the sectors.

But I'm not sure it's worth it if the smart test says failing now.

I did not see any FAILING_NOW status, so you could try. It may help refresh the format on the drive.

 

FYI, you would need to take it out of the array before doing the badblocks.

It will take about 2-3 days to do a 4 pass badblocks test.

After that you can check the SMART output and decide if you want to RMA the drive.

 

Posted

Fixed it. Did a parity check. Took two days to complete, corrected one sync error and now everything is running normally. Transfers to the array are now running 25-30MB/s.

 

I'll keep an eye on the drive. Thanks to all who helped me track this down.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...