Jump to content

jonlai9

Members
  • Posts

    119
  • Joined

  • Last visited

Posts posted by jonlai9

  1. I just precleared an old Maxtor 200GB drive that passed and tried to mount it as the cache drive, but surprisingly the drive shows up as unformatted when I started the array. Drive is sde. Seems like there's a filesystem problem?

     

    I've attached the syslog start from when I started the array, and the SMART report posted after preclear was finished, which looks fine to me...

     

    Clicking on the cache drive, it shows this, which suggests it should be formatted (I'm assuming):

    Partition format: MBR: 4K-aligned (factory-erased)

     

    What should I do now? Thanks.

    syslog.txt

    smart.txt

  2. I just got the RMA replacement for one of my drives and started running preclear on it.

     

    I'm hearing seek noises from this recertified EARS just a tad louder than to my liking, and since I'm a bit of a paranoia, I'm wondering if there's anything wrong with it, considering it's recertified. I did a short smart test before running preclear and everything looked fine. I know the preclear script is very intensive on the hard drives - would that be the cause of the louder seek noises? As a comparison, the noise level is about the same (or just a tad quieter) as when my 120GB Maxtor (7 years old and still running perfectly fine!) would sound when Windows XP boots up. It's just that.. the rest of my EADS and Seagate Green drives were virtually silent during preclear.

     

    I'll wait and see what happens at the end of the preclear. BTW, does preclear run the short or long smart test?

     

    Would you trust your parity on a recertified? That was what this drive was intended to replace.

     

    If it helps any, I have two 2TB EARS drives and I cannot hear them at all. My unRaid server is very quiet as well.

     

    It's in the Post-read portion of the preclear now and I can't hear any seeking anymore, it was only audible during pre-read. Weird. Let's see how the SMART tests say :)

     

    Seriously, I'm being extra paranoid here because it's a re-certified RMA replacement. My last RMA from WD was a new drive :(

  3. I just got the RMA replacement for one of my drives and started running preclear on it.

     

    I'm hearing seek noises from this recertified EARS just a tad louder than to my liking, and since I'm a bit of a paranoia, I'm wondering if there's anything wrong with it, considering it's recertified. I did a short smart test before running preclear and everything looked fine. I know the preclear script is very intensive on the hard drives - would that be the cause of the louder seek noises? As a comparison, the noise level is about the same (or just a tad quieter) as when my 120GB Maxtor (7 years old and still running perfectly fine!) would sound when Windows XP boots up. It's just that.. the rest of my EADS and Seagate Green drives were virtually silent during preclear.

     

    I'll wait and see what happens at the end of the preclear. BTW, does preclear run the short or long smart test?

     

    Would you trust your parity on a recertified? That was what this drive was intended to replace.

  4. I've attached an updated version of the sleep script. The main aim was to repackage the various bits of functionality that different people have proposed already. There's little new functionality, but everything has been nicely parameterized for easy configuration. It should also be easy to re-code the various activity checks.

     

    The central logic of the script is that countdown to server sleep proceeds in three consecutive steps

     

    0) unRAID puts the HDDs to sleep, absent access to their (uncached) content

    1) a timeout after last HDD goes to sleep [original sleep counter]

    2) a timeout after last external activity, currently

        * TCP access over some 30sec window within the current 1-minute countdown tick

        * ping of specific IP addresses, to ascertain whether media players, etc., are online

    -) the countdown may be suspended altogether at certain hours.

     

    If any previously timed-out conditions are re-activated, subsequent time-out counters are reset.

     

    The attached script should make it straightforward to configure whether and how to do each of these and for how long, and whether to re-new DHCP and re-negotiate for a gigabit connection upon wake-up, and more.

     

     

    Can someone give me a step by step instructions on how to install s2ram and what to edit in this script file and in my 'go' script to work with s2ram? Thanks.

  5. Here's an updated version -- this one you'd have to background from the go script rather than cron-launch.  It has a 15 minute (programmable) countdown from spindown to sleep.

     

    While the old one wouldn't corrupt anything, I feel its control mechanism is pretty clutzy - you could get weird cases where upon wakeup it might immediately re-sleep if the drives didn't spin up.  Or you hit the spindown button on the gui and shortly after the whole thing shuts down.  Now the time intervals are tightly tied together and more predictable for the user:

     

    EDIT:  Threw in a dchp lease-renewal - my router box seems to forget about the lease (expires) and it doesn't seem like I'm getting an immediate renewal upon wakeup if the lease expired.  This rectifies the situation

     

    #!/bin/bash
    
    drives="/dev/hda /dev/hdb /dev/sda /dev/sdb"
    timeout=15
    
    
    count=15
    while [ 1 ]
    do
     hdparm -C $drives | grep -q active
     if [ $? -eq 1 ]
     then
       count=$[$count-1]
     else
       count=$timeout
     fi
     if [ $count -le 0 ]
     then
       # Do pre-sleep activities
       sleep 5
    
       # Go to sleep
       echo 3 > /proc/acpi/sleep
    
       # Do post-sleep activities
       # Force a DHCP renewal (shouldn't be used for static-ip boxes)
       /sbin/dhcpcd -n
       sleep 5
    
       count=$timeout
     fi
     # Wait a minute
     echo COUNT $count
     sleep 60
    done
    

     

     

    Just add this to the 'go' script and it should sleep 15 minutes after all drives are spun down right?

     

    /boot/custom/bin/s3.sh

     

    Just wondering:

    This script doesn't seem to stop the array, just wait for all drives to be spun down, and then sleep. Does the array need to be stopped before sleeping?

     

    Edit: I just did some testing by running echo 3 > /proc/acpi/sleep on the prompt. I stopped the array before sleeping, and I noticed that when I wake the server is stopped. Was I supposed to not have stopped the server before sleeping? Would that have made the server remain started when I wake on LAN? I just felt it made sense that everything should be stopped before sleeping, since it's pretty much similar to shutting down the server.

     

    Thanks.

  6. The RAW values have meaning ONLY to the manufacturer.   Only a few represent actual counts we can interpret.

     

    As an example, the "head flying hours" on your first disk has a raw value of

    234552459001957

    Now, even if not "hours" but seconds, it would indicate the drive was 446,246,575 years old.  Now, it might be... but it is very unlikely.

     

    There is NO standard for the raw values.  You can only compare the NORMALIZED "VALUE" to its affiliated failure "THRESHOLD"

    If higher than the threshold, the parameter is NOT failing.

     

    All your disks are perfectly fine.   The "big" numbers are meaningless.

     

     

     

    Thanks Joe, that's definitely great to hear. What do all the old_age and pre-fail mean then? When I first saw them, they were displeasing, until I saw others who had good drives also had those. Why would they have such statuses if they are fine? :S

     

    I'm concerned about my WD EARS then - my last drive posted. When I was migrating data from the disk in Windows, some files would not get copied, and I was reported I/O Error, 0x8007045d. They weren't important files, and I thought, maybe preclear will fix it, it's probably a corrupted sector or something. But when I tried to preclear it, it just hung. It even brought down unMENU with it, but it was probably just a coincidence. After a few minutes when I got back into unMENU, it reported:

    kernel: end_request: I/O error, dev sdd, sector 1495592 (Errors)

    kernel: Buffer I/O error on device sdd, logical block 186949 (Errors)

     

    That can't be good, right?

     

    Should I run a long SMART on this drive? Thanks.

  7. Just wondering, what merit does the 'threshold' and 'type' have? I see a lot of people have values higher than the threshold where the types are saying pre-fail or old_age... are these of any concern or just the RAW_VALUE is what we care about?

     

    I'm curious as to why my Seagate is reporting some interesting numbers... Aren't these really big numbers?

     

    SMART Attributes Data Structure revision number: 10
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
     1 Raw_Read_Error_Rate     0x000f   116   100   006    Pre-fail  Always       -       115761536 <<<<<<<<<<<
     3 Spin_Up_Time            0x0003   094   093   000    Pre-fail  Always       -       0
     4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       18
     5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
     7 Seek_Error_Rate         0x000f   100   253   030    Pre-fail  Always       -       734918 <<<<<<<<<<<
     9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       87
    10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
    12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       9
    183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
    184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
    187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
    188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
    189 High_Fly_Writes         0x003a   099   099   000    Old_age   Always       -       1
    190 Airflow_Temperature_Cel 0x0022   066   066   045    Old_age   Always       -       34 (Lifetime Min/Max 28/34)
    191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
    192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       3
    193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       18
    194 Temperature_Celsius     0x0022   034   040   000    Old_age   Always       -       34 (0 22 0 0)
    195 Hardware_ECC_Recovered  0x001a   026   017   000    Old_age   Always       -       115761536
    197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
    240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       234552459001957
    241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       1401882945
    242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       1452924888
    
    

     

    These two WD's look alright to me, right?

     

    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
     1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
     3 Spin_Up_Time            0x0027   151   150   021    Pre-fail  Always       -       9416
     4 Start_Stop_Count        0x0032   099   099   000    Old_age   Always       -       1076
     5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
     7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
     9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       3113
    10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
    11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
    12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       248
    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       10
    193 Load_Cycle_Count        0x0032   142   142   000    Old_age   Always       -       175274
    194 Temperature_Celsius     0x0022   117   112   000    Old_age   Always       -       35
    196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
    197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
    200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0
    
    

     

    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
     1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
     3 Spin_Up_Time            0x0027   149   149   021    Pre-fail  Always       -       9516
     4 Start_Stop_Count        0x0032   099   099   000    Old_age   Always       -       1001
     5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
     7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
     9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       3415
    10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
    11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
    12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       254
    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       11
    193 Load_Cycle_Count        0x0032   149   149   000    Old_age   Always       -       155865
    194 Temperature_Celsius     0x0022   117   111   000    Old_age   Always       -       35
    196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
    197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
    200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0
    
    

     

    However, I just tried to preclear another WD today, an EARS, and preclear hung (I think?). I look at the SMART and I see the following... does this mean I should RMA it?

     

    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
     1 Raw_Read_Error_Rate     0x002f   110   110   051    Pre-fail  Always       -       17898 <<<<<<<<<<<
     3 Spin_Up_Time            0x0027   162   161   021    Pre-fail  Always       -       8900
     4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       199
     5 Reallocated_Sector_Ct   0x0033   149   149   140    Pre-fail  Always       -       403
     7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
     9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       607
    10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
    11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
    12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       61
    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       8
    193 Load_Cycle_Count        0x0032   181   181   000    Old_age   Always       -       57559
    194 Temperature_Celsius     0x0022   119   118   000    Old_age   Always       -       33
    196 Reallocated_Event_Count 0x0032   001   001   000    Old_age   Always       -       240
    197 Current_Pending_Sector  0x0032   198   197   000    Old_age   Always       -       939
    198 Offline_Uncorrectable   0x0030   198   197   000    Old_age   Offline      -       857
    199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
    200 Multi_Zone_Error_Rate   0x0008   188   188   000    Old_age   Offline      -       2480
    
    SMART Error Log Version: 1
    ATA Error Count: 17860 (device log contains only the most recent five errors)
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
    Powered_Up_Time is measured from power on, and printed as
    DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
    SS=sec, and sss=millisec. It "wraps" after 49.710 days.
    

     

    Thanks.

×
×
  • Create New...