aim60

Members
  • Posts

    92
  • Joined

  • Last visited

Posts posted by aim60

  1. dgaschk

      One disk has 4 pending sectors, but they're the same 4 that have been there for years.  And the parity check (with all new cables) shows no hardware errors.

     

    garycase

      I will definitely look into check-summing the files.  A backup server is also worth thinking about, as is segregating the really important files so a copy can be taken off-site.

     

    I've also been slowly coming to the conclusion that the only way forward is to do a correcting parity check.  And I've realized that 3000 blocks with errors is only about 3MB of data, so damage may be minimal.

     

    Thanks guys for the input.

  2. As a result of Black Friday purchases, I’ve been upgrading disks, and retiring the oldest ones.  I’ve had a few disk problems in the last 9 months, which turned out to be sata cable related.  My plan was to disturb things as little as possible, do all of the disk upgrades, and when things were stable replace all of the sata cables. Bad decision.

     

    I replaced disk6, a ST31500341AS 1.5TB, with a 2TB WDC_WD20EARX-00PASB0, and initiated a rebuild.  The result was many disk read errors on disk2.  I canceled the rebuild.  From the errors in the syslog I concluded that all of the errors were related to the sata cables.  I replaced all of the older sata cables.  While I was in the case, I also noticed that the power connector to disk2 was not fully seated, and fixed it.

     

    Before continuing, I successfully ran smartctl short disk tests on all disks.

     

    I re-initiated the rebuild on disk6.  This time the result was many read errors on disk5, and unraid marked disk5 as missing.  The syslog again indicated to me that the errors were cable/power related.  Disk5 still had one of the older sata cables, and in hindsight, it was on the loose side.  So I replaced the remaining sata cables in the system.

     

    At this point, I needed to establish confidence in the hardware.  I re-installed the original disk6, and replaced super.dat with the one from before the first disk6 replacement.  The array was set to not auto-start, and I powered up the hardware.

     

    I successfully read over 1GB from each disk with

      dd if=/dev/sdx of=/dev/null bs=65536 count=20000, then initiated a nocorrect  parity check.

     

    The hardware seems stable. The results of the parity check were:

    49 sync errors within 1 second (housekeeping area?)

      1 sync error sometime later

        3000+  sync errors after sector 2930245632

     

    If my calculations are correct, the 3000+ errors all occurred within 16GB of the end of a 1.5TB drive (disk6).  An fdisk of disk6 is attached.

     

    My Question - Since the parity disk reflects the rebuild of a 1.5TB disk6 onto a 2.0TB disk6, might the 3000+ errors all reflect the reiserfs housekeeping of increasing the size of the disk? Or do I have corrupted data?

     

    In other words, can I run a correcting parity check and be reasonably confident that I have no data corruption?  I have no backups and would l like as much as possible to avoid further corruption.

     

    Any suggestions on how to proceed would be greatly appreciated. 

     

    I’m thinking that once things are stable, I’ll run a reiserfsck on all of the data drives.

     

    An observation – anyone running a server without removable drive bays, that does a fair amount of moving/replacing drives, should strongly consider replacing their sata cables regularly.  The ones I just installed are Monoprice sata3 cables, and they seem more secure than any other cables I’ve used.

     

     

    5.0.2RC1, C2SEE, Celeron 1400, 4GB, Corsair VX450, (1) SIL3132 PCIx SATA controller, Intel PCI NIC, 7 drives in total.

    Syslogs_etc.zip

  3. For the people who can resolve the by host name when the server is using dhcp, but not with it is static, this is expected behavior.  Your router is able to resolve the host name as long as the dhcp lease is active.

     

    The following works for me using a DD-WRT router, but it may not work for everybody:

     

    See if your router will allow you to assign a static DHCP address.  That's enough for DD-WRT to permanently resolve the host name.  Since you now know what IP would always be assigned to your nic by dhcp, you can go ahead and assign the address statically.

  4. Hi Joe, found a quirk.  Running unRaid 5.0-rc16c and preclear 1.13.

     

    In the unRaid disk settings, set Enable Auto Start to No, then reboot.  Run preclear_disks.sh -l.

     

    All disks, whether assigned to the array or not, are listed as available for clearing.

     

    Start the array.  Only the correct disks are listed for clearing.

     

    Stop the array.  The correct disks are still listed.

     

  5. If you are planning on using the APCUPSD plugin with a CyberPower, read this.

     

      http://lime-technology.com/forum/index.php?topic=13411.msg127182#msg127182

     

    The plugin shuts down unraid without a problem.  Suggest setting "Power Down UPS after shutdown" to NO, and dedicating the UPS to unraid.

     

    On my CP1500PFCLCD, if you set it to YES, the UPS doesn't shutdown when you would expect it to.  At power fail, it sets an internal 60 minute timer, and shuts down the UPS when the timer expires.  And the timer doesn't reset if utility power is restored!  If you power up the server before it expires, be prepared for an unexpected server crash.  A manual power cycle of the UPS will clear the timer.

     

    There's an alternative to APCUPSD that seems to have Cyberpower compatibility.  No idea what it would take to get it running under unraid.

     

      http://www.networkupstools.org

     

     

  6. But the real question is:

     

    If you have a flash drive with a fully functional 4.7, labeled UNRAID, license key, disks configured, shares set up, etc.

     

    And a flash with a fully functional 5.x,labeled UNRAID,  license key, disks configured, shares set up, etc.

     

    Can you shutdown a server under one OS, pull the flash, copy super.dat to the other flash, insert the other flash, and bring up the server under the other OS.

     

    The goal is that both OS's are happy, no irregularities, and no rebuilding of the raid drive required under either OS. Is the super.dat file the only thing that needs to change when switching flash drives?

     

    I would like to be able to switch back and forth several times before committing to 5.x, with minimal impact.

  7. Rob

     

    Thanks for the great feedback.

     

    Just for the record, I've included the previous syslog. A power outage from Hurricane Sandy forced the server down after 9 months.  Other than a few media errors and a sporatic "frozen" error from  the unresolved Barracuda 7200.11 firmware issues, nothing noteworthy.

     

    UnRaid 4.7 has been so incredibly stable that I haven't wanted to upgrade.  The price I paid was having to purchase another small (2TB) drive, and not having the luxury to wait for a good sale.

     

    I'm looking forward to an equally stable 5.0 release.

    Syslog_2012-10-29.zip

  8. This Server has been totally stable for 3 years.

     

    An ST32000542AS red balled, and I’m pre-clearing a replacement.  In the mean time, I’ve been testing the drive on another pc.  Seatools for DOS comes up clean.  The UnRaid MyMain Smart web page looks good.  Smartctl doesn’t show anything critical.

     

    I’m wondering if the problem is the drive or some other component.

     

    The end of the Syslog shows the drive failure.

     

    4.7 final, C2SEE w/drive on the internal controller, Celeron 1400, 4GB, Corsair VX450, (1) SIL3132 PCIx SATA controller, Intel PCI NIC, 7 drives in total.

    Syslog_etc.zip

  9. Not a good idea.  Some linux kernels unlock the HPA.

     

    Set the drive to 2TB with HDAT2 and verified after a power off.  Booted UnRaid RC8a, and it saw a 3TB drive.

     

    Figured if I ever wanted to upgrade from 4.7 to 5.0, I should stay away from mucking with the HPA.

     

    Anyone holding off for 5.0 Final will have to bite the bullet and buy 2TB drives if you have a drive failures.  Wish I upgraded earlier.

  10. I have a 3TB drive that went through multiple Pre-Clear passes on a test 5.0 system.  I'd like to use the HPA trick to reduce the size to match the 2TB drive, then use it to replace the failed drive.  Its on the on-board controller on a C2SEE, so I wouldn't expect controller problems.  Anyone see potential problems?

     

    HPA procedure here:  http://lime-technology.com/forum/index.php?topic=21325.msg189469#msg189469

  11. Are you single?   ;D

     

     

    Nope..  :)  That seems like a very interesting option.  As much as I don't mind kludging stuff..  I think that is going a bit too far for me.  lol..

     

    @aim60  The cheapest on ebay is just a couple bucks less than GPS, but probably will go with your link.  How long have they lasted?

     

    Battery Geek seems to be the cheapest on ebay:

    http://cgi.ebay.com/RBC25-UPS-Battery-Pack-SU1400RMXL3U-RBC8-RBC23-RBC24-/200468876592

    $52 shipped, versus GPS is $56 shipped. Also, battery geek is 0.5 ah higher, but not sure that is that big of deal for me.

     

    Thanks,

    flips

     

     

     

    They lasted 3 years.

     

    The first set they sent were their own private label batteries.  Their replacements, 3 years later, were another brand.  These guys refurb large (as in pallet jacks and fork lifts) UPS's, so I figured they wouldn't risk their reputation selling junk.

     

    Toured a large datacenter the other day.  They used [uPS's] without batteries.  Utility Power -> [electric motor -> large flywheel -> generator] -> load.  In case of a power outage, the flywheel keeps the UPS going long enough for diesel generators to kick in and replace utility power.  Great idea for anyone that has a diesel generator at home that can kick in in 5 seconds.

  12. Running preclear_disk version 1.7 on unRaid 5.0 Beta6a.  The default partition format is 4K aligned.  Called with

    "preclear_disk -c 2 /dev/sdc" (no -a or -A).  Cycle 1 ran with partition start 64.  Cycle 2 is running with partition start 63.

    Interesting...

     

    I don't doubt you, but I see no way for that to occur...   (In other words, I'll have to test it myself)

    If running with no option specified the "default" will be that you've specified in the unRAID "Settings" page.

     

    The partition start is set prior to entering the "cycle" loop.  It is un-changed (as far as I know) otherwise.

     

    Before I start my test,  are you sure you have 4k-aligned as the "default" set on your server?    (please double-check, so I can duplicate your situation here)

     

    Also, once the second cycle is complete, let me know what the output says.   You might even run

    preclear_disk.sh -t /dev/sdc

    and let it tell you how the disk is partitioned.  I'll be curious what it says.

     

    Joe L.

     

    The default is 4k-Aligned.  I'll run -t when the cycle is done.  Also, the server isn't needed at the moment.  I'll repeat the 2-cycle test to verify.  Any files that would be of use?

     

    Should have done more verification before posting.

     

    What I encountered was the bug that you fixed in version 1.9 (preclear defaults to 63 sector alignment on unRaid 5 with no -a or -A).

  13. Running preclear_disk version 1.7 on unRaid 5.0 Beta6a.  The default partition format is 4K aligned.  Called with

    "preclear_disk -c 2 /dev/sdc" (no -a or -A).  Cycle 1 ran with partition start 64.  Cycle 2 is running with partition start 63.

    Interesting...

     

    I don't doubt you, but I see no way for that to occur...   (In other words, I'll have to test it myself)

    If running with no option specified the "default" will be that you've specified in the unRAID "Settings" page.

     

    The partition start is set prior to entering the "cycle" loop.  It is un-changed (as far as I know) otherwise.

     

    Before I start my test,  are you sure you have 4k-aligned as the "default" set on your server?    (please double-check, so I can duplicate your situation here)

     

    Also, once the second cycle is complete, let me know what the output says.   You might even run

    preclear_disk.sh -t /dev/sdc

    and let it tell you how the disk is partitioned.  I'll be curious what it says.

     

    Joe L.

     

    The default is 4k-Aligned.  I'll run -t when the cycle is done.  Also, the server isn't needed at the moment.  I'll repeat the 2-cycle test to verify.  Any files that would be of use?

  14. I'm currently running horizontally without backplanes.  But my CM590 is filling up, and I'll need to consider 5-in-3's or a Norco.  I've heard that switching a HD's orientation after months or years of bearing wear is an invitation to premature failure.  Any feelings?