nottlv

Members
  • Posts

    20
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

nottlv's Achievements

Noob

Noob (1/14)

0

Reputation

  1. Thanks Johnnie, I figured as much. So in the last few days I ran a preclear on a 8TB Seagate archive drive and started the parity rebuild by swapping out sde. Parity finished but as I feared, I'm seeing that sdo is showing 28 errors in the error column in the unRAID web GUI home page. So when the parity rebuild runs into a read error on a specific disk, what exactly happens?
  2. I noticed recently that two of the drives in my array (v5.05) were showing error counts in the GUI (I was prepping for a migration to v6). I ran SMART reports on the offending drives, and both drives are showing re-allocated sectors, pending sectors for re-allocation and entries in the SMART error log (attached). Both drives appear to be responsive, but based on the SMART data, are either partially failing or going to fail very soon. What's the best strategy to maximize my chance of no data loss? Run a parity check on the array as is and hope those sectors get properly reallocated on at least one of the drives (and then replace the drive that couldn't be helped), or replace one of the drives first (probably sde based on the higher number of sectors pending allocation) and then run a parity check? smart.sde.txt smart.sdo.txt
  3. Thanks for the info guys. So last night I made the BIOS change on a reboot, and when I brought the array up, for some reason all the disks were unassigned (is that due to the mdcmd command not sticking? I didn't have Joe's post when I did this last night). I had the screenshot of the drive assignments so I just assigned them exactly the way they were before. After doing that, and I can't remember the exact verbiage, but the web interface asked me essentially if I wanted to trust the array when I brought it up, so I did that and started to perform a parity check (not sync) in maintenance mode, with the checkbox to correct errors already checked and grayed out (so it appears I couldn't change that option even if I wanted to). I let that run overnight, and the check was complete and had 2612467 parity updates when I looked this morning. I just started the the array normally, and everything seems to be working fine (though my syslog is below if anyone wants to alert me of any issues). I haven't done a full parity sync (which I know takes around 18 hours or so)--is that necessary at this point after the parity check seems to have corrected any issues? http://66.39.67.208/downloads/syslog-2012-07-03.txt
  4. Rob, Thanks for the advice. I restarted the array and checked the connections; after restarting, the parity drive is a blue ball with status PARITY NOT VALID: DISK_DSBL_NEW and "New parity disk installed" is the array status message. I tried the "Trust Your Parity" steps (without refreshing the unRAID web menu) with the the parity check commands by Joe for version 5.0 http://lime-technology.com/forum/index.php?topic=19385.0, but I'm getting an error message on the parity check. root@NAS:~# cd / root@NAS:/# initconfig This will rename super.dat to super.bak, effectively clearing array configuration. The array must be in the Stopped state and it is up to you to confirm this. Are you sure you want to proceed? (type Yes if you do):Yes Completed root@NAS:/# /root/mdcmd set invalidslot 99 root@NAS:/# /root/mdcmd check /root/mdcmd: line 11: echo: write error: Invalid argument The syslog is below, but I'm not sure how to proceed at this point, as I can't seem to get the parity check to work. If everything looks okay with disk2, I suppose I could run the parity sync, but would I want to run reiserfsck on disk2 first before doing the parity sync to make sure everything is fine? http://66.39.67.208/downloads/syslog-2012-07-02.txt
  5. I was downloading some files recently from the Internet to my unRAID network share (version 5.0.-rc5, recently upgraded from 4.7 so I could use 3TB drives), and I started to get disk access errors in my download manager. I checked the unRAID webpage and it's showing my parity drive (a 3TB Seagate that I just installed two weeks ago; I ran preclear on it before installing) is disabled and disk2 (a 1.5TB Seagate that's been in the array for a few years I'd guess) is showing quite a few errors (819). At this point I'm unsure as to how to proceed--if I rebuild the parity, I could be syncing the errors from disk2 and quite possibly corrupting files on that disk, and I don't think I can rebuild disk2 right now since the parity is disabled. At this point, I have a spare 1.5TB drive but not another 3TB drive laying around that I could use. Any ideas on how to proceed, hopefully without losing the data on disk2? Syslog and screenshot files too large to attach, so I've uploaded them here: http://66.39.67.208/downloads/syslog-2012-06-30.txt http://66.39.67.208/downloads/unRAID-2012-6-30.png Thanks.
  6. Duh...I was looking at the VALUE and not the RAW_VALUE so I thought the drive was on it's last legs. Anyways, thanks for all the help guys.
  7. Thanks Joe, I checked the cables--everything seemed fine. I then re-slotted the drives (my server uses the IcyDock enclosures) just to make sure that wasn't an issue. I then powered up the system--I heard the clicking noise indicative of bad drive. When I went to the unRAID web console, it was showing an odd message. Disk 3 had a red dot and was marked as unavailable, but the error message was that the parity disk was smaller than the largest disk in the array. Disk 3 was showing a total size of over 2.1GB, even though it's a 1.5GB drive (my parity is 2GB Hitachi). I then unassigned disk 3, and was able to successfully start the array. I then tried to use preclear on disk 3, and got the message "Sorry: Device /dev/sdn is not responding to an fdisk -l /dev/sdn command", so I have to assume the drive is beyond recovery at this point. I guess it's another RMA for Seagate. I was still a little concerned about disk 1, so I ran a SMART check on it. Attached is the output; based on your wiki post about interpreting the SMART parameters, this drive looks like it may be on the road to failure, correct? smart_sdm.txt
  8. It appears I may have simultaneous failure of two drives in a 15 drive array. My unRAID server is a LimeTech MD-1510/LI; unRAID version is 4.7. The system has a mix of WD, Hitachi & Seagate drives, including several ST31500341AS 1.5TB drives. I recently had an issue with one of these Seagate drives were the disk was showing as unavailable. I did an RMA advanced replacement from Seagate, got the replacement drive in two days, ran preclear on it with a clean bill of health and rebuilt the array. Everything was then fine. Now, four days later, it appears as if two of the other Seagate drives have failed. I first noticed that a directory had far fewer files in it than it should, so I checked the unMenu page and there were I/O errors listed. I rebooted the array, and now two drives are listed as missing (see unMenu screenshot) and the array won't start. Also attached is the syslog; the two drives are listed as missing so I haven't been able to figure out how to run a SMART test on them (if there is a way, please let me know and I'll certainly do that). Is there any hope of rebuilding the array or am I out of luck when it comes to the data on the two affected drives? syslog.txt