visionmaster

Members
  • Posts

    32
  • Joined

  • Last visited

Converted

  • Gender
    Male

visionmaster's Achievements

Newbie

Newbie (1/14)

0

Reputation

  1. I've posted in the past regarding my issue with the Seagate 8TB ST8000VN0022 drives. ( https://forums.unraid.net/topic/105350-unraid-os-version-692-available/page/5/?tab=comments#comment-981758 ) and after reviewing this thread ( https://forums.unraid.net/topic/103938-69x-lsi-controllers-ironwolf-disks-disabling-summary-fix/ ) ultimately my fix was just to keep these 2 drives of mine set to never spin down as the errors occurred when the drives spin up from a spun down state. Last week I noticed that drive 12 temp wasn't reading out on the dashboard and then disk disabled. I had tried running SMART tests but it wouldn't allow it. I then removed that drive and replaced it with another same model ST8000VN0022 (as I had 2 more of these as spares which had been precleared and ready to go) and parity rebuilt drive 12 normally and completed with 0 errors. Stupidly, I set that drive to spin down at default which is 1 hour, and when it spun up again the read errors started and the disk went into a disabled state. I have not written anything to any disk since before the rebuild. I attempted to find the trust my array procedure to reenable the disk and use the parity as it had just rebuilt with zero errors. But I ended up reassigning the disks in the original configuration and when starting the array, it would only allow it with a parity sync. I left the check box for correct sync errors and was expecting it to run and complete with 0 errors as it had just rebuilt drive 12 with no errors and nothing has been written to any drive. Well, it is 7 hours in and there are 581 sync errors corrected so far. The data on drive 12 is mostly just movies and the few movies I've tested seem to play fine. What would be the best thing to do at this point? I still have the original drive 12 which I plugged into my other test server and is reading normally and appears to have no SMART errors. Should I let it finish rebuilding parity with all these new errors or just replace the original disk 12 back into the array and rebuild the parity again? Maybe a cable wasn't seated properly? Looks like I don't have any data loss at this point... I'd like to keep it that way and get an error free protected array back online ASAP. Thanks for any help, as I'm still a noob with this, eventhough I've been running unRaid successfully for over 10 years... syslog.txt
  2. I tried updating to 6.9.1, 2 weeks ago from 6.8.3 which has been stable for years. I had the Seagate drive issue after the drives spin down and then spin back up, I start getting read errors on 2 versions of my Seagate drives (ST8000VN004 and ST8000VN0022). Once reverting, no issues and stable again. I just tried to upgrade to 6.9.2 and immediately back to same 2 drives and spin up read errors, so I back to stable on 6.8.3. Hopefully fixed on next update. Those are the only type of Seagate drives affected on my system. My 4TB Seagates work fine. My drive controllers are Supermicro (MV88SX6081 8-port SATA II PCI-X Controller and MV64460/64461/64462 System Controller, Revision B).
  3. I've tried to set this up on unRaid 6.5.0 on scheduler without success. I'd like the parity check to run 4 times a year automatically, at midnight once every 3 months. Is this possible in the settings? Thanks! -Rich
  4. OK, thanks for the info! The parity upgrade completed and all still reports 0 errors. So when I get home from work I will upgrade and retire disk 17 that is having read errors.
  5. Hi, I've gotten great help in the past, so I always appreciate this forum. I can upload logs if needed, but this is the scenario, just need to know how I should proceed. I am on unRAID 6.5.0. I am planning to expand my parity to 8TB from 4TB, so I can upgrade a couple of other HDDs to to 8 TBs afterwards. I bought 5, 8TB HGST NAS on black Friday that I just got around to preclearing in my other rig. I was going to recycle the old 4TB drive to my Windows machine, because I needed it there, when the parity upgrade completed, and my goal was to add a second 8TB parity, soon after. The old 4TB parity drive is sitting unused awaiting the completion of this process, and the server has not been used (100% sure no writes to it, and 98% sure no reads from it) I completed a parity sync/check with 0 errors 2 days ago. Yesterday, about 6pm, I powered down and replaced the old 4TB parity with the newly precleared 8TB, checked all cables and started up. I reassigned the parity slot to the new drive and started a rebuild. About 6 hours in, I started to get notifications that disk 17 is giving off SMART raw read errors which are slowly starting to climb. From 32774 at 12:16am to 32956 at 8:11am. Disk 17 is a Samsung 2TB HD203WI. The main GUI screen still reports 0 errors from all drives and the parity rebuild is still moving along about 14 hours in and 24.4% with an estimated finish in 1-2 days as the speed is currently around 42 MB/sec. Disk 17 has 0 pending and 0 reallocated sectors. (SMART report included). So, should I let it finish and make disk 17 the first drive upgraded? In this scenario of letting it finish, am I writing corrupted data to my parity, with these raw read errors on disk 17? Or should I stop, reinsert the old 4TB parity, set it to trust that parity is correct as the server has not been used, and rebuild disk 17 with a spare 4TB I have on hand, then try again to upgrade parity, maybe adding a second parity first with an 8TB drive then upgrade the P1 slot with an 8TB. Or is there something else I should be doing or thinking? Thanks! tower-smart-20180417-0738.zip
  6. Hi, sorry for the late reply. Just saw this. It's mostly for the slow parity check, but several months ago, I had an issue with drives dropping out during a monthly check. That was the only time and since then, no more problems, just slow. I guess I will play with the tunables and wait for now. Thanks!
  7. I haven't had issues in years with my server, but recently have had a few problems including slow parity checks, which may be in part due to the 2 current SAS2LP cards I have. I'd like to change them out for the Dell H310 cards flashed to the LSi firmware in IT mode. I looked on eBay, but since flashing might be an issue, due to lack of hardware other than my server, I'd rather save myself some time and a headache. Anyone selling? I live in Florida.
  8. According to the GUI, the parity check ended by "error code:user abort" finding 718 errors. But I didn't stop it, it stopped itself, 9.5 hours in. I guess this happened when cable to disk 8 and several other cables got unseated? So even though it was doing a monthly parity check with correct errors set to yes, did incorrect parity get written? My question at this point is? Is there away to restart the array with disk 8 in a green state and to recheck the parity? If it comes back all ok, then chalk it up to loose cables and if there are errors this time then assume disk 8 is bad and rebuild with a new disk. I'm not sure I'm explaining my question correctly but it makes sense to me in my delirious state. Thanks.
  9. According to scheduler, in the unRAID GUI, it was set to write corrections. So does that mean that parity is incorrect and rebuilding from parity will place errors on the rebuilt disk? My gut says that the data disks are actually good (even disk 8 which is red balled).
  10. It took awhile to unmount everything and a bunch of errors popped up on several other disks. I went to the cmd console and forced unmount then did a powerdown. All cables seemed fine and I double checked and replugged everything. I powered up and everything came online ok with disk 8 in red ball state but everything else ok. I checked SMART on all the drives and all pass and seem good. 0 reallocated sectors and 0 pending sectors on all 23 drives. The parity check was going well for 9 and a half hours before the disks errored. I'm wondering if a cable of connection came loose which caused the issues. I'm sure nothing was written to the server since yesterday when everything was working fine and before 9.5 hours of parity check. At this point, should I rebuild Disk 8 with a new disk or bring it back online and run another non correcting parity check?
  11. Silly question, but should I powerdown and try reseating all the cables? I just looked and everything seems to be well connected. Will I be in danger of losing any data powering down at this point?