Jump to content

wheel

Members
  • Content Count

    196
  • Joined

  • Last visited

Community Reputation

1 Neutral

About wheel

  • Rank
    Advanced Member

Converted

  • Gender
    Undisclosed

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. I was kind of hoping that’d be the case, but felt like it’d be safest to check when playing with Parity on a massive array I haven’t moved to dual Parity yet. Thanks for the help!
  2. Same situation as OP, but I’m physically moving my Parity Disk to a slot currently holding a data disk. Just completed an unrelated Parity check, so timing seems perfect. Anything I need to do differently, or swap disks / new config / re-order in GUI / trust Parity works just as simply for (single) Parity In 6.8.3? Thanks for any guidance!
  3. Yeah, I'm just reading tea leaves at this point and hoping there's something obvious I'm missing. I have at least two (could be three in a couple of days) theoretically fine 8TBs ready to roll, and the original 6tb that was throwing up errors (which may have nothing to do with the disk, now) before the rebuild. GUI shows the rebuild ("Read-Check" listed) as paused. I'm guessing my next steps without a free slot to try are going to be: Cancel rebuild ("Read Check"). Stop array, power down. Place (old 6tb? another different 8tb?) into Disk 12 slot. Try a rebuild again today (since I'm guessing unraid trying to turn the old 6tb into an 8tb but failing mid-rebuild means I can't simply re-insert the old 6tb and have unraid automatically go back to the old configuration?) Any reasons why I shouldn't other than the fact that I'm playing with fire again with another disk potentially dying while I'm doing all these rebuilds? I'm starting to think my only options are firedancing or waiting who knows how long for an appropriate hotswap cage replacement and crossing my fingers that I'll physically rebuild everything fine (and I'm almost more willing to lose a data disk's data than risk messing up my entire operation).
  4. Unfortunately not - it's an old box (first built in 2011, I want to say?), four 5-slot Norco SS-500 hotswap cages stacked on each other in the front. Nothing ever really moves around behind the cages, and the only cable movement that I can recall since I first built it was when unplugging/replugging the cage's breakout cables when replacing the Marvell cards with LSIs back in December (and these issues with disk 12 started occurring maybe a quarter of a year later). The hotswap cage containing Disk 12's slot is the second up from the bottom, and could be a massive pain to replace (presuming I can find a replacement of such an old model, or one that doesn't mess up the physical spacing of the other 3 hotswap cages). Edit 2: any chance the rebuild stopped at *exactly* 6tb could be significant? Feels like a bizarre coincidence.
  5. Soooooo something may be up with the Disk 12 slot. That 6tb couldn't finish an extended smart test, so I dropped what I was pretty sure was a fine 8TB (precleared and SMART ok after being used in another box for a couple of years) into the slot for the rebuilt. Had a choice between using an SMR Seagate and CMR WD and used the WD. Rebuild was interestingly exactly 75% complete (right at the 6tb mark) and the new 8tb in the Disk 12 slot started throwing up 1024 read errors and got disabled. My instinct's to throw another 8TB spare in the slot and try it again, but something feels weird, so here's the diagnostics. Am I reaching a point where something's likely wrong with the hotswap cage and I'm going to need to buy / replace that whole thing again? tower-diagnostics-20200605-0534.zip
  6. OK, running extended test now - hate that it's consistently throwing up errors and need to replace a 6tb soon anyway, but definitely don't want to throw out disks unnecessarily during what could be a weird economic time for getting new disks. Thanks for the quick response!
  7. The sync (vs disk) correcting parity check was a total brain fart on my end, and I'm hoping it turned out okay (no error messages but I'll go back to check the underlying data as soon as I can). I was just writing to disk 12 and the GUI threw up a read error, so I immediately pulled diagnostics to send here. I have a precleared 8tb spare ready to replace Disk 12's 6tb, and I'm leaning towards just shutting down and throwing that thing in there to start a Disk 12 rebuild/upgrade now - any reasons I shouldn't do that in terms of better-safe-than-sorry? Thanks for all the guidance! tower-diagnostics-20200604-1054.zip
  8. Weird Disk12 happenings again. I had an unclean shutdown with someone accidentally hitting the power button on my UPS that powered two unraid boxes. One booted back up and prompted me to parity check. One (this one) weirdly gave me the option for a clean shutdown, which I took, then started back up. No visible issues, but felt paranoid, so ran a non correcting parity check before modifying any files. ~200 read errors on Disk 12. Ran correcting parity check. Tried collecting diagnostics at every possible opportunity to help see if anything weird turned up that anyone else might notice: 5-27: right after "unclean" / clean shutdown 5-29: after non-correcting parity check 5-30: after correcting parity check tower-diagnostics-20200530-2053.zip tower-diagnostics-20200529-2000.zip tower-diagnostics-20200527-1017.zip
  9. Thought I'd update in case it helps anyone else searching threads: the 3.3V tape trick worked, so I'm not sure what the root problem was, but if anyone has these drives working in some SS-500s but not others, rest assured the tape trick should work on those other SS-500 cages.
  10. Nice. So the seeming disk-after-disk issues associated with slot #12 are probably just coincidental? Both the 166k error drive from March and the swiftly-disabled disk this month were pretty old (the latter being a white label I got maybe 4 years ago?), so it makes sense, but the recurrence of #12 issues definitely caught my attention in a single-parity setup.
  11. 5/12 (Diagnostics After 299 Sync Errors Non-Correcting Check) 5/13 (Diagnostics After Correcting Check) 5/14 (Diagnostics After Final, Non-Correcting Check) Hope these help figure out what's going on with the 12 slot (if anything!) tower-diagnostics-20200514-0549-FINAL-NONC-CHK.zip tower-diagnostics-20200513-0732-AFTER-CORR-CHK.zip tower-diagnostics-20200512-1054-AFTER-299ERROR-NONC-CHK.zip
  12. Added to the plan: extra diagnostics sets. I'll report back here with those in ~48 hours or so. Thanks a ton!
  13. That makes sense - trick is, I haven't run a correcting check since the one back in March described above. The check I ran after installing the replacement drive on Sunday/Monday was non-correcting, and that's the same one that's finishing up right now. It does sound like now's the time to run a correcting parity check, with a plan to run a non-correcting check after that check (two checks total, starting this morning) to make sure I don't have a bigger issue specific to Disk 12's hotswap cage considering the consistent issues across disks that may or may not be coincidentally occurring there. (Really, really hope I don't need to replace a middle-of-the-tower hotswap cage in a pandemic, but technically easier than moving everything to a new build...) Thank you both for the help and guidance, JB & trurl!
  14. Sounds like a plan: check's almost done and about to start another one. Presuming it's best to run a non-correcting one to be safe - or should I run this one as correcting, then run another to see if new (vs additional) sync errors appear? Edit: the sync errors stopped growing after they hit 299. Looks like they've stayed stable there overnight and the check's almost done, so definitely a lower volume of errors than last time Disk 12 (or its hotswap slot) started going screwy.