KRiSX

Members
  • Posts

    21
  • Joined

  • Last visited

Community Answers

  1. KRiSX's post in Moving drives from onboard SATA to PCIe Card was marked as the answer   
    drives are recognised based off serial number, so as long as the drives are reported the same way it should be fine
  2. KRiSX's post in 4 drives randomly kicked up errors, including 2 parity drives - what to do next? was marked as the answer   
    Hey all, bit freaked out right now, I have been fast approaching finishing my build and transferring everything across from my old setup when I've just hit a whole boat load of errors on both parity drives and 2 of the drives that had data copying to them. I've stopped doing any transfers and don't want to touch anything until I know what to do next. Logs attached.
     
    Essentially I'm seeing a whole heap of "md: diskX write error" messages. I am now also seeing a lot of "Failing async write on buffer block" messages.
     
    For the record, all of these drives that now have errors showing have been in service for years without issue. I didn't pre-clear them, I just let unraid do its clear and then formatted them as it seems that is acceptable with known good drives.
     
    Hopefully this isn't the end of the world and I can simply resolve it with a parity rebuild or something along those lines.
     
    UPDATE #1: array appears to not be allowing any writes at this point either, going to stop all my docker containers and have kicked off extended tests on the 2 x 6tb's that have kicked up errors - have done a short test on the 2 x 8tb parity drives and they are showing as ok, maybe I need to do an extended test though?

    UPDATE #2: I've stopped the array as it was non-stop generating the same lines in the logs "Failing async write on buffer block" for 10 different blocks. When stopping the array also noticed this "XFS (md13): I/O Error Detected. Shutting down filesystem" and "XFS (md13): Please unmount the filesystem and rectify the problem(s)" - so perhaps disk 13 really isn't good like I thought?
     
    UPDATE #3: Restarted the array to see what would happen, array started, appears to be writable now, no errors being produced in the logs - parity is offline. Going to keep everything else (docker) shut down until the SMART tests are complete on the 2 x 6tb's unless someone advises me otherwise.
     
    UPDATE #4: Looking at the logs a bit harder, it seems my controller (Adaptec 6805) had a bit of a meltdown which is why I think the errors occured - I've since restarted the server which has cleared all the errors, but parity is still disabled. I'm going to continue running without parity until after the extended SMART test finishes on the 2 x 6tb's and at this point may just keep it disabled until I've finished moving data across anyway. I also ran xfs checks on each disk to be sure they were all healthy. Not sure there is much else to do apart from wait for scans to finish and then rebuild parity. Would still appreciate any feedback anyone may have

    I also found this article.... seems old... but I confirmed the timeout is set to 60... would changing it per drive as instructed cause any issue? https://ask.adaptec.com/app/answers/detail/a_id/15357/~/error%3A-aacraid%3A-host-adapter-abort-request
     
     
    newbehemoth-diagnostics-20220218-1114.zip