Jump to content

siege801

Members
  • Posts

    31
  • Joined

  • Last visited

Everything posted by siege801

  1. I'm thinking this post as Solution.
  2. Ok, great to know. Next step for me (likely outside the scope of this thread) is to identify the cause of the FixCommonProblems alert about /mnt. This had been spamming my unRAID emails and meant I missed the more critical errors pertaining to the disks/array. @trurl, I can thank you enough. I've sent a small token via your donate link. It's minuscule in comparison, but hopefully it can in some way support you in being the helping soldier to others. Thank you thank you. I'll scan back through this thread and try and find out which of your posts should be marked as the Solution. Any thoughts on which would be most appropriate?
  3. Thanks @trurl Can you advise if this needs to be addressed?
  4. Alright, that looks much better. I pulled out the existing PSU, and upgraded to a unit with enough SATA points to not need splitters. It's also a more powerful PSU. So, whether it was an underpower issue, or a splitter issue, I may never know. But for now it looks good. Looking at the diagnostics attached, what other cleanup is required? lucindraid-diagnostics-20240317-1749.zip
  5. Correct That I can't remember. I'll have to check. I have been wondering whether a larger PSU / a PSU with more built-in cables would be necessary. Until I can get the cabling checked, am I right in thinking I'm still running with at least one parity drive?
  6. Hi @trurl, Ok, I've formatted Disk 5 and it's back in the array. Then, I stopped the array to unassign Parity 2 and Disk 6. I then started the array and let the rebuild commence. However, it's just paused itself. Diagnostics attached. lucindraid-diagnostics-20240308-0759.zip
  7. I'll do it whichever way you think is safest and gets me back into having some parity protection. Breaking down and clarifying the steps: #1 - Format Disk 5 I understand this as being WFL5W6LK I can see this disk listed under Array Operations as you suggested I would. I simply tick the box that says Yes, I want to do this and proceed with the format. Then complete step #5 from "Rebuild to replacement" guide, namely "Assign the replacement disk(s) using the Unraid webGui." I note this guide states: I don't think any of my disks are emulated any more, but I know one certainly was for a time. Am I ok to proceed with the above steps at this point in time?
  8. When you can, could you confirm what my next steps need to be?
  9. Oh right, I misunderstood that re-adding was "adding", but I see the distinction. The 2Tb disk is a "new" disk but it's replacing one that failed a while ago. The 12Tb parity I believe is emulated and just needs to have parity refreshed on it. And the 12Tb that is currently mounted through UD is going back to the array as it was before.
  10. Thanks @trurl. From what I've read, I believe I can only do one operation at a time. From the documentation: NOTE: You cannot add a parity disk(s) and data disk(s) at the same time in a single operation. This needs to be split into two separate steps, one to add parity and the other to add additional data space. Just to clarify, intended outcome is to have 2x 12Tb parity, and the the rest as data. Thanks again! lucindraid-diagnostics-20240229-1716.zip
  11. Hi @trurl, Again, I want to repeat how thankful I am for your help. I fully intend on dropping a donation on your link. I've gone ahead and recovered what I need/want from the lost+found. Would you be able to give a little more guidance on how I now get the two disks back into the array?
  12. This is real progress! Thank you so much again for the help so far. I'm very comfortable on the command line. I've just been working through the output of: du -ahx --max-depth=1 /mnt/disk6/lost+found/ | sort -k1 -rh | less So far I've determined: 3.3Tb in both /mnt/user/lost+found, and /mnt/disk6/lost+found - presumably the same data, but this is to be confirmed. Approximately 9,800 sub directories within the /mnt/disk6/lost+found Approximately 7,500 of these have a directory size > 0 1.9T are Virtual Machine images that I have backed up anyway. Notably, I have backups of the irreplaceable data, but there is further data that is not economically feasible to backup. With that said, the more of it that I don't have to acquire through alternate means the better. I can spend the day working through the contents of the other sizeable sub directories of lost+found and come back to you once I've retrieved what is feasibly useful. Questions: Is it safe to leave the array running? I've stopped the Docker service. Also, is it safe to move content from /mnt/disk6/lost+found into the correct location under /mnt/user/ ? In case I haven't mentioned, my sincere gratitude for your guidance so far!
  13. Diagnostics attached. lucindraid-diagnostics-20240119-0142.zip
  14. And done. Output looks reasonably clean. unRAID file system check without -n second run - disk 6.txt
  15. Ok, I've done that. It looks like maybe it wants me to run that again? End of output: Metadata corruption detected at 0x453030, xfs_bmbt block 0x10081ce0/0x1000 libxfs_bwrite: write verifier failed on xfs_bmbt bno 0x10081ce0/0x8 xfs_repair: Releasing dirty buffer to free list! cache_purge: shake on cache 0x50c6f0 left 5 nodes!? xfs_repair: Refusing to write a corrupt buffer to the data device! xfs_repair: Lost a write to the data device! fatal error -- File system metadata writeout failed, err=117. Re-run xfs_repair. Full output: unRAID file system check without -n - disk 6.txt
  16. Further updates have been moved to this thread:
  17. I've stopped the array and removed B5SX from it. I have then mounted the same as an unassigned device, and it appears to be mounted without a problem (I can browse the directory structure). Looking at your last post, it seems you want me to "New Config B5SX back into the array as disk6 and rebuild parity instead." Could I trouble you for a breakdown of this process? Also, before I let myself get too excited that the disk is mountable. This is the disk that was showing as emulated back in September. What are the chances the mounted data is actually up to date and useful? (But alas, I'll try not get ahead of where we're at).
  18. Thanks for continuing to help. I really do appreciate it. I've attached the output from Disk #6 as a text file. Disk #5 is still running, but has so far only produced: Phase 1 - find and verify superblock... couldn't verify primary superblock - not enough secondary superblocks with matching geometry !!! attempting to find secondary superblock... <snip - a long line of dots> and beneath this it says the check is still Running. Neither output looks promising to me, which is sad. Just a reminder, Disk #5 has only just been added. I assume that is why it's in a broken state, as it likely hasn't been cleanly pulled into the array. unRAID file system check - disk 6.txt
  19. Thanks for the reply, here is the diagnostic with the array running in normal mode. Late night thinking got me to realise, before I started working on the NAS this week, the situation was already at: Parity 1 (12Tb) - OK Parity 2 (12Tb) - OK Disk 1 (2Tb) - OK Disk 2 (2Tb) - OK Disk 3 (2Tb) - OK Disk 4 (2Tb) - OK Disk 5 (2Tb) - Removed (failed months ago and was removed while waiting for warranty replacement) Disk 6 (12Tb) - In some kind of broken state - see original post in other thread. This week, in one move I re-added Disk 5 with the brand new disk and checked all cables, especially re-seating Disk 6's cables. I believe now I shouldn't have re-added Disk 5 until the array was stable with Disk 6 first. BUT... until this recent work, I still apparently had two stable parity disks, and one broken data disk. Even IF Parity #2 is faulty, this situation should still be within my fault tolerance, correct? Therefore, should I re-remove Disk 5, and allow the array to try and rebuild Disk #6 using Parity #1 ? Edit: FYI - The SMART Extended Self-Test completed without error on Parity Disk #2. I'll await your response(s), and thank you again for taking the time. lucindraid-diagnostics-20240117-1011.zip
  20. Hi everyone, I'm desperately looking for some help getting my data and array back to a working state. I've started a new topic because the original post I created was started months ago and I think it's being missed by the wizards in here. I link the original post here because it has the full details and history of what's been tried and tested. Essentially, I'm now at a stage where Disk #5 (12Tb) and Disk #6 (2Tb) are "unformatted". This makes sense for the 2Tb given it's new, whereas the 12Tb is the drive that was originally causing trouble at the beginning of the above post. But the real clanger at the moment is that the second parity disk (12Tb) is in a disabled state. I dare not make any other changes or attempts because I feel like I'm close to irreparable damage. I welcome any good news and guidance you can all bring. lucindraid-diagnostics-20240117-0141.zip
  21. Update #9: I seem to be missing significant amounts of data on the resumed array, which I assume is because I at least need either the 12Tb Parity #2 disk, or the other 12Tb array disk online and functional in order to rebuild the array with all data. I'm going to try connecting the Parity #2 disk direct to the motherboard's onboard SATA to see if I get read errors that way. Previously when I connected it this way (see update #6) I did not receive read errors, but given the slower interface, the rebuild was threatening to take days. Maybe that's what I'm forced endure? Update #10: Parity Disk #2 shows Parity device is offline. I've got it running an extended SMART test as per the suggestion above. I'm going to drop in on the Discord group to see what advice I can get.
  22. Thanks! Based on the above, do you agree it's likely failing? Also, do you have any input regarding the two unformatted disks?
  23. Update #1: With the number of read errors on Parity disk #2, I figure that disk is not detecting correctly. I've powered down to re-check the cables again. Update #2: Checked the cables on Parity #2, they seem fine. Booted back up and started the sync/rebuild, same issue with numerous read errors on Parity #2. I'm powering down now to swap cables with another disk. Update #3: With the server up on the desk, I noticed what sounds like a disk repeatedly powering down / powering up coming from Parity #2. I've connected Parity #2 to a completely different power channel and the same sound occurs, and the same read errors occur. I'm beginning to suspect a failed disk. To be sure, I'll swap the data cable as well and try again. Update #4: With the power cable swapped to a previously unused power channel, and the data cable swapped with a known-good line, the issue has remained with Parity Disk #2. Unless someone has suggestions, I can't think of any other cause than a failure on this disk. I'm going to pull it out of the server and connect it to a 3.5" disk dock and see if the power off/on sounds persist. Update #5: Having connected the disk to a USB disk dock, it seems to be running stable. The disk is currently running a "long" smartctl test (smartctl -t long). The short test passed. I'm now beginning to suspect a power supply issue in the NAS. Update #6: Sigh. I stopped the long test and connected the disk back to the NAS via a sata cable directly to the motherboard (not the LSI RAID card). It seems to be stable. Maybe I'm back to a RAID cable issue? Update #7: I plugged the disk back into the LSI on the same port. Disk detects and can run a short-test through unRAID's GUI. However, attempting to run the rebuild again returns numerous read errors on the disk (Parity #2). So I powered down again and swapped the base of the raid cables on the LSI ports. (To clarify, the LSI has two ports for the 4-way split SATA cables. I've now swapped Cable A from LSI port 1, to be in LSI port 2). Still Parity Disk #2 gives read errors, despite being in a different raid port on the LSI. So, I then scrambled the SATA cables across four disks on the same RAID channel, booted up, and attempted the rebuild. The rebuild bombed out in seconds and immediately Parity disk #2 went offline. Can I be convinced the disk is failing/has failed yet? Update #8: Ok, so the array rebuild completed with Parity Disk #2 offline. As a reminder, compared to where I started, Parity Disk #2 is now powered by a different power channel from the PSU, a different RAID port on the LSI, and is using a different SATA cable. I think I can be convinced the disk is as fault here? HOWEVER... Having brought the array online, I'm faced with the new 2Tb drive, and the existing Disk #6 (12Tb) showing as Umountable: Unsupported or no filesystem. I have no idea how to safely proceed from here. My outstanding issues are: Disks 5 and 6 showing as unmountable. Parity Disk #2 status unknown. How can I determine that the disk is faulty surely enough for warranty? For now, I've pulled the array back down and will await the wisdom of the wizards in here. Please help. Updated diagnostics attached. lucindraid-diagnostics-20240114-1641.zip
  24. So I finally got around to this (newborn entered the house, and digital life went rightly out the window). I pulled it open and checked all cables, re-seating specifically the cable connecting to the drive that was showing as disabled. Everything seemed fine. I also added a new 2Tb disk to replace the missing disk. Currently the array is rebuilding, but it seems to be stuck. The drive stats all seem to be remaining the same, and the percentage complete hasn't moved in over 30 minutes. Updated diagnostics attached. Any guidance would be greatly appreciated. lucindraid-diagnostics-20240114-1312.zip
  25. Thanks JorgeB, So advice (at this stage) is to pull her open and check the cables?
×
×
  • Create New...