Jump to content

unravelit

Members
  • Posts

    20
  • Joined

  • Last visited

Posts posted by unravelit

  1. Thanks again for your help, you guided me through to a working solution. The parity sync completed a few hours ago, and aside from a handful of errors on one disk, all is well.

     

    That disk that had a few errors is well overdue to be replaced anyway as it has been online for over 7 years (I only realised when checking it's SMART info!). Now things have settled I can now work on replacing the parity drive with a newer, larger drive, and use the "old" parity drive to replace the very old drive...

     

    I really appreciate your patience and help.

     

    Cheers!

    • Like 1
  2. OK, making progress, -L completed:

     

    Phase 1 - find and verify superblock...

    Phase 2 - using internal log - zero log...

    ALERT: The filesystem has valuable metadata changes in a log which is being destroyed because the -L option was used.

    - scan filesystem freespace and inode maps...

    clearing needsrepair flag and regenerating metadata

    - found root inode chunk

    Phase 3 - for each AG... - scan and clear agi unlinked lists...

    - process known inodes and perform inode discovery...

    - agno = 0

    - agno = 1

    - agno = 2

    - agno = 3

    - agno = 4

    - agno = 5

    - agno = 6

    - agno = 7

    - process newly discovered inodes...

    Phase 4 - check for duplicate blocks...

    - setting up duplicate extent list...

    - check for inodes claiming duplicate blocks...

    - agno = 0

    - agno = 1

    - agno = 2

    - agno = 3

    - agno = 4

    - agno = 5

    - agno = 6

    - agno = 7

    Phase 5 - rebuild AG headers and trees...

    - reset superblock...

    Phase 6 - check inode connectivity...

    - resetting contents of realtime bitmap and summary inodes

    - traversing filesystem ...

    - traversal finished ...

    - moving disconnected inodes to lost+found ...

    Phase 7 - verify and correct link counts...

    Maximum metadata LSN (67:3647140) is ahead of log (1:2).

    Format log to cycle 70.

    done

     

     

    I have not brought the array out of maintenance mode yet.

     

     

  3. I really appreciate your time with this. Running the check via the GUI gave this message...

     

    Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this.

  4. Hey all,

     

    I have had unraid running for years on my trusty HP gen3 micro server. I moved the drives to a newer G7 system.

     

    I had to do a new configuration thanks to the SAS card fooling Unraid into thinking I had different serial numbers. So I after placing the drives in their correct slots, and starting with "parity is valid" checked, it was happy and I was able to see all the files.

     

    All seemed well until I worked out 2 of the 4 SAS channels in use are throwing CRC errors...

     

    Originally it was only one, so I relocated that drive to a known good bay and it was happy. Now, in what was most likely a stupid thing to do, I let Unraid start a parity sync after starting as I thought it would be good way to confirm I wont be getting any more CRC errors...

     

    So it did not go well, another drive during this sync threw so many errors that unraid has now disabled it, and the parity has paused. In looking, this drive may actually be faulty beyond my original CRC issue.

     

    So, now I feel like I am in a precarious position, as my parity is now in an unknown state and 30% of my files are not visible (e.g. unraid is not emulating the disabled drive). That is my biggest worry - that the missing drive is not being emulated...

     

    What should my next step be? I am happy to pull the potentially faulty drive and manually recover the files from it later, but am unsure of how to go about this while keeping the current setup with what left set up remaining running.

     

    I have a new 10 TB drive I was going to use to replace the parity drive once things have settled down... looks like things just are not going to my plan...

     

     

    unraid disabled drive.png

    stuff2-diagnostics-20231107-1410.zip

  5. Last Chapter:

     

    So the final story is that I must have had a loose power splitter for the parity drive, which became intermittent after the data drive upgrade. Thanks to the help here I was able to sort out the parity drive, repair parity then finally install the new data drive and reconstruct it.

     

    Unraid Parity sync / Data rebuild: 10-11-2019 17:41

    Notice [STUFF2] - Parity sync / Data rebuild finished (0 errors)
    Duration: 18 hours, 48 minutes, 23 seconds. Average speed: 118.2 MB/s

     

    All working. Thanks!

  6. Just an update, I was in bed last night (it was already near 3am here!) and got the message, so stopped and restarted it in write parity repair.

     

    So far, the parity drive is still healthy. no slow downs. It is at 32.9 percent, running at ~116MB/sec (normal for this rig). 7 hrs 39 mins in, 12 to go.

     

    1158 sync errors repaired, the number has not grown in the past hour or so I have been watching it.

  7. OK, so I pulled the little proliant out and unplugged and REALLY plugged everything back in - it is an annoyingly tight case. Aaaaand.... the parity drive is back - well, at least it is visible but not assigned. And not misbehaving right now (the spin up/down business) - so I grabbed a diag

     

     

    My face is red, as you can't see it. I was sure everything was fine, and have "checked" everything several times. Just ... not well.

     

    stuff2-diagnostics-20191108-1353.zip

  8. 3 minutes ago, trurl said:

    If we could get SMART for parity and it turned out to be OK then you could just proceed with your original plan. Check all connections, SATA and power, both ends. Make sure to check any power splitters along the way. Then post another diagnostic.

     

    If there really was a problem with parity, Unraid may have been able to tell you this before you tried to replace the other disk. Did you check the Dashboard for any SMART warnings?

     

    Do you have Notifications configured to alert you immediately by email or other agent when Unraid detects a problem?

    I have checked and rechecked the parity drive connections.  I will do it one more time :D

     

    I have email notifications set up for immediate alert - the parity drive gave me an overtemp warning (it got up to 46 degrees) today, and does on occasion when the weather is hot and no air con on. No other issues reported. I had received a normal temp alert before doing the changes.

     

    Prior to the change over work, all drives were "thumbs up" for SMART.

     

    I would try it in an external bay to try and pull smart but the one I have only supports 2tb max.

  9. Apologies, I meant to say this in my first post but got distracted.... Originally I had no idea the parity drive was sick, so had assigned the new disk to slot 2 and started the array.

     

    It all looked right and was working, but I noted the rebuild was going to take 200+ days, the write speed was only around 400kb/sec. That is when I found the parity drive was dying - constantly spinning down and restarting.

     

    I stopped the rebuild.

     

    Now I am tempted to put back the original drive in slot 2, but I am thinking that unraid won't know what to do with it.

     

    Cheers

  10. 15 minutes ago, trurl said:

    Just to make sure there is no misunderstanding, what exactly do you mean by "removing it from the array"?

    got it - I was following the usual steps for replacing a drive - I did this:

    Stop the array

    Unassign the old drive if still assigned (to unassign, set it to No Device)

     

    then did a clean shutdown.

     

    Diag zip added. It took a while to boot, and the parity drive is now not even showing up. Looks like it went from sick to dead...

     

     

    stuff2-diagnostics-20191108-1140.zip

  11. Hey all,

     

    A minor issue, but one that has caused me to haul on the brakes until I get advice!

     

    I pulled a 2tb data drive (after removing it from the array) and replaced it with an 8tb unit. Unfortunately on power up, my older seagate 8tb drive which is the parity drive was doing the good old spin up and down routine, and took forever to come up. It does come up then throws read errors and goes offline. SMART stupidly considers the drive healthy, but it obviously is sick.

     

    I am so far not too concerned, as I think I can just "go back" to having the original 2tb drive back in, and replace the parity drive with my new 8tb drive.

     

    But, I also think I need to convince unraid that the 2tb drive belongs in the array so it doesn't try to virtualize it with a dead parity drive?! After removing the 2tb drive from the array, it has been untouched.

     

    I'm running the latest stable release (checked for updates prior to this excitement and I was already up to date)/

     

    What do I need to do to achieve this? Any advice is greatly appreciated.

×
×
  • Create New...