October 23, 20187 yr Good Morning, I had a disk fail, which i replaced. However, sometime before the parity sync/rebuild finished there was a power outage. When i booted the unraid back up - the disk assignments were lost. Through my research it sounds like its just critical that i get the parity disk correct - the other disks can be put in any location without negative consequence. So, i determined which disk was the parity, and have put it in the parity slot, and my other disks all in the disk # slots. What I am unsure of, is should I now start the array as normal, start it with "parity is already valid" selected, or do something else entirely? Whats throwing me off is that all of the disks are recognized as a "New Device" right now (blue square). I want it to rebuilt the data on the failed disk, and trust the data on the others and the parity. How do I go about this without destroying everything? Thanks!
October 23, 20187 yr Community Expert 25 minutes ago, daemian said: Through my research it sounds like its just critical that i get the parity disk correct Yes in a normal situation, during a rebuild it's not so simple, start by posting the release your using, as well as if you're single or dual parity, or post the diagnostics.
October 23, 20187 yr Author Sure, version 6.5.3 single parity config. Diagnostics attached. Thanks dt-ur01-diagnostics-20181023-0850.zip
October 23, 20187 yr Community Expert A couple of questions: Do you know what disk you were rebuilding, not the old disk#, the actual disk serial or current disk#? Is parity the 6TB Hitachi or one of the currently assigned data disks?
October 23, 20187 yr Author Quote Do you know what disk you were rebuilding, not the old disk#, the actual disk serial or current disk#? I am pretty certain it is WCC4N0334109. I say that because i put all of the drives in as data drives, and strted the array (with no parity). The other 3 looked fine, but that one showed "Unmountable: No file system". I presume that would be because the power failure occurred before the parity sync finished. Quote Is parity the 6TB Hitachi or one of the currently assigned data disks? The 6TB drive is the parity. Edited October 23, 20187 yr by daemian
October 23, 20187 yr Community Expert 1 minute ago, daemian said: I am pretty certain it is WCC4N0334109. I say that because i put all of the drives in as data drives, and strted the array (with no parity). The other 3 looked fine, but that one showed "Unmountable: No file system". I presume that would be because the power failure occurred before the parity sync finished. If parity is the 6TB then that's likely it, though it would have been best if the data disks were mounted read-only, but this should still work: -Tools -> New Config -> Retain current configuration: All -> Apply -Assign any missing disk(s) like parity -Important - After checking the assignments leave the browser on that page, the "Main" page. -Open an SSH session/use the console and type (I'll assume disk to rebuild is still disk1 if not adjust the command): mdcmd set invalidslot 1 29 -Back on the GUI and without refreshing the page, just start the array, do not check the "parity is already valid" box, disk1 will start rebuilding, disk should mount immediately but if it's unmountable don't format, wait for the rebuild to finish and then run a filesystem check
October 23, 20187 yr Author So I just want to double check, this is what the screen looks like now: I have issues this command at the CLI I have not refreshed or left the page. Now I am going to start the array, without the "Parity is already valid" selected. Is that all correct? Thank you for your help!
October 23, 20187 yr Author Sorry, to be a pest, when I click start its warning me "Parity disk(s) contents will be overwritten" -your sure, right?
October 23, 20187 yr Community Expert Yes, it's normal, the GUI doesn't take into account the invalid slot command, as long as you typed the command correctly and didn't refresh the GUI Unraid won't touch parity and start rebuilding disk1 instead.
October 24, 20187 yr Author OK - so the rebuild is completed. Now in the GUI disk 1 shows as "Unmountable: No file system"
October 24, 20187 yr Community Expert 59 minutes ago, daemian said: OK - so the rebuild is completed. Now in the GUI disk 1 shows as "Unmountable: No file system" A rebuild does not fix an “unmountable” problem as it works at the physical sector level, not the file system level. You normally need to run the file system repair tools to fix the unmountable state.
October 24, 20187 yr Community Expert 1 hour ago, daemian said: OK - so the rebuild is completed. Now in the GUI disk 1 shows as "Unmountable: No file system" Possibly the result of starting the disks read-write before without parity before, or worse, parity is not in sync, either way try a filesystem check: https://wiki.unraid.net/Check_Disk_Filesystems#Drives_formatted_with_XFS or https://wiki.unraid.net/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui
October 24, 20187 yr Community Expert P.S. I didn't notice at first since I didn't check the complete syslog but you also have problems with your cache pool, there are read and write errors on both devices, but mainly cache1: Oct 23 08:04:35 dt-ur01 kernel: BTRFS info (device sdi1): bdev /dev/sdi1 errs: wr 166, rd 1, flush 0, corrupt 0, gen 0 Oct 23 08:04:35 dt-ur01 kernel: BTRFS info (device sdi1): bdev /dev/sdh1 errs: wr 863327568, rd 506341990, flush 65261822, corrupt 0, gen 0 These are hardware errors and with SSDs usually the result of bad cables, after replacing them run a scrub and check that all errors were corrected, though if you're using any NOCOW shares there might be some undetected corruption there.
October 25, 20187 yr Author Thanks for pointing out the cache drive - I will check that out when i can. For the original issue, when I try to run xfs_repair I get the following error: root@dt-ur01:~# xfs_repair -v /dev/md1 Phase 1 - find and verify superblock... - block cache size set to 2290880 entries Phase 2 - using internal log - zero log... Log inconsistent (didn't find previous header) failed to find log head zero_log: cannot find log head/tail (xlog_find_tail=5) ERROR: The log head and/or tail cannot be discovered. Attempt to mount the filesystem to replay the log or use the -L option to destroy the log and attempt a repair. Do i try it with the -L options? It sounds like that may result in [more] data lose, but perhaps I don't really have any other option? Thank you again for all of your time and assistance.
October 25, 20187 yr Community Expert 9 minutes ago, daemian said: Do i try it with the -L options? Yes, usually there's no data loss.
October 25, 20187 yr Author well -L didn't get me any further root@dt-ur01:~# xfs_repair -Lv /dev/md1 Phase 1 - find and verify superblock... - block cache size set to 2290880 entries Phase 2 - using internal log - zero log... Log inconsistent (didn't find previous header) failed to find log head zero_log: cannot find log head/tail (xlog_find_tail=5)
October 25, 20187 yr Community Expert This means the rebuilt disk has more serious corruption, either parity wasn't valid before or possibly the result of mounting the disks read-write before rebuilding, like I mentioned disks should be mounted read only since there will always be some filesystem housekeeping that won't be reflected in the existing parity, since it wasn't assigned, btrfs you'll usually never survive this, reiserfs usually survives without issues, xfs most times should survive but other times might not.
October 26, 20187 yr Community Expert One thing I forgot to mention, I've seen the error above as a result of a hardware issues before, and looking at your diags I see you're using the onboard Intel controller, and that's good, but it's set to IDE mode, change it to AHCI in the bios and try xfs_repair again.
October 26, 20187 yr Author Thanks johnnie. I believe i got the controller running in AHCI mode now instead, but the xfs_repair still fails the same. How could I confirm that it is now running in AHCI?
October 26, 20187 yr Community Expert It's correct now, a couple more things you can try: upgrade to v6.6.2 since it has a newer xfs_repair release and if that still fails connect that disk to another pc, it would lose sync with parity but it might be worth a try.
October 26, 20187 yr Author Thanks Johnnie. I upgraded to 6.5.3 and tried xfs_repair against. Still no luck. Putting this disk in another machine is not really an option for me with this one (I am remote to the site, and there are not much in the way of resources there). I think I may need to bite the bullet and just format the drive, conceding that the data from that drive is lost. Its probably not really that big of a deal. Obviously not ideal, but I don't think I have much other choice. Would I just format that drive and then run a parity check to be sure everything is ok?
October 27, 20187 yr Community Expert Just formatting is enough, parity will be updated, then the regular scheduled checking suffices.
Archived
This topic is now archived and is closed to further replies.