trurl Posted September 9, 2016 Share Posted September 9, 2016 Johnnie, why do you suggest replacing the parity drive first? My thinking is that I should fix disk1 filesystem first, then replace the cache disk and allow unRAID to rebuild the drive from parity, then replace the parity drive (disk0) and run the parity checker finally. This will eliminate the worry of losing data from disk1. Is the parity drive too far gone at this point? Doesn't SMART reallocate the bad sectors? I understand this drive is no longer reliable but can't I still use it to rebuild the cache and disk1 if needed? Perhaps I misunderstand you, but you give me the impression that you think you can rebuild cache from parity. This is not possible. Quote Link to comment
greg_gorrell Posted September 9, 2016 Author Share Posted September 9, 2016 johnnie, that makes complete sense. trurl, thanks for clearing this up. You are correct in my misunderstanding that I could rebuild my cache from parity. I completely forgot the fact that the cache drive is unprotected. I will attempt to salvage what I can tonight when I get home from work and post anymore questions if I have any. Thanks everyone for the help so far guys, this is all greatly appreciated. squid, I ended up purchasing some 2TB enterprise WD RE4 drives. I was hoping to upgrade to 4TB but I don't have the money for that right now with house repairs and such going on before summer's end. I have some NICs arriving today so I can run PF sense so it worked out that I have some planned downtime today. We shall see how these last but either way, I am sure anything is better than the WD Green drives. The one that failed is read/written to on the daily so they obviously can't handle this application. Quote Link to comment
Squid Posted September 9, 2016 Share Posted September 9, 2016 I completely forgot the fact that the cache drive is unprotected. I will attempt to salvage what I can tonight when I get home from work and post anymore questions if I have any. After you get the parity and data disk situation sorted out, setup CA Backup to backup your appdata. Then swap out the cache drive, then restore your appdata. Should minimize the problems in salvaging. Quote Link to comment
Squid Posted September 9, 2016 Share Posted September 9, 2016 I just remembered my Docker.img file is on disk2. Can I copy that over to disk1 when I fix the filesystem and then update the settings to point to the new location? Don't worry about the docker.img file at all. It's very easily recreated from scratch. And if nothing else, since you were getting those errors on loop0, its best to just delete the image after the file system fix is done anyways. For dockers, appdata is the only thing that matters (plex library, CP library, etc etc), which is what CA will backup and restore for you. Although if your not running 6.2 AND your appdata share isn't confined to the cache drive, then the backup module won't work for you. 6.1.x and appdata confined to the cache drive is ok. BTW, docker.img really should be on the cache drive. So much faster, and redundancy on it isn't important. Redundancy on the appdata is important, but that's where CA's backup will come in handy... Quote Link to comment
greg_gorrell Posted September 9, 2016 Author Share Posted September 9, 2016 Alright guys, I just got home and realized my docker.img is located on disk2 which is otherwise empty. I am going to use this disk to replace my parity disk. RIght now I am copying the docker.img to my laptop so I can then bring the array back down and designate disk2 as the parity drive. Once that is completed, do I bring the array up and run the xfs_repair or is that with the array down? One that is ran I intend on replacing the cache drive with a Maxtor Raptop drive I have laying around, not before backing everything up with CA. I don't see the xfs_repair in the gui anywhere. Is that something that comes up in maintenance mode? EDIT: squid, I deleted that last post but not before you saw it apparently haha. I am glad you did, I will get right on to moving the drives around then. Thanks for the tip on where to keep the docker.img. I had read that it was best in an automation share when I first set my server up so thats what I did and haven't even thought much about it since. Quote Link to comment
greg_gorrell Posted September 9, 2016 Author Share Posted September 9, 2016 Also, it seems the reason docker is having problems is because the appdata is located on disk1 which is unmountable. Quote Link to comment
Squid Posted September 9, 2016 Share Posted September 9, 2016 Alright guys, I just got home and realized my docker.img is located on disk2 which is otherwise empty. I am going to use this disk to replace my parity disk. RIght now I am copying the docker.img to my laptop so I can then bring the array back down and designate disk2 as the parity drive. Once that is completed, do I bring the array up and run the xfs_repair or is that with the array down? One that is ran I intend on replacing the cache drive with a Maxtor Raptop drive I have laying around, not before backing everything up with CA. I don't see the xfs_repair in the gui anywhere. Is that something that comes up in maintenance mode? EDIT: squid, I deleted that last post but not before you saw it apparently haha. I am glad you did, I will get right on to moving the drives around then. Thanks for the tip on where to keep the docker.img. I had read that it was best in an automation share when I first set my server up so thats what I did and haven't even thought much about it since. Maybe I'm misunderstanding something though. It sounds to me like you want to pull out a data drive and replace the parity disk with it (and not replace the data drive). You can do this by doing a New Config and reassigning the drives (and letting Parity rebuild itself), but you have to be 100% sure that there's nothing on that data disk because once you do the New Config, there's no way to rebuild it at all. Quote Link to comment
Squid Posted September 9, 2016 Share Posted September 9, 2016 Also, it seems the reason docker is having problems is because the appdata is located on disk1 which is unmountable. ok... Personally, if this was my machine, I would do the xfs_repair first before I go pulling a data drive out and replacing parity with it. One you pull the drive and replace parity you've effectively lost the ability to be able to rebuild a drive. Do the xfs_repair first and get disk1 mountable again and then pull disk2. A tad safer way of doing things. Quote Link to comment
greg_gorrell Posted September 9, 2016 Author Share Posted September 9, 2016 Yes that is correct squid. disk1- unmountable, filesystem suspected to be corrupt disk2- empty other than docker.img file parity- working, has bad sectors cache- working, has bad sectors I want to just reconfigure the drives on the main page to move the disk2 to the parity slot. I cannot backup any of my appdate is it is on the unmountable disk1. I was advised not to fix the filesystem until after I have the parity drive replaced. Quote Link to comment
greg_gorrell Posted September 9, 2016 Author Share Posted September 9, 2016 Johnnie, why do you suggest replacing the parity drive first? My thinking is that I should fix disk1 filesystem first, then replace the cache disk and allow unRAID to rebuild the drive from parity, then replace the parity drive (disk0) and run the parity checker finally. This will eliminate the worry of losing data from disk1. Is the parity drive too far gone at this point? Doesn't SMART reallocate the bad sectors? I understand this drive is no longer reliable but can't I still use it to rebuild the cache and disk1 if needed? Order doesn't matter, either way works, but unassign your current parity first, you shouldn't run xfx_repair with a bad parity disk. squid, that was my thought too but see johnnie's post here. Also, I cannot seem to find the xfs_repair tool anywhere. Is that in maintenance mode? Quote Link to comment
greg_gorrell Posted September 9, 2016 Author Share Posted September 9, 2016 I found the xfs_repair tool and I am running it now in read only mode: Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... Metadata corruption detected at xfs_agf block 0x1/0x200 flfirst 118 in agf 0 too large (max = 118) agf 118 freelist blocks bad, skipping freelist scan sb_icount 190848, counted 193216 sb_ifree 93, counted 105 sb_fdblocks 247182615, counted 246124243 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. EDIT: I ran it with -L option the second time as I was unable to mount the drive. Quote Link to comment
Squid Posted September 9, 2016 Share Posted September 9, 2016 My opinion is that while the parity disk needs to be replaced you're still better off with it in the system than without it. It's almost a flip a coin thing. I prefer to go after the big problem (uncountable drive / data missing) first than to worry about the capability to repair a drive if one drops dead. Sent from my LG-D852 using Tapatalk Quote Link to comment
Squid Posted September 9, 2016 Share Posted September 9, 2016 I found the xfs_repair tool and I am running it now in read only mode: Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... Metadata corruption detected at xfs_agf block 0x1/0x200 flfirst 118 in agf 0 too large (max = 118) agf 118 freelist blocks bad, skipping freelist scan sb_icount 190848, counted 193216 sb_ifree 93, counted 105 sb_fdblocks 247182615, counted 246124243 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. If it safe to run this now without readonly mode enabled? I don't see why not but Johnnie is the expert here Sent from my LG-D852 using Tapatalk Quote Link to comment
greg_gorrell Posted September 9, 2016 Author Share Posted September 9, 2016 Well I just went ahead and ran it again with the -L option. Once complete I will make sure all is well and reassign the disk2 as a parity drive. Quote Link to comment
greg_gorrell Posted September 9, 2016 Author Share Posted September 9, 2016 Squid, my disk is fixed and all my data is good. Now it will not let me make disk the parity for some reason. Looks like it has a built in detection to make sure the wrong disks aren't assigned to the wrong spot. Quote Link to comment
greg_gorrell Posted September 9, 2016 Author Share Posted September 9, 2016 This is what I need to accomplish (see attached). I see the new config tool, but do I want to retain the data slots and the cache? How does it know i want to remove one disk? Quote Link to comment
Squid Posted September 9, 2016 Share Posted September 9, 2016 This is what I need to accomplish (see attached). I see the new config tool, but do I want to retain the data slots and the cache? How does it know i want to remove one disk? It doesn't. Make a note of the drive assignments, then new config, and assign them as you choose. Quote Link to comment
greg_gorrell Posted September 9, 2016 Author Share Posted September 9, 2016 Oh wow, I guess I misunderstood that tool completely. With all the warnings I figured it would wipe the drives. Basically all I want to do is retain the cache drive correct? Then when I got back to the main page and select disk1 to use the same drive as before, it won't mess with the data? Quote Link to comment
Squid Posted September 9, 2016 Share Posted September 9, 2016 Oh wow, I guess I misunderstood that tool completely. With all the warnings I figured it would wipe the drives. Basically all I want to do is retain the cache drive correct? Then when I got back to the main page and select disk1 to use the same drive as before, it won't mess with the data? No it won't. Retain the cache drive as before, disk1 as before, parity what disk2 was and disk2 set to not installed. Quote Link to comment
greg_gorrell Posted September 9, 2016 Author Share Posted September 9, 2016 I wasn't really able to find clear instructions on this. I will wrote a guide to add to the wiki this weekend. Thanks for the help, looks like everything is going to be just fine. Quote Link to comment
Squid Posted September 9, 2016 Share Posted September 9, 2016 I wasn't really able to find clear instructions on this. I will wrote a guide to add to the wiki this weekend. Thanks for the help, looks like everything is going to be just fine. I think everybody around here will agree that the wiki sucks (and what could be my contributions I put into the FAQ's here on the forum instead). Everybody will appreciate your contributions. Quote Link to comment
greg_gorrell Posted September 10, 2016 Author Share Posted September 10, 2016 Okay, well I have had trouble navigating the docker faq for example. The is nowhere that says what the "best practice" is for placing the appdata folders. Should they be on the cache drive as well? The wiki would be nice if it was updated to version 6, since there is so much outdated and useless data in there that makes it more cumbersome and confusing. Where would you even create a faq for managing the disks and/or filesystems? Is there and current documentation? I like taking notes so I have something to look back to, it wouldn't take much to format it for others to use. Quote Link to comment
greg_gorrell Posted September 10, 2016 Author Share Posted September 10, 2016 Something major is going on with unRAID itself. I was copying the appdata over from disk1 to the cache drive and unRAID crashed now. No web gui access, no SSH, and all network shares went down. I'm afraid to go pull the plug to reboot it. Quote Link to comment
Squid Posted September 10, 2016 Share Posted September 10, 2016 Where would you even create a faq for managing the disks and/or filesystems? Is there and current documentation? I like taking notes so I have something to look back to, it wouldn't take much to format it for others to use. Send a PM to RobJ. He's the wiki guy around here. Quote Link to comment
Squid Posted September 10, 2016 Share Posted September 10, 2016 Something major is going on with unRAID itself. I was copying the appdata over from disk1 to the cache drive and unRAID crashed now. No web gui access, no SSH, and all network shares went down. I'm afraid to go pull the plug to reboot it. If the local keyboard / monitor still work, then log in and type: diagnostics Then post the resulting diagnostics file (I think on the flash drive in the logs folder) To reboot: powerdown -r If the local keyboard doesn't work, then you've pretty much got no choice but to hit the reset button. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.