newoski Posted June 27, 2016 Share Posted June 27, 2016 Hi Guys, My saga continues, however, I wanted to start a new thread since the last one was all over the place. After swapping PCI slots, things seem to have gotten better. 6 of the 7 unmountable drives mounted. The 7th is md11, which is still red balled. I ran a disk check on it and it told me that the previous --rebuild-tree didn't complete. I tried running it again. It ran for about 6 hours, but still doesn't seem to be complete. Here's the full rebuild tree output: https://pastebin.com/mkQTUarr Weirder still, once that process completed, or didn't, disk10 popped up with 127 read errors while the server was in Maintenance mode... So confused... Quote Link to comment
trurl Posted June 28, 2016 Share Posted June 28, 2016 Not sure starting a new thread is helpful, but you should have at least included a link to any other threads you have posted that might be relevant. Quote Link to comment
RobJ Posted June 28, 2016 Share Posted June 28, 2016 If you're starting a new thread, you have to assume we don't know anything about your situation (and I don't), and provide your diagnostics (Need help? Read me first!), especially when you are going to mention unmountable drives, red balls, and using the --rebuild-tree option. If --rebuild-tree didn't finish, then there must be a hardware issue, and you have to solve that first, no point in retrying. At the bottom of the report, it indicates a bad block, so that has to be fixed first. Quote Link to comment
newoski Posted June 28, 2016 Author Share Posted June 28, 2016 If you're starting a new thread, you have to assume we don't know anything about your situation (and I don't), and provide your diagnostics (Need help? Read me first!), especially when you are going to mention unmountable drives, red balls, and using the --rebuild-tree option. If --rebuild-tree didn't finish, then there must be a hardware issue, and you have to solve that first, no point in retrying. At the bottom of the report, it indicates a bad block, so that has to be fixed first. Thanks very much for clarifying and my apologies for the confusion with the new thread. That said, I've attached my Diagnostics zip. I'm unfamiliar with bad blocks. How exactly should I proceed? Do I simply run disk check again with the -B parameter? I believe I already resolved the hardware issues unrelated to the drive itself -- pinpointed to a bad PCI slot for the expansion card. I moved the card to a new slot and that fixed all my unmountable drives, except this one with is red balled. tower-diagnostics-20160628-1310-bad-block.zip Quote Link to comment
JorgeB Posted June 28, 2016 Share Posted June 28, 2016 Weirder still, once that process completed, or didn't, disk10 popped up with 127 read errors while the server was in Maintenance mode... So confused... You're using reiserfsck on a emulated disk, unRAID reads all other disks plus parity to create its content, if disk10 or any other disk has errors, reiserfsck will fail, you could still be having issues with the controller. Edit to add: Just looked at your diags and Disk10 has pending sectors, so it needs to be replaced, problem is that you already have a disabled disk. Quote Link to comment
newoski Posted June 28, 2016 Author Share Posted June 28, 2016 Weirder still, once that process completed, or didn't, disk10 popped up with 127 read errors while the server was in Maintenance mode... So confused... You're using reiserfsck on a emulated disk, unRAID reads all other disks plus parity to create its content, if disk10 or any other disk has errors, reiserfsck will fail, you could still be having issues with the controller. Edit to add: Just looked at your diags and Disk10 has pending sectors, so it needs to be replaced, problem is that you already have a disabled disk. So after fixing the controller -- to the best of my knowledge -- I've run check disk on all drives. With the exception of Disk11, all have passed with No Corruptions Found... Where should I go from here? Edit to add: I understand I'm in a tight spot, I'm just unclear on how to proceed. If Disk11 is unrecoverable, so be it. If there's something I should do to further test the controller, let me know. If I should be doing something about Disk11 and those bad blocks, let me know. I'm just unclear on how to proceed in all regards... I believe my parity is shot to hell at this point... Quote Link to comment
JorgeB Posted June 28, 2016 Share Posted June 28, 2016 You can try mount/fix actual disk11, mount it with unassigned devices plugin and if needed run reiserfsck directly on that disk, since SMART for it looks OK. If that's successful then you can move all the data you can from disk10 to other disks if you have the space, after that you can do a new config without disk10 (or with a new one in its place). Quote Link to comment
newoski Posted June 28, 2016 Author Share Posted June 28, 2016 You can try mount/fix actual disk11, mount it with unassigned devices plugin and if needed run reiserfsck directly on that disk, since SMART for it looks OK. If that's successful then you can move all the data you can from disk10 to other disks if you have the space, after that you can do a new config without disk10 (or with a new one in its place). Hmmmm So here's what I tried: 1. Stop Array 2. Unmount Disk11 3. Start Array 4. Mount Disk11 via Unassigned Devices plugin. It shows up as Hitachi... 5. Try to read disk. I get a Windows Security pop up asking for Network Password to connect to TOWER. I try root / blank but it doesn't accept those credentials... - I'm able to read from all other mounted drives Quote Link to comment
newoski Posted June 28, 2016 Author Share Posted June 28, 2016 You can try mount/fix actual disk11, mount it with unassigned devices plugin and if needed run reiserfsck directly on that disk, since SMART for it looks OK. If that's successful then you can move all the data you can from disk10 to other disks if you have the space, after that you can do a new config without disk10 (or with a new one in its place). How do I go about running reiserfs directly on the disk11? I've got data transferring from disk10 to the rest of the array, now Quote Link to comment
newoski Posted June 28, 2016 Author Share Posted June 28, 2016 You can try mount/fix actual disk11, mount it with unassigned devices plugin and if needed run reiserfsck directly on that disk, since SMART for it looks OK. If that's successful then you can move all the data you can from disk10 to other disks if you have the space, after that you can do a new config without disk10 (or with a new one in its place). How do I go about running reiserfs directly on the disk11? I've got data transferring from disk10 to the rest of the array, now Data transfer, complete. Luckily, I have most of it stored elsewhere for this disk. Now I just need an explanation of how to run the reiserfck stuff on the unmounted Disk11 Quote Link to comment
JorgeB Posted June 28, 2016 Share Posted June 28, 2016 Data transfer, complete. Luckily, I have most of it stored elsewhere for this disk. Now I just need an explanation of how to run the reiserfck stuff on the unmounted Disk11 You have to use the unassigned devices plugin, or temporarily assign it to your cache slot (make sure filesystem is set to reiser), then: reiserfsck --check /dev/sdX1 X=assigned letter Quote Link to comment
newoski Posted June 28, 2016 Author Share Posted June 28, 2016 Data transfer, complete. Luckily, I have most of it stored elsewhere for this disk. Now I just need an explanation of how to run the reiserfck stuff on the unmounted Disk11 You have to use the unassigned devices plugin, or temporarily assign it to your cache slot (make sure filesystem is set to reiser), then: reiserfsck --check /dev/sdX1 X=assigned letter OK, it's running now. Isn't this going to give me the same error, eventually, about the bad blocks? Quote Link to comment
JorgeB Posted June 28, 2016 Share Posted June 28, 2016 It shouldn't, the previous errors were from disk10, now it's only reading disk11. Quote Link to comment
newoski Posted June 28, 2016 Author Share Posted June 28, 2016 It shouldn't, the previous errors were from disk10, now it's only reading disk11. Sorry, so close but still a bit confused... I emptied the contents off of Drive10. That's still assigned at the moment. Drive11 is red balled. How do I go about removing both drive 10 and 11 from the arrive? It gives me errors about too many wrong and/or missing disks. I understand the logic of what we're doing, just uncertain about the right away to do it... Do I remove Drive 11, and then create a New Config... and then stop the array, remove Drive 10, and create a new config again? Quote Link to comment
JorgeB Posted June 28, 2016 Share Posted June 28, 2016 If reiserfsck recovers disk11, and only after that, you have to do a new config (Tools > New config), reassign all disks (take a screenshot before), including the hopefully recovered disk11 and leaving out disk10 (or using a new one in its place), a parity sync will begin when first starting the array. Quote Link to comment
newoski Posted June 28, 2016 Author Share Posted June 28, 2016 If reiserfsck recovers disk11, and only after that, you have to do a new config (Tools > New config), reassign all disks (take a screenshot before), including the hopefully recovered disk11 and leaving out disk10 (or using a new one in its place), a parity will begin when first starting the array. so we're hoping that by simply removing the bad data from disk 10, the check on disk 11 will complete? Sent from my HTC6535LVW using Tapatalk Quote Link to comment
JorgeB Posted June 28, 2016 Share Posted June 28, 2016 You're using reiserfsck on a emulated disk, unRAID reads all other disks plus parity to create its content, if disk10 or any other disk has errors, reiserfsck will fail... You are now checking the actual disk, not the emulated one like before. Quote Link to comment
newoski Posted June 28, 2016 Author Share Posted June 28, 2016 If reiserfsck recovers disk11, and only after that, you have to do a new config (Tools > New config), reassign all disks (take a screenshot before), including the hopefully recovered disk11 and leaving out disk10 (or using a new one in its place), a parity sync will begin when first starting the array. So interestingly, there were no errors found when I did a disk check on disk11 via /dev/sds1 What do I do now? It still says unmountable in the GUI. Should I run rebuild tree? root@Tower:~# reiserfsck --check /dev/sds1 reiserfsck 3.6.24 Will read-only check consistency of the filesystem on /dev/sds1 Will put log info to 'stdout' Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes ########### reiserfsck --check started at Tue Jun 28 15:53:41 2016 ########### Replaying journal: Done. Reiserfs journal '/dev/sds1' in blocks [18..8211]: 0 transactions replayed Checking internal tree.. finished Comparing bitmaps..finished Checking Semantic tree: finished No corruptions found There are on the filesystem: Leaves 481945 Internal nodes 2980 Directories 114175 Other files 180943 Data block pointers 462166750 (0 of them are zero) Safe links 0 ########### reiserfsck finished at Tue Jun 28 16:16:39 2016 Quote Link to comment
newoski Posted June 28, 2016 Author Share Posted June 28, 2016 Does it mount? No. Should I run rebuild tree? EDIT: It gives me the same problem when I mount it via Unassigned Devices. Trying to access the drive in Windows results in asking me for a username/password, but root etc doesn't work Quote Link to comment
JorgeB Posted June 28, 2016 Share Posted June 28, 2016 Where is the disk, cache or unassigned devices plugin? Quote Link to comment
newoski Posted June 28, 2016 Author Share Posted June 28, 2016 Where is the disk, cache or unassigned devices plugin? Unassigned Devices Quote Link to comment
newoski Posted June 28, 2016 Author Share Posted June 28, 2016 Where is the disk, cache or unassigned devices plugin? CACHE WORKED! Woohooo. Hoping this is progress?? What do I do now? Copy all the data of the drive and then shrink the array?? Quote Link to comment
JorgeB Posted June 28, 2016 Share Posted June 28, 2016 Strange, it should also mount in UD, but never mind, now you can do a new config: -take a screenshot of current array -tools > new config -reassign all disks including disk11 (leave out old disk10), double check parity disk is in the parity slot -start array to begin parity sync Quote Link to comment
newoski Posted June 28, 2016 Author Share Posted June 28, 2016 Strange, it should also mount in UD, but never mind, now you can do a new config: -take a screenshot of current array -tools > new config -reassign all disks including disk11 (leave out old disk10), double check parity disk is in the parity slot -start array to begin parity sync Wait, what? The former disk11 is assigned as my cache disk right now. I can't assign it to disk11. What am I missing? Shouldn't I copy all the data off it -- while assigned as the cache drive, first -- and then essentially just assign it as a new disk, once I shrink the array to remove disk 10 and the drive formerly known as disk11? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.