June 26, 201610 yr Hi Guys, Not sure where to start here. After upgrading some hardware, my drives were all unassigned when I powered on again. After reassigning them, as they were previously, I started the array. The below is a snippet of the syslog I'm able to capture from Putty. It already ran through this once and ended with 7 drives unmountable. After stopping the array and rebooting, only 1 was unmountable. I tried to run a reiserfs check on that drive, md11, but it didn't seem to give me any output. After rebooting again, it's back to doing what appears to be checking each disk sequentially before starting the array. Help would be monumentally appreciated: http://pastebin.com/FqFU8Q9L
June 26, 201610 yr Hi Guys, Not sure where to start here. After upgrading some hardware, my drives were all unassigned when I powered on again. After reassigning them, as they were previously, I started the array. The below is a snippet of the syslog I'm able to capture from Putty. It already ran through this once and ended with 7 drives unmountable. After stopping the array and rebooting, only 1 was unmountable. I tried to run a reiserfs check on that drive, md11, but it didn't seem to give me any output. After rebooting again, it's back to doing what appears to be checking each disk sequentially before starting the array. Help would be monumentally appreciated: http://pastebin.com/FqFU8Q9L Diagnostics would be a good start. But with 7 drives down (and 1 disabled), it tends to sound like a controller card and/or the power delivery system to those drives
June 26, 201610 yr Author Hi Squid, Should I run them with the Array Stopped? Or Started in Maintenance mode? I'm reseating all my connections to see if that helps and then I'll grab diagnostics.
June 26, 201610 yr Hi Squid, Should I run them with the Array Stopped? Or Started in Maintenance mode? I'm reseating all my connections to see if that helps and then I'll grab diagnostics. I would say started. But whatever you do, do NOT hit the format button
June 26, 201610 yr After rebooting again, it's back to doing what appears to be checking each disk sequentially before starting the array. Reiser sometimes takes what seems like an age to replay through the transaction log to make sure the filesystem is in a consistent state before it finishes mounting the drive. I would leave the array alone for at LEAST 2 hours, then check back on the main page and see if your drives have finished mounting. The red x will need to be dealt with sooner rather than later, but I think you need to let the array fully settle and boot, then pull another diagnostics and post back.
June 26, 201610 yr Author So after unseating and the connections, all but one of the drives -- Drive 11 -- have come back. The Cache drive isn't formatted, but that is a brand new cache drive that I just added. So I think that's to be expected. Diagnostics are attached How should I proceed with the Unmountable md11? Should I rebuild it from parity? tower-diagnostics-20160626-1703.zip
June 26, 201610 yr So after unseating and the connections, all but one of the drives -- Drive 11 -- have come back. The Cache drive isn't formatted, but that is a brand new cache drive that I just added. So I think that's to be expected. Diagnostics are attached How should I proceed with the Unmountable md11? Should I rebuild it from parity? Rebuilding an unmountable drive will simply result in an unmountable drive. Give it a bit for peep to check out the diagnostics
June 26, 201610 yr Stop the array, restart it in maintenance mode, and then run the file system checks on disk 11 (main - disk 11 - check file system)
June 26, 201610 yr Author What command line would that be? I've tried the reiserfsck --check /dev/md11 command and it just spits out that it replayed 0 transactions and then nothing else happens. After the 0 transactions message, should I just leave it alone? Is it still doing something at that point?
June 26, 201610 yr Author Hmmm, so after rebooting once and it starting the array right up with disk 11 X, now it's back to running that reiserfs check...
June 26, 201610 yr What command line would that be? I've tried the reiserfsck --check /dev/md11 command and it just spits out that it replayed 0 transactions and then nothing else happens. After the 0 transactions message, should I just leave it alone? Is it still doing something at that point? I'm not the real file system checking guy around here. That label would belong to RobJ / johnny.black I'm sure they'll pipe in. Been forever and a day since I ever had to run a fs check (And the command line you entered was correct, but you can do every thing through the GUI since you're running v6)
June 26, 201610 yr Author 7 Disks still unmountable. I'm running disk check on 11 via the GUI. Can I run simultaneous checks on the other disks?
June 26, 201610 yr Author Should I run rebuild tree? And can I run check disk on other drives at the same time? reiserfsck 3.6.24 Will read-only check consistency of the filesystem on /dev/md11 Will put log info to 'stdout' ########### reiserfsck --check started at Sun Jun 26 17:37:02 2016 ########### Replaying journal: Replaying journal: Done. Reiserfs journal '/dev/md11' in blocks [18..8211]: 0 transactions replayed Checking internal tree.. finished Comparing bitmaps..Bad nodes were found, Semantic pass skipped 1 found corruptions can be fixed only when running with --rebuild-tree ########### reiserfsck finished at Sun Jun 26 17:40:10 2016 ########### Zero bit found in on-disk bitmap after the last valid bit. block 366280812: The level of the node (60316) is not correct, (4) expected the problem in the internal node occured (366280812), whole subtree is skipped vpf-10640: The on-disk and the correct bitmaps differs.
June 26, 201610 yr 7 Disks still unmountable. I'm running disk check on 11 via the GUI. Can I run simultaneous checks on the other disks? Are you still getting read errors on those disks? In the screenshot you posted, besides the disabled disk 11 there are read errors on disks 12,13,14,16,17 and 18, this points to a hardware issue, what do these disks have in common, controller? I think you should try to resolve that first, maybe all disks but disk 11 will mount and only then check filesystem for disk11.
June 26, 201610 yr Author 7 Disks still unmountable. I'm running disk check on 11 via the GUI. Can I run simultaneous checks on the other disks? Are you still getting read errors on those disks? In the screenshot you posted, besides the disabled disk 11 there are read errors on disks 12,13,14,16,17 and 18, this points to a hardware issue, what do these disks have in common, controller? I think you should try to resolve that first, maybe all disks but disk 11 will mount and only then check filesystem for disk11. All the drives that are having issues are on the same SATA expansion card (2 x 4way breakouts). I'll try swapping the card and the cables tomorrow and report back. Running rebuild-tree on Disk11 in the meantime
June 26, 201610 yr Author How long does rebiuld tree take? It's currently sitting at 0% and has been for sometime...
June 27, 201610 yr Author 7 Disks still unmountable. I'm running disk check on 11 via the GUI. Can I run simultaneous checks on the other disks? Are you still getting read errors on those disks? In the screenshot you posted, besides the disabled disk 11 there are read errors on disks 12,13,14,16,17 and 18, this points to a hardware issue, what do these disks have in common, controller? I think you should try to resolve that first, maybe all disks but disk 11 will mount and only then check filesystem for disk11. Seems to be a damaged PCI slot. Switched to other slot and all but md11 are back. Running rebuild tree on md11 now Sent from my HTC6535LVW using Tapatalk
Archived
This topic is now archived and is closed to further replies.