Teekno Posted January 28 Share Posted January 28 Hi! So I recently added a new drive, but now one of my existing drives in the array reads as unreadable. I have replaced the cable on it, tried different ports to plug it into to try to eliminate any screwups that I might have done, but I am getting to the point where I am getting out of my comfort zone on this recovery. The main screen shows me that Disk 5 is "Unmountable: unsupported or no file system" and the drive shows as emulating. I have taken the array down, brought up in maintenance mode, and the drive is still emulated so I am not sure where to go from here, and frankly afraid that I'll screw up and have data loss. I have attached the latest diagnostics. Hoping someone who knows more about this than me can see something obvious here. tower-diagnostics-20240128-1016.zip Quote Link to comment
trurl Posted January 28 Share Posted January 28 Looks like you started rebuild of the unmountable filesystem onto the same disk5 (more than once). Is that correct? We usually like to repair the filesystem before rebuild, especially if rebuilding to the same disk. SMART for disk5 is OK but it looks like there might have been connection problems during some of this. The last attempt had just been started so I don't know how it is doing now. Post new diagnostics. Quote Link to comment
itimpi Posted January 28 Share Posted January 28 Note that ‘unmountable’ is different to disabled (which you must have since the drive is being emulated) and the two states require different actions to clear them. Handling of unmountable drives is covered here in the online documentation accessible via the ‘Manual’ link at the bottom of the GUI or the DOCS link at the top of each forum page. The Unraid OS->Manual section in particular covers most features of the current Unraid release. Quote Link to comment
Teekno Posted January 28 Author Share Posted January 28 Well, I came back to run new diags, but can't get into the server. Nothing on screen, keyboard unresponsive, can't ping it. But I can hear the drives chattering along. If it doesn't respond soon I may have to hard power it down. May have a whole new set of problems. Quote Link to comment
Teekno Posted January 28 Author Share Posted January 28 I did have to hard stop it, and start it back up. It came back with the array stopped. I have made a new diag dump in case somebody sees something useful in there. Any help is appreciated. tower-diagnostics-20240128-1322.zip Quote Link to comment
trurl Posted January 28 Share Posted January 28 Can't tell much without the array started. Unassign disk5, start the array in normal (not maintenance) mode, then post new diagnostics. Quote Link to comment
Teekno Posted January 28 Author Share Posted January 28 OK. I have started the array, non-maintenance mode, and here is a new diag run. tower-diagnostics-20240128-1337.zip Quote Link to comment
trurl Posted January 28 Share Posted January 28 The array is emulating missing disk5, but it is unmountable. Leave physical disk5 unassigned. That disk should appear in your Unassigned Devices. Check filesystem on emulated disk5 in the array. Be sure to use the webUI and not the command line. Capture the output and post it. Quote Link to comment
trurl Posted January 28 Share Posted January 28 Sorry, cancel that idea. I see you are having connection problems with disk1, which will be a problem for emulation. Check connections on disk1, then post new diagnostics as before. Quote Link to comment
Teekno Posted January 28 Author Share Posted January 28 2 hours ago, trurl said: Sorry, cancel that idea. I see you are having connection problems with disk1, which will be a problem for emulation. Check connections on disk1, then post new diagnostics as before. Well, now I am seeing other drives disappear. Swapped all cables of affected drives. I think it's a failing SATA card. I've ordered a new one and I'll follow up here after it arrives and I get it installed. Quote Link to comment
Teekno Posted January 30 Author Share Posted January 30 OK. Replaced (and upgraded) my SATA card. So at this point I see Disk 5 emulated, like I did before. I now also see, under array options, the disk that's been emulated listed as unmountable, and I get this option: "Format will create a file system in all Unmountable disks." I am thinking that if I choose this, would it reformat the drive, and rebuild it as part of the array? I'd appreciate any assistance. I've attached the latest logs in case there's something else I should be looking at. tower-diagnostics-20240129-1920.zip Quote Link to comment
trurl Posted January 30 Share Posted January 30 1 hour ago, Teekno said: would it reformat the drive, and rebuild it as part of the array? NO!!! If you format, it will rebuild a formatted disk. Don't do anything, let me look at diagnostics. Quote Link to comment
trurl Posted January 30 Share Posted January 30 Check filesystem on disk5. Be sure to do it from the webUI and not the command line. Post the output. Quote Link to comment
Teekno Posted January 30 Author Share Posted January 30 Ok. I didn’t do anything stupid. Here’s the output of the file system check. Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... Log inconsistent (didn't find previous header) failed to find log head zero_log: cannot find log head/tail (xlog_find_tail=5) - scan filesystem freespace and inode maps... sb_fdblocks 220269811, counted 219118455 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 4 - agno = 6 - agno = 1 - agno = 5 - agno = 7 - agno = 3 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... Maximum metadata LSN (24:107341) is ahead of log (0:0). Would format log to cycle 27. No modify flag set, skipping filesystem flush and exiting. Quote Link to comment
trurl Posted January 30 Share Posted January 30 Check filesystem on disk5, this time without the -n. If it asks for it, use -L. Be sure to do it from the webUI and not the command line. Post the output. Quote Link to comment
Teekno Posted January 30 Author Share Posted January 30 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... Log inconsistent (didn't find previous header) failed to find log head zero_log: cannot find log head/tail (xlog_find_tail=5) ERROR: The log head and/or tail cannot be discovered. Attempt to mount the filesystem to replay the log or use the -L option to destroy the log and attempt a repair. Quote Link to comment
trurl Posted January 30 Share Posted January 30 30 minutes ago, trurl said: If it asks for it, use -L Post output Quote Link to comment
Teekno Posted January 30 Author Share Posted January 30 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... Log inconsistent (didn't find previous header) failed to find log head zero_log: cannot find log head/tail (xlog_find_tail=5) - scan filesystem freespace and inode maps... clearing needsrepair flag and regenerating metadata sb_fdblocks 220269811, counted 219118455 - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 3 - agno = 7 - agno = 5 - agno = 4 - agno = 6 - agno = 1 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... Maximum metadata LSN (24:107349) is ahead of log (1:2). Format log to cycle 27. done Quote Link to comment
trurl Posted January 30 Share Posted January 30 Start array in normal (not maintenance) mode and post diagnostics Quote Link to comment
Teekno Posted January 30 Author Share Posted January 30 OK. And I really appreciate the help! tower-diagnostics-20240130-0826.zip Quote Link to comment
trurl Posted January 30 Share Posted January 30 Diagnostics shows disk5 mounted and with plenty of contents, rebuild underway with no apparent problems yet. Check your lost+found share for anything repair couldn't figure out. Quote Link to comment
Teekno Posted January 30 Author Share Posted January 30 OK, thanks. Currently rebuild time is around five days but I'll see where it shakes out. Quote Link to comment
trurl Posted January 30 Share Posted January 30 41 minutes ago, Teekno said: Currently rebuild time is around five days but I'll see where it shakes out. Usual estimate is 2-3 hours per TB unless there are controller bottlenecks. Quote Link to comment
Teekno Posted January 30 Author Share Posted January 30 Yeah, I thought that looked odd. It's running at about 28 MB/sec with periods down to 6 MB/sec. Quote Link to comment
Teekno Posted January 30 Author Share Posted January 30 (edited) OK, one disk in particular is showing very high utilization on iowait, around 94%. I am thinking of stopping the rebuild, shutting down and maybe checking the cable, or replacing it? Does that sound like something that might work or is there another approach I should try? Edited January 30 by Teekno Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.