Sandwich Posted March 11, 2021 Share Posted March 11, 2021 Trying to set up an unRaid box to replace my Drobo 5N (which, thankfully, still works after 5+ years). I've got a Gigabyte GA-Z97MX-Gaming 5 motherboard (not sure what SATA controller it has as the specs page doesn't seem to say). I have a WD Red 6Tb drive which was in the Drobo for 2.5 years and recently (supposedly) crashed; the Drobo ejected it from the array for some reason. I took it in for warranty replacement, where they did a thorough test that took a couple of days and reported the access time for each and every sector of the disk. It apparently passed with flying colors, so not sure why it was ejected from the Drobo. In any case, I gave it a complete, full NTFS reformat in my Windows PC and ran `chkdsk /R`, both of which completed with no errors or issues. At this point I'm very confused as to why it was ejected from the Drobo. So I plug it into my otherwise-empty unRaid box to see what happens. As expected, it reported that the NTFS-formatted disk was unmountable, so I clicked to have unRaid format it. Now it says the "FS" column is "xfs", but reports that the disk is "Unmountable: No file system". I have no idea what's going on. Finally, the disk log shows a lot of errors, but I don't know what's going on with those either. I'm attaching both the SMART report and the copied disk log (is there a better way to extract that log than just copy-paste?). Any ideas? Thanks so much for any help you can give me. 6tb disk log.txt cube-smart-20210311-1119.zip Quote Link to comment
JorgeB Posted March 11, 2021 Share Posted March 11, 2021 Try replacing or swapping both cables, if still issues please post the complete diags: Tools -> Diasgnostics Quote Link to comment
Sandwich Posted March 11, 2021 Author Share Posted March 11, 2021 I've swapped SATA cables with new ones out of the bag; no immediate change. I then stopped the array to try formatting the drive again. After about 2-3 minutes it just stopped formatting, still reporting "Unmountable: No file system", with a notification on the side: Quote Unraid array errors: 11-03-2021 18:22 Warning [CUBE] - array has errors Array has 1 disk with read errors The disk log is attached below, as is the full diagnostics as requested. Thank you so much for your time and assistance with this! 6tb new cables formatting stopped errors.txt cube-diagnostics-20210311-1824.zip Quote Link to comment
JorgeB Posted March 11, 2021 Share Posted March 11, 2021 If the new cables didn't help it almost certainly a disk problem, sometimes errors are weird, just recently had a disk that passes every SMART test, you can do a perfect copy with dd, but try to mount the disk with any filesystem and you get error after error. Quote Link to comment
Sandwich Posted March 11, 2021 Author Share Posted March 11, 2021 Hmm. Ok, here's the extra wrench in the works: when the Drobo kicked out the 6Tb, I immediately bought an 8Tb "replacement" before realizing the 6Tb was still under warranty. Then, since the Drobo was down to single-disk redundancy, I ran the 8Tb through a week-long `badblocks` test (which it passed), and then installed it into my empty unRaid box. It mounted fine, I made it a share (or whatever the terminology is around creating shares on drives), and began another week-long process, that of using rsync to copy over all the data (just under 6Tb of data) from the Drobo to the 8Tb drive in unRaid. That completed successfully, and for a while (5-10 days?), unRaid was working super-fast (relative to the slowpoke Drobo) on the network with just the 8Tb drive. Then, just in the last few days trying to figure out the issue with the 6Tb drive, suddenly the 8Tb drive stopped being recognized by unRaid as well. Same exact reported issue: "Unmountable: No file system". Does any of that shed any more light on what might be going on here? Is it possible that, instead of the drive being bad, the motherboard/controller is bad? But if so, why would it copy 6Tb of data over without a hitch? But if the drives are bad, why has every test under the sun except attempting to mount them in unRaid saying the drives are fine? 🤔 Quote Link to comment
JorgeB Posted March 11, 2021 Share Posted March 11, 2021 If more drives are causing similar ATA errors then there might be other issue, diags showing that might help. Quote Link to comment
Sandwich Posted March 11, 2021 Author Share Posted March 11, 2021 19 minutes ago, JorgeB said: diags showing that might help. Ok, so tell me what to do. Quote Link to comment
JorgeB Posted March 11, 2021 Share Posted March 11, 2021 In the diags posted there's only 1 disk assigned, post new ones after there was a problem mounting the other disk. Quote Link to comment
Sandwich Posted March 11, 2021 Author Share Posted March 11, 2021 Ah, of course, sorry! Here: cube-diagnostics-20210311-1945.zip Quote Link to comment
JorgeB Posted March 11, 2021 Share Posted March 11, 2021 Disk2 only shows filesystem corruption, not ATA errors, that should be fixable by checking filesystem: https://wiki.unraid.net/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui Quote Link to comment
Sandwich Posted March 11, 2021 Author Share Posted March 11, 2021 (edited) Odd... the filesystem of the 8Tb shows as "auto". EDIT: Ahh, but I do have a screenshot of that drive in unRaid 3 days ago, showing it as having `xfs`. I'll continue on that assumption. Edited March 11, 2021 by Sandwich Quote Link to comment
Sandwich Posted March 11, 2021 Author Share Posted March 11, 2021 It's not showing me the option to check filesystem status (presumably because the detected filesystem is "auto"). :-/ Quote Link to comment
JorgeB Posted March 11, 2021 Share Posted March 11, 2021 Click on that disk (with the array stopped) and change fs to xfs. Quote Link to comment
Sandwich Posted March 11, 2021 Author Share Posted March 11, 2021 Phase 1 - find and verify superblock... - block cache size set to 703632 entries Phase 2 - using internal log - zero log... zero_log: head block 2058495 tail block 2058491 ALERT: The filesystem has valuable metadata changes in a log which is being ignored because the -n option was used. Expect spurious inconsistencies which may be resolved by first mounting the filesystem to replay the log. - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 2 - agno = 0 - agno = 3 - agno = 1 - agno = 4 - agno = 5 - agno = 6 - agno = 7 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Thu Mar 11 20:28:53 2021 Phase Start End Duration Phase 1: 03/11 20:26:33 03/11 20:26:34 1 second Phase 2: 03/11 20:26:34 03/11 20:26:34 Phase 3: 03/11 20:26:34 03/11 20:27:46 1 minute, 12 seconds Phase 4: 03/11 20:27:46 03/11 20:27:46 Phase 5: Skipped Phase 6: 03/11 20:27:46 03/11 20:28:53 1 minute, 7 seconds Phase 7: 03/11 20:28:53 03/11 20:28:53 Total run time: 2 minutes, 20 seconds Does the above indicate it found issues that need to be repaired? The manual seemed to indicate there'd be a clearly stated option to use for re-running the check command, but I don't see anything that matches that description above. Quote Link to comment
itimpi Posted March 11, 2021 Share Posted March 11, 2021 You need to rerun the check removing the -n (no modify flag) so that fixing is allowed and add the -L flag. Quote Link to comment
Sandwich Posted March 11, 2021 Author Share Posted March 11, 2021 Geez, now I know where all that Holywood tech speak in movies comes from. O.O Phase 1 - find and verify superblock... - block cache size set to 703632 entries Phase 2 - using internal log - zero log... zero_log: head block 2058495 tail block 2058491 ALERT: The filesystem has valuable metadata changes in a log which is being destroyed because the -L option was used. - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 3 - agno = 2 - agno = 1 - agno = 4 - agno = 5 - agno = 6 - agno = 7 Phase 5 - rebuild AG headers and trees... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... Maximum metadata LSN (1:2058555) is ahead of log (1:2). Format log to cycle 4. XFS_REPAIR Summary Thu Mar 11 20:48:18 2021 Phase Start End Duration Phase 1: 03/11 20:45:54 03/11 20:45:54 Phase 2: 03/11 20:45:54 03/11 20:46:07 13 seconds Phase 3: 03/11 20:46:07 03/11 20:47:03 56 seconds Phase 4: 03/11 20:47:03 03/11 20:47:03 Phase 5: 03/11 20:47:03 03/11 20:47:03 Phase 6: 03/11 20:47:03 03/11 20:47:52 49 seconds Phase 7: 03/11 20:47:52 03/11 20:47:52 Total run time: 1 minute, 58 seconds done Great, so... now what? Do I stop the array from maintenance mode and restart normally? Currently, the Main screen still shows both disks as "Unmountable: No file system", although at least there's progress that the 8Tb's FS is "xfs" now instead of just "auto". ¯\_(ツ)_/¯ Also, if at any point in all this there's any indication whether the issue would be due to a failing drive vs failing MB vs random, please do let me know. Quote Link to comment
itimpi Posted March 11, 2021 Share Posted March 11, 2021 Yes, stop the array and restart in normal mode. The status does not get changed until the system next tries to mount the drive and I would expect the disk to now mount OK. Quote Link to comment
Sandwich Posted March 11, 2021 Author Share Posted March 11, 2021 Ok, so that does seem to have brought the 8Tb back, although I'm still scratching my head as to how it got borked in the first place. Nevertheless, thank you!! As for the original issue, the 6Tb drive... as best as you can tell, that does seem to be either a cable or drive issue, correct? And since I swapped out to new cables.... Quote Link to comment
Sandwich Posted March 12, 2021 Author Share Posted March 12, 2021 BTW, when I run a check on the 6Tb drive with `-n`, the result is this (canceled after a few minutes of whole-lotta-nothin): Quote Phase 1 - find and verify superblock... bad primary superblock - filesystem mkfs-in-progress bit set !!! attempting to find secondary superblock... ....................................................................................................................................................................................... And the line with the dots just grows and grows. Quote Link to comment
itimpi Posted March 12, 2021 Share Posted March 12, 2021 Did you do this from the command line? If so what is the exact command that you used? Quote Link to comment
Sandwich Posted March 12, 2021 Author Share Posted March 12, 2021 No, from running the array in maintenance mode, clicking the disk, and running the check. Quote Link to comment
itimpi Posted March 12, 2021 Share Posted March 12, 2021 25 minutes ago, Sandwich said: No, from running the array in maintenance mode, clicking the disk, and running the check. I have never seen that particular error message before so do not know what it means. I think you are going to need to let it scan the disk to see if it can find a valid superblock (which can take hours on a large disk). Maybe someone else will have a suggestion? Quote Link to comment
JorgeB Posted March 12, 2021 Share Posted March 12, 2021 That disk appears to be failing, and xfs failed to format it correctly. Quote Link to comment
Sandwich Posted March 12, 2021 Author Share Posted March 12, 2021 Ok, that's actually good news (since it's both empty and under warranty); it means there was a reason the Drobo kicked it out, and a reason unRAID can't make use of it. Thank you all for your help, you've been great!! Quote Link to comment
Sandwich Posted March 18, 2021 Author Share Posted March 18, 2021 Bit of an update and further puzzlement: I've gotten the 6Tb drive replaced, and the tech at the store assured me that the drive was error-free. Great! Same thing he said about the original 6Tb drive. 🙄 So I installed the replacement 6Tb and, unlike the original one, unRAID was able to format it. Yay! Then, since I need the largest drive in the array to be the parity disk, I `rsync`ed everything from the 8Tb (which had everything `rsync`ed from the Drobo5N) to the replacement 6Tb. That process took about half a day, and completed successfully. I then created a new drive config (Tools -> New Config), and assigned the 8Tb as a parity disk, and the 6Tb as data. It started to rebuild the parity on the 8Tb, which it said would take about a day. So I left it like that last night. This morning I come back to see that the process paused partway through, and there's an error notification on the screen: If I click on the disk log, the modal window just loads screen after screen of log rows, with errors scattered everywhere. At this point, I have no idea what to think. Full diagnostic attached. cube-diagnostics-20210318-0833.zip Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.