live4soccer7 Posted June 30, 2022 Share Posted June 30, 2022 (edited) I woke up this morning to docker and VM manager being down. The cache drive (xfs fs), which these files are stored on is now "unmountable" (wrong or no file system). I did put the array into maintenance mode and attempt to hit repair. You can see the results below. My backup of it is from a few months ago, so I can recover most all of what was on it, but if possible I would definitely like to get this functional again. On the "Main" menu it reads out the temp, says active and "Healthy". I'm not sure if these are accurate or just last readings of the disk. I was on 6.10.0. I did push the update today (disk failed before this) to 6.10.3. Any help would be greatly appreciated. I use the VM for work and have lots of other things down that are somewhat important and time sensitive. Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ALERT: The filesystem has valuable metadata changes in a log which is being ignored because the -n option was used. Expect spurious inconsistencies which may be resolved by first mounting the filesystem to replay the log. - scan filesystem freespace and inode maps... agi unlinked bucket 23 is 73394135 in ag 1 (inode=1147135959) sb_icount 1021248, counted 1021376 sb_ifree 8167, counted 6971 sb_fdblocks 164425577, counted 166053595 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 1 - agno = 0 - agno = 2 - agno = 3 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... disconnected inode 1147135959, would move to lost+found Phase 7 - verify link counts... would have reset inode 1147135959 nlinks from 0 to 1 No modify flag set, skipping filesystem flush and exiting. Edited June 30, 2022 by live4soccer7 Quote Link to comment
JorgeB Posted June 30, 2022 Share Posted June 30, 2022 Run it again without -n or nothing will be done. Quote Link to comment
JorgeB Posted June 30, 2022 Share Posted June 30, 2022 And use -L if it ask for it. Quote Link to comment
live4soccer7 Posted June 30, 2022 Author Share Posted June 30, 2022 Thanks. running now Quote Link to comment
live4soccer7 Posted June 30, 2022 Author Share Posted June 30, 2022 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ALERT: The filesystem has valuable metadata changes in a log which is being destroyed because the -L option was used. - scan filesystem freespace and inode maps... clearing needsrepair flag and regenerating metadata agi unlinked bucket 23 is 73394135 in ag 1 (inode=1147135959) sb_icount 1021248, counted 1021376 sb_ifree 8167, counted 6971 sb_fdblocks 164425577, counted 166053595 - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 1 - agno = 3 - agno = 2 - agno = 0 clearing reflink flag on inodes when possible Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... disconnected inode 1147135959, moving to lost+found Phase 7 - verify and correct link counts... Maximum metadata LSN (72:1879323) is ahead of log (1:2). Format log to cycle 75. xfs_repair: Flushing the data device failed, err=61! Cannot clear needsrepair due to flush failure, err=61. xfs_repair: Flushing the data device failed, err=61! fatal error -- File system metadata writeout failed, err=61. Re-run xfs_repair. Quote Link to comment
JorgeB Posted June 30, 2022 Share Posted June 30, 2022 Try again, xfs_repair should always finish, with more or less data loss, do you remember if the device had some free space or was close to fully used? Quote Link to comment
live4soccer7 Posted June 30, 2022 Author Share Posted June 30, 2022 It is a 2tb nvme drive. it should have about 1tb free. Should I run it with any flags? Quote Link to comment
live4soccer7 Posted June 30, 2022 Author Share Posted June 30, 2022 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... clearing needsrepair flag and regenerating metadata - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 3 - agno = 1 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... xfs_repair: Flushing the data device failed, err=61! Cannot clear needsrepair due to flush failure, err=61. xfs_repair: Flushing the data device failed, err=61! fatal error -- File system metadata writeout failed, err=61. Re-run xfs_repair. Ran without any flags a couple times with this. It's always possible a log or something filled it up and caused this problem. Quote Link to comment
live4soccer7 Posted June 30, 2022 Author Share Posted June 30, 2022 Is there a way to check for previous notifications in unraid? I may be able to see if something filled it to the "gills" last night. If not, there should be a lot of free space Quote Link to comment
live4soccer7 Posted June 30, 2022 Author Share Posted June 30, 2022 smartctl 7.3 2022-02-28 r5338 [x86_64-linux-5.15.46-Unraid] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: Samsung SSD 980 PRO 2TB Serial Number: S6B0NG0R405728R Firmware Version: 2B2QGXA7 PCI Vendor/Subsystem ID: 0x144d IEEE OUI Identifier: 0x002538 Total NVM Capacity: 2,000,398,934,016 [2.00 TB] Unallocated NVM Capacity: 0 Controller ID: 6 NVMe Version: 1.3 Number of Namespaces: 1 Namespace 1 Size/Capacity: 2,000,398,934,016 [2.00 TB] Namespace 1 Utilization: 1,496,877,862,912 [1.49 TB] Namespace 1 Formatted LBA Size: 512 Namespace 1 IEEE EUI-64: 002538 b41150549f Local Time is: Thu Jun 30 09:39:16 2022 PDT Firmware Updates (0x16): 3 Slots, no Reset required Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test Optional NVM Commands (0x0057): Comp Wr_Unc DS_Mngmt Sav/Sel_Feat Timestmp Log Page Attributes (0x0f): S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Maximum Data Transfer Size: 128 Pages Warning Comp. Temp. Threshold: 82 Celsius Critical Comp. Temp. Threshold: 85 Celsius Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 8.49W - - 0 0 0 0 0 0 1 + 4.48W - - 1 1 1 1 0 200 2 + 3.18W - - 2 2 2 2 0 1000 3 - 0.0400W - - 3 3 3 3 2000 1200 4 - 0.0050W - - 4 4 4 4 500 9500 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 0 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! - available spare has fallen below threshold - media has been placed in read only mode SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x09 Temperature: 36 Celsius Available Spare: 0% Available Spare Threshold: 10% Percentage Used: 56% Data Units Read: 3,017,526,991 [1.54 PB] Data Units Written: 2,839,436,501 [1.45 PB] Host Read Commands: 5,464,158,312 Host Write Commands: 4,063,944,349 Controller Busy Time: 45,841 Power Cycles: 457 Power On Hours: 4,340 Unsafe Shutdowns: 28 Media and Data Integrity Errors: 9,994 Error Information Log Entries: 9,994 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 Temperature Sensor 1: 36 Celsius Temperature Sensor 2: 49 Celsius Error Information (NVMe Log 0x01, 16 of 64 entries) No Errors Logged Quote Link to comment
live4soccer7 Posted June 30, 2022 Author Share Posted June 30, 2022 Could the "failed" status have any to do with the unmountable drive/fs? I wouldn't think so, but figured I would ask. The frontend says it is healthy, so not sure why that would be Quote Link to comment
JorgeB Posted June 30, 2022 Share Posted June 30, 2022 4 minutes ago, live4soccer7 said: Could the "failed" status have any to do with the unmountable drive/fs? 9 minutes ago, live4soccer7 said: - media has been placed in read only mode The device is in read-only mode, that's why xfs_repair is failing to write the corrections, you'll need to replace it and restore data from backups if available. Quote Link to comment
live4soccer7 Posted June 30, 2022 Author Share Posted June 30, 2022 (edited) What would cause it to go into read-only mode? If it is in read only mode, can I extract data off it still? I do have a backup, but it is from february, so not terribly new. Still way better than no backup Edited June 30, 2022 by live4soccer7 Quote Link to comment
JorgeB Posted June 30, 2022 Share Posted June 30, 2022 6 minutes ago, live4soccer7 said: What would cause it to go into read-only mode? 31 minutes ago, live4soccer7 said: - available spare has fallen below threshold 7 minutes ago, live4soccer7 said: If it is in read only mode, can I extract data off it still? It would be easy if the filesystem was still mounting, since it's not and it can't be fixed there are basically two options: use a file recovery util like UFS explorer or clone it to another device then run xfs_repair. Quote Link to comment
live4soccer7 Posted June 30, 2022 Author Share Posted June 30, 2022 Thanks! "available spare" is that referring to space? Quote Link to comment
JorgeB Posted June 30, 2022 Share Posted June 30, 2022 Flash devices come with some spare cells to replace ones that turn bad, for that device once the spare space gets below 10% you'd get a SMART warning, it's now at 0% and I assume the reason why the device is read-only. Also note that according to this the device was just a little half way past predicted life, but this is just an indication, I have one currently at 187% and still going strong. 1 hour ago, live4soccer7 said: Percentage Used: 56% Quote Link to comment
live4soccer7 Posted June 30, 2022 Author Share Posted June 30, 2022 Thanks. Makes sense. How are you determining it is past predicted life when it says percentage used 56%? I'm trying to learn on this one, not questioning your statement. Quote Link to comment
JorgeB Posted June 30, 2022 Share Posted June 30, 2022 2 minutes ago, live4soccer7 said: How are you determining it is past predicted life when it says percentage used 56%? Sorry, typo, I meant a little half way past. Quote Link to comment
live4soccer7 Posted June 30, 2022 Author Share Posted June 30, 2022 Ok. thank you for clarification. This drive will get RMA'd for sure. It is just over 1 yr old. If using a recovery tool like UFS, would this impair the ability to clone the drive? Just wondering which one I should attempt first. Quote Link to comment
live4soccer7 Posted June 30, 2022 Author Share Posted June 30, 2022 What would be the preferred program or method to clone it? I am handy and comfortable in terminal if it is possible clone/copy it to the array or I can move it to a windows machine and clone it to a mechanical drive there. Thanks. Quote Link to comment
live4soccer7 Posted July 1, 2022 Author Share Posted July 1, 2022 Thanks! Running ddrescue now. fingers crossed! Quote Link to comment
JorgeB Posted July 1, 2022 Share Posted July 1, 2022 12 hours ago, live4soccer7 said: If using a recovery tool like UFS, would this impair the ability to clone the drive? No, but if you have a spare device available cloning would be my first option, no need to buy another program, just note that if cloning to a larger device it won't mount with Unraid, but you can use UD. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.