June 17, 20188 yr Hi, I was having trouble copying a folder off of my unraid server via samba and windows 10. I was getting frustrated because I was having error 0x8007003B on my windows machine on that folder only. I must have mis-clicked and deleted most of the files on my array. I know that this is usually the end and very well might be for me. I had years of photos in there some backed up to my google drive. I got to digging and noticed that my free space hasn't opened up. I did a little digging with google and found that my samba server likely had those files tied up as in use and that they could be sitting somewhere as fuse_hidden XXXX files waiting for a reboot to clear. I looked at my log to try and see what happened. and it says Fatal error: Allowed memory size of 134217728 bytes exhausted so no log or a full one at least. I have unraid v 6.5.2 I have not rebooted the server or unmounted the array the only thing I have done is stopped all dockers and vms I have found the recycle bin plugin and will be using that in the future is there anything I can do to recover this since as I understand it the Fuse files are holding the HD space until reboot, remount ( hopefully i'm correct in that thought process of how that works) any options including pulling the drives and trying to recover the files that way? and what info do you need from me ? Is UFS EXPLORER or Raise Data Recovery an option ? Thank you Ryan Edited June 17, 20188 yr by Trippett
June 17, 20188 yr Does lsof lists any process with a large number of open files that are part of "mnt"? Processes that has open files will have the file handles of the open files available as /proc/<pid>/fd/<fd> For normal files/directories, the fd is in the directory listing presented as a link to the file name that was opened. But it can also claim that the file descriptor points to an anonymous inode. Anyway - you want to be very careful with introducing any writes to the disks involved in this incident to maximize the chances that recovery software can find the deleted files. And you should put at the top of your list the fact that parity is not a replacement for backup. Parity is a way to improve the availability. Anything you don't want to lose should be stored on multiple media and really recommended to be stored at multiple physical locations. Backup should not be seen as an optional extra step unless the intended goal is a scientific project to evaluate the time until data loss.
June 17, 20188 yr Post your full diagnostics and don't do anything else until someone who actually is good at this stuff comes along (not me ) I suspect there's a good chance you'll get them recovered as you haven't tried X,Y & Z in a panic like some do.
June 17, 20188 yr Author Here is my Diagnostics file. I did go ahead and stop the array to be absolutely sure that there was no more write activity after reading this forum post it seems that I do stand a ok chance at recovering the files but I am in no hurry as I will have to buy some drives to put the data on 6 TB worth tower-diagnostics-20180617-0848.zip Edited June 17, 20188 yr by Trippett
June 17, 20188 yr Author the logs would seem to indicate that I have a dying drive .. I think .... there chock full of this Jun 6 10:02:46 Tower kernel: ffff88014f507000: 42 4d 41 33 00 00 00 d6 ff ff ff ff ff ff ff ff BMA3............ Jun 6 10:02:46 Tower kernel: ffff88014f507010: 00 00 00 00 35 e7 a4 69 00 00 00 01 8c 0d d3 88 ....5..i........ Jun 6 10:02:46 Tower kernel: ffff88014f507020: 00 00 00 04 00 36 02 69 3d e1 46 51 eb 9c 45 9f .....6.i=.FQ..E. Jun 6 10:02:46 Tower kernel: ffff88014f507030: 81 73 3b e4 ea fb 7a 48 00 00 00 01 ae a9 77 5f .s;...zH......w_ Jun 6 10:02:46 Tower kernel: XFS (md2): metadata I/O error: block 0x18c0dd388 ("xfs_trans_read_buf_map") error 74 numblks 8 Jun 6 10:02:46 Tower kernel: XFS (md2): Corruption warning: Metadata has LSN (4:3539561) ahead of current LSN (1:2507828). Please unmount and run xfs_repair (>= v4.3) to resolve. Jun 6 10:02:46 Tower kernel: XFS (md2): Metadata CRC error detected at xfs_buf_ioend+0x49/0x9c [xfs], xfs_bmbt block 0x18c0dd388 Jun 6 10:02:46 Tower kernel: XFS (md2): Unmount and run xfs_repair Jun 6 10:02:46 Tower kernel: XFS (md2): First 64 bytes of corrupted metadata buffer:
June 17, 20188 yr Author I mounted the array in maintenance mode and ran this I do believe that I may be fighting some file system corruption - human error..... not sure how much each contributes I believe most of my missing- deleted data in on disk 2 and disk 3. at this point I'm not entirely sure if the data was just deleted or became lost. This is the output of xfs_repair with the -nv flag used DISK 1 Phase 1 - find and verify superblock... - block cache size set to 248912 entries Phase 2 - using internal log - zero log... zero_log: head block 1864057 tail block 1864057 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 1 - agno = 3 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Sun Jun 17 10:14:06 2018 Phase Start End Duration Phase 1: 06/17 10:13:57 06/17 10:13:57 Phase 2: 06/17 10:13:57 06/17 10:13:58 1 second Phase 3: 06/17 10:13:58 06/17 10:14:04 6 seconds Phase 4: 06/17 10:14:04 06/17 10:14:04 Phase 5: Skipped Phase 6: 06/17 10:14:04 06/17 10:14:06 2 seconds Phase 7: 06/17 10:14:06 06/17 10:14:06 Total run time: 9 seconds DISK 2 Phase 1 - find and verify superblock... - block cache size set to 241416 entries Phase 2 - using internal log - zero log... zero_log: head block 2508027 tail block 2507843 ALERT: The filesystem has valuable metadata changes in a log which is being ignored because the -n option was used. Expect spurious inconsistencies which may be resolved by first mounting the filesystem to replay the log. - scan filesystem freespace and inode maps... sb_fdblocks 486913653, counted 487894697 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... Maximum metadata LSN (31:1316489) is ahead of log (1:2508027). Would format log to cycle 34. No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Sun Jun 17 10:19:07 2018 Phase Start End Duration Phase 1: 06/17 10:17:06 06/17 10:17:06 Phase 2: 06/17 10:17:06 06/17 10:17:09 3 seconds Phase 3: 06/17 10:17:09 06/17 10:18:42 1 minute, 33 seconds Phase 4: 06/17 10:18:42 06/17 10:18:43 1 second Phase 5: Skipped Phase 6: 06/17 10:18:43 06/17 10:19:07 24 seconds Phase 7: 06/17 10:19:07 06/17 10:19:07 Total run time: 2 minutes, 1 second DISK 3 Phase 1 - find and verify superblock... - block cache size set to 241376 entries Phase 2 - using internal log - zero log... zero_log: head block 1412704 tail block 1412704 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Sun Jun 17 10:27:48 2018 Phase Start End Duration Phase 1: 06/17 10:23:33 06/17 10:23:33 Phase 2: 06/17 10:23:33 06/17 10:23:36 3 seconds Phase 3: 06/17 10:23:36 06/17 10:26:54 3 minutes, 18 seconds Phase 4: 06/17 10:26:54 06/17 10:26:55 1 second Phase 5: Skipped Phase 6: 06/17 10:26:55 06/17 10:27:48 53 seconds Phase 7: 06/17 10:27:48 06/17 10:27:48 Total run time: 4 minutes, 15 seconds DISK 4 Phase 1 - find and verify superblock... - block cache size set to 248808 entries Phase 2 - using internal log - zero log... zero_log: head block 969082 tail block 969082 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 3 - agno = 2 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Sun Jun 17 10:24:55 2018 Phase Start End Duration Phase 1: 06/17 10:23:43 06/17 10:23:43 Phase 2: 06/17 10:23:43 06/17 10:23:44 1 second Phase 3: 06/17 10:23:44 06/17 10:24:20 36 seconds Phase 4: 06/17 10:24:20 06/17 10:24:21 1 second Phase 5: Skipped Phase 6: 06/17 10:24:21 06/17 10:24:55 34 seconds Phase 7: 06/17 10:24:55 06/17 10:24:55 Total run time: 1 minute, 12 seconds Edited June 17, 20188 yr by Trippett
June 17, 20188 yr Author I ran xfs_repair -v on disk 1,3,4 and xfs_repair -vL on disk 2 and I have everything back it looks like this was less human error and more file corruption. Thank you for your help Ryan
June 17, 20188 yr I'm happy that you got your files back. But now would be a good time for: 8 hours ago, pwm said: And you should put at the top of your list the fact that parity is not a replacement for backup. Parity is a way to improve the availability. Anything you don't want to lose should be stored on multiple media and really recommended to be stored at multiple physical locations. Backup should not be seen as an optional extra step unless the intended goal is a scientific project to evaluate the time until data loss.
Archived
This topic is now archived and is closed to further replies.