AngelEyes Posted February 2, 2020 Share Posted February 2, 2020 (edited) Hello, I woke to this message from my server this morning, it runs 24/7 so there was no unsafe shutdown etc. I also have some messages in Fix Common problems about failing to write to two other drives which seem to be running ok. Running Unraid 6.7.2. Any help gratefully received. Thank you. Adam server-diagnostics-20200202-0911.zip Edited February 7, 2020 by AngelEyes Quote Link to comment
JorgeB Posted February 2, 2020 Share Posted February 2, 2020 There were read errors on disks 10 and 12, and both look like an actual disk problem, though possibly an intermittent one, since disk3 is disable it can't be emulated correctly with errors on more disks, hence the filesystem corruption, reboot to clear the errors and check file system on the emulated disk3, check if there are more read errors on the other disks. Even if there are no more errors for now on those disks I would recommend using a new disk to rebuild disk3, in case it happens again resulting in a corrupt rebuild. Quote Link to comment
AngelEyes Posted February 2, 2020 Author Share Posted February 2, 2020 Thank you, I will order a couple of disks to be on the safe side and follow the instructions to check the file system. Quote Link to comment
AngelEyes Posted February 2, 2020 Author Share Posted February 2, 2020 Ok, I ran the filesystem test with -nv which showed some errors but no suggestions on what to do next so i ran it again with no options which seemed to correct the errors. However now the disk is showing partition format: error and is unassigned with no file system. Any ideas? Quote Link to comment
JorgeB Posted February 2, 2020 Share Posted February 2, 2020 Please post current diags. Quote Link to comment
AngelEyes Posted February 2, 2020 Author Share Posted February 2, 2020 Ok, thank you. server-diagnostics-20200202-1128.zip Quote Link to comment
JorgeB Posted February 2, 2020 Share Posted February 2, 2020 More read errors on disks 10 and 12, run an extended SMART test on both, not possible to fix the filessytem or rebuild disk3 while these errors persist. There are also errors with disk3, but these look like a connection problem, check/replace cables, or if you're plan to rebuild to a new disk disconnect it. Quote Link to comment
AngelEyes Posted February 4, 2020 Author Share Posted February 4, 2020 Ok, finally completed the extended smart test on both drives, it says completed without error on both. I have included both results just in case, rebooted and included a diagnostic report. What should I try next? server-smart-20200202-1351 (1).zipserver-smart-20200204-0824.zip Thank you. server-diagnostics-20200204-0831.zip Quote Link to comment
JorgeB Posted February 4, 2020 Share Posted February 4, 2020 OK, disks appear to be fine but you're still having controller issues, disk12 dropped offline: Feb 4 08:30:03 Server kernel: ata3: SATA link down (SStatus 0 SControl 310) Feb 4 08:30:03 Server kernel: ata3.00: disabled And still read errors on disk10, replace/swap cables on both disks 10 and 12, preferably connected to a different controller. Also On 2/2/2020 at 12:02 PM, johnnie.black said: There are also errors with disk3, but these look like a connection problem, check/replace cables, or if you're plan to rebuild to a new disk disconnect it. Quote Link to comment
AngelEyes Posted February 4, 2020 Author Share Posted February 4, 2020 Ok, thanks I'll have a check. Is there a way to see quickly that the connection is a problem? Also, how do identify which disk is which physically? Just remove them and check serial number? Thanks Quote Link to comment
JorgeB Posted February 4, 2020 Share Posted February 4, 2020 15 minutes ago, AngelEyes said: Is there a way to see quickly that the connection is a problem? Best way it to replace/swap cables to rule them out, this case is kind of strange that errors on both disks are logged like an actual disk problem, but according to the SMART tests it's not. 16 minutes ago, AngelEyes said: Also, how do identify which disk is which physically? Just remove them and check serial number? It's good to know which disk is which by where they are in the case, but if you don't know yes, you need to check the serial numbers. Quote Link to comment
AngelEyes Posted February 4, 2020 Author Share Posted February 4, 2020 I appreciate your help, thank you. Quote Link to comment
AngelEyes Posted February 4, 2020 Author Share Posted February 4, 2020 Ok, I replaced the SATA cables and now on boot I have this message: Notice [] - array turned good Array has 0 disks with read errors Fix common problems shows no errors. How should I proceed next? Thanks. server-diagnostics-20200204-1145.zip Quote Link to comment
AngelEyes Posted February 4, 2020 Author Share Posted February 4, 2020 I also have this message which refers to disk 3 which is the one that had the XFS Filesystem problem: Unraid device sdk SMART message [199]: 04-02-2020 11:46 Notice [SERVER] - udma crc error count returned to normal value WDC_WD60EFAX-68SHWN0_WD-WX21D39PLYY0 (sdk) Quote Link to comment
JorgeB Posted February 4, 2020 Share Posted February 4, 2020 Read errors on disks 10 and 12 appear to be resolved, at least for now, you need to check filesystem on the emulated disk3: https://wiki.unraid.net/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui Remove -n or nothing will be done and if it asks for -L use it. If after that emulated disk3 mounts check that contents look correct, also look for lost+found folder for any partial or lost files, if all looks good next step is to rebuild the disk, old disk3 looks fine and problem was most likely cable related, but if possible I would recommend rebuilding to a new disk just in case something goes wrong during the rebuild, with the new cables there were no errors on boot, but there could still be during the rebuild, which is much more i/o intensive. Quote Link to comment
AngelEyes Posted February 4, 2020 Author Share Posted February 4, 2020 Hi, It seemed to need to move a lot to lost+found, should I try to mount it? Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... out-of-order cnt btree record 2 (2337603 6) block 2/1 out-of-order cnt btree record 8 (112371903 5681) block 2/1 out-of-order cnt btree record 9 (58051330 487331) block 2/1 out-of-order cnt btree record 10 (59395320 1232111) block 2/1 out-of-order cnt btree record 11 (11282207 14058) block 2/1 invalid length 16012 in record 12 of cnt btree block 2/1 out-of-order cnt btree record 13 (65090439 24189) block 2/1 out-of-order cnt btree record 14 (73005734 278253) block 2/1 out-of-order cnt btree record 15 (76268660 277824) block 2/1 out-of-order cnt btree record 16 (74523991 52005) block 2/1 out-of-order cnt btree record 17 (119192546 1043443) block 2/1 out-of-order cnt btree record 19 (58910657 11037649) block 2/1 block (2,60009375-60009375) multiply claimed by cnt space tree, state - 2 block (2,65090466-65090466) multiply claimed by cnt space tree, state - 2 out-of-order cnt btree record 20 (5589277 10835157) block 2/1 block (2,11248432-11248432) multiply claimed by cnt space tree, state - 7 block (2,11282276-11282276) multiply claimed by cnt space tree, state - 2 out-of-order cnt btree record 22 (26107066 461) block 2/1 out-of-order cnt btree record 23 (26137120 196) block 2/1 out-of-order cnt btree record 24 (3006250 1479) block 2/1 agf_freeblks 14913719, counted 50299413 in ag 2 agf_longest 10794676, counted 11949980 in ag 2 agf_freeblks 23160413, counted 23160363 in ag 0 agf_freeblks 16470, counted 123049 in ag 1 agf_freeblks 1479880, counted 1479752 in ag 3 agf_freeblks 24190093, counted 24190128 in ag 7 agi_freecount 60, counted 37 in ag 2 agi_freecount 60, counted 37 in ag 2 finobt agi_freecount 111, counted 96 in ag 7 agi_freecount 111, counted 96 in ag 7 finobt agi_freecount 64, counted 63 in ag 0 agi_freecount 64, counted 63 in ag 0 finobt agi_freecount 78, counted 68 in ag 1 agi_freecount 78, counted 68 in ag 1 finobt agi_freecount 25, counted 36 in ag 3 agi_freecount 25, counted 36 in ag 3 finobt sb_ifree 733, counted 763 sb_fdblocks 188439944, counted 219963766 - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 Bad ctime nsec 1012814891 on inode 100, resetting to zero imap claims a free inode 101 is in use, correcting imap and clearing inode cleared inode 101 imap claims a free inode 103 is in use, correcting imap and clearing inode cleared inode 103 Bad ctime nsec 1037099873 on inode 107, resetting to zero Bad ctime nsec 1037265133 on inode 122, resetting to zero - agno = 1 Bad mtime nsec 1046438165 on inode 1073741921, resetting to zero data fork in ino 1749280503 claims free block 342959447 data fork in ino 1749280506 claims free block 353058787 data fork in ino 1749280506 claims free block 353058788 data fork in ino 1749280506 claims free block 353517899 data fork in ino 1749280506 claims free block 353517900 - agno = 2 corrected directory 2149391474 size, was 207, now 78 Bad ctime nsec 1029773995 on inode 2149391479, resetting to zero data fork in ino 2149394521 claims free block 269494593 data fork in ino 2149394526 claims free block 270773059 data fork in ino 2149394526 claims free block 271441706 data fork in ino 2149394526 claims free block 274024733 data fork in ino 2156096933 claims free block 326486786 data fork in ino 2156096933 claims free block 327830776 data fork in ino 2156096933 claims free block 327830777 data fork in ino 2156096933 claims free block 332151002 data fork in ino 2156096933 claims free block 332151003 data fork in ino 2156097306 claims free block 333525895 data fork in ino 2156097306 claims free block 333525896 data fork in ino 2156124920 claims free block 327346113 data fork in ino 2228743995 claims free block 282312399 data fork in ino 2228743995 claims free block 282312400 data fork in ino 2228743995 claims free block 284515493 data fork in ino 2228743995 claims free block 284515494 data fork in ino 2237471045 claims free block 294542522 data fork in ino 2237471045 claims free block 294572576 data fork in ino 2237471050 claims free block 279717663 data fork in ino 2237471050 claims free block 279717664 data fork in ino 2237471051 claims free block 279683889 - agno = 3 Bad ctime nsec 1044474694 on inode 3221225583, resetting to zero - agno = 4 - agno = 5 - agno = 6 - agno = 7 bad inode format in inode 7516192865 Bad mtime nsec 1022742958 on inode 7516192864, resetting to zero corrected directory 7516192864 size, was 276, now 268 bogus .. inode number (433791696992) in directory inode 7516192864, clearing inode number bad inode format in inode 7516192865 cleared inode 7516192865 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 7 - agno = 3 - agno = 5 - agno = 6 - agno = 10 - agno = 8 entry "PLAYLIST" in shortform directory 6442451040 references free inode 101 junking entry "PLAYLIST" in directory inode 6442451040 - agno = 1 entry "BDMV" in shortform directory 6442451041 references free inode 7516192865 junking entry "BDMV" in directory inode 6442451041 corrected i8 count in directory 6442451041, was 2, now 1 - agno = 9 - agno = 11 - agno = 4 entry ".." at block 0 offset 80 in directory inode 102 references free inode 7516192865 corrected i8 count in directory 7516192864, was 10, now 9 bogus .. inode number (0) in directory inode 7516192864, entry ".." at block 0 offset 80 in directory inode 11811160161 references free inode 7516192865 clearing inode number entry ".." at block 0 offset 80 in directory inode 4294967394 references free inode 7516192865 entry "10000" in shortform directory 1182777507 references free inode 103 junking entry "10000" in directory inode 1182777507 entry ".." at block 0 offset 80 in directory inode 3573508162 references free inode 7516192865 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... entry ".." in directory inode 102 points to free inode 7516192865 bad hash table for directory inode 102 (no data entry): rebuilding rebuilding directory inode 102 entry ".." in directory inode 3573508162 points to free inode 7516192865 bad hash table for directory inode 3573508162 (no data entry): rebuilding rebuilding directory inode 3573508162 entry "ANY!" in dir ino 4294967392 doesn't have a .. entry, will set it in ino 7516192864. entry ".." in directory inode 4294967394 points to free inode 7516192865 bad hash table for directory inode 4294967394 (no data entry): rebuilding rebuilding directory inode 4294967394 entry ".." in directory inode 11811160161 points to free inode 7516192865 bad hash table for directory inode 11811160161 (no data entry): rebuilding rebuilding directory inode 11811160161 setting .. in sf dir inode 7516192864 to 4294967392 Metadata corruption detected at 0x46217c, inode 0x1c0000060 data fork xfs_repair: warning - iflush_int failed (-117) - traversal finished ... - moving disconnected inodes to lost+found ... disconnected dir inode 102, moving to lost+found disconnected inode 115, moving to lost+found disconnected inode 116, moving to lost+found disconnected inode 117, moving to lost+found disconnected inode 118, moving to lost+found disconnected inode 119, moving to lost+found disconnected inode 120, moving to lost+found disconnected inode 121, moving to lost+found disconnected inode 122, moving to lost+found disconnected inode 123, moving to lost+found disconnected inode 124, moving to lost+found disconnected inode 125, moving to lost+found disconnected inode 126, moving to lost+found disconnected inode 127, moving to lost+found disconnected inode 128, moving to lost+found disconnected inode 129, moving to lost+found disconnected inode 130, moving to lost+found disconnected inode 131, moving to lost+found disconnected inode 132, moving to lost+found disconnected inode 133, moving to lost+found disconnected inode 134, moving to lost+found disconnected inode 135, moving to lost+found disconnected inode 136, moving to lost+found disconnected inode 137, moving to lost+found disconnected inode 138, moving to lost+found disconnected inode 139, moving to lost+found disconnected inode 140, moving to lost+found disconnected inode 141, moving to lost+found disconnected inode 142, moving to lost+found disconnected inode 143, moving to lost+found disconnected inode 144, moving to lost+found disconnected inode 145, moving to lost+found disconnected inode 146, moving to lost+found disconnected inode 147, moving to lost+found disconnected inode 148, moving to lost+found disconnected inode 149, moving to lost+found disconnected inode 150, moving to lost+found disconnected inode 151, moving to lost+found disconnected inode 152, moving to lost+found disconnected inode 153, moving to lost+found disconnected inode 154, moving to lost+found disconnected inode 155, moving to lost+found disconnected inode 156, moving to lost+found disconnected inode 157, moving to lost+found disconnected inode 717352277, moving to lost+found disconnected inode 717352278, moving to lost+found disconnected dir inode 1182777507, moving to lost+found disconnected inode 2149391479, moving to lost+found disconnected inode 2149391480, moving to lost+found disconnected inode 2149391481, moving to lost+found disconnected dir inode 3119473582, moving to lost+found disconnected dir inode 3573508162, moving to lost+found disconnected dir inode 4294967394, moving to lost+found disconnected inode 7516192888, moving to lost+found disconnected inode 7516192889, moving to lost+found disconnected inode 7516192890, moving to lost+found disconnected dir inode 9663676513, moving to lost+found disconnected dir inode 10737418337, moving to lost+found disconnected dir inode 11811160161, moving to lost+found Phase 7 - verify and correct link counts... resetting inode 101 nlinks from 2 to 10 resetting inode 6442451040 nlinks from 7 to 6 resetting inode 1073741920 nlinks from 6 to 1 resetting inode 6442451041 nlinks from 4 to 3 resetting inode 1073741921 nlinks from 0 to 1 resetting inode 1182777507 nlinks from 10 to 9 Maximum metadata LSN (1:412401) is ahead of log (1:411757). Format log to cycle 4. done Quote Link to comment
JorgeB Posted February 4, 2020 Share Posted February 4, 2020 Just start the array and check data on disk3 Quote Link to comment
AngelEyes Posted February 4, 2020 Author Share Posted February 4, 2020 Ok, sorry for the questions but disk 3 has been unassigned this whole period so when i select the correct disk it shows a blue square and says 'New Device'. I assume if I start the array now it will have to rebuild? Quote Link to comment
JorgeB Posted February 4, 2020 Share Posted February 4, 2020 1 minute ago, AngelEyes said: I assume if I start the array now it will have to rebuild? Yes, but like mentioned I would recommend rebuilding to a new disk. Quote Link to comment
AngelEyes Posted February 4, 2020 Author Share Posted February 4, 2020 Ok, I will put in a new disk. Is it worth hanging onto the old disk 3, maybe running a preclear to see if it is ok or should I bin it? Thanks. Quote Link to comment
JorgeB Posted February 4, 2020 Share Posted February 4, 2020 Disk3 looks healthy, I suggested using a new one just in case something goes wrong during the rebuild, so you can still copy the data from the old disk if needed, after this is resolved you can reuse that disk however you want. Quote Link to comment
AngelEyes Posted February 4, 2020 Author Share Posted February 4, 2020 Fantastic, thanks so much for you patience and help. Quote Link to comment
AngelEyes Posted February 7, 2020 Author Share Posted February 7, 2020 All sorted now, thank you! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.