nickyhvm Posted June 23, 2020 Posted June 23, 2020 Hello I'm really getting desperate at this moment. My unraid server was working fine for a few weeks already until I was gaming on a vm and everything froze. I had to hard shutdown my pc using the button and boot again, everything worked fine again until it froze again after a couple of hours. I was seeing some errors in my logs and after googling them, it seemed like I had to reformat my cache drives. I made a couple attempts to copy the current data from the cache to my array but with no luck because it kept freezing on me. (once the server froze, it kept running but nothing would work, couldn't get to the ui, couldn't ssh, couldn't connect to docker containers that were running... etc. So I gave up on copying my cache thnking I would only loose 1 day of data by doing a reformat anyways. Sadly enough I lost my VM's and all my appdata... pissed of as I was already, that was not the worst part, my server still keeps freezing in less then a few hours! Now this last time that I got it to run again, it isn't mounting my cache 1 drive because it says filesystem not found. I have no idea were to look anymore! Here is my diagnostics can someone please help me? tower-diagnostics-20200623-1809.zip Quote
JorgeB Posted June 23, 2020 Posted June 23, 2020 Cache filesystem is corrupt and needs to be re-done, you also need to check filesystem on disk4, but these are likely a consequence of all the crashing, not the reason for it, after it's fixed you can enable the syslog server/mirror feature to see if it catches something, but if the problem is hardware related it likely won't. Quote
nickyhvm Posted June 23, 2020 Author Posted June 23, 2020 (edited) Thanks for the quick answer! I have two cache drives, can I recover one using the other or do I have to format both again and loose my appdata again? because in the wiki it states that you have to use the scrub but this button is disabled for me now Edited June 23, 2020 by nickyhvm Quote
JorgeB Posted June 23, 2020 Posted June 23, 2020 Since the filesystem is corrupt it affects both devices, redundancy can only help when a device fail, look for the CA appdata backup/restore plugin. Quote
nickyhvm Posted June 23, 2020 Author Posted June 23, 2020 okay thanks! I have reformatted the drives now, checked disk4 and made it repair itself, have the logs enabled to mirror to flash, will post new diagnostics if the server freezes again and I will definitly install that plugin! Thanks! Quote
nickyhvm Posted June 23, 2020 Author Posted June 23, 2020 So I was creating my docker containers again and I started having issues again just like last time, I see a lot of errors in the system log but I can't make any sense of them! it ended with reboot needed. Can anyone tell me what is going on? tower-diagnostics-20200623-1902.zip Quote
JorgeB Posted June 23, 2020 Posted June 23, 2020 You still need to: 42 minutes ago, johnnie.black said: check filesystem on disk4 Quote
JorgeB Posted June 23, 2020 Posted June 23, 2020 https://wiki.unraid.net/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui Quote
nickyhvm Posted June 23, 2020 Author Posted June 23, 2020 the freeze happend a few minutes after the previous diagnostics that I posted. Mirror syslog to flash is enabled but I can't seem to find the correct logs on my flash drive? also taking a new diagnostics only gives the logs starting from a fresh boot? In the meantime I will look again at disk 4 and post the results Quote
nickyhvm Posted June 23, 2020 Author Posted June 23, 2020 this is the output of the xfs_repair -v on disk4: Phase 1 - find and verify superblock... - block cache size set to 741776 entries Phase 2 - using internal log - zero log... zero_log: head block 725875 tail block 725875 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 1 - agno = 3 Phase 5 - rebuild AG headers and trees... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... XFS_REPAIR Summary Tue Jun 23 19:22:26 2020 Phase Start End Duration Phase 1: 06/23 19:22:13 06/23 19:22:13 Phase 2: 06/23 19:22:13 06/23 19:22:14 1 second Phase 3: 06/23 19:22:14 06/23 19:22:20 6 seconds Phase 4: 06/23 19:22:20 06/23 19:22:20 Phase 5: 06/23 19:22:20 06/23 19:22:20 Phase 6: 06/23 19:22:20 06/23 19:22:25 5 seconds Phase 7: 06/23 19:22:25 06/23 19:22:25 Total run time: 12 seconds done Quote
nickyhvm Posted June 23, 2020 Author Posted June 23, 2020 now i'm getting this in my logs: Jun 23 19:53:26 Tower kernel: BTRFS critical (device loop2): corrupt leaf: root=266 block=126418944 slot=1 ino=1377 file_offset=0, invalid offset for file extent, have 2, should be aligned to 4096 Jun 23 19:53:26 Tower kernel: BTRFS critical (device loop2): corrupt leaf: root=266 block=126418944 slot=1 ino=1377 file_offset=0, invalid offset for file extent, have 2, should be aligned to 4096 Jun 23 19:53:26 Tower kernel: BTRFS error (device loop2): error loading props for ino 1377 (root 266): -5 Jun 23 19:53:26 Tower kernel: BTRFS critical (device loop2): corrupt leaf: root=266 block=126418944 slot=1 ino=1377 file_offset=0, invalid offset for file extent, have 2, should be aligned to 4096 Jun 23 19:53:26 Tower kernel: BTRFS critical (device is my cache getting corrupt again? Quote
JorgeB Posted June 23, 2020 Posted June 23, 2020 loop2 is usually the docker image, was that created new? Quote
nickyhvm Posted June 23, 2020 Author Posted June 23, 2020 (edited) yes that was newly created after the format of the cache, it is also preventing me from starting and deleting one container and doing a scrub in my docker settings tells me "no errors" Edited June 23, 2020 by nickyhvm Quote
nickyhvm Posted June 23, 2020 Author Posted June 23, 2020 removed the docker image, started adding containers again, got stuck at adding a container and got a lot of errors in my log that I can not make any sense of! tower-diagnostics-20200623-2028.zip Quote
nickyhvm Posted June 23, 2020 Author Posted June 23, 2020 6656 errors within 2 minutes guess I should get this sorted first Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.