Only can boot into safe mode and crashes instantly when array starts up


Go to solution Solved by JorgeB,

Recommended Posts

Hey, so my server wasn't working as it crashed and I was trying to troubleshoot it with no luck. 

As title says, I can only boot into safe mode and can not turn on the array. Never had this issue before. Also 2 of my drives went to being emulated even though the smart system says they passed. I can not replace these drives as I can not start the array. I tried doing memtest+ and ram has no errors. I tried plugging in different sata and power cables and nothing worked. One drive is on the motherboard and the other is connected to a sata card so doubt they would be the problem. I had two parity drives so I haven't lost anything yet but can't get it working. All the drives are less than 2 years old so should really happen. Any advice would be great, thanks. 

Link to comment
1 minute ago, JorgeB said:

Start the array, then grab the diags on the console by typing "diagnostics", it might log the problem.

sadly it freezes instantly when I start the array, I just tried pulling up the console in another tab and doing it but it already froze and I couldn't type anything. it was letting me type right before I started the array. thanks though, anything I can try is useful

Link to comment

Hey so tried xfs repair,

other options crashed at step 1, -L crashed at step 2, this is the log

 

Linux 4.19.107-Unraid.
Last login: Tue Mar 16 22:15:11 +0000 2021 on /dev/pts/0.
root@unBaud:~# xfs_repair -L /dev/md2
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...


 

Link to comment

sorry had work,

ok so it doesn't crash when I do it on other drives (log below) also some progress,

I did -L today and it didn't crash but I think the message at the bottom was because it stopped it for some reason. let me know what to do as don't want it to crash as it is some progress. I don't mind it crashing at the moment if we need to try something else. Thanks for the continued support, this is a very strange situation that I can't find online. 

 

Linux 4.19.107-Unraid.
root@unBaud:~# xfs_repair -V /dev/md1
xfs_repair version 5.4.0
root@unBaud:~# xfs_repair -v /dev/md1
Phase 1 - find and verify superblock...
        - block cache size set to 3055776 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 1041894 tail block 1041894
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 2
        - agno = 1
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...

        XFS_REPAIR Summary    Wed Mar 17 19:09:52 2021

Phase           Start           End             Duration
Phase 1:        03/17 19:09:48  03/17 19:09:48
Phase 2:        03/17 19:09:48  03/17 19:09:49  1 second
Phase 3:        03/17 19:09:49  03/17 19:09:50  1 second
Phase 4:        03/17 19:09:50  03/17 19:09:50
Phase 5:        03/17 19:09:50  03/17 19:09:50
Phase 6:        03/17 19:09:50  03/17 19:09:51  1 second
Phase 7:        03/17 19:09:51  03/17 19:09:51

Total run time: 3 seconds
done
root@unBaud:~# xfs_repair -v /dev/md2
Phase 1 - find and verify superblock...
        - block cache size set to 3055776 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 769341 tail block 769337
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.
root@unBaud:~# xfs_repair -L /dev/md2
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
     

Link to comment

did at the bottom, log below, tried it again aswell new error, gonna try start the array as new things are happening.

 

root@unBaud:~# xfs_repair -L /dev/md2
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
xfs_repair: libxfs_device_zero write failed: Input/output error
root@unBaud:~# xfs_repair -L /dev/md2
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
totally zeroed log
xfs_repair: libxfs_device_zero write failed: Input/output error
root@unBaud:~# 

Link to comment

Ok so I don't understand, after doing the xfs -L, went back to array to start it. I stopped the array and was about to start it then disk 5 was missing plus a drive that was used for unassigned plugins. I rebooted the server and tried just starting not on safe mode, crashed. Start again in safe mode, disk 5 and unassigned device is there. don't know if the drives being missing for a brief moment is useful but that's what happened. 

Link to comment

Where the diags grabbed after the i/o error? They appear to be just after array start, and don't see anything logged, if yes post new ones.

 

Note that I'm going offline in a few minutes, will check back tomorrow morning, I usually only check the forum from around 7:30AM to around 7:30PM UTC time.

Link to comment

Don't know that was the most up to date diagnostics literally downloaded it right after it crashed. Every time I do something now something different happens, right before it crashes. new log from xfs repair on device 2. That is as far as it got. I dont really understand what is causing the crashes but I guess it doesn't get logged. Last diagnostics were right before I did this and il add the diagnostics from right after when I got it restarted. I start work early tomorrow so will be on earlier. 

 

Linux 4.19.107-Unraid.
Last login: Wed Mar 17 19:20:45 +0000 2021 on /dev/pts/0.
root@unBaud:~# xfs_repair -v  /dev/md2
Phase 1 - find and verify superblock...
        - block cache size set to 3055776 entries
Phase 2 - using internal log
        - zero log...
totally zeroed log
zero_log: head block 0 tail block 0
        - scan filesystem freespace and inode maps...
sb_fdblocks 14537429, counted 21352345
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
bad CRC for inode 2170963486
bad CRC for inode 2170963486, will rewrite
free inode 2170963486 contains errors, corrected
        - agno = 2

unbaud-diagnostics-20210317-1943.zip

Link to comment

Hey, so at work on break. When I get home I am going to take out all stuff for VMs like graphics card, usb card etc and clean everything. Then reset motherboard because think something weird is happening with it. Then will have bare essentials installed only for array to test it properly. Will do loads of tests then, may update my bios so no worries if can’t you can’t reply tonight because might be late when I finished everything. 

Link to comment

So I have a feeling the integrated gpu on my cpu is dead, or cpu is dying or motherboard is dying. I don't know which but it has to be one of them. shorty after the tests the other day, my pc went from showing bios start up screen to not showing anything. When I took the gpus out and everything, still no bios screen. tried different hdmi, still nope. install one gpu and bios screen showed up. and idea what is broken and what I should do. I don't have a spare cpu to narrow it down anymore. 

 

my unraid was using my cpu gpu as webui interface so need that working for my system to work. 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.