Only can boot into safe mode and crashes instantly when array starts up

SKAntoniou · March 16, 2021

Hey, so my server wasn't working as it crashed and I was trying to troubleshoot it with no luck.

As title says, I can only boot into safe mode and can not turn on the array. Never had this issue before. Also 2 of my drives went to being emulated even though the smart system says they passed. I can not replace these drives as I can not start the array. I tried doing memtest+ and ram has no errors. I tried plugging in different sata and power cables and nothing worked. One drive is on the motherboard and the other is connected to a sata card so doubt they would be the problem. I had two parity drives so I haven't lost anything yet but can't get it working. All the drives are less than 2 years old so should really happen. Any advice would be great, thanks.

ChatNoir · March 16, 2021

Hello,

The more information you provide, the more detailed the answer you can expect. Your diagnostics can be a good starting point.

You can get them in Tools / Diagnostics and attach the zip to your next post.

SKAntoniou · March 16, 2021

thanks, yeh didn't know how to get into the diagnostics. Found logs but no errors would show up on the log.

unbaud-diagnostics-20210316-1858.zip

JorgeB · March 16, 2021

Start the array, then grab the diags on the console by typing "diagnostics", it might log the problem.

SKAntoniou · March 16, 2021

1 minute ago, JorgeB said:

Start the array, then grab the diags on the console by typing "diagnostics", it might log the problem.

sadly it freezes instantly when I start the array, I just tried pulling up the console in another tab and doing it but it already froze and I couldn't type anything. it was letting me type right before I started the array. thanks though, anything I can try is useful

JorgeB · March 16, 2021

That's unusual, does it start in maintenance mode?

SKAntoniou · March 16, 2021

Just checked maintenance mode works, what do you suggest I do from here?

I did the diagnostics but do you know how to download them?

JorgeB · March 16, 2021

Check file system in all the array drives, without -n (assuming they are xfs).

SKAntoniou · March 16, 2021

Thanks but not sure how to do that, on shares, it doesn't show anything. the logs were a bit interesting. There are 2 drives emulated at the moment. 1 has a bunch of errors and the other looks normal which is weird. I have attached the logs so it easier to read. The parity one is the error full one.

Disk 1 (Parity Disk 1) Log.rtf Disk 4 (Array Disk Emulated) Log.rtf

SKAntoniou · March 16, 2021

Also both my cache drives now seem to have CRC error count to 1, one is literally brand new (like few days old) so don't understand that

SKAntoniou · March 16, 2021

Hey so tried xfs repair,

other options crashed at step 1, -L crashed at step 2, this is the log

Linux 4.19.107-Unraid.
Last login: Tue Mar 16 22:15:11 +0000 2021 on /dev/pts/0.
root@unBaud:~# xfs_repair -L /dev/md2
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...

JorgeB · March 17, 2021

Does it crash for disk2 only? Post the diags after it does.

SKAntoniou · March 17, 2021

sorry had work,

ok so it doesn't crash when I do it on other drives (log below) also some progress,

I did -L today and it didn't crash but I think the message at the bottom was because it stopped it for some reason. let me know what to do as don't want it to crash as it is some progress. I don't mind it crashing at the moment if we need to try something else. Thanks for the continued support, this is a very strange situation that I can't find online.

Linux 4.19.107-Unraid.
root@unBaud:~# xfs_repair -V /dev/md1
xfs_repair version 5.4.0
root@unBaud:~# xfs_repair -v /dev/md1
Phase 1 - find and verify superblock...
- block cache size set to 3055776 entries
Phase 2 - using internal log
- zero log...
zero_log: head block 1041894 tail block 1041894
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 2
- agno = 1
- agno = 3
Phase 5 - rebuild AG headers and trees...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- traversing filesystem ...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- traversal finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...

XFS_REPAIR Summary Wed Mar 17 19:09:52 2021

Phase Start End Duration
Phase 1: 03/17 19:09:48 03/17 19:09:48
Phase 2: 03/17 19:09:48 03/17 19:09:49 1 second
Phase 3: 03/17 19:09:49 03/17 19:09:50 1 second
Phase 4: 03/17 19:09:50 03/17 19:09:50
Phase 5: 03/17 19:09:50 03/17 19:09:50
Phase 6: 03/17 19:09:50 03/17 19:09:51 1 second
Phase 7: 03/17 19:09:51 03/17 19:09:51

Total run time: 3 seconds
done
root@unBaud:~# xfs_repair -v /dev/md2
Phase 1 - find and verify superblock...
- block cache size set to 3055776 entries
Phase 2 - using internal log
- zero log...
zero_log: head block 769341 tail block 769337
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed. Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair. If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.
root@unBaud:~# xfs_repair -L /dev/md2
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.

JorgeB · March 17, 2021

Just now, SKAntoniou said:

xfs_repair -v /dev/md2

Use -L

SKAntoniou · March 17, 2021

did at the bottom, log below, tried it again aswell new error, gonna try start the array as new things are happening.

root@unBaud:~# xfs_repair -L /dev/md2
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
xfs_repair: libxfs_device_zero write failed: Input/output error
root@unBaud:~# xfs_repair -L /dev/md2
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
totally zeroed log
xfs_repair: libxfs_device_zero write failed: Input/output error
root@unBaud:~#

JorgeB · March 17, 2021

5 minutes ago, SKAntoniou said:

xfs_repair: libxfs_device_zero write failed: Input/output error

This suggests a disk problem, please post diags.

SKAntoniou · March 17, 2021

Ok so I don't understand, after doing the xfs -L, went back to array to start it. I stopped the array and was about to start it then disk 5 was missing plus a drive that was used for unassigned plugins. I rebooted the server and tried just starting not on safe mode, crashed. Start again in safe mode, disk 5 and unassigned device is there. don't know if the drives being missing for a brief moment is useful but that's what happened.

SKAntoniou · March 17, 2021

new diagnostics

unbaud-diagnostics-20210317-1923.zip

JorgeB · March 17, 2021

Where the diags grabbed after the i/o error? They appear to be just after array start, and don't see anything logged, if yes post new ones.

Note that I'm going offline in a few minutes, will check back tomorrow morning, I usually only check the forum from around 7:30AM to around 7:30PM UTC time.

SKAntoniou · March 17, 2021

Don't know that was the most up to date diagnostics literally downloaded it right after it crashed. Every time I do something now something different happens, right before it crashes. new log from xfs repair on device 2. That is as far as it got. I dont really understand what is causing the crashes but I guess it doesn't get logged. Last diagnostics were right before I did this and il add the diagnostics from right after when I got it restarted. I start work early tomorrow so will be on earlier.

Linux 4.19.107-Unraid.
Last login: Wed Mar 17 19:20:45 +0000 2021 on /dev/pts/0.
root@unBaud:~# xfs_repair -v /dev/md2
Phase 1 - find and verify superblock...
- block cache size set to 3055776 entries
Phase 2 - using internal log
- zero log...
totally zeroed log
zero_log: head block 0 tail block 0
- scan filesystem freespace and inode maps...
sb_fdblocks 14537429, counted 21352345
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
bad CRC for inode 2170963486
bad CRC for inode 2170963486, will rewrite
free inode 2170963486 contains errors, corrected
- agno = 2

unbaud-diagnostics-20210317-1943.zip

JorgeB · March 18, 2021

Still nothing logged, you thing you can try is to check if the actual disk 2 is mounting with UD, if yes do a new config instead of rebuilding.

SKAntoniou · March 18, 2021

Hey, so at work on break. When I get home I am going to take out all stuff for VMs like graphics card, usb card etc and clean everything. Then reset motherboard because think something weird is happening with it. Then will have bare essentials installed only for array to test it properly. Will do loads of tests then, may update my bios so no worries if can’t you can’t reply tonight because might be late when I finished everything.

SKAntoniou · March 19, 2021

So I have a feeling the integrated gpu on my cpu is dead, or cpu is dying or motherboard is dying. I don't know which but it has to be one of them. shorty after the tests the other day, my pc went from showing bios start up screen to not showing anything. When I took the gpus out and everything, still no bios screen. tried different hdmi, still nope. install one gpu and bios screen showed up. and idea what is broken and what I should do. I don't have a spare cpu to narrow it down anymore.

my unraid was using my cpu gpu as webui interface so need that working for my system to work.

SKAntoniou · March 19, 2021

any* idea what is broken and what I should do?

SKAntoniou · March 19, 2021

still crashes when array starts up with gpu configuration. Motherboard or cpu must be faulty.

Only can boot into safe mode and crashes instantly when array starts up

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation