Jump to content

Kernel panic when disk is used


coppit

Recommended Posts

Hi all,

 

Coming back from vacation, I had a situation where commands like "ls /mnt/user/Media" were hanging. I tried rebooting the server to unwedge things, but unraid couldn't unmount the disks because bash was holding a file handle in it. "ps aux" reported that my hung bash was in state "D", and kill -9 couldn't kill it.

 

So I held my breath and hard rebooted the server. Now unraid has a kernel panic within seconds of the login prompt. If I remove disk 4, it boots okay. (Although oddly, it says it can't start the array due to too many wrong or missing disks, but there's just one missing disk. What's up with that? In my futzing, somehow I got into a state where disk 1 was also unassigned, but I replaced super.dat with super.old and now it just says that disk 4 is missing.)

 

Before the kernel panic, there's an error:

 

Internal error xfs_mountfs_int(2) at line 840 of file fs/xfs/xfs_mount.c. Caller xfs_fs_fill_super+0x33c/0x3bb

 

After getting the prompt, there's a ton of stuff dumped to the console as part of the kernel panic. LMK if you need the last page -- I took a picture. Here are the last few lines:

 

[hex stuff] tick_nobz_idel_enter+hex stuff
[hex stuff] cpu_startup_entry+hex stuff
[hex stuff] start_secondary+hex stuff
---[ end trace 8513738cffbb74b7 ]--

 

There's stuff in there about page faults and semaphores.

 

Any ideas? I'm thinking I'll just replace the drive and hope unraid will rebuild the data. But I'm a little worried that won't work, since it's saying that I have too many wrong or missing disks...

 

I'm using the latest version of unraid and the controller BIOS.

 

P.S. It would be super nice if there was an option to have rsyslog write to the cache drive as well as /var/log/syslog. That way I would have a way of getting the syslog for the kernel panic to you guys.

Link to comment

Try starting the array in Maintenance Mode (with the drive assigned)

 

Thanks for the reply. File system corruption would make sense, given that I had to do a hard powerdown. I did notice a message about the dirty bit being set.

 

But how can I start it in maintenance mode? The kernel panic prevents me from booting. Is there a way to tell unraid to boot straight into maintenance mode?

Link to comment

File system corruption should not be visible until the drive is mounted, which normally happens on array start.  So the solution probably is to edit config/disk.cfg and change startArray from "yes" to "no".  This should disable Auto Start of the array, allowing you to click the Maintenance Mode check box before manually starting the array.

 

This would have to be done with the server shut down and the flash drive in another machine.  The disk.cfg file is just a text file, and any text editor should work, and line endings don't matter with this file.

Link to comment

Thanks Rob. Since I was already booted without disk4, I set startArray to "no", rebooted, then checked the xfs file system on all my disks. None had any errors, as far as I could tell. But I went ahead and did a "writable" repair of disk4. I then started the array in normal mode.

 

One thing though, and perhaps this is a separate issue... When I started the array in normal mode, it warned me that the array would be unprotected. After the start, it says "Unmountable disk present: Disk 2 • ()" Literally there is nothing between the parens. I don't have a disk2 installed. In the "Array Devices" section, disk 2 says both "Not installed" and "Unmountable".

 

Perhaps, this is some sort of cruft from when I removed disk2 beore... After verifying that the disk contents look okay, should I re-initialize the array and recompute parity?

Link to comment

I don't have a disk2 installed. In the "Array Devices" section, disk 2 says both "Not installed" and "Unmountable".

 

Perhaps, this is some sort of cruft from when I removed disk2 before...

I suspect you are right.  Good idea restoring an older super.dat, but vars for array drives are in both super.dat and disk.cfg, so might have been better to have restored both.  Check the Diagnostics - Disk 2 section of disk.cfg, and third disk position in super.dat.

 

After verifying that the disk contents look okay, should I re-initialize the array and recompute parity?

That sounds right.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...