Jump to content

4.5.4 Kernel bug


Recommended Posts

Got the following while the server was running. Couldn't save more than that. Did a screenshot of the console and there is a "kernel bug" message along the lines. The server was halted, was pingable but didn't allow any login or anything on the console.

 

Anyone knows more why this happened (the config is in my sig)? Thanks for any hints.

 

After reboot id shows "Parity Check in progress" - is this only a check or a recalculation?

I was only reading from the server, no writing at all to the filesystem.

syslog-message.txt

screenshot-small.jpg.zip

Link to comment

There have been times in the past where a kernel bug/crash was uncovered by a corruption of a file-system.

 

You might check each of your data disks with reiserfsck as described in the wiki:

http://lime-technology.com/wiki/index.php?title=Check_Disk_Filesystems

Don't run the check on the parity disk, as it does not have a file-system.

 

Also, send [email protected] an e-mail pointing Tom to this thread, in case it points to an issue to him.

 

Link to comment

Do you know after rebooting the Parity Check that appeared - is this a check only or actual recalculation of the parity?

 

It's both... it reads all the disks, including parity.  If parity mismatch is found, the 'parity errors' counter is incremented, but then the parity disk is written with correct parity.

Link to comment

If parity mismatch is found, the 'parity errors' counter is incremented, but then the parity disk is written with correct parity.

 

so in the event there is a mismatch - how does the system distinguish between the parity data being wrong or the hard disk data being wrong?

Link to comment

If parity mismatch is found, the 'parity errors' counter is incremented, but then the parity disk is written with correct parity.

 

so in the event there is a mismatch - how does the system distinguish between the parity data being wrong or the hard disk data being wrong?

It always assumes the data is correct and parity incorrect. 

 

It is why we, as users, requested the NOCORRECT option to the parity check.  That way, it is non-destructive, and if a specific disk is really bad (and you can figure that it is a specific disk), you can still use the parity disk to re-construct its replacement.

 

Unfortunately, there is no way to invoke the NOCORRECT version of the parity check from the web-interface ... and if it finds errors, there is no way to know what address on the disk had the mis-match. 

 

For now, it is a command line command, or, you can invoke it from the unMENU Array Management page.

 

Joe L.

 

 

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...