Jump to content

Server unresponsive (4.7)


Recommended Posts

This is a strange one.

My 4.6 server has begun to become unresponsive when left overnight. I'm running it over the network with no monitor or keyboard, and in the morning it's not responding to http, telnet, smb or anything. Rebooting clears the log so I can't see what it's freezing on.

 

I know this is super ambiguous, but does anyone have any suggestions?

 

Thanks,

Ben

Link to comment

I know this is super ambiguous, but does anyone have any suggestions?

Yes, give us a hardware breakdown of what is in the machine and let us know if anything new has been added lately.

 

You can also try updating to 4.7, though I would not suggest it until we figure out what might be going on.

 

There is a sticky at the top of this forum that specifically tells you how to get a syslog from the computer if possible and if the server hangs.

Link to comment

I've caught my server mid-crash. I moved the syslog to a 100mb ramdisk so the problem can't bring the whole system down by filling the syslog.

 

The unraid web interface is showing errors for many of the drives, i'm guessing the drives that are connected to my AOC-SASLP-MV8 going by the syslog error. A reboot fixed this last time. Any suggestions as to what might be going wrong?

 

Below is a clip of the syslog.

Mar 28 21:49:33 Tower kernel: ata7: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata7: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata7: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata7: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata7: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata7: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata7: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata7: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata7: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata7: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata7: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata7: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata7: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata7: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata7: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata7: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata7: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata7: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata7: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata7: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132
Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50
Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }

 

Link to comment

Hi sorry for the delay. I don't use a UPS. Do you think this could be a power issue? It only started happening once I had 15 drives in place.

 

My motherboard is a Gigabyte GA-MA74GM-S2

My PSU is a CORSAIR CMPSU-750TX (single rail)

 

By PSU spec I should be ok. All of my drives are 7200RPM and are left spun up 24/7. I suppose I could be a household current problem, the lights in my house do dim slightly when my furnace first kicks in. Are these errors possibly the SAS card losing power temporarily and dropping into some kind of unknown state?

 

Thanks.

 

p.s. yes, I do fly RC. Do I know you from that world?

 

You did not provide the hardware breakdown as requested previously so I have a few questions or suggestions???

 

1. Do you use UPS?

 

2. How old your PSU is? and your motherboard?

 

 

and the third unrelated to Unraid - do you fly RC planes?  ;)

Link to comment

I woke up this morning and the server is unresponsive again. Here's the last of the log before the server died altogether.

Apr  1 20:19:30 Tower kernel: ata6: no sense translation for status: 0x50
Apr  1 20:19:30 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:30 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:30 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:30 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:30 Tower kernel: ata6: no sense translation for status: 0x50
Apr  1 20:19:30 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:30 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:30 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:30 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:30 Tower kernel: ata6: no sense translation for status: 0x50
Apr  1 20:19:30 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:30 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[

 

Link to comment

I woke up this morning and the server is unresponsive again. Here's the last of the log before the server died altogether.

Apr  1 20:19:30 Tower kernel: ata6: no sense translation for status: 0x50
Apr  1 20:19:30 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:30 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:30 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:30 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:30 Tower kernel: ata6: no sense translation for status: 0x50
Apr  1 20:19:30 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:30 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:30 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:30 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:30 Tower kernel: ata6: no sense translation for status: 0x50
Apr  1 20:19:30 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:30 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]!
Apr  1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132
Apr  1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50
Apr  1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Apr  1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete }
Apr  1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[

 

This thread discusses the exact same bug:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/554398

 

Seems to be in the mvsas driver.  You might have better luck with 4.7 unRAID, or even 5.0beta6a since it is a newer kernel.

 

Note that 5.0beta6a has specific upgrade instructions in its release notes.  Be sure to follow them.  Step 1 is probably to STOP the array and copy your entire /boot/config folder to a safe place while the array is stopped. This will allow you to restore it if needed to revert to version 4.6 of unRAID.

 

If you do try 5.0beta6a, just do NOT start the array if any of the disks show as MBR-unknown.  Instead,  post the output of these commands:

(Where sdX = the appropriate three letter device name for any disk showing as MBR-Unknown.)

sfdisk -lus /dev/sdX

cat cat /sys/block/sdX/size

dd status=noxfer count=1 if=/dev/sdX | od -Ad -t x1

 

If all disks show as MBR-unaligned you can start the array.

 

Joe L.

 

Link to comment

Thanks for the help. I just upgraded to 4.7. A drive red-balled when I cold restarted the computer so I used the "trust array" procedure. A parity check is running now. I've got a telnet session piping out to a text file, so if it happens again I should have a log of the beginning of the failure.

 

Link to comment
  • 1 month later...

Running 4.7 on the server I've been very stable the past few weeks until today. The server became unresponsive but luckily I was tailing the syslog to a log file over telnet. Below is the excerpt of where things began to go wrong until the log file cut off and I was unable to reconnect to the server until I hard rebooted.

 

Any suggestions on what the problem might be?

 

May  4 16:50:53 Tower ntpd[1601]: synchronized to 169.229.70.183, stratum 3
May  4 17:04:38 Tower ntpd[1601]: time reset +1.753126 s
May  4 17:06:05 Tower ntpd[1601]: synchronized to 169.229.70.183, stratum 3
May  4 17:21:01 Tower ntpd[1601]: time reset +1.839623 s
May  4 17:21:57 Tower ntpd[1601]: synchronized to 169.229.70.183, stratum 3
May  4 17:38:00 Tower ntpd[1601]: time reset +1.874318 s
May  4 17:38:29 Tower ntpd[1601]: synchronized to 169.229.70.183, stratum 3
May  4 17:53:20 Tower kernel: sas: command 0xc6b640c0, task 0xc4186a00, timed out: BLK_EH_NOT_HANDLED
May  4 17:53:20 Tower kernel: sas: Enter sas_scsi_recover_host
May  4 17:53:20 Tower kernel: sas: trying to find task 0xc4186a00
May  4 17:53:20 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4186a00
May  4 17:53:20 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5
May  4 17:53:20 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4186a00
May  4 17:53:20 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5
May  4 17:53:20 Tower kernel: sas: sas_scsi_find_task: task 0xc4186a00 failed to abort
May  4 17:53:20 Tower kernel: sas: task 0xc4186a00 is not at LU: I_T recover
May  4 17:53:20 Tower kernel: sas: I_T nexus reset for dev 0300000000000000
May  4 17:53:20 Tower kernel: sas: I_T 0300000000000000 recovered
May  4 17:53:20 Tower kernel: sas: --- Exit sas_scsi_recover_host
May  4 17:53:51 Tower kernel: sas: command 0xc6b640c0, task 0xc419c780, timed out: BLK_EH_NOT_HANDLED
May  4 17:53:51 Tower kernel: sas: Enter sas_scsi_recover_host
May  4 17:53:51 Tower kernel: sas: trying to find task 0xc419c780
May  4 17:53:51 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc419c780
May  4 17:53:51 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5
May  4 17:53:51 Tower kernel: sas: sas_scsi_find_task: querying task 0xc419c780
May  4 17:53:51 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5
May  4 17:53:51 Tower kernel: sas: sas_scsi_find_task: task 0xc419c780 failed to abort
May  4 17:53:51 Tower kernel: sas: task 0xc419c780 is not at LU: I_T recover
May  4 17:53:51 Tower kernel: sas: I_T nexus reset for dev 0300000000000000
May  4 17:53:51 Tower kernel: sas: I_T 0300000000000000 recovered
May  4 17:53:51 Tower kernel: sas: --- Exit sas_scsi_recover_host
May  4 17:54:21 Tower kernel: sas: command 0xc6b640c0, task 0xc4186dc0, timed out: BLK_EH_NOT_HANDLED
May  4 17:54:21 Tower kernel: sas: Enter sas_scsi_recover_host
May  4 17:54:21 Tower kernel: sas: trying to find task 0xc4186dc0
May  4 17:54:21 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4186dc0
May  4 17:54:21 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5
May  4 17:54:21 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4186dc0
May  4 17:54:21 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5
May  4 17:54:21 Tower kernel: sas: sas_scsi_find_task: task 0xc4186dc0 failed to abort
May  4 17:54:21 Tower kernel: sas: task 0xc4186dc0 is not at LU: I_T recover
May  4 17:54:21 Tower kernel: sas: I_T nexus reset for dev 0300000000000000
May  4 17:54:21 Tower kernel: sas: I_T 0300000000000000 recovered
May  4 17:54:21 Tower kernel: sas: --- Exit sas_scsi_recover_host
May  4 17:54:25 Tower ntpd[1601]: time reset +1.820599 s
May  4 17:54:54 Tower kernel: sas: command 0xc6b640c0, task 0xc4186dc0, timed out: BLK_EH_NOT_HANDLED
May  4 17:54:54 Tower kernel: sas: Enter sas_scsi_recover_host
May  4 17:54:54 Tower kernel: sas: trying to find task 0xc4186dc0
May  4 17:54:54 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4186dc0
May  4 17:54:54 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5
May  4 17:54:54 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4186dc0
May  4 17:54:54 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5
May  4 17:54:54 Tower kernel: sas: sas_scsi_find_task: task 0xc4186dc0 failed to abort
May  4 17:54:54 Tower kernel: sas: task 0xc4186dc0 is not at LU: I_T recover
May  4 17:54:54 Tower kernel: sas: I_T nexus reset for dev 0300000000000000
May  4 17:54:54 Tower kernel: sas: I_T 0300000000000000 recovered
May  4 17:54:54 Tower kernel: sas: --- Exit sas_scsi_recover_host
May  4 17:54:58 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1841:port has not device.
May  4 17:54:58 Tower kernel: sas: sas_ata_task_done: SAS error 8a
May  4 17:54:58 Tower kernel: ata4: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00
May  4 17:54:58 Tower kernel: ata4: status=0x01 { Error }
May  4 17:54:58 Tower kernel: ata4: error=0x04 { DriveStatusError }
May  4 17:55:28 Tower kernel: sas: command 0xc6b640c0, task 0xc419c500, timed out: BLK_EH_NOT_HANDLED
May  4 17:55:28 Tower kernel: sas: Enter sas_scsi_recover_host
May  4 17:55:28 Tower kernel: sas: trying to find task 0xc419c500
May  4 17:55:28 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc419c500
May  4 17:55:28 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5
May  4 17:55:28 Tower kernel: sas: sas_scsi_find_task: querying task 0xc419c500
May  4 17:55:28 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5
May  4 17:55:28 Tower kernel: sas: sas_scsi_find_task: task 0xc419c500 failed to abort
May  4 17:55:28 Tower kernel: sas: task 0xc419c500 is not at LU: I_T recover
May  4 17:55:28 Tower kernel: sas: I_T nexus reset for dev 0300000000000000
May  4 17:55:28 Tower kernel: sas: I_T 0300000000000000 recovered
May  4 17:55:28 Tower kernel: sas: --- Exit sas_scsi_recover_host
May  4 17:56:09 Tower kernel: sas: command 0xf41ed0c0, task 0xc4187900, timed out: BLK_EH_NOT_HANDLED
May  4 17:56:09 Tower kernel: sas: Enter sas_scsi_recover_host
May  4 17:56:09 Tower kernel: sas: trying to find task 0xc4187900
May  4 17:56:09 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4187900
May  4 17:56:09 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5
May  4 17:56:09 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4187900
May  4 17:56:09 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5
May  4 17:56:09 Tower kernel: sas: sas_scsi_find_task: task 0xc4187900 failed to abort
May  4 17:56:09 Tower kernel: sas: task 0xc4187900 is not at LU: I_T recover
May  4 17:56:09 Tower kernel: sas: I_T nexus reset for dev 0300000000000000
May  4 17:56:09 Tower kernel: sas: I_T 0300000000000000 recovered
May  4 17:56:09 Tower kernel: sas: --- Exit sas_scsi_recover_host
May  4 17:56:40 Tower kernel: sas: command 0xf41ed0c0, task 0xc4187900, timed out: BLK_EH_NOT_HANDLED
May  4 17:56:40 Tower kernel: sas: Enter sas_scsi_recover_host
May  4 17:56:40 Tower kernel: sas: trying to find task 0xc4187900
May  4 17:56:40 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4187900
May  4 17:56:40 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5
May  4 17:56:40 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4187900
May  4 17:56:40 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5
May  4 17:56:40 Tower kernel: sas: sas_scsi_find_task: task 0xc4187900 failed to abort
May  4 17:56:40 Tower kernel: sas: task 0xc4187900 is not at LU: I_T recover
May  4 17:56:40 Tower kernel: sas: I_T nexus reset for dev 0300000000000000
May  4 17:56:40 Tower kernel: sas: I_T 0300000000000000 recovered
May  4 17:56:40 Tower kernel: sas: --- Exit sas_scsi_recover_host
May  4 17:57:11 Tower kernel: sas: command 0xf41ed0c0, task 0xc4187900, timed out: BLK_EH_NOT_HANDLED
May  4 17:57:11 Tower kernel: sas: Enter sas_scsi_recover_host
May  4 17:57:11 Tower kernel: sas: trying to find task 0xc4187900
May  4 17:57:11 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4187900
May  4 17:57:11 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5
May  4 17:57:11 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4187900
May  4 17:57:11 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5
May  4 17:57:11 Tower kernel: sas: sas_scsi_find_task: task 0xc4187900 failed to abort
May  4 17:57:11 Tower kernel: sas: task 0xc4187900 is not at LU: I_T recover
May  4 17:57:11 Tower kernel: sas: I_T nexus reset for dev 0300000000000000
May  4 17:57:11 Tower kernel: sas: I_T 0300000000000000 recovered
May  4 17:57:11 Tower kernel: sas: --- Exit sas_scsi_recover_host
May  4 17:57:34 Tower ntpd[1601]: synchronized to 169.229.70.183, stratum 3
May  4 17:57:42 Tower kernel: sas: command 0xf41ed0c0, task 0xc4187900, timed out: BLK_EH_NOT_HANDLED
May  4 17:57:42 Tower kernel: sas: Enter sas_scsi_recover_host
May  4 17:57:42 Tower kernel: sas: trying to find task 0xc4187900
May  4 17:57:42 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4187900
May  4 17:57:42 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5
May  4 17:57:42 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4187900
May  4 17:57:42 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5
May  4 17:57:42 Tower kernel: sas: sas_scsi_find_task: task 0xc4187900 failed to abort
May  4 17:57:42 Tower kernel: sas: task 0xc4187900 is not at LU: I_T recover
May  4 17:57:42 Tower kernel: sas: I_T nexus reset for dev 0300000000000000
May  4 17:57:42 Tower kernel: sas: I_T 0300000000000000 recovered
May  4 17:57:42 Tower kernel: sas: --- Exit sas_scsi_recover_host
May  4 17:58:13 Tower kernel: sas: command 0xf41ed0c0, task 0xc4187900, timed out: BLK_EH_NOT_HANDLED
May  4 17:58:13 Tower kernel: sas: Enter sas_scsi_recover_host
May  4 17:58:13 Tower kernel: sas: trying to find task 0xc4187900
May  4 17:58:13 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4187900
May  4 17:58:13 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5
May  4 17:58:13 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4187900
May  4 17:58:13 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5
May  4 17:58:13 Tower kernel: sas: sas_scsi_find_task: task 0xc4187900 failed to abort
May  4 17:58:13 Tower kernel: sas: task 0xc4187900 is not at LU: I_T recover
May  4 17:58:13 Tower kernel: sas: I_T nexus reset for dev 0300000000000000
May  4 17:58:13 Tower kernel: sas: I_T 0300000000000000 recovered
May  4 17:58:13 Tower kernel: sas: --- Exit sas_scsi_recover_host
May  4 17:58:44 Tower kernel: sas: command 0xf41ed0c0, task 0xc4187900, timed out: BLK_EH_NOT_HANDLED
May  4 17:58:44 Tower kernel: sas: Enter sas_scsi_recover_host
May  4 17:58:44 Tower kernel: sas: trying to find task 0xc4187900
May  4 17:58:44 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4187900
May  4 17:58:44 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5
May  4 17:58:44 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4187900
May  4 17:58:44 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5
May  4 17:58:44 Tower kernel: sas: sas_scsi_find_task: task 0xc4187900 failed to abort
May  4 17:58:44 Tower kernel: sas: task 0xc4187900 is not at LU: I_T recover
May  4 17:58:44 Tower kernel: sas: I_T nexus reset for dev 0300000000000000
May  4 17:58:44 Tower kernel: sas: I_T 0300000000000000 recovered
May  4 17:58:44 Tower kernel: sas: --- Exit sas_scsi_recover_host
May  4 17:59:25 Tower kernel: sas: command 0xf41ede40, task 0xc4187cc0, timed out: BLK_EH_NOT_HANDLED
May  4 17:59:25 Tower kernel: sas: Enter sas_scsi_recover_host
May  4 17:59:25 Tower kernel: sas: trying to find task 0xc4187cc0
May  4 17:59:25 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4187cc0
May  4 17:59:25 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5
May  4 17:59:25 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4187cc0
May  4 17:59:25 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5
May  4 17:59:25 Tower kernel: sas: sas_scsi_find_task: task 0xc4187cc0 failed to abort
May  4 17:59:25 Tower kernel: sas: task 0xc4187cc0 is not at LU: I_T recover
May  4 17:59:25 Tower kernel: sas: I_T nexus reset for dev 0300000000000000
May  4 17:59:25 Tower kernel: sas: I_T 0300000000000000 recovered
May  4 17:59:25 Tower kernel: sas: --- Exit sas_scsi_recover_host

 

Link to comment

Thanks for the help. I took a look and found nothing conclusive. The log doesn't seem to contain any indication which drive it was that might have been causing problems.

 

Ben, the log point to 0300000000000000 and this is the hard drive attached to the third channel (00 to 03 are on the first SFF8087, and 04 to 07 are on the other port)

 

Since you have a single controller in you system this is different than the other cases with dual controllers.

 

Possible explanation to me apart from the immature / buggy driver for this card is a power glitch (you do not have UPS and obviously you have a power problems when your furnace kicks in) or a bitflip in the DRAM as your motherboard does not support and you do not use ECC memory. This will corrupt your OS as it runs from the memory and you cannot recover until you restart the server.

 

Just hope you do not keep the TEMAC/EMFSO backup on this server.  :-X

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...