bfeist Posted March 18, 2011 Share Posted March 18, 2011 This is a strange one. My 4.6 server has begun to become unresponsive when left overnight. I'm running it over the network with no monitor or keyboard, and in the morning it's not responding to http, telnet, smb or anything. Rebooting clears the log so I can't see what it's freezing on. I know this is super ambiguous, but does anyone have any suggestions? Thanks, Ben Link to comment
prostuff1 Posted March 18, 2011 Share Posted March 18, 2011 I know this is super ambiguous, but does anyone have any suggestions? Yes, give us a hardware breakdown of what is in the machine and let us know if anything new has been added lately. You can also try updating to 4.7, though I would not suggest it until we figure out what might be going on. There is a sticky at the top of this forum that specifically tells you how to get a syslog from the computer if possible and if the server hangs. Link to comment
bfeist Posted March 21, 2011 Author Share Posted March 21, 2011 I caught the server in mid-crash yesterday. It was logging thousands of duplicate object errors. I tried to catch the log midstream but it was so huge that it crashed my browser. I'm going to try the suggestion listed here: http://lime-technology.com/forum/index.php?topic=3352.msg28691#msg28691 and will also try to get to the bottom of the duplicate object problem. Link to comment
bfeist Posted March 29, 2011 Author Share Posted March 29, 2011 I've caught my server mid-crash. I moved the syslog to a 100mb ramdisk so the problem can't bring the whole system down by filling the syslog. The unraid web interface is showing errors for many of the drives, i'm guessing the drives that are connected to my AOC-SASLP-MV8 going by the syslog error. A reboot fixed this last time. Any suggestions as to what might be going wrong? Below is a clip of the syslog. Mar 28 21:49:33 Tower kernel: ata7: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata7: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata7: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata7: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata7: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata7: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata7: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata7: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata7: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata7: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata7: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata7: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata7: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata7: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata7: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata7: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata7: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata7: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata7: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata7: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata3: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata3: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata3: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Mar 28 21:49:33 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Mar 28 21:49:33 Tower kernel: sas: lldd_execute_task returned: -132 Mar 28 21:49:33 Tower kernel: ata6: no sense translation for status: 0x50 Mar 28 21:49:33 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 28 21:49:33 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Link to comment
bfeist Posted March 29, 2011 Author Share Posted March 29, 2011 FYI, I just rebooted the system and the logging frenzy has stopped, and the error counts on the unraid page are all back to 0. It seems to do this after being up for about a week, maybe more. Link to comment
bcbgboy13 Posted March 29, 2011 Share Posted March 29, 2011 You did not provide the hardware breakdown as requested previously so I have a few questions or suggestions??? 1. Do you use UPS? 2. How old your PSU is? and your motherboard? and the third unrelated to Unraid - do you fly RC planes? Link to comment
bfeist Posted March 31, 2011 Author Share Posted March 31, 2011 Hi sorry for the delay. I don't use a UPS. Do you think this could be a power issue? It only started happening once I had 15 drives in place. My motherboard is a Gigabyte GA-MA74GM-S2 My PSU is a CORSAIR CMPSU-750TX (single rail) By PSU spec I should be ok. All of my drives are 7200RPM and are left spun up 24/7. I suppose I could be a household current problem, the lights in my house do dim slightly when my furnace first kicks in. Are these errors possibly the SAS card losing power temporarily and dropping into some kind of unknown state? Thanks. p.s. yes, I do fly RC. Do I know you from that world? You did not provide the hardware breakdown as requested previously so I have a few questions or suggestions??? 1. Do you use UPS? 2. How old your PSU is? and your motherboard? and the third unrelated to Unraid - do you fly RC planes? Link to comment
bfeist Posted April 2, 2011 Author Share Posted April 2, 2011 I woke up this morning and the server is unresponsive again. Here's the last of the log before the server died altogether. Apr 1 20:19:30 Tower kernel: ata6: no sense translation for status: 0x50 Apr 1 20:19:30 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:30 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:30 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:30 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:30 Tower kernel: ata6: no sense translation for status: 0x50 Apr 1 20:19:30 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:30 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:30 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:30 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:30 Tower kernel: ata6: no sense translation for status: 0x50 Apr 1 20:19:30 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:30 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[ Link to comment
Joe L. Posted April 2, 2011 Share Posted April 2, 2011 I woke up this morning and the server is unresponsive again. Here's the last of the log before the server died altogether. Apr 1 20:19:30 Tower kernel: ata6: no sense translation for status: 0x50 Apr 1 20:19:30 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:30 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:30 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:30 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:30 Tower kernel: ata6: no sense translation for status: 0x50 Apr 1 20:19:30 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:30 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:30 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:30 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:30 Tower kernel: ata6: no sense translation for status: 0x50 Apr 1 20:19:30 Tower kernel: ata6: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:30 Tower kernel: ata6: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[-132]! Apr 1 20:19:40 Tower kernel: sas: lldd_execute_task returned: -132 Apr 1 20:19:40 Tower kernel: ata5: no sense translation for status: 0x50 Apr 1 20:19:40 Tower kernel: ata5: translated ATA stat/err 0x50/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Apr 1 20:19:40 Tower kernel: ata5: status=0x50 { DriveReady SeekComplete } Apr 1 20:19:40 Tower kernel: mvsas 0000:02:00.0: mvsas exec failed[ This thread discusses the exact same bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/554398 Seems to be in the mvsas driver. You might have better luck with 4.7 unRAID, or even 5.0beta6a since it is a newer kernel. Note that 5.0beta6a has specific upgrade instructions in its release notes. Be sure to follow them. Step 1 is probably to STOP the array and copy your entire /boot/config folder to a safe place while the array is stopped. This will allow you to restore it if needed to revert to version 4.6 of unRAID. If you do try 5.0beta6a, just do NOT start the array if any of the disks show as MBR-unknown. Instead, post the output of these commands: (Where sdX = the appropriate three letter device name for any disk showing as MBR-Unknown.) sfdisk -lus /dev/sdX cat cat /sys/block/sdX/size dd status=noxfer count=1 if=/dev/sdX | od -Ad -t x1 If all disks show as MBR-unaligned you can start the array. Joe L. Link to comment
bfeist Posted April 2, 2011 Author Share Posted April 2, 2011 Thanks for the help. I just upgraded to 4.7. A drive red-balled when I cold restarted the computer so I used the "trust array" procedure. A parity check is running now. I've got a telnet session piping out to a text file, so if it happens again I should have a log of the beginning of the failure. Link to comment
StevenD Posted April 2, 2011 Share Posted April 2, 2011 I was also seeing this mvsas error. Ive gotten it twice in the last three weeks. I upgraded to 4.7 today. I'll keep my fingers crossed. Link to comment
bfeist Posted May 4, 2011 Author Share Posted May 4, 2011 Running 4.7 on the server I've been very stable the past few weeks until today. The server became unresponsive but luckily I was tailing the syslog to a log file over telnet. Below is the excerpt of where things began to go wrong until the log file cut off and I was unable to reconnect to the server until I hard rebooted. Any suggestions on what the problem might be? May 4 16:50:53 Tower ntpd[1601]: synchronized to 169.229.70.183, stratum 3 May 4 17:04:38 Tower ntpd[1601]: time reset +1.753126 s May 4 17:06:05 Tower ntpd[1601]: synchronized to 169.229.70.183, stratum 3 May 4 17:21:01 Tower ntpd[1601]: time reset +1.839623 s May 4 17:21:57 Tower ntpd[1601]: synchronized to 169.229.70.183, stratum 3 May 4 17:38:00 Tower ntpd[1601]: time reset +1.874318 s May 4 17:38:29 Tower ntpd[1601]: synchronized to 169.229.70.183, stratum 3 May 4 17:53:20 Tower kernel: sas: command 0xc6b640c0, task 0xc4186a00, timed out: BLK_EH_NOT_HANDLED May 4 17:53:20 Tower kernel: sas: Enter sas_scsi_recover_host May 4 17:53:20 Tower kernel: sas: trying to find task 0xc4186a00 May 4 17:53:20 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4186a00 May 4 17:53:20 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5 May 4 17:53:20 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4186a00 May 4 17:53:20 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5 May 4 17:53:20 Tower kernel: sas: sas_scsi_find_task: task 0xc4186a00 failed to abort May 4 17:53:20 Tower kernel: sas: task 0xc4186a00 is not at LU: I_T recover May 4 17:53:20 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 May 4 17:53:20 Tower kernel: sas: I_T 0300000000000000 recovered May 4 17:53:20 Tower kernel: sas: --- Exit sas_scsi_recover_host May 4 17:53:51 Tower kernel: sas: command 0xc6b640c0, task 0xc419c780, timed out: BLK_EH_NOT_HANDLED May 4 17:53:51 Tower kernel: sas: Enter sas_scsi_recover_host May 4 17:53:51 Tower kernel: sas: trying to find task 0xc419c780 May 4 17:53:51 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc419c780 May 4 17:53:51 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5 May 4 17:53:51 Tower kernel: sas: sas_scsi_find_task: querying task 0xc419c780 May 4 17:53:51 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5 May 4 17:53:51 Tower kernel: sas: sas_scsi_find_task: task 0xc419c780 failed to abort May 4 17:53:51 Tower kernel: sas: task 0xc419c780 is not at LU: I_T recover May 4 17:53:51 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 May 4 17:53:51 Tower kernel: sas: I_T 0300000000000000 recovered May 4 17:53:51 Tower kernel: sas: --- Exit sas_scsi_recover_host May 4 17:54:21 Tower kernel: sas: command 0xc6b640c0, task 0xc4186dc0, timed out: BLK_EH_NOT_HANDLED May 4 17:54:21 Tower kernel: sas: Enter sas_scsi_recover_host May 4 17:54:21 Tower kernel: sas: trying to find task 0xc4186dc0 May 4 17:54:21 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4186dc0 May 4 17:54:21 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5 May 4 17:54:21 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4186dc0 May 4 17:54:21 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5 May 4 17:54:21 Tower kernel: sas: sas_scsi_find_task: task 0xc4186dc0 failed to abort May 4 17:54:21 Tower kernel: sas: task 0xc4186dc0 is not at LU: I_T recover May 4 17:54:21 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 May 4 17:54:21 Tower kernel: sas: I_T 0300000000000000 recovered May 4 17:54:21 Tower kernel: sas: --- Exit sas_scsi_recover_host May 4 17:54:25 Tower ntpd[1601]: time reset +1.820599 s May 4 17:54:54 Tower kernel: sas: command 0xc6b640c0, task 0xc4186dc0, timed out: BLK_EH_NOT_HANDLED May 4 17:54:54 Tower kernel: sas: Enter sas_scsi_recover_host May 4 17:54:54 Tower kernel: sas: trying to find task 0xc4186dc0 May 4 17:54:54 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4186dc0 May 4 17:54:54 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5 May 4 17:54:54 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4186dc0 May 4 17:54:54 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5 May 4 17:54:54 Tower kernel: sas: sas_scsi_find_task: task 0xc4186dc0 failed to abort May 4 17:54:54 Tower kernel: sas: task 0xc4186dc0 is not at LU: I_T recover May 4 17:54:54 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 May 4 17:54:54 Tower kernel: sas: I_T 0300000000000000 recovered May 4 17:54:54 Tower kernel: sas: --- Exit sas_scsi_recover_host May 4 17:54:58 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1841:port has not device. May 4 17:54:58 Tower kernel: sas: sas_ata_task_done: SAS error 8a May 4 17:54:58 Tower kernel: ata4: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00 May 4 17:54:58 Tower kernel: ata4: status=0x01 { Error } May 4 17:54:58 Tower kernel: ata4: error=0x04 { DriveStatusError } May 4 17:55:28 Tower kernel: sas: command 0xc6b640c0, task 0xc419c500, timed out: BLK_EH_NOT_HANDLED May 4 17:55:28 Tower kernel: sas: Enter sas_scsi_recover_host May 4 17:55:28 Tower kernel: sas: trying to find task 0xc419c500 May 4 17:55:28 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc419c500 May 4 17:55:28 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5 May 4 17:55:28 Tower kernel: sas: sas_scsi_find_task: querying task 0xc419c500 May 4 17:55:28 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5 May 4 17:55:28 Tower kernel: sas: sas_scsi_find_task: task 0xc419c500 failed to abort May 4 17:55:28 Tower kernel: sas: task 0xc419c500 is not at LU: I_T recover May 4 17:55:28 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 May 4 17:55:28 Tower kernel: sas: I_T 0300000000000000 recovered May 4 17:55:28 Tower kernel: sas: --- Exit sas_scsi_recover_host May 4 17:56:09 Tower kernel: sas: command 0xf41ed0c0, task 0xc4187900, timed out: BLK_EH_NOT_HANDLED May 4 17:56:09 Tower kernel: sas: Enter sas_scsi_recover_host May 4 17:56:09 Tower kernel: sas: trying to find task 0xc4187900 May 4 17:56:09 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4187900 May 4 17:56:09 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5 May 4 17:56:09 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4187900 May 4 17:56:09 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5 May 4 17:56:09 Tower kernel: sas: sas_scsi_find_task: task 0xc4187900 failed to abort May 4 17:56:09 Tower kernel: sas: task 0xc4187900 is not at LU: I_T recover May 4 17:56:09 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 May 4 17:56:09 Tower kernel: sas: I_T 0300000000000000 recovered May 4 17:56:09 Tower kernel: sas: --- Exit sas_scsi_recover_host May 4 17:56:40 Tower kernel: sas: command 0xf41ed0c0, task 0xc4187900, timed out: BLK_EH_NOT_HANDLED May 4 17:56:40 Tower kernel: sas: Enter sas_scsi_recover_host May 4 17:56:40 Tower kernel: sas: trying to find task 0xc4187900 May 4 17:56:40 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4187900 May 4 17:56:40 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5 May 4 17:56:40 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4187900 May 4 17:56:40 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5 May 4 17:56:40 Tower kernel: sas: sas_scsi_find_task: task 0xc4187900 failed to abort May 4 17:56:40 Tower kernel: sas: task 0xc4187900 is not at LU: I_T recover May 4 17:56:40 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 May 4 17:56:40 Tower kernel: sas: I_T 0300000000000000 recovered May 4 17:56:40 Tower kernel: sas: --- Exit sas_scsi_recover_host May 4 17:57:11 Tower kernel: sas: command 0xf41ed0c0, task 0xc4187900, timed out: BLK_EH_NOT_HANDLED May 4 17:57:11 Tower kernel: sas: Enter sas_scsi_recover_host May 4 17:57:11 Tower kernel: sas: trying to find task 0xc4187900 May 4 17:57:11 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4187900 May 4 17:57:11 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5 May 4 17:57:11 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4187900 May 4 17:57:11 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5 May 4 17:57:11 Tower kernel: sas: sas_scsi_find_task: task 0xc4187900 failed to abort May 4 17:57:11 Tower kernel: sas: task 0xc4187900 is not at LU: I_T recover May 4 17:57:11 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 May 4 17:57:11 Tower kernel: sas: I_T 0300000000000000 recovered May 4 17:57:11 Tower kernel: sas: --- Exit sas_scsi_recover_host May 4 17:57:34 Tower ntpd[1601]: synchronized to 169.229.70.183, stratum 3 May 4 17:57:42 Tower kernel: sas: command 0xf41ed0c0, task 0xc4187900, timed out: BLK_EH_NOT_HANDLED May 4 17:57:42 Tower kernel: sas: Enter sas_scsi_recover_host May 4 17:57:42 Tower kernel: sas: trying to find task 0xc4187900 May 4 17:57:42 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4187900 May 4 17:57:42 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5 May 4 17:57:42 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4187900 May 4 17:57:42 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5 May 4 17:57:42 Tower kernel: sas: sas_scsi_find_task: task 0xc4187900 failed to abort May 4 17:57:42 Tower kernel: sas: task 0xc4187900 is not at LU: I_T recover May 4 17:57:42 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 May 4 17:57:42 Tower kernel: sas: I_T 0300000000000000 recovered May 4 17:57:42 Tower kernel: sas: --- Exit sas_scsi_recover_host May 4 17:58:13 Tower kernel: sas: command 0xf41ed0c0, task 0xc4187900, timed out: BLK_EH_NOT_HANDLED May 4 17:58:13 Tower kernel: sas: Enter sas_scsi_recover_host May 4 17:58:13 Tower kernel: sas: trying to find task 0xc4187900 May 4 17:58:13 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4187900 May 4 17:58:13 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5 May 4 17:58:13 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4187900 May 4 17:58:13 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5 May 4 17:58:13 Tower kernel: sas: sas_scsi_find_task: task 0xc4187900 failed to abort May 4 17:58:13 Tower kernel: sas: task 0xc4187900 is not at LU: I_T recover May 4 17:58:13 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 May 4 17:58:13 Tower kernel: sas: I_T 0300000000000000 recovered May 4 17:58:13 Tower kernel: sas: --- Exit sas_scsi_recover_host May 4 17:58:44 Tower kernel: sas: command 0xf41ed0c0, task 0xc4187900, timed out: BLK_EH_NOT_HANDLED May 4 17:58:44 Tower kernel: sas: Enter sas_scsi_recover_host May 4 17:58:44 Tower kernel: sas: trying to find task 0xc4187900 May 4 17:58:44 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4187900 May 4 17:58:44 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5 May 4 17:58:44 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4187900 May 4 17:58:44 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5 May 4 17:58:44 Tower kernel: sas: sas_scsi_find_task: task 0xc4187900 failed to abort May 4 17:58:44 Tower kernel: sas: task 0xc4187900 is not at LU: I_T recover May 4 17:58:44 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 May 4 17:58:44 Tower kernel: sas: I_T 0300000000000000 recovered May 4 17:58:44 Tower kernel: sas: --- Exit sas_scsi_recover_host May 4 17:59:25 Tower kernel: sas: command 0xf41ede40, task 0xc4187cc0, timed out: BLK_EH_NOT_HANDLED May 4 17:59:25 Tower kernel: sas: Enter sas_scsi_recover_host May 4 17:59:25 Tower kernel: sas: trying to find task 0xc4187cc0 May 4 17:59:25 Tower kernel: sas: sas_scsi_find_task: aborting task 0xc4187cc0 May 4 17:59:25 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1701:mvs_abort_task:rc= 5 May 4 17:59:25 Tower kernel: sas: sas_scsi_find_task: querying task 0xc4187cc0 May 4 17:59:25 Tower kernel: /usr/src/sas/trunk/mvsas_tgt/mv_sas.c 1645:mvs_query_task:rc= 5 May 4 17:59:25 Tower kernel: sas: sas_scsi_find_task: task 0xc4187cc0 failed to abort May 4 17:59:25 Tower kernel: sas: task 0xc4187cc0 is not at LU: I_T recover May 4 17:59:25 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 May 4 17:59:25 Tower kernel: sas: I_T 0300000000000000 recovered May 4 17:59:25 Tower kernel: sas: --- Exit sas_scsi_recover_host Link to comment
dgaschk Posted May 5, 2011 Share Posted May 5, 2011 There are many reports of this issue. Search the forum for "BLK_EH_NOT_HANDLED". Link to comment
bfeist Posted May 5, 2011 Author Share Posted May 5, 2011 Thanks for the help. I took a look and found nothing conclusive. The log doesn't seem to contain any indication which drive it was that might have been causing problems. Link to comment
dgaschk Posted May 5, 2011 Share Posted May 5, 2011 Do you have more than one card? One user seemed to make progress by moving the card to another slot. Actually, exchanging the locations of two cards; one was a video card and the other a AOC-SASLP-MV8. Link to comment
bcbgboy13 Posted May 5, 2011 Share Posted May 5, 2011 Thanks for the help. I took a look and found nothing conclusive. The log doesn't seem to contain any indication which drive it was that might have been causing problems. Ben, the log point to 0300000000000000 and this is the hard drive attached to the third channel (00 to 03 are on the first SFF8087, and 04 to 07 are on the other port) Since you have a single controller in you system this is different than the other cases with dual controllers. Possible explanation to me apart from the immature / buggy driver for this card is a power glitch (you do not have UPS and obviously you have a power problems when your furnace kicks in) or a bitflip in the DRAM as your motherboard does not support and you do not use ECC memory. This will corrupt your OS as it runs from the memory and you cannot recover until you restart the server. Just hope you do not keep the TEMAC/EMFSO backup on this server. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.