Treytor Posted November 29, 2011 Posted November 29, 2011 Running unraid 5.0 beta 13 and a drive came up with some errors today. I replaced it and started a rebuild, but it seems to be "stuck". The web UI is responsive, but very slow. There is very little HDD activity, and one of my drives has spun down. Clicking on the log I see this Nov 28 22:30:41 Cooper kernel: ata22: sas eh calling libata port error handler Nov 28 22:30:41 Cooper kernel: sas: --- Exit sas_scsi_recover_host Nov 28 22:30:41 Cooper kernel: drivers/scsi/mvsas/mv_sas.c 1904:port 1 slot 0 rx_desc 30000 has error info8000000080000000. Nov 28 22:30:41 Cooper kernel: mdcmd (67): spindown 17 Nov 28 22:30:43 Cooper kernel: drivers/scsi/mvsas/mv_sas.c 1904:port 1 slot 0 rx_desc 30000 has error info0000000080000002. Nov 28 22:31:20 Cooper last message repeated 4 times Nov 28 22:32:25 Cooper last message repeated 7 times Nov 28 22:33:29 Cooper last message repeated 7 times Nov 28 22:34:24 Cooper last message repeated 6 times Nov 28 22:34:24 Cooper kernel: mdcmd (68): spindown 18 Nov 28 22:34:34 Cooper kernel: drivers/scsi/mvsas/mv_sas.c 1904:port 1 slot 0 rx_desc 30000 has error info0000000080000002. Nov 28 22:34:52 Cooper last message repeated 2 times Nov 28 22:34:53 Cooper emhttp: Spinning up all drives... Nov 28 22:35:01 Cooper kernel: drivers/scsi/mvsas/mv_sas.c 1904:port 1 slot 0 rx_desc 30000 has error info0000000080000002. Nov 28 22:35:38 Cooper last message repeated 4 times Nov 28 22:36:43 Cooper last message repeated 7 times Nov 28 22:37:47 Cooper last message repeated 7 times Nov 28 22:38:52 Cooper last message repeated 7 times I've tried beta 14 as well, same problem. Any ideas? Thanks!
Treytor Posted November 29, 2011 Author Posted November 29, 2011 Here is my full system log. system-log.zip
Treytor Posted November 30, 2011 Author Posted November 30, 2011 Doing a mem test, and everything is checking out okay so far.
sdumas Posted November 30, 2011 Posted November 30, 2011 I am starting to see a pattern here... It seems to be a recurring occurence. You're the third person that has a similar problem. Did you have issues before this happens? I had several disk related issues. But if I believe the logs - for me - it would be my fourth disk out of ten that failed... highly improbable... I have a parity check right now that is running so slow it should take over two months to complete - I guess I'll have to live without parity until someone figures this out. I have enjoyed my unRAID for a few years now without issues and then suddenly - 4 disks within days of each other that failed - ... I don't think so.
prostuff1 Posted November 30, 2011 Posted November 30, 2011 I have enjoyed my unRAID for a few years now without issues and then suddenly - 4 disks within days of each other that failed - ... I don't think so. You would be surprised... when it rains it pours!! Where the drives by the same manufacturer? bought at the same time? how old where they?
Treytor Posted November 30, 2011 Author Posted November 30, 2011 I didn't have issues before this. Memcheck checked out fine, and I've tried 3 different drives (all should be good). It starts out okay rebuilding, then gets stuck at random points. I'll post a new syslog when (if) I can get it to come up.
sdumas Posted November 30, 2011 Posted November 30, 2011 "Where the drives by the same manufacturer? bought at the same time? how old where they?" Nope - different drives, bought at different times - some Seagates - some WD - black, blue and greens. Changed a few over the years - so four in a row sounds highly improbable. See the new thread I started: http://lime-technology.com/forum/index.php?topic=16968.0
Treytor Posted November 30, 2011 Author Posted November 30, 2011 The WEB UI is unresponsive so I can't get a log through that. Interestingly enough the shares are responding however...
dgaschk Posted November 30, 2011 Posted November 30, 2011 Telnet to the server and copy the log to a visible share.
Treytor Posted November 30, 2011 Author Posted November 30, 2011 Got it. Looks like the same errors as before. syslog.zip
dgaschk Posted November 30, 2011 Posted November 30, 2011 Nov 29 18:02:04 Cooper kernel: drivers/scsi/mvsas/mv_sas.c 2108:phy 1 ctrl sts=0x00199800. Nov 29 18:02:04 Cooper kernel: drivers/scsi/mvsas/mv_sas.c 2110:phy 1 irq sts = 0x01000000 This is a known bug. Please read the Announcements thread.
Treytor Posted December 1, 2011 Author Posted December 1, 2011 I couldn't find that specifically while searching, but I'm going to assume you are referring to the "nasty LSI Controller issues" mentioned in the beta 14 thread. Trying again with beta 12a. Thanks!
dgaschk Posted December 1, 2011 Posted December 1, 2011 No. Its this one: http://lime-technology.com/forum/index.php?topic=16125.msg152449;topicseen#msg152449
Treytor Posted December 1, 2011 Author Posted December 1, 2011 Okay, thank you. I'm not getting that error any more with Beta 12b, but it seems both replacement drives are (red ball) erroring out. I'm going to RMA them both, and try again once I get a replacement. Damn drive prices have gone up a lot since last time I bought them.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.