Sinister Posted January 21, 2022 Share Posted January 21, 2022 I am not sure why i am currently getting this error at all i have not changed anything that i am aware of recently, even after adding a script to my go file and restarting the issue still persists. tower-diagnostics-20220121-1404.zip Quote Link to comment
trurl Posted January 21, 2022 Share Posted January 21, 2022 Lots of entries in syslog like this Jan 21 14:04:53 Tower kernel: sd 9:0:4:0: [sdt] tag#1354 Sense Key : 0x1 [current] Jan 21 14:04:53 Tower kernel: sd 9:0:4:0: [sdt] tag#1354 ASC=0x0 ASCQ=0x0 Jan 21 14:04:53 Tower kernel: sd 9:0:19:0: [sdai] tag#1355 Sense Key : 0x1 [current] Jan 21 14:04:53 Tower kernel: sd 9:0:19:0: [sdai] tag#1355 ASC=0x0 ASCQ=0x0 Jan 21 14:04:53 Tower kernel: sd 9:0:5:0: [sdu] tag#1360 Sense Key : 0x1 [current] Jan 21 14:04:53 Tower kernel: sd 9:0:5:0: [sdu] tag#1360 ASC=0x0 ASCQ=0x0 Jan 21 14:04:53 Tower kernel: sd 9:0:7:0: [sdw] tag#1443 Sense Key : 0x1 [current] Jan 21 14:04:53 Tower kernel: sd 9:0:7:0: [sdw] tag#1443 ASC=0x0 ASCQ=0x0 I suspect related to this controller 82:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03) Syslog isn't really large enough to fill logspace though. What do you get from command line with this? du -h /var/log 1 Quote Link to comment
Sinister Posted January 21, 2022 Author Share Posted January 21, 2022 This is what i am getting. Quote Link to comment
trurl Posted January 21, 2022 Share Posted January 21, 2022 OK, what do you get with this? ls -lah /var/log 1 Quote Link to comment
Sinister Posted January 21, 2022 Author Share Posted January 21, 2022 11 minutes ago, trurl said: Lots of entries in syslog like this Jan 21 14:04:53 Tower kernel: sd 9:0:4:0: [sdt] tag#1354 Sense Key : 0x1 [current] Jan 21 14:04:53 Tower kernel: sd 9:0:4:0: [sdt] tag#1354 ASC=0x0 ASCQ=0x0 Jan 21 14:04:53 Tower kernel: sd 9:0:19:0: [sdai] tag#1355 Sense Key : 0x1 [current] Jan 21 14:04:53 Tower kernel: sd 9:0:19:0: [sdai] tag#1355 ASC=0x0 ASCQ=0x0 Jan 21 14:04:53 Tower kernel: sd 9:0:5:0: [sdu] tag#1360 Sense Key : 0x1 [current] Jan 21 14:04:53 Tower kernel: sd 9:0:5:0: [sdu] tag#1360 ASC=0x0 ASCQ=0x0 Jan 21 14:04:53 Tower kernel: sd 9:0:7:0: [sdw] tag#1443 Sense Key : 0x1 [current] Jan 21 14:04:53 Tower kernel: sd 9:0:7:0: [sdw] tag#1443 ASC=0x0 ASCQ=0x0 I suspect related to this controller 82:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03) Syslog isn't really large enough to fill logspace though. What do you get from command line with this? du -h /var/log Quote Link to comment
Sinister Posted January 21, 2022 Author Share Posted January 21, 2022 1 minute ago, trurl said: OK, what do you get with this? ls -lah /var/log Quote Link to comment
trurl Posted January 21, 2022 Share Posted January 21, 2022 So syslog really is that large, but all of that isn't included in diagnostics. Probably it is just full of lines similar to those I already posted, a likely controller issue. 1 Quote Link to comment
Sinister Posted January 21, 2022 Author Share Posted January 21, 2022 (edited) 5 minutes ago, trurl said: So syslog really is that large, but all of that isn't included in diagnostics. Probably it is just full of lines similar to those I already posted, a likely controller issue. HBA controller ? i dont understand I just switched out from a netapp controller Which was disabling disks in the array every few hours to an LSI one where i have no issues. What should i do ? Edited January 21, 2022 by Sinister Quote Link to comment
Solution trurl Posted January 21, 2022 Solution Share Posted January 21, 2022 Reseat controller(s). Check connections, both ends, SAS/SATA and power, including splitters. Quote Link to comment
Sinister Posted January 23, 2022 Author Share Posted January 23, 2022 On 1/21/2022 at 3:56 PM, trurl said: Reseat controller(s). Check connections, both ends, SAS/SATA and power, including splitters. Powering off the server, removing the card entirely then re-installing it seems to have done the trick. Thank You Quote Link to comment
Sinister Posted January 31, 2022 Author Share Posted January 31, 2022 Guess i spoke too soon because now i have a different error according to the logs, filling up var logs. Stating SMbus busy or something to that affect, i think i read in the forums that unloading drivers in dynamix temp plugin resolves this ? im not sure. tower-diagnostics-20220131-0633.zip Quote Link to comment
trurl Posted January 31, 2022 Share Posted January 31, 2022 Looks like you're still having problems with connections to multiple disks. Are you sure the controller has adequate cooling? Quote Link to comment
Sinister Posted January 31, 2022 Author Share Posted January 31, 2022 2 hours ago, trurl said: Looks like you're still having problems with connections to multiple disks. Are you sure the controller has adequate cooling? I think it does, the problem even stopped for a while when i switched controllers. I did however have to re-add one disk twice to the array while all others are fine. could one disk cause this ? Quote Link to comment
trurl Posted January 31, 2022 Share Posted January 31, 2022 Any power splitters involved? 1 Quote Link to comment
Sinister Posted February 7, 2022 Author Share Posted February 7, 2022 On 1/31/2022 at 5:55 PM, trurl said: Any power splitters involved? So i went to dynamix temp plugin, and when i clicked unload driver i have not since had this issue which is why i waited a while to post back. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.