capino Posted March 22, 2020 Share Posted March 22, 2020 I see a lot of kernel errors in the system log. I have 8 disks attached to a Dell PERC H200 with LSI firmware. Attached my system log optimus-diagnostics-20200322-1807.zip Quote Link to comment
JorgeB Posted March 23, 2020 Share Posted March 23, 2020 Looks like they are spin down related, disable spin down for a few hours and see if they go way, some users with LSI HBA have a few errors during spin up or spin down, not sure why, but they look harmless. Quote Link to comment
capino Posted March 24, 2020 Author Share Posted March 24, 2020 Disabeling the spin down made the errors go away. Unfortunately Spindown is one of the nice features of Unraid and I would like to keep it enabled. Are there any other things to do to get rid of these error messages? Quote Link to comment
JorgeB Posted March 24, 2020 Share Posted March 24, 2020 9 minutes ago, capino said: Are there any other things to do to get rid of these error messages? Sorry, don't known, you're already using latest LSI firmware, strange that this only happens to a few users, probably related to some hardware combination. Quote Link to comment
capino Posted March 24, 2020 Author Share Posted March 24, 2020 I will swap the LSI card somewhere this week. I have a second controller that's not in use at the moment. And maybe also swap the cables. Let's see if that will fix the problem. Quote Link to comment
capino Posted March 27, 2020 Author Share Posted March 27, 2020 Just replaced the controller this morning and re-enabled the spin down. Unfortunately immediately received the same errors. Now looking to replace the cables. Quote Link to comment
henris Posted May 25, 2020 Share Posted May 25, 2020 I yesterday noticed that I had similar error messages in my syslog adjacent to drive spindowns. Looking at the syslog as whole, the related drive varied. The issue was not consistent; not all spindowns resulted in error messages. Also the same drive sometimes caused error messages and sometimes not. Since I was quite certain I did not have these earlier I looked on the server changelog and the only change I had made recently was the addition of HDDTemp docker to store drive temps to Telegraf/InfluxDB/Grafana. I stopped the docker and tested spinning down drives manually, no errors anymore. I also waited through the night and no errors either. It appears that at least one potential source for this is HDDTemp. One could speculate that similar apps/dockers performing smart requests could cause issues. I haven't performed any additional tests like manually running smart on recently spundown disk but I have "never" seen this with stock unRAID. Since HDDTemp for me is already not ideal due to non-persistent device naming, I will simply stop using it. Syslog snippet: device sdn -> Disk 12 May 24 16:05:31 TMS-740 kernel: mdcmd (127): spindown 11 May 24 16:08:54 TMS-740 kernel: mdcmd (128): spindown 10 May 24 16:43:04 TMS-740 kernel: mdcmd (129): spindown 13 May 24 17:00:50 TMS-740 kernel: mdcmd (130): spindown 12 May 24 17:01:12 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000c75493da) May 24 17:01:12 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#518 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:01:12 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:01:12 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:01:12 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000c75493da) May 24 17:02:42 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000acc00b30) May 24 17:02:42 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#519 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:02:42 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:02:42 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:02:42 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000acc00b30) May 24 17:04:12 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000b88d3273) May 24 17:04:12 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#521 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:04:12 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:04:12 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:04:12 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000b88d3273) May 24 17:05:42 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000c75493da) May 24 17:05:42 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#518 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:05:42 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:05:42 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:05:42 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000c75493da) May 24 17:07:12 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000c92f385e) May 24 17:07:12 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#523 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:07:12 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:07:12 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:07:12 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000c92f385e) May 24 17:08:42 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000c75493da) May 24 17:08:42 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#518 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:08:42 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:08:42 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:08:42 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000c75493da) May 24 17:10:12 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000acc00b30) May 24 17:10:12 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#519 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:10:12 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:10:12 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:10:12 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000acc00b30) May 24 17:11:42 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000acc00b30) May 24 17:11:42 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#519 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:11:42 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:11:42 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:11:42 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000acc00b30) May 24 17:13:12 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000c92f385e) May 24 17:13:12 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#523 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:13:12 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:13:12 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:13:12 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000c92f385e) May 24 17:14:42 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000c75493da) May 24 17:14:42 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#518 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:14:42 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:14:42 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:14:42 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000c75493da) May 24 17:15:02 TMS-740 kernel: sd 1:0:12:0: Power-on or device reset occurred May 24 17:15:02 TMS-740 rc.diskinfo[10781]: SIGHUP received, forcing refresh of disks info. May 24 17:16:12 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000acc00b30) May 24 17:16:12 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#519 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:16:12 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:16:12 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:16:12 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000acc00b30) May 24 17:16:59 TMS-740 kernel: sd 1:0:12:0: Power-on or device reset occurred May 24 17:16:59 TMS-740 rc.diskinfo[10781]: SIGHUP received, forcing refresh of disks info. Quote Link to comment
capino Posted June 5, 2020 Author Share Posted June 5, 2020 Thank you for replying. Shutting down the HDDTemp docker also fixed the problem for me. Quote Link to comment
keshavdaboss Posted August 9, 2020 Share Posted August 9, 2020 Are you able to get spindown working on an LSI SAS card? On 5/24/2020 at 11:02 PM, henris said: I yesterday noticed that I had similar error messages in my syslog adjacent to drive spindowns. Looking at the syslog as whole, the related drive varied. The issue was not consistent; not all spindowns resulted in error messages. Also the same drive sometimes caused error messages and sometimes not. Since I was quite certain I did not have these earlier I looked on the server changelog and the only change I had made recently was the addition of HDDTemp docker to store drive temps to Telegraf/InfluxDB/Grafana. I stopped the docker and tested spinning down drives manually, no errors anymore. I also waited through the night and no errors either. It appears that at least one potential source for this is HDDTemp. One could speculate that similar apps/dockers performing smart requests could cause issues. I haven't performed any additional tests like manually running smart on recently spundown disk but I have "never" seen this with stock unRAID. Since HDDTemp for me is already not ideal due to non-persistent device naming, I will simply stop using it. Syslog snippet: device sdn -> Disk 12 May 24 16:05:31 TMS-740 kernel: mdcmd (127): spindown 11 May 24 16:08:54 TMS-740 kernel: mdcmd (128): spindown 10 May 24 16:43:04 TMS-740 kernel: mdcmd (129): spindown 13 May 24 17:00:50 TMS-740 kernel: mdcmd (130): spindown 12 May 24 17:01:12 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000c75493da) May 24 17:01:12 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#518 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:01:12 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:01:12 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:01:12 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000c75493da) May 24 17:02:42 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000acc00b30) May 24 17:02:42 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#519 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:02:42 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:02:42 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:02:42 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000acc00b30) May 24 17:04:12 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000b88d3273) May 24 17:04:12 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#521 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:04:12 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:04:12 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:04:12 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000b88d3273) May 24 17:05:42 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000c75493da) May 24 17:05:42 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#518 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:05:42 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:05:42 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:05:42 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000c75493da) May 24 17:07:12 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000c92f385e) May 24 17:07:12 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#523 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:07:12 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:07:12 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:07:12 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000c92f385e) May 24 17:08:42 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000c75493da) May 24 17:08:42 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#518 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:08:42 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:08:42 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:08:42 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000c75493da) May 24 17:10:12 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000acc00b30) May 24 17:10:12 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#519 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:10:12 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:10:12 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:10:12 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000acc00b30) May 24 17:11:42 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000acc00b30) May 24 17:11:42 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#519 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:11:42 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:11:42 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:11:42 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000acc00b30) May 24 17:13:12 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000c92f385e) May 24 17:13:12 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#523 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:13:12 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:13:12 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:13:12 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000c92f385e) May 24 17:14:42 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000c75493da) May 24 17:14:42 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#518 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:14:42 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:14:42 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:14:42 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000c75493da) May 24 17:15:02 TMS-740 kernel: sd 1:0:12:0: Power-on or device reset occurred May 24 17:15:02 TMS-740 rc.diskinfo[10781]: SIGHUP received, forcing refresh of disks info. May 24 17:16:12 TMS-740 kernel: sd 1:0:12:0: attempting task abort! scmd(00000000acc00b30) May 24 17:16:12 TMS-740 kernel: sd 1:0:12:0: [sdn] tag#519 CDB: opcode=0x85 85 06 20 00 d8 00 00 00 00 00 4f 00 c2 00 b0 00 May 24 17:16:12 TMS-740 kernel: scsi target1:0:12: handle(0x001d), sas_address(0x443322110e000000), phy(14) May 24 17:16:12 TMS-740 kernel: scsi target1:0:12: enclosure logical id(0x500062b200794840), slot(13) May 24 17:16:12 TMS-740 kernel: sd 1:0:12:0: task abort: SUCCESS scmd(00000000acc00b30) May 24 17:16:59 TMS-740 kernel: sd 1:0:12:0: Power-on or device reset occurred May 24 17:16:59 TMS-740 rc.diskinfo[10781]: SIGHUP received, forcing refresh of disks info. Quote Link to comment
ArveVM Posted August 9, 2023 Share Posted August 9, 2023 On 5/25/2020 at 8:02 AM, henris said: I looked on the server changelog and the only change I had made recently wow,, @henris,, is there a server changelog in unRaid?? Or are you actually tracking changes like a pro in an external change-system ?? Quote Link to comment
henris Posted September 22, 2023 Share Posted September 22, 2023 On 8/9/2023 at 10:50 PM, ArveVM said: wow,, @henris,, is there a server changelog in unRaid?? Or are you actually tracking changes like a pro in an external change-system ?? No changelog in UnRaid to my knowledge. I'm keeping my changelog in OneNote. Few lines for each "trivial change" and a separate subpage for larger changes or complex troubleshootings. Also some tailored pages or tables for things like disks and more complex dockers. It is just so much easier to have a compressed logical description of the changes rather than trying to reverse-engineer it from logs if even possible. In my work I use things like Jira for change management but I don't like for personal use (feels too much like work). 99% of all problems come from changes. I can document my own changes and I can try to control other changes with scheduled updates. I have docker updates running on Friday/Saturday night so I have all the weekend to fix things To emphasize the point, I just ran into the 1% and have a failed cache pool drive and potentially corrupted cache pool. That is why I'm in this forum right now, to make a new troubleshooting thread. Last time I had to do troubleshooting was 11.4.2022. I've been running UnRaid since 2009. I just love it, I can just let it run months and months without any manual intervention. Sometimes things just break. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.