michaelmcq

Members
  • Posts

    37
  • Joined

  • Last visited

Everything posted by michaelmcq

  1. After a month or so of this not happening I’ve had it 3 times this week so I’m back to investigating 😞 any suggestions for the best way to identify the cause. I don’t really want to replace parts that are working and I suspect one of: backplane motherboard psu
  2. Could it be PCIe lanes? I don’t understand it enough but I wonder if I have 2 HBAs running (14 drives) and 4 SSDs could that cause this problem?
  3. Thanks, I think there might be a correlation between this happening and me hammering the SSDs in there at the same time, they’re not via the HBA but on the motherboard (Z490-A-PRO). So I was thinking either power draw or something motherboard related when it’s on board drives are working hard? Off to research power consumption!
  4. Thank you, I took the server down, reseated the card and thought all was sorted but it's gone again tonight but this time it was both cards/all drives, could they both fail at pretty much the same time, that seems unlikely, I don't really know what else to look for in the logs, there was nothing else in the log immediately before both cards failed, again they're all back after a reboot Jul 15 19:05:44 Tower root: Total Spundown: 1 Jul 15 19:05:44 Tower root: Entering Turbo Mode Jul 15 19:05:44 Tower kernel: mdcmd (160): set md_write_method 1 Jul 15 19:05:44 Tower kernel: Jul 15 19:10:44 Tower root: Total Spundown: 1 ### [PREVIOUS LINE REPEATED 4 TIMES] ### Jul 15 19:30:46 Tower emhttpd: spinning down /dev/sds Jul 15 19:31:06 Tower emhttpd: spinning down /dev/sdl Jul 15 19:31:18 Tower emhttpd: spinning down /dev/sdj Jul 15 19:31:29 Tower emhttpd: spinning down /dev/sdo Jul 15 19:31:39 Tower emhttpd: spinning down /dev/sdp Jul 15 19:31:43 Tower emhttpd: spinning down /dev/sdk Jul 15 19:31:57 Tower emhttpd: spinning down /dev/sdh Jul 15 19:31:57 Tower emhttpd: spinning down /dev/sdi Jul 15 19:35:44 Tower root: Total Spundown: 9 Jul 15 19:35:44 Tower root: Entering Normal Mode Jul 15 19:35:44 Tower kernel: mdcmd (161): set md_write_method 0 Jul 15 19:35:44 Tower kernel: Jul 15 19:40:44 Tower root: Total Spundown: 9 ### [PREVIOUS LINE REPEATED 8 TIMES] ### Jul 15 20:23:48 Tower emhttpd: read SMART /dev/sdo Jul 15 20:25:44 Tower root: Total Spundown: 8 ### [PREVIOUS LINE REPEATED 2 TIMES] ### Jul 15 20:36:11 Tower kernel: mpt2sas_cm0: SAS host is non-operational !!!! Jul 15 20:36:11 Tower kernel: mpt2sas_cm1: SAS host is non-operational !!!! Jul 15 20:36:12 Tower kernel: mpt2sas_cm0: SAS host is non-operational !!!! Jul 15 20:36:12 Tower kernel: mpt2sas_cm1: SAS host is non-operational !!!! Jul 15 20:36:13 Tower kernel: mpt2sas_cm0: SAS host is non-operational !!!! Jul 15 20:36:13 Tower kernel: mpt2sas_cm1: SAS host is non-operational !!!! Jul 15 20:36:14 Tower kernel: mpt2sas_cm0: SAS host is non-operational !!!! Jul 15 20:36:14 Tower kernel: mpt2sas_cm1: SAS host is non-operational !!!! Jul 15 20:36:15 Tower kernel: mpt2sas_cm0: SAS host is non-operational !!!! Jul 15 20:36:15 Tower kernel: mpt2sas_cm1: SAS host is non-operational !!!! Jul 15 20:36:16 Tower kernel: mpt2sas_cm0: SAS host is non-operational !!!! Jul 15 20:36:16 Tower kernel: mpt2sas_cm0: _base_fault_reset_work: Running mpt3sas_dead_ioc thread success !!!! Jul 15 20:36:16 Tower kernel: sd 8:0:0:0: [sdf] Synchronizing SCSI cache Jul 15 20:36:16 Tower kernel: sd 8:0:0:0: [sdf] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00 Jul 15 20:36:16 Tower kernel: sd 8:0:4:0: [sdj] tag#803 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=5s Jul 15 20:36:16 Tower kernel: sd 8:0:4:0: [sdj] tag#803 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00 Jul 15 20:36:16 Tower kernel: sd 8:0:4:0: [sdj] tag#804 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=0s Jul 15 20:36:16 Tower kernel: sd 8:0:4:0: [sdj] tag#804 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 98 00 Jul 15 20:36:16 Tower kernel: sd 8:0:5:0: [sdk] tag#805 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=0s Jul 15 20:36:16 Tower kernel: sd 8:0:5:0: [sdk] tag#805 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00 Jul 15 20:36:16 Tower kernel: sd 8:0:5:0: [sdk] tag#806 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=0s Jul 15 20:36:16 Tower kernel: sd 8:0:5:0: [sdk] tag#806 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 98 00 Jul 15 20:36:16 Tower kernel: sd 8:0:2:0: [sdh] tag#807 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=0s Jul 15 20:36:16 Tower kernel: sd 8:0:2:0: [sdh] tag#807 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00 Jul 15 20:36:16 Tower kernel: sd 8:0:2:0: [sdh] tag#808 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=0s Jul 15 20:36:16 Tower kernel: sd 8:0:2:0: [sdh] tag#808 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 98 00 tower-diagnostics-20210715-2051.zip
  5. Over the last couple of days I've started seeing drive errors, I don't always notice straight away. Sometimes it's 4 drives, but other times all drives. Rebooting the server brings everything back as it should be. Jul 5 12:23:02 Tower kernel: mpt2sas_cm1: SAS host is non-operational !!!! Jul 5 12:23:03 Tower kernel: mpt2sas_cm1: SAS host is non-operational !!!! Jul 5 12:23:04 Tower kernel: mpt2sas_cm1: SAS host is non-operational !!!! Jul 5 12:23:05 Tower kernel: mpt2sas_cm1: SAS host is non-operational !!!! Jul 5 12:23:06 Tower kernel: mpt2sas_cm1: SAS host is non-operational !!!! Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: SAS host is non-operational !!!! Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: _base_fault_reset_work: Running mpt3sas_dead_ioc thread success !!!! Jul 5 12:23:07 Tower kernel: sd 9:0:0:0: [sdn] Synchronizing SCSI cache Jul 5 12:23:07 Tower kernel: sd 9:0:0:0: [sdn] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00 Jul 5 12:23:07 Tower kernel: sd 9:0:1:0: [sdo] Synchronizing SCSI cache Jul 5 12:23:07 Tower kernel: sd 9:0:1:0: [sdo] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00 Jul 5 12:23:07 Tower kernel: sd 9:0:4:0: [sdr] tag#731 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=4s Jul 5 12:23:07 Tower kernel: sd 9:0:4:0: [sdr] tag#731 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00 Jul 5 12:23:07 Tower kernel: sd 9:0:4:0: [sdr] tag#732 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=0s Jul 5 12:23:07 Tower kernel: sd 9:0:4:0: [sdr] tag#732 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 98 00 Jul 5 12:23:07 Tower kernel: sd 9:0:5:0: [sds] tag#733 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=0s Jul 5 12:23:07 Tower kernel: sd 9:0:5:0: [sds] tag#733 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00 Jul 5 12:23:07 Tower kernel: sd 9:0:5:0: [sds] tag#734 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=0s Jul 5 12:23:07 Tower kernel: sd 9:0:5:0: [sds] tag#734 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 98 00 Jul 5 12:23:07 Tower kernel: sd 9:0:3:0: [sdq] tag#735 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=0s Jul 5 12:23:07 Tower kernel: sd 9:0:3:0: [sdq] tag#735 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00 Jul 5 12:23:07 Tower kernel: sd 9:0:3:0: [sdq] tag#736 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00 cmd_age=0s Jul 5 12:23:07 Tower kernel: sd 9:0:3:0: [sdq] tag#736 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 98 00 Jul 5 12:23:07 Tower kernel: sd 9:0:2:0: [sdp] Synchronizing SCSI cache Jul 5 12:23:07 Tower kernel: sd 9:0:2:0: [sdp] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00 Jul 5 12:23:07 Tower emhttpd: read SMART /dev/sdr Jul 5 12:23:07 Tower emhttpd: read SMART /dev/sds Jul 5 12:23:07 Tower emhttpd: read SMART /dev/sdq Jul 5 12:23:07 Tower kernel: sd 9:0:3:0: [sdq] Synchronizing SCSI cache Jul 5 12:23:07 Tower kernel: sd 9:0:3:0: [sdq] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00 Jul 5 12:23:07 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Jul 5 12:23:07 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Jul 5 12:23:07 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Jul 5 12:23:07 Tower kernel: sd 9:0:4:0: [sdr] Synchronizing SCSI cache Jul 5 12:23:07 Tower kernel: sd 9:0:4:0: [sdr] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00 Jul 5 12:23:07 Tower unassigned.devices: Warning: Can't get rotational setting of '/dev/sdq'. Jul 5 12:23:07 Tower unassigned.devices: Warning: Can't get rotational setting of '/dev/sdq'. Jul 5 12:23:07 Tower unassigned.devices: Warning: Can't get rotational setting of '/dev/sdr'. Jul 5 12:23:07 Tower unassigned.devices: Warning: Can't get rotational setting of '/dev/sdr'. Jul 5 12:23:07 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Jul 5 12:23:07 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Jul 5 12:23:07 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Jul 5 12:23:07 Tower kernel: sd 9:0:5:0: [sds] Synchronizing SCSI cache Jul 5 12:23:07 Tower kernel: sd 9:0:5:0: [sds] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00 Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x4433221100000000) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: removing handle(0x0009), sas_addr(0x4433221100000000) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: enclosure logical id(0x5b8ca3a0f0160c00), slot(3) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x4433221101000000) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: removing handle(0x000a), sas_addr(0x4433221101000000) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: enclosure logical id(0x5b8ca3a0f0160c00), slot(2) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x4433221102000000) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: removing handle(0x000b), sas_addr(0x4433221102000000) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: enclosure logical id(0x5b8ca3a0f0160c00), slot(1) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x4433221104000000) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: removing handle(0x000c), sas_addr(0x4433221104000000) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: enclosure logical id(0x5b8ca3a0f0160c00), slot(7) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x4433221105000000) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: removing handle(0x000d), sas_addr(0x4433221105000000) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: enclosure logical id(0x5b8ca3a0f0160c00), slot(6) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: mpt3sas_transport_port_remove: removed: sas_addr(0x4433221103000000) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: removing handle(0x000e), sas_addr(0x4433221103000000) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: enclosure logical id(0x5b8ca3a0f0160c00), slot(0) Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: unexpected doorbell active! Jul 5 12:23:07 Tower kernel: mpt2sas_cm1: sending diag reset !! Jul 5 12:23:08 Tower kernel: mpt2sas_cm1: Invalid host diagnostic register value Jul 5 12:23:08 Tower kernel: mpt2sas_cm1: System Register set: Jul 5 12:23:08 Tower kernel: 00000000: ffffffff Jul 5 12:23:08 Tower kernel: 00000004: ffffffff Jul 5 12:23:08 Tower kernel: 00000008: ffffffff Jul 5 12:23:08 Tower kernel: 0000000c: ffffffff Jul 5 12:23:08 Tower kernel: 00000010: ffffffff Jul 5 12:23:08 Tower kernel: 00000014: ffffffff Jul 5 12:23:08 Tower kernel: 00000018: ffffffff Jul 5 12:23:08 Tower kernel: 0000001c: ffffffff <REPEATED> Jul 5 12:23:08 Tower kernel: 000000f8: ffffffff Jul 5 12:23:08 Tower kernel: 000000fc: ffffffff Jul 5 12:23:08 Tower kernel: mpt2sas_cm1: diag reset: FAILED Jul 5 12:26:40 Tower root: Total Spundown: 8 Jul 5 12:31:41 Tower root: Total Spundown: 8 Jul 5 12:33:00 Tower kernel: md: disk5 read error, sector=9102416 Jul 5 12:33:00 Tower kernel: md: disk2 read error, sector=9102416 Jul 5 12:33:00 Tower kernel: md: disk4 read error, sector=9102416 Jul 5 12:33:00 Tower kernel: md: disk6 read error, sector=9102416 Jul 5 12:33:10 Tower emhttpd: read SMART /dev/sdj Jul 5 12:33:10 Tower emhttpd: read SMART /dev/sdk Jul 5 12:33:10 Tower emhttpd: read SMART /dev/sdg Jul 5 12:33:10 Tower kernel: XFS (md5): metadata I/O error in "xfs_da_read_buf+0x9e/0xfe [xfs]" at daddr 0x8ae450 len 8 error 5 Jul 5 12:33:10 Tower kernel: XFS (md5): metadata I/O error in "xfs_da_read_buf+0x9e/0xfe [xfs]" at daddr 0x8ae450 len 8 error 5 Jul 5 12:33:10 Tower emhttpd: read SMART /dev/sdf Jul 5 12:33:10 Tower emhttpd: read SMART /dev/sdl Jul 5 12:33:10 Tower emhttpd: read SMART /dev/sdi I have 2 SAS controller cards, I'd initially thought one of them might be failing but when all drives went I thought it must be motherboard related, nothing has changed on the machine recently it's been running quite nicely for a while? I'm a bit stuck as to where to look next. Diagnostics attached tower-diagnostics-20210705-1244.zip
  6. I don't actively use this anymore, I guess you'd need to update the main script or you maybe able to update the options file to set defaults either via CLI --prefs-add or editing the file directly - it'll be in your config folder and you just want the line: subtitles 1
  7. The max available resolution for get_iplayer is 720p, the BBC don't make anything else higher available through the web version of iPlayer only SmartTVs etc
  8. Just an update from me, with my troublesome docker turned off everything is working fine, it's not Pi-Hole for me but rather this : https://github.com/chrisns/docker-node-sonos-http-api perhaps there are similarilties, although not obviously to me! I've not had a chance to update to RC2 yet.
  9. Can’t stop them all, one of them, a node docker for the Sonos api that I added wouldn’t stop. The others all have. I’ve subsequently tried to restart the docker service with /etc/rc.d/rc.docker restart but that’s not working, looks like I might be doing a forced reboot in a bit and then disabling that particular docker
  10. Is there a docker command to stop them all? I can’t do that at the minute, but will try over the weekend at some point. As an aside I ran “docker stats” from ssh and it doesn’t return anything, have to ctrl c to get a prompt back “docker ps” does return successfully
  11. Not that it's any better than holding down the power button but if you can still get a ssh session, you can do the following to force a reboot echo 1 > /proc/sys/kernel/sysrq echo b > /proc/sysrq-trigger https://major.io/2009/01/29/linux-emergency-reboot-or-shutdown-with-magic-commands/
  12. It stopped and started without issue but didn't seem to make any difference to the docker tab unfortunately
  13. I've posted these over on the main release thread, but here's my diagnostics for the same issue https://lime-technology.com/applications/core/interface/file/attachment.php?id=39269 I wouldn't know where to start with sifting through them but I notice we all have the same error in our docker logs, specifically level=error msg="stream copy error: reading from a closed fifo" I don;t know what a log looks like with out this issue so that could be a red herring. I couldn't see anything at the same time in my sys log
  14. Here's my diag file too with the same issue tower-diagnostics-20180319-1556.zip
  15. I do, that doesn’t run either just sits there scanning. interesting to see Yippy3000 had to force a reboot as I did too. GUI was stuck on stopping services.
  16. No offence taken :-). I rebooted, the docker and also "App Store" tabs worked after a reboot, they're now back to not working though. It flows through to the Dashboard too as I don't see my dockers or VMs under there (VM tab does load) I wonder if it's to do with checking for updates. (Although my Plugins page does work and shows updates)
  17. I've upgraded and everything appears to be working fine, but I can't access any information about my dockers. I know they're running but when I visit the docker page I just get the spinner and eventually see this in chrome developer tools "http://192.168.1.114:8008/plugins/dynamix.docker.manager/include/DockerUpdate.php 504 (Gateway Time-out)" Tried in different browsers and the same, but not restarted the server yet (I have some maintenance to do in a few days so waiting for that, especially as the dockers are working) Plugins page works and various plugins have had updates
  18. I went for the easy option, HDHomeRun - all up and running. I figured if I had any issues I could play with Adding it to Plex anyway Sent from my iPhone using Tapatalk
  19. So, master plan is to buy a second TBS card for DVB-T2, I'm assuming that'll work fine with my existing TBS S2 card? Only complication is a lack of pcie slots, so I'm going to use a pci -> pcie adapter, fingers crossed, has anyone got any experience with these adapters or is there a USB or PCI (not express) T2 card that'll work alongside my existing card?
  20. I struggled too, found out I needed to create a file in the config folder called superuser with contents { "username": "USERNAME", "password": "PASSWORD" } Sent from my iPhone using Tapatalk
  21. You could try the TBS OpenSource build on RC6 as I've added more firmware in that may help as piotrasd needed a similar setup. Dumb question - I presume I need to move the main unraid to RC6 too, rather than just change it in DVB settings?
  22. First things first, thank you very much for setting this up, it's been a godsend moving from my standalone tvheadend on ubuntu to my Unraid server. I have 2 DVB devices, a TBS DVB-S2 PCI card and a USB PCTV DBV-T2 stick. I remember it being an absolute pain getting the 2 working on ubuntu and I can't remember how I did it. I'm currently using the opensourcce TBS one in 6.2.4 and the PCI card is working but not the PCTV - is there a driver set that might give me both? Failing that, I guess a combined T2 & S2 card might be my next solution
  23. It might be cause you only have one adapter and its scanning for EPG and then you have no free adapters for recording. I've disabled the Over the Air EPG options and restarted but it's not made any difference unfortunately I've fixed it - is was a user setting, my user interface level was set to default, I changed it to Expert and I can now see the DVR sections.