mrvnsk9

Members
  • Posts

    21
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

mrvnsk9's Achievements

Noob

Noob (1/14)

0

Reputation

1

Community Answers

  1. Edit: Reverting back to BTRFS from ZFS on my cache pool where my appdata is stored fixed the issue. I rolled back to 6.12.1 after having some issues with docker containers after updating to 12.6.2. I am getting a kernel panic when attempting to start containers using "docker compose up -d". Everything was running fine prior to updating to 12.6.2. I also tried disabling the Nvidia runtime in the docker compose file to see if I had an issue there. It didn't make a difference. Let me know if more information is needed. Here is a screen shot of the server console and diagnostics are attached. dragon-diagnostics-20230630-2315.zip
  2. I'm considering dumping the mellanox card in favor of intel as well. I suppose I'll hold out a while for the fix. I do appreciate all the hard work the devs do as well. What intel card are you looking considering? I've been looking at the x520, but I'm not sure that's the best option.
  3. Thanks for the information. I guess I missed that in the release notes and my forum searches. I'll run the on-board NIC until this is resolved. Thanks for the information.
  4. Connecting the on-board NIC restores connectivity. Looks like there's a known issue with Mellanox NICs at the moment. I'll leave the on-board NIC connected until that issue is resolved. Thanks for the help.
  5. After upgrading to 6.10.2, I am no longer getting an IP address. I am using a Mellanox MCX311A-XCAT ConnectX-3 @ eth1 on an AMD platform. IMMOU is also disabled. I didn't have any issues with this NIC prior to upgrading. The diagnostics are attached. Any help would be greatly appreciated. - Brian dragon-diagnostics-20220606-1740.zip
  6. @johnnie.black Changing the bios to the correct setting seems to have done the trick. I'm also going to swap out the controller card for one that's actually supported by unraid. I'll consider this solved. Thanks for your help!
  7. I have a StarTech controller with a Marvell 88SE9230 chipset, which i have to disable IMMOU or it drops drives, in the system. I'm going to remove that controller from the array and see if that improves things (I only have one drive attached to it anyway). I should probably replace it with a LSI 9300-8i or something similar.
  8. @johnnie.black After making the changes to the bios, the server stayed up for about 15 hours. I was using the unbalance plugin to move some files to disk6 and received the following errors before it locked up. Jan 6 23:33:48 Dragon kernel: ata1.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x6 frozen Jan 6 23:33:48 Dragon kernel: ata1.00: failed command: WRITE FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata1.00: cmd 61/70:00:08:9d:e0/01:00:be:00:00/40 tag 0 ncq dma 188416 out Jan 6 23:33:48 Dragon kernel: res 40/00:ff:ff:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata1.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata1.00: failed command: WRITE FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata1.00: cmd 61/40:08:90:7d:e0/05:00:be:00:00/40 tag 1 ncq dma 688128 out Jan 6 23:33:48 Dragon kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata1.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata1: hard resetting link Jan 6 23:33:48 Dragon kernel: ata3.00: exception Emask 0x0 SAct 0x780 SErr 0x0 action 0x6 frozen Jan 6 23:33:48 Dragon kernel: ata3.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata3.00: cmd 60/40:38:78:9e:e0/05:00:be:00:00/40 tag 7 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata3.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata3.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata3.00: cmd 60/40:40:b8:a3:e0/05:00:be:00:00/40 tag 8 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata3.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata3.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata3.00: cmd 60/40:48:f8:a8:e0/05:00:be:00:00/40 tag 9 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata3.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata3.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata3.00: cmd 60/78:50:38:ae:e0/01:00:be:00:00/40 tag 10 ncq dma 192512 in Jan 6 23:33:48 Dragon kernel: res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata3.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata3: hard resetting link Jan 6 23:33:48 Dragon kernel: ata4.00: exception Emask 0x0 SAct 0x3c003000 SErr 0x0 action 0x6 frozen Jan 6 23:33:48 Dragon kernel: ata4.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata4.00: cmd 60/40:60:d0:82:e0/05:00:be:00:00/40 tag 12 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:ff:ff:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata4.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata4.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata4.00: cmd 60/40:68:10:88:e0/05:00:be:00:00/40 tag 13 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata4.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata4.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata4.00: cmd 60/40:d0:78:9e:e0/05:00:be:00:00/40 tag 26 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:ff:ff:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata4.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata4.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata4.00: cmd 60/40:d8:b8:a3:e0/05:00:be:00:00/40 tag 27 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:ff:ff:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata4.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata4.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata4.00: cmd 60/40:e0:f8:a8:e0/05:00:be:00:00/40 tag 28 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata4.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata4.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata4.00: cmd 60/78:e8:38:ae:e0/01:00:be:00:00/40 tag 29 ncq dma 192512 in Jan 6 23:33:48 Dragon kernel: res 40/00:ff:82:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata4.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata4: hard resetting link Jan 6 23:33:48 Dragon kernel: ata8.00: exception Emask 0x0 SAct 0x3c00 SErr 0x0 action 0x6 frozen Jan 6 23:33:48 Dragon kernel: ata8.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata8.00: cmd 60/40:50:78:9e:e0/05:00:be:00:00/40 tag 10 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata8.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata8.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata8.00: cmd 60/40:58:b8:a3:e0/05:00:be:00:00/40 tag 11 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata8.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata8.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata8.00: cmd 60/40:60:f8:a8:e0/05:00:be:00:00/40 tag 12 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata8.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata8.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata8.00: cmd 60/78:68:38:ae:e0/01:00:be:00:00/40 tag 13 ncq dma 192512 in Jan 6 23:33:48 Dragon kernel: res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata8.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata8: hard resetting link Jan 6 23:33:48 Dragon kernel: ata2.00: exception Emask 0x0 SAct 0x1e080 SErr 0x0 action 0x6 frozen Jan 6 23:33:48 Dragon kernel: ata2.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata2.00: cmd 60/40:38:10:88:e0/05:00:be:00:00/40 tag 7 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:ff:ff:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata2.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata2.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata2.00: cmd 60/40:68:78:9e:e0/05:00:be:00:00/40 tag 13 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata2.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata2.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata2.00: cmd 60/40:70:b8:a3:e0/05:00:be:00:00/40 tag 14 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:ff:ff:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata2.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata2.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata2.00: cmd 60/40:78:f8:a8:e0/05:00:be:00:00/40 tag 15 ncq dma 688128 in Jan 6 23:33:48 Dragon kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata2.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata2.00: failed command: READ FPDMA QUEUED Jan 6 23:33:48 Dragon kernel: ata2.00: cmd 60/78:80:38:ae:e0/01:00:be:00:00/40 tag 16 ncq dma 192512 in Jan 6 23:33:48 Dragon kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 6 23:33:48 Dragon kernel: ata2.00: status: { DRDY } Jan 6 23:33:48 Dragon kernel: ata2: hard resetting link Would this indicate there is an issue with disk6? I was copying the files from disk5 if that is relevant information. I've attached a new diagnostics taken after rebooting the server. It looks like there was an error on every drive in the array. Did unbalance cause this or do I have bad cables or is there another issue causing it? For reference, these are the pci devices for the drives. /sys/bus/pci/devices/0000:01:00.1/ata1/host1/target1:0:0/1:0:0:0/block/sdb /sys/bus/pci/devices/0000:01:00.1/ata2/host2/target2:0:0/2:0:0:0/block/sdc /sys/bus/pci/devices/0000:01:00.1/ata3/host3/target3:0:0/3:0:0:0/block/sdd /sys/bus/pci/devices/0000:01:00.1/ata4/host4/target4:0:0/4:0:0:0/block/sde /sys/bus/pci/devices/0000:01:00.1/ata7/host7/target7:0:0/7:0:0:0/block/sdf /sys/bus/pci/devices/0000:01:00.1/ata8/host8/target8:0:0/8:0:0:0/block/sdg /sys/bus/pci/devices/0000:09:00.0/ata12/host12/target12:0:0/12:0:0:0/block/sdh dragon-diagnostics-20200106-2358.zip
  9. The bios is up to date and C-states are already disabled. The "Power Supply Idle Control" was not set to the suggested value. I changed that. The odd thing is the server has been stable for a year and didn't start having issues until Jan 1. Probably a coincidence, but it's still odd to me. Is the "/usr/local/sbin/zenstates --c6-disable" line still required in the go file or is it no longer needed? Also, I'm using "rcu_nocbs=0-7" in the syslinux configuration.
  10. The log was still mirrored after the restart. If you scroll up to line 213 in the syslog file you should see a timestamp of "Jan 4 17:47:16". This is where the server became unresponsive.
  11. I haven't had issues with the docker image filling up. I guess the dockers hadn't finished starting when I pulled the diagnostics. I haven't done a memtest yet. I'll try that. Should I be concerned about the SMART errors on disks 3 and 6?
  12. Apologies. I forgot my array wasn't started when I pulled the first diagnostics. Of course that was the only time I pulled any diagnostics. It freezes when the array is started. It hasn't frozen with the array stopped. I attached a new diagnostics with the array started. Thanks dragon-diagnostics-20200104-2108.zip
  13. Over the last few days my server has starting hanging after being up for a few hours. I looked at the SMART reports in the diagnostics and it looks like there are errors on disks 3 and 6. I'm not sure if this is the problem or if there is another cause for the lockups. I mirrored the syslog server to flash before the last freeze happened. Both it and the diagnostics are attached. Any help would be greatly appreciated. Thanks in advance! dragon-diagnostics-20200104-1903.zip syslog
  14. I think I finally have the server stabilized. I ended up doing the follow: Replaced the ASUS board with an ASRock board (At this point, I'm pretty sure the ASUS board was bad.) Disabled the Global C-State controls in the BIOS Added the rcu_nocbs=0-7 setting Thanks for the help.