Jump to content

JorgeB

Moderators
  • Posts

    67,797
  • Joined

  • Last visited

  • Days Won

    708

Everything posted by JorgeB

  1. Feel free to send them, but looks like the same exact issue. We believe this is related, a lot of Dell users had to add iommu=pt to be able to boot v6.10.x, we also believe that by doing this it prevent them from having the DMAR errors, suggest you update to v6.10.3-rc1, it now comes with iommu=pt by default, should fix this DMAR/corruption issue for all affected platforms.
  2. The "stale configuration" is the problem, browser is not showing actual status, make sure all browsers are closed, but I believe this is known to happen sometimes without a clear reason.
  3. Don't see how. Please post the diagnostics.
  4. I cant confirm since I don't know that exact model, but assuming there's nothing special about it it *should* work.
  5. Nope, there's no time code, you can try again, take note of the time it crashes and upload a new syslog, another thing you can try is to boot the server in safe mode with all docker/VMs disable, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.
  6. Most likely explanation is that the SSD was first added to the array when Unraid used sector 64 for partition start, if you still have the old SSD intact, please post output of: fdisk -l /dev/sdX Replace X with correct letter.
  7. By the way it's logged it's a scheduled check: Jun 9 22:00:01 HRH-UNRAID kernel: mdcmd (38): check Also, there's a button to cancel, might need to refresh the GUI first.
  8. Again nothing obvious, no known issues with the hardware you're using that I know of, do you now the exact time it last crashed, just in case I missed something in the syslog.
  9. That should work, one cable from the LSI to each controller.
  10. Nothing obvious in the log, please post the diagnostics so we can see the hardware used.
  11. It's available in the diags, system/meminfo.txt It shows both DIMMs on channel A, this means they were installed side by side, to install in dual channel mode you need to leave on slot empty between the DIMMs, some boards use different colors on the sockets to indicate this, but not all.
  12. Jun 9 22:09:42 Tower kernel: ata3: softreset failed (timeout) Jun 9 22:09:42 Tower kernel: ata3: reset failed, giving up Jun 9 22:09:42 Tower kernel: ata3.00: disabled ... Jun 9 22:10:57 Tower kernel: ata4: softreset failed (timeout) Jun 9 22:10:57 Tower kernel: ata4: reset failed, giving up Jun 9 22:10:57 Tower kernel: ata4.00: disabled Disks 2 and 4 dropped offline, start by powering down, checking/replacing cables for both disks and post new diags after array start. If possible I would also consider replacing that old PCI controller both disks are connected to, that's a very old and slow controller, and lets see if the disks dropping is caused by the controller itself.
  13. It's also the macvlan issue: Jun 3 15:47:20 vkhpsrv01 kernel: macvlan_broadcast+0x116/0x144 [macvlan] Jun 3 15:47:20 vkhpsrv01 kernel: macvlan_process_broadcast+0xc7/0x110 [macvlan] Besides that there's also filesystem corruption detected on disk1, check filesystem there, btrfs is detecting data corruption in the pool, see here how to handle that, good idea to run memtest first though, also you have both DIMMs in the same channel, for better performance and stability install one in each channel.
  14. Post new diags after disabling the Bluetooth module, and next time please post the original zip.
  15. You should be able to create a raido volume for each disk, but note that's not ideal, and not recommended with Unraid, you should try to get a true HBA.
  16. As will everyone using a Samsung 980 NVMe device, until they fix it.
  17. Btrfs is very susceptible to RAM or any other hardware corruption issue, much more than other filesystems, so if there's an issue it's where you'll see it first, but there are many users, not just in Unraid, using very large btrfs filesystem for years without issues, I myself have roughly 200 btrfs filesystems in use for about 5 or 6 years, only had issues with one, it got trashed twice in a couple of months, traced it to a bad disk.
  18. Load average is crazy high, also RAM usage it at 99%, start shutting down some services to see if you can find the culprit, if not do it the other way around, start the array and leave all VMs/Dockers disabled then start enabling one by one, or a few at a time.
  19. One of the cache devices dropped offline: Jun 8 19:56:06 BIGDADDY kernel: ata1: hard resetting link Jun 8 19:56:12 BIGDADDY kernel: ata1: COMRESET failed (errno=-16) Jun 8 19:56:12 BIGDADDY kernel: ata1: reset failed, giving up Jun 8 19:56:12 BIGDADDY kernel: ata1.00: disabled Check/replace cables, also see here for better pool monitoring for the future.
  20. Unfortunately not seeing anything relevant logged, this can make troubleshooting very difficult Do you remember the last release it was stable? Can you go back to that one and confirm it remains stable?
  21. Looks to me like a hardware issue, one more thing you can try is to boot the server in safe mode with all docker/VMs disable, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.
×
×
  • Create New...