Jump to content

JorgeB

Moderators
  • Posts

    67,616
  • Joined

  • Last visited

  • Days Won

    707

Everything posted by JorgeB

  1. You can check the syslog to see which files are corrupt and need to be replaced/deleted.
  2. Btrfs was detecting data corruption before going read-only, you're using ECC RAM so unlikely to be that, but probably there's some hardware issue, try a different NVMe device if available.
  3. I found this by accident yesterday while testing another thing, when I have some time I'm going to test with earlier releases to see if/when it changed then decide.
  4. Unlikely, after those errors transfer should be retried, but can't say impossible. They are calculated when the block is written and checked every time it's read/accessed.
  5. You should backup and reformat that filesystem, you should also see here, Ryzen with overclocked RAM is known to corrupt data, and btrfs can get corrupted fast with RAM errors.
  6. You'll need to do a new config since the disks name will change and will very likely get an invalid partition layout on those disks, that can be fixed but requires rebuilding them one by one (and obviously you'd need to have a valid parity).
  7. Still suspect a power/connection issue then, try another PSU if available.
  8. Yes, for some servers that are always on I have scripts snapshotting the shares once a day (some multiple times a day) then replicate the differences to another server, then have scripts monitoring the free space of the backups storage and deleting older snapshots as needed, this way I usually can go back 30 to 60 days if needed, depending on the share, on my "cold storage" servers that are mostly off I just run the scrip manually to snapshot the disks and replicate when there's enough new data to justify an incremental backup to the backup server.
  9. ST8000DM004 - These disks are known to sometimes perform very bad, apparently only some of them, older Seagate Archive SMR disks performed much better.
  10. It depends on the controller, but likely not, it's one of the reasons we don't recommend RAID controllers for Unraid.
  11. Ahh, OK. Just did a dual disk rebuild on my work server using v6.9.2 without issues, so it confirms it's not a general problem, I suspect it's what I wrote above.
  12. disk0: size: 3906985764 disk11: size: 3907018532 As you can see it's slightly smaller, almost certainly because of the RAID controller, it's not using all the space to create the volume, it reserves some of it.
  13. Diags will confirm, but if Unraid is complaining then it's not the biggest, might be related to the RAID controller in use.
  14. It's not the first time I've noticed, sometimes I use that stream to quickly find a thread and some are missing, e.g.: This thread is missing, last post was 5 hours ago by me, so it should appear in the list above:
  15. No idea, it's not something that I ever tried.
  16. It doesn't if it's just direct connections, it does if it includes a SAS expander.
  17. Probably, why were you using a "rootshare"? That's not default behavior.
  18. Doesn't support > 2TiB Supports > 2TiB Don't get that one, not recommended for a long time, see here for a list:
  19. Turns out I was wrong, pretty sure I've seen it work like that before, but at least as of 6.9.2 it's not, during parity2 sync it didn't detect (or correct) sycn errors on parity1, despite parity1 being read during the sync.
  20. Apr 29 09:10:50 MissionCtrl kernel: ata1: link is slow to respond, please be patient (ready=0) Apr 29 09:10:54 MissionCtrl kernel: ata1: COMRESET failed (errno=-16) Apr 29 09:10:54 MissionCtrl kernel: ata1: hard resetting link ... Apr 29 09:10:56 MissionCtrl kernel: ata2.00: cmd 61/18:b8:a0:2c:23/00:00:75:00:00/40 tag 23 ncq dma 12288 out Apr 29 09:10:56 MissionCtrl kernel: res 61/04:00:00:00:00/00:00:00:00:00/00 Emask 0x401 (device error) <F> Apr 29 09:10:56 MissionCtrl kernel: ata2.00: status: { DRDY DF ERR } Apr 29 09:10:56 MissionCtrl kernel: ata2.00: error: { ABRT } Apr 29 09:10:56 MissionCtrl kernel: ata2.00: failed to read native max address (err_mask=0x1) Apr 29 09:10:56 MissionCtrl kernel: ata2.00: HPA support seems broken, skipping HPA handling Apr 29 09:10:56 MissionCtrl kernel: ata2.00: failed to enable AA (error_mask=0x1) ### [PREVIOUS LINE REPEATED 1 TIMES] ### Apr 29 09:10:56 MissionCtrl kernel: ata2.00: configured for UDMA/133 (device error ignored) Apr 29 09:10:56 MissionCtrl kernel: ata2: EH complete Apr 29 09:10:56 MissionCtrl kernel: ata5.00: exception Emask 0x0 SAct 0xf800 SErr 0x0 action 0x0 Apr 29 09:10:56 MissionCtrl kernel: ata5.00: irq_stat 0x40000008 Apr 29 09:10:56 MissionCtrl kernel: ata5.00: failed command: READ FPDMA QUEUED Apr 29 09:10:56 MissionCtrl kernel: ata5.00: cmd 60/38:58:d0:b2:7f/00:00:95:01:00/40 tag 11 ncq dma 28672 in Apr 29 09:10:56 MissionCtrl kernel: res 61/04:00:00:00:00/00:00:00:00:00/00 Emask 0x401 (device error) <F> Apr 29 09:10:56 MissionCtrl kernel: ata5.00: status: { DRDY DF ERR } Apr 29 09:10:56 MissionCtrl kernel: ata5.00: error: { ABRT } Apr 29 09:10:56 MissionCtrl kernel: ata5.00: failed to read native max address (err_mask=0x1) Apr 29 09:10:56 MissionCtrl kernel: ata5.00: HPA support seems broken, skipping HPA handling Apr 29 09:10:56 MissionCtrl kernel: ata5.00: failed to enable AA (error_mask=0x1) ### [PREVIOUS LINE REPEATED 1 TIMES] ### Apr 29 09:10:56 MissionCtrl kernel: ata5.00: configured for UDMA/133 (device error ignored) Apr 29 09:10:56 MissionCtrl kernel: ata5: EH complete Apr 29 09:10:57 MissionCtrl kernel: ata3.00: exception Emask 0x0 SAct 0xa000 SErr 0x0 action 0x0 Apr 29 09:10:57 MissionCtrl kernel: ata3.00: irq_stat 0x40000008 Apr 29 09:10:57 MissionCtrl kernel: ata3.00: failed command: READ FPDMA QUEUED Apr 29 09:10:57 MissionCtrl kernel: ata3.00: cmd 60/18:68:90:f7:ea/00:00:5f:01:00/40 tag 13 ncq dma 12288 in Apr 29 09:10:57 MissionCtrl kernel: res 61/04:00:00:00:00/00:00:00:00:00/00 Emask 0x401 (device error) <F> Apr 29 09:10:57 MissionCtrl kernel: ata3.00: status: { DRDY DF ERR } Apr 29 09:10:57 MissionCtrl kernel: ata3.00: error: { ABRT } Errors on multiple disks, this is the onboard controller, so I would start with by testing with a different power supply.
  21. See if this applies to you: https://forums.unraid.net/topic/70529-650-call-traces-when-assigning-ip-address-to-docker-containers/ See also here: https://forums.unraid.net/bug-reports/stable-releases/690691-kernel-panic-due-to-netfilter-nf_nat_setup_info-docker-static-ip-macvlan-r1356/
  22. See if this applies to you: https://forums.unraid.net/topic/70529-650-call-traces-when-assigning-ip-address-to-docker-containers/ See also here: https://forums.unraid.net/bug-reports/stable-releases/690691-kernel-panic-due-to-netfilter-nf_nat_setup_info-docker-static-ip-macvlan-r1356/
  23. Pretty sure that won't be a general problem, but I've seen multiple Ryzen users with issues completing a parity check due to various call traces on v6.9.x, probably something to do with the new kernel and the Unraid driver, but without the diags from when it crashed it's just a guess. There already is one:
×
×
  • Create New...