Jump to content

JorgeB

Moderators
  • Posts

    67,386
  • Joined

  • Last visited

  • Days Won

    705

Everything posted by JorgeB

  1. Most commonly it can happen after an unclean shutdown.
  2. If you have multiple NIC you need to check which one is eth0, you can do that by booting using the GUI mode, if with that it still doesn't work please post the diagnostics.
  3. There's still a problem with parity, replace/swap both cables.
  4. SSD dropped offline again: Dec 9 18:54:40 vortex kernel: ata9: limiting SATA link speed to 3.0 Gbps Dec 9 18:54:41 vortex kernel: ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 320) Dec 9 18:55:11 vortex kernel: ata9.00: qc timeout (cmd 0xec) Dec 9 18:55:11 vortex kernel: ata9.00: failed to IDENTIFY (I/O error, err_mask=0x4) Dec 9 18:55:11 vortex kernel: ata9.00: revalidation failed (errno=-5) Dec 9 18:55:11 vortex kernel: ata9.00: disabled Did you do what I suggested in the link above, i.e., replacing the cables on that SSD? You should also try connecting the SSD to a different controller, Marvell controllers are known to drop disks without a reason and for that and other reasons not recommended for Unraid use.
  5. Checksum errors mean data is getting corrupt, most commonly from bad RAM, but could be controller, board, CPU, etc, unlikely to be the SSDs themselves.
  6. No spreadsheet, I just used an online graph site to make the chart.
  7. There was a change in the parity check/sync code that tries to auto tune the parity check for best performance, but if it was taking 3+ days your previous settings were likely very far from optimal, 20H sounds about right for 14TB, I had a 10TB check that took around 17H.
  8. The initial transfer is cached to RAM, if you see a drop off after a few seconds it means the devices can't keep up with the write speed.
  9. There are some recovery options here that might help: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=543490
  10. Depends on the damage, like I mentioned:
  11. In that case I would try disabling trim for a week or so and see if it makes any difference, and after fixing the filesystem.
  12. Reformat pool and restore data, and don't forget link above for better pool monitoring.
  13. OK, so the problem is network related, if iperf doesn't get more than 2Gbits, neither will any single transfer, you need to try different things, like NIC, another PC, etc until iperf performs normally.
  14. First run iperf to confirm lan bandwidth then you can try copying to a ramdisk to rule out other bottlenecks, I can get 1GB/s reading from my cache pool. But it's slower if writing to my desktop since the NVMe devices can't keep up:
  15. It's in Settings -> Display Settings -> Show array utilization indicator
  16. That's the UD path, you could try mounting the device manually and running fstrim, just to confirm issue is not UD related. Unmount device in UD then: mkdir /temp mount -t xfs /dev/sdX1 /temp fstrim v- /temp Replace X with correct letter, and don't forget the 1 next to it.
  17. Only thing out of the ordinary are these error every time the device is trimmed: Dec 1 01:00:33 MOZART kernel: dmar_fault: 20 callbacks suppressed Dec 1 01:00:33 MOZART kernel: DMAR: DRHD: handling fault status reg 3 Dec 1 01:00:33 MOZART kernel: DMAR: [DMA Read] Request device [06:00.0] fault addr f9590000 [fault reason 06] PTE Read access is not set Dec 1 01:00:33 MOZART kernel: DMAR: DRHD: handling fault status reg 3 Dec 1 01:00:33 MOZART kernel: DMAR: [DMA Read] Request device [06:00.0] fault addr f78ff000 [fault reason 06] PTE Read access is not set Dec 1 01:00:40 MOZART kernel: DMAR: DRHD: handling fault status reg 3 Dec 1 01:00:40 MOZART kernel: DMAR: [DMA Read] Request device [06:00.0] fault addr ee08d000 [fault reason 06] PTE Read access is not set Dec 1 01:00:41 MOZART kernel: DMAR: DRHD: handling fault status reg 3 Dec 1 01:00:41 MOZART kernel: DMAR: [DMA Read] Request device [06:00.0] fault addr f7903000 [fault reason 06] PTE Read access is not set Dec 1 01:00:42 MOZART root: /var/lib/docker: 8.1 GiB (8695169024 bytes) trimmed on /dev/loop2 Dec 1 01:00:42 MOZART root: /mnt/cache: 397.6 GiB (426866036736 bytes) trimmed on /dev/nvme0n1p1 I believe there's a firmware update to fix that, though I can't see that causing data corruption it still should be fixed and see if it makes any difference.
  18. Set any shares with data on cache you want moved to the array to cache="yes" then run the mover.
  19. It can, but it shouldn't cause data corruption, it should correct single bit errors or halt the system if an uncorrectable error is detected, you can post the diagnostics, maybe something else visible.
  20. You should format but checksum errors are more likely caused by bad RAM than de NVMe device, you should run memtest.
  21. That's because of a bug so v6.8 final can be released, but v6.9rc will also be released at the same time with kernel 5.4.x
  22. Difficult to say because of the filesystem corruption but unlikely, make sure to try and backup anything important before trying anything.
×
×
  • Create New...