Jump to content

JorgeB

Moderators
  • Posts

    67,386
  • Joined

  • Last visited

  • Days Won

    705

Everything posted by JorgeB

  1. "prefer" or "no" (assuming all data on those shares is already on cache)
  2. The syslog is lost on reboot, so the one form this issue is gone, but if it happens again download diags before rebooting.
  3. Depends, if replacing the controllers is an option do it then rebuild, if not at least replace/swap the cables on those disks and rebuild, if it happens again grab diags before rebooting.
  4. Type reboot if it doesn't work after a few minutes you'll need to force it.
  5. If you have any script running at array start with the user script plugin disable it, there was recently a similar case.
  6. Current cache pool is a mess: Data Data Metadata System System Id Path single RAID1 RAID1 single RAID1 Unallocated -- --------- -------- --------- -------- -------- --------- ----------- 1 missing - 91.97GiB 1.00GiB - 32.00MiB -93.00GiB 3 /dev/sdk1 45.06GiB 181.76GiB 1.00GiB 32.00MiB 32.00MiB 5.00GiB 4 missing - 85.00GiB - - - -85.00GiB 5 /dev/sdj1 - 688.79GiB 1.00GiB - - 1.15TiB 6 /dev/sdi1 - 684.00GiB 1.00GiB - - 1.15TiB -- --------- -------- --------- -------- -------- --------- ----------- Total 45.06GiB 865.76GiB 2.00GiB 32.00MiB 32.00MiB 2.13TiB Used 29.10GiB 783.93GiB 1.19GiB 0.00B 160.00KiB You have one failing device, two missing devices, part of the data is raid1, another part is single profile, best bet would be to try and backup any important data still on cache and re-format the pool with remaining good devices.
  7. Disk looks fine, but there are already a lot of UDMA CRC errors, these are a connection problem, usually a bad SATA cable.
  8. Cache filesystem is corrupt, best bet it to re-format, if there's any important data there you can try these recovery options.
  9. It wasn't a disk problem almost certainly, most likely a controller issue, possibly cable/power issue.
  10. BINGO! This certainly explains the filesystem corruption, you can't have the same filesystem mounted twice.
  11. Likely not the same issue, but please post the diagnostics: Tools -> Diagnostics
  12. I suggest you replace the controllers, not drivers, use one the recommended LSI models, any LSI with a SAS2008/2308/3008/3408 chipset in IT mode, e.g., 9201-8i, 9211-8i, 9207-8i, 9300-8i, 9400-8i, etc and clones, like the Dell H200/H310 and IBM M1015, these latter ones need to be crossflashed.
  13. Likely a connection issue, disks dropped offline at the same time, still good to check SMART, also syslog rotated and doesn't show the boot process, so check that LSI controllers are using the latest firmware (p20.00.07.00), earlier p20 releases for example have known issues.
  14. You're using SASLP controllers, these have several known issues, including controller crashing and dropping disks without a reason, can't say for sure without the syslog but likely what happened here, since both disabled disks are on the same controller, though it could also be a for example a cable issue, in any case and if possible I suggest replacing them with LSI HBAs.
  15. Please post current diags, after the reboot.
  16. Rebooting should fix it, if it doesn't post new diags after the reboot.
  17. Cancel rebuild and shutdown Unraid, when the power comes back one power up and it will start rebuilding from the beginning.
  18. You should post the diagnostics, if it's an array disk you can't use it in UD without removing it from the array, also likely to have the same issue.
  19. Transfer graph suggests LAN is working correctly and first few GBs are fast because they are cached to RAM, after that looks like your cache can't keep up. Try enabling turbo write and writing directly to array to compare.
  20. Cache1 SSD has issues, and looks like an actual SSD problem: Jan 28 19:00:49 MaxisNAS kernel: BTRFS error (device sdk1): bdev /dev/sdk1 errs: wr 0, rd 866, flush 0, corrupt 0, gen 0 Jan 28 19:00:49 MaxisNAS kernel: BTRFS error (device sdk1): bdev /dev/sdk1 errs: wr 0, rd 867, flush 0, corrupt 0, gen 0 Jan 28 19:00:49 MaxisNAS kernel: ata5: EH complete Jan 28 19:00:49 MaxisNAS kernel: ata5.00: exception Emask 0x0 SAct 0x1fff0 SErr 0x0 action 0x0 Jan 28 19:00:49 MaxisNAS kernel: ata5.00: irq_stat 0x40000008 Jan 28 19:00:49 MaxisNAS kernel: ata5.00: failed command: READ FPDMA QUEUED Jan 28 19:00:49 MaxisNAS kernel: ata5.00: cmd 60/08:30:80:ae:1c/00:00:1a:00:00/40 tag 6 ncq dma 4096 in Jan 28 19:00:49 MaxisNAS kernel: res 41/40:08:80:ae:1c/00:00:1a:00:00/00 Emask 0x409 (media error) <F> Jan 28 19:00:49 MaxisNAS kernel: ata5.00: status: { DRDY ERR } Jan 28 19:00:49 MaxisNAS kernel: ata5.00: error: { UNC } Jan 28 19:00:49 MaxisNAS kernel: ata5.00: supports DRM functions and may not be fully accessible
  21. I was going to suggest something similar, though if moving the disk around the NetApp didn't help and identical disks work there correctly it suggests that's not the problem, but worth a try and if it keeps failing you can be almost certain the disk is the problem.
  22. OK, so the PCI ROM message appears to be normal, this one just below isn't so normal, but might be unrelated, since it was right after start: Jan 28 18:01:39 Claron-Cloud kernel: vfio-pci 0000:0b:00.3: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x000000007ffd3000 flags=0x0000] Unfortunately I'm not seeing any errors logged when it paused, maybe someone else will have an idea.
  23. Don't know if this an error or normal vfio-pci 0000:08:00.0: No more image in the PCI ROM Check if this is logged when starting the VM or after it pauses.
  24. It's not cache space, do you know the date/time when it last paused?
×
×
  • Create New...