Jump to content

JorgeB

Moderators
  • Posts

    67,662
  • Joined

  • Last visited

  • Days Won

    707

Everything posted by JorgeB

  1. There are read errors on the failing device and because of that some data can't be moved to the other one, there's still a lot of data remaining on the that device, you'll need to back up whatever you can to the array or other device then recreate the cache.
  2. Diags are after rebooting, only thing visible now is a corrupt docker image, re-create it and if it happens again grab the diags before rebooting.
  3. Make sure firmware in both the HBA and expander are the latest one, if errors persist it could be a power/connection issue, diagnostics might give more clues.
  4. Enable syslog mirror to flash then post that log after a crash.
  5. That is expected, the 1st rebuilt disk would be corrupt because of the read errors, doesn't matter if it was empty, this would then translate to corruption on the next rebuild. You can run ddrescue on the failing disk, this way you can at least know which files are corrupt after the clone.
  6. Sorry, my fault, I forgot about the metadata, it's still raid1, first convert it to single also: btrfs balance start -f -mconvert=single /mnt/cache Then do the above.
  7. Looks more like a controller problem but please post the complete diagnostics: Tools -> Diagnostics
  8. Do you mean the problem happens is certain slots? If yes if suggests a backplane issue, could also be a power issue depending on how the backplane is powered. Don't replace or format any more disks, just try do to the parity sync after replacing the backplane/checking power, if there are still read errors in multiple disks there's still a problem.
  9. Forgot to mention, unrelated to this issue but the server was running out of RAM for some time before this, you should also look into that.
  10. There were errors in multiple disks: Jul 3 21:43:36 Tower kernel: md: disk4 read error, sector=8177890032 Jul 3 21:43:36 Tower kernel: md: disk1 read error, sector=8177890032 Jul 3 21:43:36 Tower kernel: md: disk2 read error, sector=8177890032 Jul 3 21:43:36 Tower kernel: md: disk6 read error, sector=8177890032 Jul 3 21:43:36 Tower kernel: md: disk8 read error, sector=8177890032 Jul 3 21:43:36 Tower kernel: md: disk0 read error, sector=8177890032 Jul 3 21:43:36 Tower kernel: md: disk29 read error, sector=8177890032 Also the cache device, could be a connection/power problem or HBA problem.
  11. The OP's issue has nothing to do with the GUI, it's not the first time you post the same wrong information, please stop.
  12. See if this applies to you: https://forums.unraid.net/topic/70529-650-call-traces-when-assigning-ip-address-to-docker-containers/ See also here: https://forums.unraid.net/bug-reports/stable-releases/690691-kernel-panic-due-to-netfilter-nf_nat_setup_info-docker-static-ip-macvlan-r1356/
  13. Please post the diagnostics: Tools -> Diagnostics
  14. That disk is already showing a lot of pending sectors, it should be replaced. Errors on disk2 look more like a power/connection problem, replace cables and rebuild disk1 again to a new disk.
  15. Any data on formatted drives would be lost, if you want more advice for now please post the diagnostics: Tools -> Diagnostics (after array start).
  16. This issue is about the number of drives, not their size, you might have other issue or some controller bottleneck, diags grabbed during a parity check might help.
  17. I would suggest this since there's a suspect device, so the quicker it's done the better, but note that if there are read errors there will be problems, but it would be the same if you try to convert to raid1.
  18. Since the pool is now in single mode and has a possible failing device you can try to remove it now instead of converting to back to raid1 and then removing, but to remove a device from a single profile pool you can only do it manually, before starting it's a good idea to make sure backups are up to date, then: -with the array started type in the console: btrfs dev del /dev/sdb1 /mnt/cache -if the command aborts with errors post new diags, if the command completes without errors and you get the cursor back stop the array -unassign both cache devices -start array -stop array -assign the Samsung cache device only -start array -done
  19. No, there is some problem with pool, probably because of the failing ADATA device but can't see what it was in the diags posted.
  20. It's not mounting because you converted the pool to single profile then removed a device, that's not possible, you can only remove devices from a redundant pool, this might work: -stop array -unassign all cache devices -start array -type on the console (if you rebooted since the diags make sure the ADATA SSD is still sdb): btrfs-select-super -s 1 /dev/sdb1 -stop array -assign both cache devices, there can't be an "all data on this device will be deleted" warning for any of the cache devices -start array -post new diags.
  21. Please post current diagnostics: Tools -> Diagnostics
  22. It's unusual to get multiple corrupt filesystem at the same without an apparent reason, they might be a hardware issue, but you can try to fix them: https://wiki.unraid.net/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui Run it without -n or nothing will be done, if it asks for -L use it.
×
×
  • Create New...