Jump to content

JorgeB

Moderators
  • Posts

    67,397
  • Joined

  • Last visited

  • Days Won

    705

Everything posted by JorgeB

  1. Yep. Btrfs can be more finicky, especially with hardware issues, xfs is supposed to support reflinking in the near future, maybe it already does, not sure since I don't keep up with xfs development.
  2. Disk3 looks healthy, I suggested using a new one just in case something goes wrong during the rebuild, so you can still copy the data from the old disk if needed, after this is resolved you can reuse that disk however you want.
  3. Yes, but like mentioned I would recommend rebuilding to a new disk.
  4. It could, but if that's the case it should also start causing problems on the current slot. That's expected, once a disk is disabled it needs to be rebuilt. Disk8 is mounting correctly, so no need to fix filesystem for now, just rebuild on top.
  5. Yes but not automatically, you use a script though, make a copy of the vdisk when the VM is shutdown in the state you want, then just copy back overwriting existing one before VM start, if using btrfs you can do this with reflinks so copy takes a couple of seconds and also it will use much less space, only the differences between them.
  6. Just start the array and check data on disk3
  7. Read errors on disks 10 and 12 appear to be resolved, at least for now, you need to check filesystem on the emulated disk3: https://wiki.unraid.net/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui Remove -n or nothing will be done and if it asks for -L use it. If after that emulated disk3 mounts check that contents look correct, also look for lost+found folder for any partial or lost files, if all looks good next step is to rebuild the disk, old disk3 looks fine and problem was most likely cable related, but if possible I would recommend rebuilding to a new disk just in case something goes wrong during the rebuild, with the new cables there were no errors on boot, but there could still be during the rebuild, which is much more i/o intensive.
  8. What can I do to keep my Ryzen based server from crashing/locking up with Unraid? Ryzen on Linux can lock up due to issues with c-states, and while this should mostly affect 1st gen Ryzen there are reports that 2nd and even 3rd gen can be affected in some cases, make sure bios is up to date, then look for "Power Supply Idle Control" (or similar) and set it to "typical current idle" (or similar). If there's no such setting in the BIOS try instead to disable C-States globally, also note that there have been some reports that with some boards the setting above is not enough and only completely disabling C-States brings stability. Also many of those servers seem to be running overclocked RAM, this is known to cause stability issues and even data corruption on some Ryzen/Threadripper systems, even if no errors are detected during memtest, server and overclock don't go well together, respect max RAM speed according to config and CPU listed on the tables below. Note: Ryzen based APUs don't follow the same generation convention compared to regular desktop CPUs and are generally one generation behind, so for example Ryzen 3200G is a 2nd Gen CPU: 1st gen Ryzen: 2nd gen Ryzen: 3rd gen (3xxx) and Zen3 (5xxx) Ryzen : Zen4 (7xxx) Ryzen CPUs: Threadripper 1st Gen: Threadripper 2nd Gen: Threadripper 3rd Gen:
  9. You need to install the Dynamix trim plugin and schedule it to run regularly, usually once a day is adequate.
  10. Best way it to replace/swap cables to rule them out, this case is kind of strange that errors on both disks are logged like an actual disk problem, but according to the SMART tests it's not. It's good to know which disk is which by where they are in the case, but if you don't know yes, you need to check the serial numbers.
  11. If it's on the correct share in the array just change share to cache="prefer" and run the mover.
  12. OK, disks appear to be fine but you're still having controller issues, disk12 dropped offline: Feb 4 08:30:03 Server kernel: ata3: SATA link down (SStatus 0 SControl 310) Feb 4 08:30:03 Server kernel: ata3.00: disabled And still read errors on disk10, replace/swap cables on both disks 10 and 12, preferably connected to a different controller. Also
  13. Please post the diagnostics: Tools-> Diagnostics
  14. Was the pool encrypted? Also please post the diagnostics: Tools-> Diagnostics
  15. If it was because of the the Intel NIC issue it was fixed on v6.8.2, update, but your problem seems different, LSI HBA is constantly faulting and resetting: Feb 3 14:20:08 Tower kernel: mpt2sas_cm0: fault_state(0x2622)! Feb 3 14:20:08 Tower kernel: mpt2sas_cm0: sending diag reset !! Feb 3 14:20:09 Tower kernel: mpt2sas_cm0: diag reset: SUCCESS This keeps repeating on the syslog, check HBA is well seated and/or try a different slot if available, you'll then need to check file system on the affected disks, but only after fixing the HBA issue.
  16. Should be, it was release specifically to fix that problem (possible others also), also worth checking all connections. First thing you want to do is to rebuild disk3 (assuming it never completed), since old rebuild was going to be mostly corrupt.
  17. If like I suspected transfers are full speed to the array (with turbo write) and slow to cache, it means your cache device is the problem, make sure it's getting regularly trimmed or try a different one.
  18. Ryzen on Linux can lock up due to issues with c-states, make sure bios is up to date, then look for "Power Supply Idle Control" (or similar) and set it to "typical current idle" (or similar), or completely disable C-sates. More info here: https://forums.unraid.net/bug-reports/prereleases/670-rc1-system-hard-lock-r354/ IMHO very bad idea to use overclock RAM on a server, especially since that's known to cause instability and even data corruption on some Ryzen servers, respect max ram speed depending on config:
  19. Strange, VMs are there, no errors on syslog about libvirt, there are some nginx errors, so possibly a browser issue, you can try a different browser or booting in safe mode to exclude plugin interference, if neither helps no idea on what the problem is, maybe someone else knows.
  20. Please post output of: ls -lh /etc/libvirt/qemu
  21. Those are the standard diags, post just the syslog, you need to download it from where it's being saved to.
  22. libvirt is starting correctly on the diags posted, is the VM page blank or stuck?
  23. Nothing jumps out hardware wise, also very few plugins so safe mode unlikely to help, the errors on the snippet you posted are nothing to worry about, try enabling the syslog server/mirror feature, then post the saved syslog after a crash, it might catch something.
  24. Can't say that, it either was remapped or it in use again without error, firmware should have cleared the pending sector, but it didn't, it's not that unusual, especially with WD drives.
×
×
  • Create New...