Jump to content

JorgeB

Moderators
  • Posts

    67,814
  • Joined

  • Last visited

  • Days Won

    708

Everything posted by JorgeB

  1. Start by running a single stream iperf test in both directions to check network bandwidth.
  2. Do you mean you cannot boot? If yes where does it stop?
  3. Something strange is going on there.
  4. It's not logged as a disk problem and disk5 looks healthy, cancel the rebuild, replace cables/change slot on disk5 and try again.
  5. Diags are after rebooting so not much to see, if it happens again grab them before rebooting.
  6. No segfaults in the diags posted, run memtest and see if you get anymore.
  7. Aug 3 11:25:40 Tower kernel: md: import_slot: 4 empty Aug 3 11:25:40 Tower kernel: md: import_slot: 6 empty The array was started without disks 4 and 6 assigned, this will make Unraid emulate those disks, they were re-assigned after an array stop Aug 3 11:34:51 Tower kernel: md: import_slot: 4 replaced Aug 3 11:34:51 Tower kernel: md: import_slot: 6 replaced And that will make a rebuild required.
  8. That suggests a problem with a controller or disk, unfortunately no easy way to tell which without testing one by one.
  9. Yes, both are good and fast devices, MX500 has a firmware issue with false pending sectors, but can easily be "solved" by not monitoring that attribute.
  10. Do you mean just restoring the backup to a different device made Unraid start to crash again?
  11. Enable the syslog server and post that together with the complete diagnostics after a crash.
  12. When the checksums are done by the filesystem, like zfs or btrfs, they are done block by block, not by file, when done by for example file integrity plugin they are done file by file, and they are always the same size, and very small, they fit in the extended attributes. There are various way, I for example use snapshots, they are read-only and cannot be modified by those kind of attacks.
  13. If this was interrupted you need to run it again, since you replaced the disk you can run it on the old unassigned disk to avoid having the array offline, assuming you have enough ports, and then copy the data over, in that case take the opportunity to format the new disk xfs.
  14. Bitrot is extremely rare, and the drives have error correction, but it can happen, IMHO cheksums are mostly useful for when for example an issue occurs during a disk rebuild, like errors on another disk, and you can see if any/which files were affected.
  15. Enable the syslog server and post that after a crash.
  16. Upsides: Mainly checksum and snapshot support. Downsides: Not as resilient as xfs especially with bad hardware, and recovery in case of serious corruption might be more difficult, though in my experience, and I have around 200 btrfs filesystems, all except about a dozen are single device, singe device filesystems, like the ones used in the array are more resilient than multi device filesystems, against corruption, not against a disk failure obviously, for that you use parity in the array.
  17. No filesystem corruption detected so far, first thing to do is to check if the files are there or not, browse the shares using for example midnight commander (mc on the console), if they are there you should then check the permissions.
  18. Please post the diagnostics, though anything that happen before this last boot cannot be seen.
  19. Aug 3 02:15:23 black shfs: shfs: ../lib/fuse.c:1451: unlink_node: Assertion `node->nlookup > 1' failed. It's the issue below: Some workarounds discussed there, mostly disable NFS if not needed or you can change everything to SMB, can also be caused by Tdarr if you use that.
  20. If the drives were disabled when the ports failed they need to be rebuilt, just using a different controller would not necessitate rebuild.
  21. Yes, both checks were correct and found the same errors, this might suggest a problem with one of the disks, since there were several ATA errors with disk3 replace cables there and run another check.
×
×
  • Create New...