Jump to content

JorgeB

Moderators
  • Posts

    67,814
  • Joined

  • Last visited

  • Days Won

    708

Everything posted by JorgeB

  1. Auto pariyt check after an unclean shutdown is non correct, last one you ran was correct, so there should be no more errors on the next check.
  2. Yes, kooks OK now, note that the pool is in single profile, i.e., no redundancy but you can use the total capacity from both devices, if you want to convert to raid1 (redundant but only 120GB can be used since they have different capacities) you first need to remove some data from cache1.
  3. Filesystem is going read-only due to lack of space, but I'm not seeing why it's running out of space at mount time, this usually happens when a balance was started but not signs of "balance resumed" in the log, in any case please try this, with the array stopped type: mkdir /temp mount -o skip_balance /dev/sdf1 /temp btrfs balance cancel /temp umount /temp Then start the array and post new diags.
  4. Run chkdsk on it but the flash should be OK.
  5. That's expected, once a disk gets disabled it needs to be rebuilt: https://wiki.unraid.net/Manual/Storage_Management#Rebuilding_a_drive_onto_itself
  6. Please reboot and post new diags after array start.
  7. Aug 1 12:35:11 Tower kernel: general protection fault, probably for non-canonical address 0x9c0101000034: 0000 [#1] SMP PTI Aug 1 13:01:44 Tower kernel: irq 16: nobody cared (try booting with the "irqpoll" option) A couple of days ago there was the same error as last time and then IRQ 16 also got disabled, possibly not a big issue since the server worked for 2 more days, main issue was that this USB controller stopped working: Aug 3 12:35:33 Tower kernel: xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead And the flash drive was using it, so after that Unraid cannot continue to work correctly.
  8. Start here: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=819173
  9. SMART cannot always identify issues, I would suggest swapping cables again with a different disk to make sure, and if it happens again replace it.
  10. There wasn't a valid btrfs filesystem in sdu, suggesting the device was wiped at some point, because of that Unraid if first deleting the missing device, I do see a lot of these errors logged: Aug 3 13:13:34 Anderson kernel: BTRFS warning (device sds1): direct IO failed ino 119162 rw 0,0 sector 0xfe1a280 len 4096 err no 10 Aug 3 13:13:34 Anderson kernel: BTRFS warning (device sds1): direct IO failed ino 119162 rw 0,0 sector 0xfe1a288 len 4096 err no 10 Aug 3 13:13:34 Anderson kernel: BTRFS warning (device sds1): direct IO failed ino 119162 rw 0,0 sector 0xfe1a290 len 4096 err no 10 Not sure what these mean exactly but while the balance is going lets see if it finishes.
  11. Forgot to mention, likely sdak was old disk9 and it was replaced, so it's the same filesystem, but older generation.
  12. Aug 3 12:22:46 Anderson root: WARNING: adding device /dev/sdak1 gen 3145 but found an existing device /dev/sdj1 gen 3159 Sdj is disk9, sdak is currently unassigned, wipe or disconnect sdak since it appears to be conflicting with the pool, then please reboot and post new diags after array start.
  13. It's not known and if it happens again grab the diagnostics before rebooting so we can see what happened.
  14. There are some crashes/timeouts related to the NVMe device, please try running without it, keep the syslog server going and post a new one if it crashes again.
  15. It should stop displaing temps for all dropped drives, that's a clue to which ones dropped, you also won't be able to get SMART attributes for them.
  16. Try a different browser other than Firefox, if still the same please try rebooting in safe mode, if you get the stale config error post new diags.
  17. Problem with the onboard SATA controller, quite common with some Ryzen servers, reboot and post new diags after array start, if this continues to happen best bet is to get an add-on controller (or use a different board).
  18. "Invalid partition layout" is not a filesystem problem, the disk lost/damaged some part of the MBR, looks like it mostly happens with WD disks, so possibly a firmware problem. To fix you can rebuild the the disk, assuming parity is valid, you just unassign it, start the array, Unraid will recreate the partition correctly and emulated the disk, if all looks good you can then rebuild on top, alternatively you could mount the disk outside the array and copy the data back to it.
  19. It's not one drive, all devices connected to the controller drop offline, you just get one disabled disk because Unraid only disables as many disks as there are parity drives, but obviously the server cannot continue to work correctly.
  20. Aug 2 11:26:53 Altair8800 kernel: macvlan_broadcast+0x116/0x144 [macvlan] Aug 2 11:26:53 Altair8800 kernel: macvlan_process_broadcast+0xc7/0x110 [macvlan] Beside the mentioned possible hardware issues this will also make Unraid crash, these are usually the result of having dockers with a custom IP address, switching to ipvlan should fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enable, top right)).
  21. If memtest doesn't find anything suggest updating to v6.11.0-rc2 and test, newer kernel might help.
×
×
  • Create New...