Jump to content

JorgeB

Moderators
  • Posts

    67,807
  • Joined

  • Last visited

  • Days Won

    708

Everything posted by JorgeB

  1. I believe there's a plan for Unraid to soon have an option to easily install out of tree drives, this is the best bet for issues like these, since compiling the various out of tree drivers with every Unraid release was becoming very difficult for LT.
  2. After replacing the cables start the array and check that the emulated disk is mounting and contents look correct, then you can rebuild on top: https://wiki.unraid.net/Manual/Storage_Management#Rebuilding_a_drive_onto_itself
  3. Again, it only affects Broadcom NIC, and it's resolved on v6.10.3, no NICs are disabled there. If it's really hardware it won't fix it, but since the driver is likely newer it might help if it's not.
  4. Yes, it does look like a power/connection problem, disk 1 also showing similar issues, replace cables and rebuild.
  5. It's up to you, but the data corruption issue is not NIC related, and even when we suspected it was it was with Broadcom NICs, not Intel.
  6. Jul 5 09:39:17 Tower kernel: ahci 0000:03:00.0: AHCI controller unavailable! Problems with the Marvell controller, and all devices connected there dropped, Marvell controllers are not recommended for Unraid, first thing I would recommend is replacing it with a recommended controller, once that's done (or if you don't want to do that for now just power cycle the server) run a scrub on the pool and post new diags.
  7. There was a QEMU related crash: Jul 4 17:06:42 Tower kernel: qemu-event[9717]: segfault at 0 ip 0000148e82990396 sp 0000148e7e03d930 error 4 in libvirt.so.0.6005.0[148e82846000+1f6000] Rebooting should fix it, if it doesn't post new diags.
  8. That suggests a NIC problem. You don't need to do that for v6.10.3, also your NIC wasn't affected even when you needed to use that. No, and those are fixed in v6.10.3, there could be some issue with your hardware though.
  9. The sync thread is single threaded and the CPU usage is very high for the speed it's going, there was a similar case recently, also with a threadripper IIRC, but don't remember the solution, let me see if I can find it.
  10. It's running out of RAM, you'll need to limit resources or add more RAM.
  11. Try with just one DIMM at a time and see if those hardware errors go away.
  12. Since both disks are still reporting disk problems on the onboard SATA suggest you run an extended SMART test on both, quite weird but looks like both disks failed.
  13. No, that's an impossible speed, likely the check is just accounting for the last time it an, but considering the full parity size for the speed average.
  14. You can use a Windows PC, just copy all the contents, though what really matters is the config folder.
  15. Make sure board has the latest BIOS, also if booting UEFI try CSM, or vice versa.
  16. Not a network guy but pretty sure you should only have one gateway, remove the gateway from eth1 and try again.
  17. That's usually a flash drive problem, backup current one and recreate it manually or using the USB tool.
  18. That's the correct way forward, but depending on the controller used with the Dell there might also be issues with the partitions, especially if it was a raid controller, you can still do it, but if the disks don't mount due to an "invalid partition" error don't do anything else and post the diagnostics.
  19. That error is not about the key, please post the diagnostics.
  20. Both disks are reporting a problem, though SMART looks healthy, they are connected to a controller with a SATA port multiplier, and it's a Marvell, two things that can cause issues on their own, worse when they are together, connect those 2 disks and disks 2, 5 and 7 (the other ones connected to the port multiplier) to the Intel SATA ports which are not being used and post new diags.
  21. As long as it doesn't start crashing again you don't need to do anything, all disks should mount except disk4 that will need to be formatted, if they don't post new diags.
  22. There's a failing NOW attribute (reallocated sector count), that would be reason enough to replace, but there are also other bad attributes: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 5 Reallocated_Sector_Ct PO--CK 140 140 140 NOW 1785 196 Reallocated_Event_Count -O--CK 176 176 000 - 24 197 Current_Pending_Sector -O--CK 200 200 000 - 2 198 Offline_Uncorrectable ----CK 200 200 000 - 2 200 Multi_Zone_Error_Rate ---R-- 197 197 000 - 1318
  23. Yes, SMART looks real bad, before it wasn't visible since the disk dropped but I should have asked for a SMART report after rebooting, in any case you need to replace that disk.
×
×
  • Create New...