Benji

Members
  • Posts

    68
  • Joined

  • Last visited

Benji's Achievements

Rookie

Rookie (2/14)

5

Reputation

  1. While preparing stuff to swap the disk out, the raw read errors have all disappeared. The SMART stats show no errors now and it passed an extended test without issue. Should I still be wary of this disk and swap it out anyway? Here's the most recent test:
  2. I didn't know that was an option, I shall do that now. Thank you again!
  3. I do. I'm replacing the 6TB 13 with a 10TB disk, so even removing the 2TB 16 I'd still be up 2TB overall. That one hadn't thrown errors yet, I appreciate you bringing it to my attention.
  4. The SMART test is running on disk 16, but the more I think about it, the more I'm tempted to just pre-emptively remove it. Can I replace disk 13 and remove disk 16 at the same time or would I have to replace 13 and then remove 16?
  5. I'll do that as well, thank you.
  6. I thought extended tests took longer. Either way, I got notified of a read failure and the test completed.
  7. Oh, sorry. Drive 13. I'll do that.
  8. I've just had a drive rack up some read errors. I assume the disk is toast and needs replacing, but I just wanted to double check. If it is faulty, I don't currently have a hot/cold spare. So far I've excluded it from every share that uses the array. Would the best protocol be to just leave it and rebuild when I get a new drive or try and copy data off it? Many thanks. apollo-diagnostics-20231204-1308.zip
  9. I swapped it to C1 and I'm getting the same errors in Unraid but the motherboard panel has changed the error to C1. So is it safe to say it's a bad stick then?
  10. I've just migrated my Unraid setup to new hardware and haven't checked the logs in a few days. I noticed there's L3 cache and ECC errors. The Unraid logs also seem to sync up with some logs from the Gigabyte management panel. It's logged this with in a minute or two of each of the errors: Is all of this suggesting the RAM in DIMM_D1 is bad or is there something else going on? I realise EPYC is a bit funny about mounting pressure, so I was going to try reseating the CPU and the RAM. The CPU was preinstalled on the mobo when I got it, but I have an appropriate torque screwdriver now. I've attached diagnostics just in-case. Any help would be greatly appreciated. apollo-diagnostics-20231129-0847.zip
  11. Yeah, I was afraid of that. It took a month to crash last time. I guess I need to get creative! Thanks.
  12. My server has randomly restarted 3 times now. I enabled mirror syslogs to flash drive after the first restart, waited nearly a month and then it did it again, just after I turned off the mirror! I re-enabled it and it just did it last night. The syslog from the flash drive seems to be identical to the syslog in the GUI. I thought it was supposed to log more info? I've attached the syslog, from the flash drive, from midnight. The server went down around 5am it seems. I'd appreciate any help or suggestions people can offer. apollo-diagnostics-20230823-0809.zip syslog 23.08.txt
  13. I came home today to find my Unraid server offline. I plugged in a display and it was blank, so I reset it. It took forever to run through the bios and just stayed on the select boot menu or bios screen. I manually selected the flash drive and it went blank for 5 mins, then came back to the boot menu. I turned it off and checked the flash drive on my Windows PC, Windows detected an error and corrected it. Upon plugging it back in and starting the server, everything works as normal. My question is, how did it develop an error and what should I do? Is this a sign the flash drive is faulty or is this a symptom of something else? I've attached the diagnostics, although I'm not sure if they'll be helpful in this instance. Many thanks. apollo-diagnostics-20230707-1800.zip