Benji

Members
  • Posts

    68
  • Joined

  • Last visited

Everything posted by Benji

  1. While preparing stuff to swap the disk out, the raw read errors have all disappeared. The SMART stats show no errors now and it passed an extended test without issue. Should I still be wary of this disk and swap it out anyway? Here's the most recent test:
  2. I didn't know that was an option, I shall do that now. Thank you again!
  3. I do. I'm replacing the 6TB 13 with a 10TB disk, so even removing the 2TB 16 I'd still be up 2TB overall. That one hadn't thrown errors yet, I appreciate you bringing it to my attention.
  4. The SMART test is running on disk 16, but the more I think about it, the more I'm tempted to just pre-emptively remove it. Can I replace disk 13 and remove disk 16 at the same time or would I have to replace 13 and then remove 16?
  5. I'll do that as well, thank you.
  6. I thought extended tests took longer. Either way, I got notified of a read failure and the test completed.
  7. Oh, sorry. Drive 13. I'll do that.
  8. I've just had a drive rack up some read errors. I assume the disk is toast and needs replacing, but I just wanted to double check. If it is faulty, I don't currently have a hot/cold spare. So far I've excluded it from every share that uses the array. Would the best protocol be to just leave it and rebuild when I get a new drive or try and copy data off it? Many thanks. apollo-diagnostics-20231204-1308.zip
  9. I swapped it to C1 and I'm getting the same errors in Unraid but the motherboard panel has changed the error to C1. So is it safe to say it's a bad stick then?
  10. I've just migrated my Unraid setup to new hardware and haven't checked the logs in a few days. I noticed there's L3 cache and ECC errors. The Unraid logs also seem to sync up with some logs from the Gigabyte management panel. It's logged this with in a minute or two of each of the errors: Is all of this suggesting the RAM in DIMM_D1 is bad or is there something else going on? I realise EPYC is a bit funny about mounting pressure, so I was going to try reseating the CPU and the RAM. The CPU was preinstalled on the mobo when I got it, but I have an appropriate torque screwdriver now. I've attached diagnostics just in-case. Any help would be greatly appreciated. apollo-diagnostics-20231129-0847.zip
  11. Yeah, I was afraid of that. It took a month to crash last time. I guess I need to get creative! Thanks.
  12. My server has randomly restarted 3 times now. I enabled mirror syslogs to flash drive after the first restart, waited nearly a month and then it did it again, just after I turned off the mirror! I re-enabled it and it just did it last night. The syslog from the flash drive seems to be identical to the syslog in the GUI. I thought it was supposed to log more info? I've attached the syslog, from the flash drive, from midnight. The server went down around 5am it seems. I'd appreciate any help or suggestions people can offer. apollo-diagnostics-20230823-0809.zip syslog 23.08.txt
  13. I came home today to find my Unraid server offline. I plugged in a display and it was blank, so I reset it. It took forever to run through the bios and just stayed on the select boot menu or bios screen. I manually selected the flash drive and it went blank for 5 mins, then came back to the boot menu. I turned it off and checked the flash drive on my Windows PC, Windows detected an error and corrected it. Upon plugging it back in and starting the server, everything works as normal. My question is, how did it develop an error and what should I do? Is this a sign the flash drive is faulty or is this a symptom of something else? I've attached the diagnostics, although I'm not sure if they'll be helpful in this instance. Many thanks. apollo-diagnostics-20230707-1800.zip
  14. Ok, I'll add a new one. Thank you for your help.
  15. I don't have one, I've triple checked! Is the extra parameters "--restart unless-stopped" anything to do with it?
  16. Is there a way to stop it restarting if it thinks there's no connection? I'm not sure what is causing it, but this has happened twice now. I get: 2023-02-02 15:44:39.524391 [ERROR] Network is possibly down. 2023-02-02 15:44:40.540314 [INFO] Restarting container. The wireguard server nor my home connection has gone offline, so ideally I'd like it to just sit there until whatever blip passes. Thanks.
  17. They are both in an enclosure, unused WD Elements enclosures. The drive that has a problem has been fine for a while like that, it's only from constantly plugging and unplugging the new one did the first go wrong. I wonder if I knocked the cables or something while I was working on the other and caused an issue. edit - For reference, I think the partition got deleted or altered. I recreated the NTFS partition in r-studio and was able to read the entire folder and file structure, the files are being recovered to a new disk now.
  18. That wasn't the drive I put in the enclosure today. Unless you mean a problem with the enclosure anyway?
  19. I was having a problem with one of my unassigned drives that I'd just put into an external shell, that problem was resolved but at some point the other drive stopped showing up. The mount option is grey. I'm fairly sure it also used to be Dev 1 and is now saying Dev 2, I don't know if that matters. I've tried all various unplugging, plugging, clearing configs, reboot etc. but nothing has helped. The problematic drive is: WD_Elements_25A3_57344A3058525034-0:0 (sda) WD5TB I'd appreciate any help or advice offered. apollo-diagnostics-20230101-1929.zip
  20. It was doing many multiple transfers at the same time and not completing any of them. It seems whenever I start a file move from that part, it just breaks and even stopping the process won't stop rsync. You do have to kill them. Anyway, I tried midnight commander again. I realise the last time I tried it was after the other method, which meant the transfers would have still been going on in the background. Anyway, on a fresh attempt MC transferred the remaining 2TB overnight with no issue. So either there's something wrong with my setup when attempting moves from the browse disk function or it's a bug. Either way, avoiding it and using MC was the solution.
  21. Nothing that I can see. I stopped the transfer, waited a few mins, but the disk activity hasn't stopped. I checked htop and I can see rsync is still running from disk2 > disk5. Does it still finish the current transfer after stopping it? apollo-diagnostics-20221224-1306.zip
  22. I don't think it is displaying the correct speed. Since that screenshot, 36 hours ago, it's only moved 700GB. It's gone from 22% to 55% complete. That definitely points towards the Move window being the correct estimate. apollo-diagnostics-20221224-1038.zip