firrae

Members
  • Posts

    37
  • Joined

  • Last visited

Everything posted by firrae

  1. Just happened to me on Unraid server Pro, version 6.10.3. Couldn't event just restart NGINX as it would disconnect my SSH client (my Mac) a moment after I connected every time. Ended up having to hard shut it down by the power button. When it came back up now a drive is being "emulated" for apparently no reason as it's SMART test says it's fine and so does Unraid other than it didn't want to bring it up apparently.
  2. Cool, thanks. Will do tomorrow after work. With its ominous wording I figured I'd make sure before doing it ha. Sometimes it actually means something.
  3. OK, what I care about on it is being backed up again. So it's good to die if it needs to.
  4. Hi there, I'm in a weird issue. the UnRAID disk check says my drive has filesystem corruption, but I find it interesting that the system is fine until I've written 1 or 2 GBs of data to it. At that point it becomes I/O errors galore. So clearly it's messed up. When I went through the wiki article here on repairing drives with issues: https://wiki.unraid.net/Check_Disk_Filesystems#Checking_and_fixing_drives_in_the_webGui. It returns the following through both the UI and through a SSH session: xfs_repair -v /dev/md6 Phase 1 - find and verify superblock... - block cache size set to 1507224 entries Phase 2 - using internal log - zero log... zero_log: head block 2248359 tail block 2247490 ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. So now I'm not sure what that means for me. I don't think there's anything awfully important on the drive and am about to boot it up as normal to double check, but like I said I can read off it perfectly fine, even after writes start to fail, so if there is anything important on it I can presumably recover it. Now though is the crux of my question, what is the best method? I have a spare 4TB drive sitting here if replacing it 1:1 will work and letting the parity re-build (there's maybe 20GB on the drive so far as I can remember) so not much, otherwise do I drop that `-L` flag on it and hope it fixes things (from the wiki the tool seems unreliable for positive outcomes at best)? I'm going to check what's on the drive now aby spinning it back up and hopefully someone can help me go from there. If you need anything else to help out let me know. tower-smart-20201119-2112.zip tower-diagnostics-20201119-2112.zip
  5. While this was not the fix, it did help me generate a useful error in the logs. Seems like something went weird on one of my disks causing I/O locks in specific containers and even some whole shares (the downloads folder being one of them). After a reboot and some clean-up this seems to be working again, though I will need to keep an eye on the drive which is normal procedure I guess (it's one of my oldest). Though thanks for trying and showing me that Chrome add-on, I'm totally using it!
  6. Hi there, maybe someone could help me out. I posted the issue on the GitHub issue tracker here because that's my default place to put issues as a developer lol: https://github.com/binhex/arch-delugevpn/issues/224. Synopsis is that the container is running, VPN is seemingly connected fine, web UI shows up, but when I try to add a torrent by Torrent file (.torrent) I get "Failed to upload torrent" and there doesn't seem to be a log in sight about it. Since the UI works otherwise I can only assume the setting are correct. I've tried turning off Privoxy AND VPN all together and still get the same issue. Magnet links work and the stuff begins to download fine, but adding .torrent files is a complete no go, and as I only use torrents from a private tracker, they don't offer magnets so this is a deal breaker for me. At this point if I can't figure it out, I'm back on the hunt for a VPN protected torrent container after spending too much time on this one already sadly.
  7. Well, I can't thank you enough @johnnie.black! After updating the firmware I see 0 read errors and the CRC errors have completely stopped increasing on all drives. Thanks a bunch for pointing this out, I would NEVER have thought of this being an issue.
  8. OK, will try that. Thanks for pointing me in a direction @johnnie.black!
  9. What would be the path forward do you think then? I'm not sure what I should do. I have multiple disks reporting read errors, but none show issues other than the CRC errors in SMART. Should I stop the rebuild, flash the firmware, and then... what? Rebuild if a parity check goes well?
  10. Interesting. Otherwise, if you don't me asking, but does the system look fine at a cursory glance? I do still have UNRAID reporting high read errors as well. Other than the glaring "its rebuilding a drive" thing of course.
  11. @johnnie.black is that a newer issue or has it been that way since before v6? I only got into UNRAID as v6 was launching and don't remember there being an issue until recently?
  12. Quick update. After I finished writing this I noticed that my SATA cable based drives were also getting these errors, but not all of them. The Parity drive is reporting 0 over its entire life, but the drive nearest it is showing an increasing, but slower than other drive on the SAS to SATA breakouts, number of CRC errors. This maybe leads to a combination of the cables and the cages? I really don't know at this point.
  13. Hi there, After digging around on Google and the forums I believe the issues with my array come down to the issue that I am getting UDMA CRC errors on a number of my drives, but honestly I'm not sure where to begin looking at the cause. In my eyes, and from reading, I believe it could be one or a combination of 3 things: My SAS to SATA cables (maybe they are cross-talking and the likely candidate?) - I've tried 2 different brands but still get the issue, though both brands the cables looked the same, just slightly different colours. - https://www.amazon.ca/gp/product/B0736J45V2/ My drive cages: I have a Rosewill RSV-L4412 which came with 3 drive cages (can't remember the part number for them) - https://www.rosewill.com/product/rosewill-rsv-l4412-4u-rackmount-server-case-or-chassis-12-sata-sas-hot-swap-drives-5-cooling-fans-included/ My SAS controller which is a Fujitsu (?) card flashed to be an LSI 9211-8i in "IT" mode At this point I believe the cables but I'd be interested in hearing what others think. 8 of my disks use these breakout cables as the way they connect, the other 4 go directly to the motherboard SATA ports. What I find interesting is it seems like the drives on these breakout cables have the issue much worse, though this is only so far a short term observation since I read about this, and the cage that's wired directly currently only has 3 drives in it, the rest are fully loaded with 4. I'm curious if people think I'd be better served with which of the potential options to try and solve this: Get different breakout cables. Get new drive cages. change out the controller. In any case I'd be interested in seeing the recommendations people have on this. This all comes from my seeing what I think are VERY high read error counts as I'm rebuilding my array after changing out a drive. Attached is my diagnostics file from the server. Its in the middle of building that drive as I mentioned, so whatever decision I make I'm a couple of days away from actually implementing at least assuming I can eve get the parts to do it at this point. I'm interested to see what people think. Thanks! tower-diagnostics-20200318-1415.zip
  14. You may have found it. I thought I had updated it, and while I need to go into the BIOS to be sure this is one of the update features on the second to last update they gave: I have VT enabled. I will try these BIOS updates if the server crashes otherwise I'll try them tomorrow night by gracefully shutting it down.
  15. 1) It seems the BIOS is fully updated, but I'll check again. 2) brand new PSU, replaced the original PSU. Tested the PSU on my main PC and it ran it fine for 3 days. 3) All the RAM is identical and was purchased at the same time. 4) RAM meets the mobo's requirements and is within their spec. 5) This is the one I can't decide if it is the problem. I have 4 fans in the case, one over the HDDs that's an intake, 1 more intake on the side, and 2 exhaust (back and top). I've had this issue happen with the fans in place and the case closed and with the side fan removed and the case fully open. 6) I cleaned pretty well everything before I put it in there. I moved the old PC into a new case and took that time to basically clean everything via compressed air before putting it back in. I'll follow up with the BIOS though and for heat, I figured there'd be some sort of warning or error log somewhere, but I can't find anything that indicates that.
  16. I have a monitor hooked up to it and was looking at it once when it happened, there was no shutdown procedure so it was a hard power off, the screen just went black and then the BIOS boot screen, that's where my thought of it being the PSU came from originally.
  17. Hi there, I've been trying to solve an issue with my server that seems to baffle everyone I speak to and at this point, I'm running out of possibilities. As the title says my server randomly reboots and there's seemingly no reason. I've captured logs via telnet and looked at them after the crash and found absolutely nothing. The log just stops as if I pulled the power plug. I checked my docker containers and each time there's no consistent action happening that I can point to. Below is a synopsis of my setup: Intel i7-920 EVGA X58 motherboard 12GB (6 2GB sticks) of DDR3 RAM 4 HDDs (2 4TB and 2 2TB) 1 SSD (an older 128GB Sandisk) 600W EVGA 80+ Bronze PSU (brand new as I thought this might be the issue originally) I do live in an apartment so I have a 1500VA APC UPS between the server and the wall. As I said I can't find any clear thing that causes the reboot and the only reason I know it happened is PLEX is no longer available or I hear the beep from the POST succeeding. I have found some potential contributing factors though: 1) When I'm running no docker containers the server seems fairly stable and was on for 24 straight hours where the reboot usually happens after 6hrs (rarely less, but it has happened after only 2 hours before). 2) At about 6 containers it seems to lag the web UI and then crash shortly after. 3) SabNZB seems to cause it when it's the only container fairly quickly. During all this, I am watching the system stats on the dashboard and only a few cores ever spike to 100% and memory never passed 40%, but there's seemingly no consistency on CPU and RAM usage and the rebooting. Finally, I have run memtest86 on it and after leaving the test to run for 2 straight days it never found an error so I've basically ruled out memory. I now have an error with community applications, likely corruption (I think this happened when it rebooted in the middle of trying to create a container), but even at the start, when CA was working, I was having this issue. Any help is appreciated as this is basically making it unusable. Edit: I have a telnet session into the server now to try and capture if/when the server reboots, this could take a while though.
  18. @Squid, OK, I'll try that this weekend and maybe move it to a new USB. I hope this might end up solving my stability issues, but from @trurl's comments, I'm not overly hopeful about that part...
  19. Yes, it never registered any power issues and my other PC that is also connected to it was perfectly fine. To note, it is an APC 1500 VA so it should be more than enough. At this point, I'm at a loss if it's not likely the USB. I've changed out the PSU, run memory checks, benchmarked the CPU and all the drives are reporting good health...
  20. Could this also be causing my issue where the server randomly restarts? Also is there a way to redo the flash drive but keep all my current settings and data? EDIT: Also yes the same error.
  21. I don't see any "bread" or any other types of errors after the system has come back up. Just tried to use the apps install again and it's the same issue.
  22. Wish I could tell you. After I made that post I went to bed and now I've awoken to the server having rebooted itself uncleanly again. Could that be caused by a bad USB stick? This has been happening for a while now and I've tested everything I can think of in the hardware and it all comes up fine so I'm running out of options on why this server is so unstable...
  23. Hey @Squid, I had initially messaged about an issue with CA here: but the scope of the issue seems wider than just the cleanup. Whenever I try to install a new application from the repository I get the following error: Warning: simplexml_load_file(): /boot/config/plugins/dockerMan/templates-user/my-plexrequests.xml:1: parser error : Document is empty in /usr/local/emhttp/plugins/dynamix.docker.manager/include/CreateDocker.php on line 418 Warning: simplexml_load_file(): in /usr/local/emhttp/plugins/dynamix.docker.manager/include/CreateDocker.php on line 418 Warning: simplexml_load_file(): ^ in /usr/local/emhttp/plugins/dynamix.docker.manager/include/CreateDocker.php on line 418 Fatal error: Uncaught Error: Call to a member function xpath() on boolean in /usr/local/emhttp/plugins/dynamix.docker.manager/include/CreateDocker.php:419 Stack trace: #0 /usr/local/emhttp/plugins/dynamix.docker.manager/include/CreateDocker.php(448): getXmlVal(false, 'Name') #1 /usr/local/emhttp/plugins/dynamix.docker.manager/include/CreateDocker.php(675): getUsedPorts() #2 /usr/local/emhttp/plugins/dynamix/include/DefaultPageLayout.php(383) : eval()'d code(17): require_once('/usr/local/emht...') #3 /usr/local/emhttp/plugins/dynamix/include/DefaultPageLayout.php(383): eval() #4 /usr/local/emhttp/plugins/dynamix/template.php(61): require_once('/usr/local/emht...') #5 /usr/local/src/wrap_get.php(16): include('/usr/local/emht...') #6 {main} thrown in /usr/local/emhttp/plugins/dynamix.docker.manager/include/CreateDocker.php on line 419 I'd prefer to not have to re-do my Docker setup as I've just seemingly got it stable. For some reason, some of the containers in the repository seem to cause my server to hard reboot with no reason given in the logs. That is a different issue for a different thread though. Am I just missing something?
  24. Hi @Squid, This was working well for me until today when I encountered the following: Warning: DOMDocument::loadXML(): Empty string supplied as input in /usr/local/emhttp/plugins/ca.cleanup.appdata/include/xmlHelpers.php on line 195 Fatal error: Uncaught Exception: [XML2Array] Error parsing the XML string. in /usr/local/emhttp/plugins/ca.cleanup.appdata/include/xmlHelpers.php:197 Stack trace: #0 /usr/local/emhttp/plugins/ca.cleanup.appdata/include/exec.php(43): XML2Array::createArray('') #1 /usr/local/src/wrap_post.php(27): include('/usr/local/emht...') #2 {main} thrown in /usr/local/emhttp/plugins/ca.cleanup.appdata/include/xmlHelpers.php on line 197 Any help is appreciated as I can't currently easily clean up, this plugin was working too well ;p EDIT: Actually digging deeper, I'm getting similar issues even trying to add new applications. I can search but when I go to install it throws a similar error message.
  25. Whenever I install this image I get the following in the browser: Did I do something wrong? EDIT: After re-creating the container it seems to be working now. Not sure what happened though.