Jump to content

randommonth

Members
  • Posts

    37
  • Joined

  • Last visited

Posts posted by randommonth

  1.  

    Hi everyone, this issue had popped up recently in that my Plex server appears to refuse to transcode. Many users and clients are reporting getting the 'Not enough CPU' error message or files just aren't working. 

     

    I've got an Unraid server running an i7 8770 with 16 GB of RAM.

    PMS is the Binhex docker container. I have had hardware acceleration working perfectly for several years, until now.

     

    Server logs attached

     

    Thanks in advance.

    Plex Media Server Logs_2023-12-30_17-28-51.zip

  2. Hi guys,

     

    I have had issues with cache drive stability in the past when I was running dual SSDs in a pool. I reduced the cache down to a single drive and have experienced excellent stability for the last 6-9 months. I could not pin point the root cause but I suspect some of the SATA ports on my motherboard are loose or faulty.

     

    I upgraded to 6.12.3 about a week ago and have seen consistent re-occurrence of the cache instability. Every morning I'm waking to find the cache has degraded somehow overnight, either completely disappearing or simply being unreadable. I have managed to resolve quickly each time by shutting down and rebooting, sometimes needing to stop the array and reassign the cache to pool. Each time the cache has reappeared and worked fine.

     

    Find diagnostics attached.

     

    Thanks

     

     

    bigdaddy-diagnostics-20230816-0641.zip

  3. 10 hours ago, Marcel40625 said:

    stop array and remove it, data should exist on the 2nd cache drive, or you have just one?

     

    Thanks but I'm a bit confused now.

     

    I have two cache drives in a pool. I want to remove one. Is it as simple as stopping the array, removing the drive then restarting the array? Or do I have to go through the process of changing the cache preferences on applicable shares and running the mover?

  4. Hi guys, following on from months of errors in my cache pool, I recently nuked the pool and started over with some great help from @JorgeB.

     

    This morning my syslog was full and I found many errors coming from one of my cache drives (SDB1).

     

    I assume this is the 256 GB ADATA drive in my pool, see below.

     

    Before I go and replace this drive I just wanted to make absolutely certain this was the problematic drive in my pool?

     

     

    image.png.da8a683bc21d45bce7266d1122c1f513.png

    bigdaddy-diagnostics-20221124-1219.zip

  5. Hi everyone,

     

    Over the last few months I've had numerous problems with my cache pool, with @JorgeB providing a lot of very helpful advice. A few weeks back I nuked the pool and started afresh, or at least I thought I had started with a clean new cache pool. The pool has performed flawlessly ever since then, however I've noticed that the second drive 'Cache 2' appears to have been on standby the entire time since starting over.

     

    The fact that the second drive being on standby coincides with the absence of all the earlier issues may be confirmation that the second drive was the source of all my earlier issues, however I want to know why it's on standby and if it can be fixed?

     

    What are the next steps I should take in root causing the problem?

     

     

    image.png

  6. On 5/10/2022 at 7:40 PM, elcapitano said:

    Cant get reverse proxy to work for organizr.

     

    I am sorry having to ask for help on this, but I am at a loss.

     

    Prereq: 

    1) CNAME subdomain: "organizr.domain.xxx" has been created at cloudlare

    2) Using SWAG - organizr.subdomain.conf has been configured (see attached)

    3) SWAG restarted

    4) Organizr is on docker's proxynet

    5) Container name is "organizr"

     

    What am I missing?

     

    Container is up and running locally. .. 

     

    Skærmbillede 2022-05-10 kl. 11.29.41.png

     

    Hi everyone, I was having the same problems with the same setup and found a solution that others may appreciate.

     

    With the change of the Organizr app name to organizrv2, the "set $upstream_app" line should have organizrv2 in it.

     

    I think the SWAG sample conf files need updating to reflect the change in app name.

     

     

  7. Thanks but I got impatient yesterday and tried to recover the remaining drive using your guide here -

     

    When I mounted the remaining drive I saw that I wasn't going to be able to recover all the data, which confused me because I thought the data in a RAID 1 cache pool was mirrored? So I formatted the drive then recovered my Appdata and both drives appear to have come back online ok.

     

    Until this morning, when my Docker server appears to have failed. I took the attached diagnostics before rebooting.

     

    I'm getting sick of the abysmal reliability of my cache. My efforts yesterday were intended on removing the troublesome SSD but just caused more drama than I expected. How can I safely reduce my cache to a single drive so I can eliminate variable sources of issues?

    bigdaddy-diagnostics-20220924-0804.zip

  8. Hi @JorgeB, I've got a two drive cache pool and just tried to remove the first of my cache drives following the guide in the FAQ and I'm seeing some strange behaviour.

    I stopped the array and unassigned the first cache drive, then restarted the array. I couldn't see any cache activity, but there was a notification that the cache had returned to normal operation. However, under Pool Devices it says "Unmountable: No File System", my Appdata folder is empty and none of my Dockers are there.

    I then tried to reallocate the first cache drive into the first slot, but the system warned me that all data would be removed from this drive if I started the array, so I chose not to. I then reallocated the second cache drive to first cache drive's slot and restarted the array, but I'm still getting the "Unmountable: no file system" error and when I click on the cache web gui, the Balance function is unavailable, saying it is only available when the array has started.

     

    Any help appreciated!!

    bigdaddy-diagnostics-20220923-1159.zip

  9. 10 hours ago, itimpi said:

    I suspect that is the time that you have mover scheduled to run and that activity associated with that is causing the issues.

     

    That's interesting - my mover is scheduled to run each morning at 3:40 am and my cache usually fails one hour later at 4:40 am.

     

  10. 58 minutes ago, JorgeB said:

    Did you replace the cables before this like I suggested or not? If yes it suggests a cable problem, if not it might be a power issue, do the SSDs share for example a power splitter?

     

    I don't think I swapped the cables before this, and yes the SSDs are powered by a sata power splitter, similar to this one - https://www.centrecom.com.au/8ware-sata-power-splitter-cable-15cm-1-x-15-pin-2-x-15-pin-male-to-female?gclid=CjwKCAjwsfuYBhAZEiwA5a6CDIfGyv1DRoInRPz151Wt6xyYMQgQZnE1B5W9u2sJEJPX4KhpttIsDhoClFIQAvD_BwE

     

  11. On 8/25/2022 at 5:05 PM, JorgeB said:

    Swap cables between both SSDs, if the Crucial keeps dropping replace it.

     

    Hey @JorgeB, I followed your suggestion about the BTRFS monitoring script and started scrubbing the cache every day, which appears to have worked for a few weeks, but then my cache failed again at the same time as usual (4:40am). I know you suggested it was the Crucial drive that was the issue previously, but when I rebooted this morning, it was the ADATA drive that failed to be recognised. I swapped the cables between the SSDs and rebooted and everything started up ok. Could you confirm if the diagnostics identify the culprit here? Also, why is it always happening at the same time of day?

     

    Thanks!!

    bigdaddy-diagnostics-20220912-0731.zip

  12. 14 hours ago, JorgeB said:

    And it's not the first time, check/replace cables, then run a scrub and check all errors were corrected, see here for more info and better pool monitoring.

     

    Hi @JorgeB, thanks for the advice but my server failed again overnight, 24 hours after the last failure (diagnostics attached). Interestingly these failures appear to happen at the same time, 4:40am. I've gone back through my notification emails and 4:40am is the time for about 80% of them. Could this be just the time that the plugin runs its checks or could there be another process triggering it?

     

    When I restarted the server after the failure yesterday, the Docker service still appeared to be corrupted. So I deleted the Docker folder through the GUI and restarted the server, expecting to have to reload all my Dockers from templates. However upon restart, all my Docker containers were there and running fine??

     

    I ran the scrub per your advice last night and got these results:

     

    UUID: b22551d5-574f-45b3-a984-7648d094271c

    Scrub started: Wed Aug 24 22:06:11 2022

    Status: finished

    Duration: 0:09:21

    Total to scrub: 137.94GiB

    Rate: 231.54MiB/s

    Error summary: read=16457092 super=3

    Corrected: 0

    Uncorrectable: 16457092

    Unverified: 0

     

    All my cables are brand new so what else could be the problem? Your advice yesterday (besides cables) was to have better monitoring, but that doesn't appear to attack any root cause, just give me earlier warning before failure?

     

    One of the SSDs (ADATA) is much older than the other. If it is the one failing, could I try removing it from the pool and reverting to a single cache drive for a while?

     

    I'm very appreciative of all advice, just getting a bit frustrated here!!

     

    Thanks

     

    bigdaddy-diagnostics-20220825-0754.zip

  13. Hi guys,

     

    I've just discovered my server crashed again overnight and is displaying the same usual symptoms. Docker service failed to start and errors in unable to wrote to cache and unable to write to Docker image. Interestingly, this time both cache drives still appeared online under the Main tab. I tried to resolve this by moving the SATA cable from one port on the motherboard to another, but that created more issues with the Cache being unreadable, so I moved the cables back to the ports they were in originally and the system appears to have restarted ok. I can't help but assume that one or more SATA ports on my motherboard are faulty..?

     

    bigdaddy-diagnostics-20220824-0613.zip

  14. On 6/28/2022 at 7:10 PM, JorgeB said:

    Device is dropping offline, this is usually a power/connection problem, suggest replacing/swapping both cables, power and SATA.

     

    Hi everyone, I've swapped all the power and data cables on my SSDs but I keep getting errors causing the server to crash around every 14 days or so. Most recently, my Dockers failed with a 'Server error 403' message. I've attached the diagnostics file from that moment. I can resolve this by shutting down, jiggling the SATA cables at their connection to the motherboard and restarting. Is it possible that my motherboard SATA connections are loose? if so, how do I confirm this before buying a new motherboard? 

    bigdaddy-diagnostics-20220817-1930.zip

×
×
  • Create New...