Jump to content

Rysz

Community Developer
  • Posts

    500
  • Joined

  • Last visited

  • Days Won

    2

Posts posted by Rysz

  1. 10 minutes ago, dicry said:

    Now I can see UPS LOAD, but it is still incomplete.

    5.png

     

    That's great, because from UPS Load the other missing variables can be calculated.

     

    Now set these settings:

    grafik.png.b7aa45d63a64d616efbf082b9df4d307.png

     

    Instead of 0 put here the nominal 2000VA and 1600W of your UPS and you should see the other variables now 🙂

     

  2. On 5/20/2024 at 8:31 PM, PilaScat said:

    Any news for the APC Back-UP UPS?

    image.png.99914ae42e0fdd3e774f21cade10be3e.png

    After reboot it works for some time, then freeze

    image.thumb.png.ab9a1396ae81de86d20818d872d38abd.png

    Tried with usb power override without succes

    EDIT: Saw the post about APC, resetted config, will see about freezing

     

    nut-debug-20240520202951.zip 258.73 kB · 0 downloads

     

    Sorry, there's no news on that. It's still a known issue both with APCUPSD and NUT that is likely related to the UPS firmware being different or broken somehow (which in my opinion is the more likely scenario). Again, in my personal opinion, it's a very low quality production UPS (which does make sense at that price) that I personally wouldn't trust to protect my devices in its current state. I've read not many really positive reviews about it either so far, apart from the price point. I think the only option, if it's still new, is to try to return it for another better (APC, ...) model or series that's known to be working better with NUT. This is my personal opinion and not that of the NUT developers.

     

  3. 24 minutes ago, itimpi said:

    I would think your best bet is to reset the array (via Tools->New Config) and build new parity based on the remaining good drives.     You now have a healthy array based on those drives with their data intact.  You can add new drives at this point if you want using normal mechanisms.   If the Data recovery people can return any drives with their data intact then you copy the data off them back to the array and having done that decide if the returned drives are trustworthy enough to be put back into the array (as new drives).

     

    Ironically enough this problem although in one sense catastrophic it shows one of the benefits of Unraid in that when you have multiple drive failures well beyond the fault tolerance level of the array you only lose the data on the failed drives.   Traditional RAID based arrays would have meant ALL your data was lost.

     

    This, but before doing that I'd make sure those drives are in fact dead and not just dropped offline because of another unrelated issue.

  4. How did 7 drives just die and how are you so accepting of the fact?

    Did you try connecting those presumed dead drives directly to a PC to ensure they are in fact really dead?

    Are you sure your LSI is not just dropping offline because of over-heating or this perhaps being some sort of cabling issue?

     

    I'd take a step back and assess the situation once again before just sending out your drives somewhere.

    It'd certainly help to get better support here if you included your diagnostics package from your server, so please do.

     

  5. On 5/22/2024 at 9:01 AM, dicry said:

    May 22 13:38:49 Tower rc.nut: Writing NUT configuration...
    May 22 13:39:27 Tower flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
    May 22 13:41:04 Tower rc.nut: Updating permissions for NUT...
    May 22 13:41:04 Tower rc.nut: Checking if the NUT Runtime Statistics Module should be enabled...
    May 22 13:41:04 Tower rc.nut: Enabling the NUT Runtime Statistics Module...
    May 22 13:41:05 Tower rc.nut: Stopping the NUT services... 
    May 22 13:41:07 Tower rc.nut: Can't open /var/run/nut/snmp-ups-ups.pid: No such file or directory
    May 22 13:41:07 Tower rc.nut: Can't open /var/run/nut/snmp-ups-192.168.50.199.pid either: No such file or directory
    May 22 13:41:07 Tower rc.nut: Network UPS Tools - UPS driver controller 2.8.2
    May 22 13:42:56 Tower ool www[1692]: /usr/local/emhttp/plugins/nut-dw/scripts/start
    May 22 13:42:56 Tower rc.nut: WARNING: NUT was user-configured to disable power management for all USB devices.
    May 22 13:42:56 Tower rc.nut: WARNING: NUT is now forcing all USB devices to permanent [on] power state as requested...
    May 22 13:42:57 Tower rc.nut: Writing NUT configuration...
    May 22 13:43:28 Tower flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
    May 22 13:45:12 Tower rc.nut: Updating permissions for NUT...
    May 22 13:45:12 Tower rc.nut: Checking if the NUT Runtime Statistics Module should be enabled...
    May 22 13:45:12 Tower rc.nut: Enabling the NUT Runtime Statistics Module...
    May 22 13:45:58 Tower rc.nut: Startup timer elapsed, continuing...
    May 22 13:45:58 Tower rc.nut: Driver [ups] PID 26668 initially exceeded maxstartdelay and is still starting
    May 22 13:45:58 Tower rc.nut: Network UPS Tools - UPS driver controller 2.8.2
    May 22 13:49:14 Tower rc.nut: No supported device detected at [ups] (host 192.168.50.199)
    May 22 13:49:14 Tower rc.nut: Network UPS Tools - Generic SNMP UPS driver 1.31 (2.8.2)
    May 22 13:57:59 Tower ool www[15025]: /usr/local/emhttp/plugins/nut-dw/scripts/stop
    May 22 13:58:00 Tower rc.nut: Writing NUT configuration...
    May 22 13:58:30 Tower flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
    May 22 14:00:17 Tower rc.nut: Updating permissions for NUT...
    May 22 14:00:17 Tower rc.nut: Checking if the NUT Runtime Statistics Module should be enabled...
    May 22 14:00:17 Tower rc.nut: Enabling the NUT Runtime Statistics Module...
    May 22 14:00:18 Tower rc.nut: Stopping the NUT services... 
    May 22 14:00:20 Tower rc.nut: Can't open /var/run/nut/snmp-ups-ups.pid: No such file or directory
    May 22 14:00:20 Tower rc.nut: Can't open /var/run/nut/snmp-ups-192.168.50.199.pid either: No such file or directory
    May 22 14:00:20 Tower rc.nut: Network UPS Tools - UPS driver controller 2.8.2
    May 22 14:18:45 Tower ool www[5571]: /usr/local/emhttp/plugins/nut-dw/scripts/start
    May 22 14:18:45 Tower rc.nut: WARNING: NUT was user-configured to disable power management for all USB devices.
    May 22 14:18:45 Tower rc.nut: WARNING: NUT is now forcing all USB devices to permanent [on] power state as requested...
    May 22 14:18:46 Tower rc.nut: Writing NUT configuration...
    May 22 14:19:32 Tower flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
    May 22 14:19:47 Tower dhcpcd[2490]: bond1: failed to renew DHCP, rebinding
    May 22 14:19:47 Tower dhcpcd[2490]: bond1: leased 192.168.50.109 for 7200 seconds
    May 22 14:20:32 Tower flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
    May 22 14:21:01 Tower rc.nut: Updating permissions for NUT...
    May 22 14:21:01 Tower rc.nut: Checking if the NUT Runtime Statistics Module should be enabled...
    May 22 14:21:01 Tower rc.nut: Enabling the NUT Runtime Statistics Module...
    May 22 14:21:47 Tower rc.nut: Startup timer elapsed, continuing...
    May 22 14:21:47 Tower rc.nut: Driver [ups] PID 27409 initially exceeded maxstartdelay and is still starting
    May 22 14:21:47 Tower rc.nut: Network UPS Tools - UPS driver controller 2.8.2
    May 22 14:25:02 Tower rc.nut: No supported device detected at [ups] (host 192.168.50.199)
    May 22 14:25:02 Tower rc.nut: Network UPS Tools - Generic SNMP UPS driver 1.31 (2.8.2)
    May 22 14:39:25 Tower ool www[22665]: /usr/local/emhttp/plugins/nut-dw/scripts/start
    May 22 14:39:25 Tower rc.nut: WARNING: NUT was user-configured to disable power management for all USB devices.
    May 22 14:39:25 Tower rc.nut: WARNING: NUT is now forcing all USB devices to permanent [on] power state as requested...
    May 22 14:39:26 Tower rc.nut: Writing NUT configuration...
    May 22 14:39:34 Tower flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
    May 22 14:41:34 Tower flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
    May 22 14:41:40 Tower rc.nut: Updating permissions for NUT...
    May 22 14:41:40 Tower rc.nut: Checking if the NUT Runtime Statistics Module should be enabled...
    May 22 14:41:40 Tower rc.nut: Enabling the NUT Runtime Statistics Module...
    May 22 14:42:26 Tower rc.nut: Startup timer elapsed, continuing...
    May 22 14:42:26 Tower rc.nut: Driver [ups] PID 1311 initially exceeded maxstartdelay and is still starting
    May 22 14:42:26 Tower rc.nut: Network UPS Tools - UPS driver controller 2.8.2
    May 22 14:45:41 Tower rc.nut: No supported device detected at [ups] (host 192.168.50.199)
    May 22 14:45:41 Tower rc.nut: Network UPS Tools - Generic SNMP UPS driver 1.31 (2.8.2)
    May 22 14:46:27 Tower ool www[27261]: /usr/local/emhttp/plugins/nut-dw/scripts/stop
    May 22 14:46:28 Tower rc.nut: Writing NUT configuration...
    May 22 14:46:34 Tower flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
    May 22 14:48:46 Tower rc.nut: Updating permissions for NUT...
    May 22 14:48:46 Tower rc.nut: Checking if the NUT Runtime Statistics Module should be enabled...
    May 22 14:48:46 Tower rc.nut: Enabling the NUT Runtime Statistics Module...
    May 22 14:48:47 Tower rc.nut: Stopping the NUT services... 
    May 22 14:48:49 Tower rc.nut: Can't open /var/run/nut/snmp-ups-ups.pid: No such file or directory
    May 22 14:48:49 Tower rc.nut: Can't open /var/run/nut/snmp-ups-192.168.50.199:161.pid either: No such file or directory
    May 22 14:48:49 Tower rc.nut: Network UPS Tools - UPS driver controller 2.8.2
    May 22 14:49:01 Tower ool www[14930]: /usr/local/emhttp/plugins/nut-dw/scripts/start
    May 22 14:49:01 Tower rc.nut: WARNING: NUT was user-configured to disable power management for all USB devices.
    May 22 14:49:01 Tower rc.nut: WARNING: NUT is now forcing all USB devices to permanent [on] power state as requested...
    May 22 14:49:02 Tower rc.nut: Writing NUT configuration...
    May 22 14:49:35 Tower flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
    May 22 14:51:15 Tower rc.nut: Updating permissions for NUT...
    May 22 14:51:15 Tower rc.nut: Checking if the NUT Runtime Statistics Module should be enabled...
    May 22 14:51:15 Tower rc.nut: Enabling the NUT Runtime Statistics Module...
    May 22 14:52:01 Tower rc.nut: Startup timer elapsed, continuing...
    May 22 14:52:01 Tower rc.nut: Driver [ups] PID 31670 initially exceeded maxstartdelay and is still starting
    May 22 14:52:01 Tower rc.nut: Network UPS Tools - UPS driver controller 2.8.2
    May 22 14:55:17 Tower rc.nut: No supported device detected at [ups] (host 192.168.50.199:161)
    May 22 14:55:17 Tower rc.nut: Network UPS Tools - Generic SNMP UPS driver 1.31 (2.8.2)

     

    This is a log, and the MIB file is attached for your reference. I am currently using 192.168.50.199:161 but still unable to connect.

    UPS-RFC1628-MIB.zip 7.46 kB · 0 downloads

     

    It should work with your UPS because it uses the RFC standard.

    Can you try changing the settings as follows:

     

    grafik.png.3ea0beb8b5c2ddb7356d427bf281b85b.png

     

     

    grafik.thumb.png.03be2bd843b6e6ca3fcb2f3ace5fe2db.png

     

     

    If this doesn't work, please run these commands and post the results:

    nut-scanner -S -s 192.168.50.198 -e 192.168.50.200 -c public0
    nut-scanner -S -s 192.168.50.198 -e 192.168.50.200 -c private0
    nut-scanner -S -s 192.168.50.198 -e 192.168.50.200

     

    Last resort, check out this Chinese guide I found for your UPS with SNMP:

    https://post.smzdm.com/p/adm62dkk/

     

    They got it working with SNMPv3. It should definitely work with SNMPv1/SNMPv2 also, but in the worst case you can try the guide from this post and use SNMPv3 (which is far more complicated, but also more secure).

     

    Please let me know what worked for you in the end!

     

  6. 8 minutes ago, dicry said:

    The SNMP address in the image is 192.168.50.199, and SNMP has been configured. However, I can't enable it in Unraid. I suspect it's due to incorrect settings, but I haven't been able to find the reason.

    At the same time, I found in the Hardware Compatibility List that the Huawei UPS5000-E supports the SNMP-UPS driver, while my UPS2000-A series only supports the Huawei-UPS2000 driver.

    1.png

    2.png


    Can you try put the as the UPS Driver Port: 192.168.50.199:161

    Then set Start Network UPS Tools Service to No --> Apply --> Yes --> Apply

    If it doesn't work then please post the relevant SYSLOG lines so I can see what NUT does... 🙂

     

    Also please download the MIB file from that page and post it here, so I can see if it is NUT compatible!

     

  7. On 5/15/2024 at 6:37 PM, tmodev said:

    Hello guys,

    Does anyone of you run NetworkUPS Tools with the PeaNUT Homepage integration?  I had it working in the past when i was trying out an ACP UPS. However, with my Cyberpower UPS (way lower idle draw from the UPS itself) I cannot get this widget to work. 

     

    Any ideas what i might have configured wrong?

    https://gethomepage.dev/latest/widgets/services/peanut/

     

          - tmoUPS:
                icon: https://cdn-icons-png.flaticon.com/512/2138/2138730.png
                description: tmo UPS
                siteMonitor: http://192.168.178.200:9990
                statusStyle: "dot"
                widget:
                    type: peanut
                    url: http://192.168.178.200:9990
                    key: ups
                    fields: ["battery_charge", "ups_load", "ups_status"]


    grafik.png.39dae29a1fcaa1537a5357dc4b49d4ea.png

    grafik.png

     

    Apparently you don't just need the widget, you also need the PeaNUT program running. "This widget requires an additional tool, PeaNUT, as noted. Other projects exist to achieve similar results using a customapi widget, for example NUTCase." => https://github.com/Brandawg93/PeaNUT

  8. 7 minutes ago, dicry said:

    Thank you for your help, it is now working correctly. I have another question: I am currently using a 2303 to USB adapter, but I have an additional SNMP intelligent card. However, according to the Hardware Compatibility List, it seems that the UPS2000-A series does not support the SNMP intelligent card. I am not sure if it will still work properly if I switch to the SNMP card.

    ups5.png

     

    It can also work with SNMP, because for SNMP you would use another driver and not "huawei-ups2000" (that is only for a physical connection to the UPS). If you use SNMP, you would use the driver "snmp-ups" and set the UPS Port to the IP of your SNMP card (see where the red arrow is in the screenshot).

     

    This would be an example SNMP configuration:

    grafik.png.08b1cb70f2b5c346d5089e35018ec88b.png


    But beware you need to make sure that the network connection to your UPS stays available in a power outage, so any switches also need to be connected to the UPS.

     

  9. 2 minutes ago, dicry said:

    I am using the Huawei UPS2000-A-2KTTS | 2KVA / 1600W with a built-in battery tower UPS, but when I use the NUT Device Scanner, it shows the following message:

    usb3.thumb.png.1b1be493b38ebfef64caba01d087a003.png

    NO DEVICES FOUND Please visit the 'Driver Guide' and manually choose the recommended driver for your UPS. In case of persisting problems please try another physical port, cable or the 'Support Thread'.

    However, I checked the compatibility on the NUT Hardware Compatibility List and found that my UPS is supported. I am not sure what is wrong or if my settings are incorrect. I would appreciate any help with this issue.

    ups1.png

    ups2.png

    ups4.png

     

    Your settings are not correct, please try this configuration instead and let me know:

    grafik.png.0de3732afc53ccebf86f5c8c24eb94ce.png

  10. 15 minutes ago, thatja said:

     

    How would I find out about the rootfs-ramdisk being full? or likewise if a plugin is writing to it?

     

    I haver 96GB of RAM in the server, I restarted the system via reboot on SSH using my phone on an app called Termius, only the web UI ssh isn't responsive.

     

    OK that's very interesting because if you restarted via reboot command it should show more in the syslogs. It should show it shutting down services, the array etc... but there's nothing after your last SSH login, which again makes me think that the ramdisk is full or otherwise unwritable at that point.

     

    The next time it gets stuck, don't instantly reboot, but SSH into it first and run the following commands:

    df -h

    and

    cat /etc/mtab

    and

    ls -la /mnt

    Please post the output of those commands here then, before rebooting your server.

     

    Feel free to enable mergerFS again and wait for it to get stuck again, just so we can be sure. 🙂 

    Also... where did you put the mergerFS mount commands, how are you running them?

     

  11. 10 minutes ago, thatja said:

     

    The unclean shutdown was because power was pulled from the server, this wasn't a crash related to UNRAID but a power outage on my end, sorry for the confusion regarding that.

     

    The crashes today caused by UNRAID/Something else occurred at 10:30AMish and 10:50AMish. Those are what the syslogs above cover before/after events of.

     

    Well there's nothing in the logs to indicate a failure of any kind around those times, related to mergerFS or not. But the fact that it fails to even generate a diagnostics package makes me think that the rootfs-ramdisk (at /) is either full (with some plugin writing to it non-stop filling it up), not accessible or otherwise broken somehow. It isn't even able to write the syslog or any other files into the diagnostics package, which would lead me to my earlier belief that it has something to do with the RAM. How much RAM do you have on your server? How did you shutdown your server after it crashed -  because there's nothing in the logs anymore after your last SSH login to the crashed server.

     

  12. 4 minutes ago, thatja said:

     

    This is where I am stuck, the clean shutdown was because I could not get into ssh OR the UI, that was the first crash at 1AM.

     

    Secondly, the last 2 syslog provided above was before I restarted the server. (after /mnt became inaccessible) - this was from /boot/logs as I did enable syslog server.

     

    I have had the server with nothing running at all, no docker containers but Plex, no mergerfs and it was fine. As soon as I mounted my mounts, I got another crash. 

     

    OK and where are the logs from what happened before 01am?

    Because the server seems to have crashed and rebooted at 01am, we need to know what happened before.

     

    There's no indication in the logs that mergerFS isn't operating as it should.

    The opposite actually, it doing garbage collection until the very end of your logs shows it's still running. 🤔

  13. Something is taking down your system at night:

    May 17 01:00:35 Plexified emhttpd: unclean shutdown detected

     

    Sorry, but you really need to start listening to us and re-trace your steps on what you have changed/updated recently... especially regarding your plugins. There are a ton of additional (and a few of them of quite invasive nature) plugins installed on your server... any of which could cause this issue. The fact that it isn't even able to generate a non-empty diagnostics package further underlines the fact that there's something seriously wrong with your server at the moment.

     

    Again, you need to start listening to the advice given, try disabling your plugins one-by-one and see if and when your server starts working again. You needing some plugins for your daily business doesn't change the fact that it's impossible to diagnose the problem without disabling some plugins at least temporarily. You also, as already pointed out by @JorgeB, need to set up the syslog server to see what is happening before the crashing and not just afterwards. So far we've only seen the logs after the system reboots, not from before, which would likely show the problem.

     

  14. 7 minutes ago, thatja said:

    Hi. I didn't reboot the servewr at 2.45AM.

     

    As for my mergerfs, its a pretty simple command that has been working since I started using UNRAID in December.

     

    mergerfs -o defaults,allow_other,use_ino,fsname=mergerFS /mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0000/:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0001:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0002:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0003:/mnt/nvmedl/plexified/mounts/google/MoviesSrc/0004/ /mnt/nvmedl/plexified/mounts/moviesrc/Movies/

     

    AND

     

    mergerfs -o defaults,allow_other,use_ino,category.create=ff,fsname=mergerFS /mnt/user/plexdata/:/mnt/nvmedl/plexified/mounts/moviesrc=NC:/mnt/nvmedl/plexified/mounts/google/Data=NC /mnt/nvmedl/plexified/mounts/secret/

     

    Worth noting, that /mnt/user/plexdata is my array, the rest are all rclone mounts merged to make /mnt/nvmedl/plexified/mounts/secret/

     

    nvmedl is the name of my cache drive and it is an nvme as the name suggests.

     

    Looks good to me - and you're running this through array_start.sh or array_start_complete.sh, I'm guessing?

    Something definitely shutdown your server before 02:45am, because the log starts with a server boot at 02:45am.

    Did you notice any parity checks or anything that would indicate an unclean shutdown has happened?

     

    Honestly if you changed nothing on the mergerFS scripts and they worked since December...

    I'd start looking at the GPU driver or a general RAM issue; might be worth running an extended memtest.

    ... to see if your RAM experiences any troubles after (x) hours of testing ...


    But @JorgeB is definitely more experienced at general support than me, so take this with a grain of salt.

    I don't think mergerFS is causing this, but as suggested I would try disabling it first and see if the problems still happen.

  15. Just now, JorgeB said:

    It may well be, I just wanted the user to test without mergefs to rule that out, since there's nothing else relevant logged that I can see that would explain folders going away.

     

    Yes, that's definitely a good idea, was already thinking a step further there. 😄 

  16. Can you please post the mergerFS scripts where you are setting up your mergerFS mounts?

    I see no actual errors regarding mergerFS, but let's see your scripts just to be sure. 🙂

    mergerFS garbage collection is normal and occurs every 15 minutes by default (according to manual).

     

    Also... the log posted starts with a system reboot at 02:45am - did you do this reboot?

    ... or did the system crash and reboot itself? Since you say trouble started at 02:50am.

    ... 02:50am would be after that 02:45am reboot, so was it a crash or user-triggered reboot?

     

    Also... just to provide a timeline here - since you say the troubles started around 5 days ago:

    The mergerFS backend (the actual binary) has last been updated 26/03/2024.

    The mergerFS frontend (calling your mergerFS mount scripts) has last been updated 26/04/2024.

    Those frontend changes have been minor, only introducing a timeout so that array start cannot get stuck.

    So both these updates would have been way outside of the 5 days where you experienced trouble...

     

    But please do post your mergerFS mount scripts nevertheless, you never know! 🙂 

     

    @JorgeB: Seems more like a general system problem (perhaps RAM-related?) to me.

    It's also weird that diagnostics did not include a syslog, perhaps some problems writing to the rootfs (RAM-)disk?
    Also the user said /mnt itself was inaccessible, that directory should always exist regardless of any mounts being there.

     

  17. 2 hours ago, thatja said:

    Hello, my system seems to be crashing and I think mergerfs is the cause, this has only been happening recently, is there a way to downgrade to the previous version so I can test if it is indeed the new version of mergerfs?

     

    If so, how can I achieve this?

     

    Edit: I just saw your other support topic, let's continue this there for sake of completeness:

     

×
×
  • Create New...