Jump to content

xreyuk

Members
  • Posts

    33
  • Joined

  • Last visited

Everything posted by xreyuk

  1. Thanks, I think it may have been my 'snapshots' dir causing it. I use SpaceInvaderOne's replication script to replicate my ZFS cache snapshots to the array. It's possible I didn't have this in the previous caching as it would have been setup after Cache Dirs was setup, however I didn't think adding a share to cache dirs would have caused such a cpu spike! I'm seeing smaller cpu jumps over the same time period, which is only minor, but is there a setting I can use to adjust these? I don't write to the array very often so probably don't cache dirs updating this often.
  2. I'm now having the spiking CPU issue with the latest version of Cache Dirs. The below is from a 30 second window of CPU activity. I didn't have this problem with the old 'incompatible' one. My settings are whatever defaults come with the plugin (including /mnt/user scanning turned off), except I have removed any directories on my cache, so only have array drives cached by the plugin, none of which are written to at the time (the disks are all spun down). These are the settings I've always had and it's been fine. Disabling the plugin completely stops this behavior. Any ideas?
  3. install file activity plugin - it should tell you more information about what is accessed when the disks spin up. Do you have Recycle Bin installed? If that's set to daily it might be spinning up every night.
  4. I have not had any problems with Dockers since upgrading however Fix Common Problems is recommending to install this patch anyway. Is it something you recommend installing 'just in case'?
  5. Upgrade seems to have worked fine for me, it also seems to have stopped my custom docker network assigning ipv6 addresses even though ipv6 was disabled which is nice. Upgraded from 6.12.6
  6. Thanks, the syslogs are from a time where I know the server was working to when the server crashed, e.g id been using it the night before so knew it was good, I didn’t see it relevant to post syslog information before that. ill have to get it setup to run in safe mode in a couple days as ill have to make provisions to do that (make sure dns is setup etc). all the hardware is new and RAM was memtested when the server was built. Ive seen so many posts in here the last few days with the same symptoms I’m just starting to think it’s a bug with unraid.
  7. Hi, I have unraid running on the following hardware. Intel Core i3-12100 16GB Corsair Vengeance 3200Mhz RAM Gigabyte H610i Mini ITX Motherboard Corsair RMx 650W PSU 4 x 4TB Seagate IronPort HDDs in array, formatted in XFS 1 x Crucial P3 Plus 500GB NVME in Cache Pool formatted in ZFS. I have Dockers for Homebridge, Plex, Cloudflare DDNS, Sabnzbd, Sonarr/Radarr/Notifarr and Homarr dash. No CPU Pinning has been done except for Plex, which is pinned to Cores 3 and 4. The other dockers are able to use all available cores. I have one Debian VM running which has been assigned to core 2 - to avoid it using resources Unraid might need on core 1). It has 1GB RAM assigned and is running PiHole and Unbound. I use powertop and auto tune to tune the server where it can reach C10 states, however in the meantime I've had a crash where a PCI card was installed which prevented it going to less than C2 so don't think this is the problem. I have been experiencing frequent crashes since I installed Unraid at the start of January, in total I have logs for 4, but have experienced 5. 4 out of 5 crashes has left Unraid unreachable, non responsive to ping (including the VM's on it's own IP), whilst the box still has power. I have to manually power off and power on to get access back. 1 crash I was able to connect a keyboard/mouse/monitor and get a login prompt which seemed to bring it back to life. Parity checks are successful after unclean shutdown. The details and diagnostics/relevant syslogs are below. I have included syslogs from between the time I know the server to have been good, between the time I noticed it had crashed and needed to reboot. This is information for crashes 2-5 as crash 1 had no logs and I didn't take diagnostics. Crash 2 - 3rd Feb Crash occured in the early hours of the morning whilst sleeping, but woke up to the box non responsive to any login attempts and GUI inaccessible however all of my dockers and VM was still working. I managed to connect a monitor/mouse/keyboard and this gave me an unraid login after which the GUI became accessible again. Server had previously been up since 31st Jan, 'Syslog 2 and Diagnostics 2' relate to the below. The diagnostics was taken a bit later on the day as I didn't have chance when I noticed the crash. Syslog to me indicates it could be related to potentially running out of memory (for whatever reason) Crash 3 - 3rd Feb This crash I noticed around 8PM in the evening. Unraid, all my dockers and my VMs became unresponsive and the server would not respond to pings. No automated tasks were scheduled between the two crashes this day (except for whatever sonarr/radarr do) and the server had pretty much been idle most of the day. Had to power off and back on via power button to get server back. 'Syslog 3 and Diagnostics 3' contain information for this, obviously diagnostics was taken after the reboot. The syslog to me seemed to indicated an nginx crash may be the cause here. Crash 4 - Feb 10th This time it'd crashed again in the early hours of the morning, woke up to find exactly the same symptoms as Crash 3. Unresponsive to pings on both Unraid and the VM, server still powered on. Had to power on/off via power button to get server back. 'Syslog 4 and Diagnostics 4' related to this crash, obviously diagnostics taken after reboot. Syslog is literally empty so i have no idea. Crash 5 - Feb 11th Crash occurred around 9PM in the evening. Exactly the same symptoms as previous 2 crashes. Syslog 5 and Diagnostics 5 in the attached files are relevant. Syslog again seems literally empty to me so I have no idea. Only 'weird' thing I can see in the logs appears to be one nginx crash which could be related, and the other is that my dockers are assigning IPv6 addresses despite IPv6 being disabled in Unraid settings, as well as my custom docker network which also says IPv6 is disabled. The crashes only seem to occur when the HDDs are spun down, so the system is at very light load only serving DNS from unbound/Pihole basically. Any help would be greatly appreciated. Diagnostics 3.zip Diagnostics 4.zip Diagnostics 5.zip Syslog 2.txt Syslog 3.txt Syslog 4.txt Syslog 5.txt Diagnostics 2.zip
  8. I won't hijack this thread completely cause I'll be making a new one, but I'm having the exact same symptoms as this also. Sometimes server needs rebooting hours later, but longest I've had it stable is just shy of 7 days. However my syslog logs simply point at nothing.
  9. I keep getting crashes on Unraid which don't seem to have a consistent pattern in the logs. I belive powertop is enabling auto suspend for my USB ports, could this possibly affect unraid and cause it to crash?
  10. For what it’s worth I’ve just found this thread because I believe I’m having this issue running 6.12.6. syslogs shows nginx crash and the whole server becomes unresponsive (even to pings) and I have to power off with the power button. it seems to happen when the server is idle for a couple of hours.
  11. Confirmed it is definitely the card causing this. I started the server with the PCI E card removed and was able to instantly hit C10 on the package as a whole. I'm using the same ASM1166 card everyone else is using and it's on a later firmware than the one that fixes the power states issue and as you can see from the quoted post, it does support ASPM. The only other thing I've seen is that someone had to compile their own kernel to get it to go lower. Any ideas on anything I can do or am I just stuck?
  12. No, it's GbE - and without the PCI card plugged in I am able to reach package state C10, so I think this is somehow related to the card/PCI slot. All the tuneables are listed as 'good' in powertop after the --auto-tune
  13. I'd have no idea how to pull it if someone could tell me I wouldn't mind. I bought the card off Amazon UK, it was the MZHOU one everyone else appears to be buying.
  14. So I have changed my card to an ASM1166 card which is running a high enough firmware version to support ASPM. This is the output of lspci 00:01.0 PCI bridge: Intel Corporation 12th Gen Core Processor PCI Express x16 Controller #1 (rev 05) (prog-if 00 [Normal decode]) LnkCap: Port #2, Speed 32GT/s, Width x16, ASPM L1, Exit Latency L1 <16us LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+ 00:1c.0 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #1 (rev 11) (prog-if 00 [Normal decode]) LnkCap: Port #1, Speed 8GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <1us, L1 <4us LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes, Disabled- CommClk- 00:1c.4 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #5 (rev 11) (prog-if 00 [Normal decode]) LnkCap: Port #5, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+ 01:00.0 SATA controller: ASMedia Technology Inc. ASM1166 Serial ATA Controller (rev 02) (prog-if 01 [AHCI 1.0]) LnkCap: Port #0, Speed 8GT/s, Width x2, ASPM L0s L1, Exit Latency L0s <4us, L1 <64us LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+ 03:00.0 Non-Volatile memory controller: Micron/Crucial Technology P2 [Nick P2] / P3 / P3 Plus NVMe PCIe SSD (DRAM-less) (rev 01) (prog-if 02 [NVM Express]) LnkCap: Port #1, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 unlimited LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+ So I am using the same settings as before installing the card where I could reach package C10 states. I have applied --auto-tune and the disks seem to spin up and down as expected, and all tuneables are listed as 'good'. However, now I can only reach package state of C2, and core state of C10. Does anyone know how I can troubleshoot why the package is no longer going to C10 with the PCI card installed? I have tried disabling all dockers and VM's I have installed, I am still running the same plugins (no new ones) and I am still not able to reach C10 with these disabled.
  15. Hi, I have just bought an ASM1166 card. I plugged it into a windows PC to do the firmware upgrade, however the firmware on the card is 221118-003E-00 which is a higher number than the 211108-0000-00 recommended in this thread. Do I need to change the firmware or not do we think? Thanks.
  16. Could I just use the --auto-tune settings if I get an ASM1166 card and make sure firmware is upgraded? *EDIT* I saw another post, I can. I'll return the JBM one and get an ASM1166 card. Thanks.
  17. Hi all, I have previously managed to acheive C10 state on my machine with BIOS settings and using --auto-tune, however I needed to expand my storage earlier than expected and thus have had to install a PCI to SATA card with the JMicron JBM585 controller. From my understanding using --auto-tune is now not possible as it stops the drives connected to this card spinning up. Therefore I need to use the CLI commands and a startup script to achieve any power savings I can (I realise the PCI card may stop me reaching low power states too). My output to see whether ASPM is enabled is as follows. 00:01.0 PCI bridge: Intel Corporation 12th Gen Core Processor PCI Express x16 Controller #1 (rev 05) (prog-if 00 [Normal decode]) LnkCap: Port #2, Speed 32GT/s, Width x16, ASPM L1, Exit Latency L1 <16us LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk- 00:1c.0 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #1 (rev 11) (prog-if 00 [Normal decode]) LnkCap: Port #1, Speed 8GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <1us, L1 <4us LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes, Disabled- CommClk- 00:1c.4 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #5 (rev 11) (prog-if 00 [Normal decode]) LnkCap: Port #5, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+ 01:00.0 SATA controller: JMicron Technology Corp. JMB58x AHCI SATA controller (prog-if 01 [AHCI 1.0]) LnkCap: Port #0, Speed 8GT/s, Width x2, ASPM not supported LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk- 03:00.0 Non-Volatile memory controller: Micron/Crucial Technology P2 [Nick P2] / P3 / P3 Plus NVMe PCIe SSD (DRAM-less) (rev 01) (prog-if 02 [NVM Express]) LnkCap: Port #1, Speed 16GT/s, Width x4, ASPM L1, Exit Latency L1 unlimited LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+ All ASPM/Power options are enabled in BIOS (as previously mentioned but double checked), and the PCI to SATA controller is plugged into the PCI x16 slot. I can see that ASPM is disabled on the x16 controller and the PCI to SATA card - do I need to get this enabled somehow or is this expected? (there are no explicit BIOS options to enable this). Furthermore, here is the output of 'bad' devices from powertop before the card was installed. All of these auto-tuned correctly. >> Bad Enable SATA link power management for host6 Bad Enable SATA link power management for host7 Bad Enable SATA link power management for host5 Bad Enable SATA link power management for host3 Bad Enable SATA link power management for host1 Bad Enable SATA link power management for host8 Bad Enable SATA link power management for host4 Bad Enable SATA link power management for host2 Bad VM writeback timeout Bad Autosuspend for USB device Flash Drive [Samsung] Bad Runtime PM for disk sdd Bad Runtime PM for disk sda Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH SATA C Bad Runtime PM for disk sdc Bad Runtime PM for port ata1 of PCI device: Intel Corporation Alder Lak Bad Runtime PM for port ata2 of PCI device: Intel Corporation Alder Lak Bad Runtime PM for port ata3 of PCI device: Intel Corporation Alder Lak Bad Runtime PM for port ata4 of PCI device: Intel Corporation Alder Lak Bad Runtime PM for port ata5 of PCI device: Intel Corporation Alder Lak Bad Runtime PM for port ata6 of PCI device: Intel Corporation Alder Lak Bad Runtime PM for port ata7 of PCI device: Intel Corporation Alder Lak Bad Runtime PM for disk sde Bad Runtime PM for port ata8 of PCI device: Intel Corporation Alder Lak Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH Serial Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH Serial Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH Serial Bad Runtime PM for PCI Device Intel Corporation Ethernet Connection (17 Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH PCI Ex Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH Shared Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH Serial Bad Runtime PM for PCI Device Intel Corporation Device 4630 Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH Serial Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH Serial Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH SPI Co Bad Runtime PM for PCI Device Micron/Crucial Technology P2 [Nick P2] / Bad Runtime PM for PCI Device Intel Corporation Device 7a87 Bad Runtime PM for disk sdb Here is the output from after I've installed the card. Bad Enable SATA link power management for host6 Bad Enable SATA link power management for host7 Bad Enable SATA link power management for host8 Bad Enable SATA link power management for host13 Bad Enable SATA link power management for host11 Bad Enable SATA link power management for host5 Bad Enable SATA link power management for host3 Bad Enable SATA link power management for host1 Bad Enable SATA link power management for host12 Bad Enable SATA link power management for host10 Bad Enable SATA link power management for host4 Bad Enable SATA link power management for host2 Bad VM writeback timeout Bad Enable SATA link power management for host9 Bad Autosuspend for USB device Flash Drive [Samsung] Bad Runtime PM for PCI Device Micron/Crucial Technology P2 [Nick P2] / P3 / P3 Plus NVMe PCIe SSD (DRAM-less) Bad Runtime PM for port ata6 of PCI device: Intel Corporation Alder Lake-S PCH SATA Controller [AHCI Mode] Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH Serial IO I2C Controller #3 Bad Runtime PM for port ata5 of PCI device: Intel Corporation Alder Lake-S PCH SATA Controller [AHCI Mode] Bad Runtime PM for disk sda Bad Runtime PM for disk sdb Bad Runtime PM for disk sdc Bad Runtime PM for disk sdd Bad Runtime PM for disk sde Bad Runtime PM for disk sdf Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH SATA Controller [AHCI Mode] Bad Runtime PM for port ata2 of PCI device: Intel Corporation Alder Lake-S PCH SATA Controller [AHCI Mode] Bad Runtime PM for port ata3 of PCI device: Intel Corporation Alder Lake-S PCH SATA Controller [AHCI Mode] Bad Runtime PM for PCI Device JMicron Technology Corp. JMB58x AHCI SATA controller Bad Runtime PM for port ata1 of PCI device: Intel Corporation Alder Lake-S PCH SATA Controller [AHCI Mode] Bad Runtime PM for port ata4 of PCI device: Intel Corporation Alder Lake-S PCH SATA Controller [AHCI Mode] Bad Runtime PM for port ata7 of PCI device: Intel Corporation Alder Lake-S PCH SATA Controller [AHCI Mode] Bad Runtime PM for PCI Device Intel Corporation Ethernet Connection (17) I219-V Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH Serial IO I2C Controller #5 Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH PCI Express Root Port #1 Bad Runtime PM for port ata8 of PCI device: Intel Corporation Alder Lake-S PCH SATA Controller [AHCI Mode] Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH Serial IO I2C Controller #1 Bad Runtime PM for PCI Device Intel Corporation Device 7a87 Bad Runtime PM for PCI Device Intel Corporation Device 4630 Bad Runtime PM for port ata10 of PCI device: JMicron Technology Corp. JMB58x AHCI SATA controller Bad Runtime PM for port ata11 of PCI device: JMicron Technology Corp. JMB58x AHCI SATA controller Bad Runtime PM for port ata12 of PCI device: JMicron Technology Corp. JMB58x AHCI SATA controller Bad Runtime PM for port ata13 of PCI device: JMicron Technology Corp. JMB58x AHCI SATA controller Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH SPI Controller Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH Serial IO I2C Controller #4 Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH Serial IO I2C Controller #2 Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH Serial IO I2C Controller #0 Bad Runtime PM for port ata9 of PCI device: JMicron Technology Corp. JMB58x AHCI SATA controller Bad Runtime PM for PCI Device Intel Corporation Alder Lake-S PCH Shared SRAM So from what I can see I would need to do the commands for all disks except sdf, as that did not exist before and will be the new one connected to the controller (confirmed from main tab as well). I would need to do the sata link management for hosts 1 to 8, and the PM ATA port for ata1 to ata8. However after that I am unable to identify which other tweaks to do and which devices they need doing on listed in the /sys/ folder directories. I would basically like to be able to do as much power tuning as possible without messing up the controller and the disk currently connected to it (and future disks). Is anyone able to advise how to find these devices and then use that information to help me do the commands manually?
  18. Hi, I have a server with a total of 16GB RAM. My unraid dashboard currently shows 5.79GiB (6.2GB) of RAM In use. If I use htop and sort by RES and manually calculate everything there is around 2.5GB in use, which if I add the 100% ZGS usage to (2GB, 1/8 of total ram), that makes 4.5GB total. If I use free --mega it shows 5.5GB used, 345MB shared, 9.66GB buff/cache. If I look at what's actually being used by the dockers and VM's I have (reserved and reported by unraid) I believe I'm using around 4.5GB Why are all three showing different variations of usage and how do I know which is the most accurate? Diagnostics file: diagnostics-20240118-1442.zip
  19. Hi All, I've set my docker image to be 15GB in the docker settings, however when I look on the share where the image is saved, I can see the docker.img is 16.1GB, could anyone advise why this is? My containers only use up around 2-3GB of the image.
  20. Ahh okay no worries, thanks. From my 'extensive' testing today, turning off power boost and going into power saver mode saves about 15w under a load of sabnzbd downloading, doesn't really have an affect under idle. However, disabling the 'boost' feature under my BIOS, for whatever reason saves me basically 50% under idle so I think I might run with this for the forseeable until I notice a performance impact. Thanks.
  21. Thanks. On the original guide it says to disable turbo boosting but if you think that won’t affect it much I’ll re-enable it. I’m assuming that if I re-enable boosting in bios, I can just use the tips and tweaks plugin to disable/re-enable it? If I do this I’m assuming my normal governer should still be set to powersave? Finally can I just use the tune all command in my go file to run that at startup rather than using the individual commands? thank you for all your help.
  22. Thanks this has seemed to make it work. Idle has dropped from 20-23w to 11-13w, so a good power saving. Once I left it a bit longer after the image below I was dropping into c10 state. My question is do you think this will affect plex transcoding? I'm using hardware transcoding to the Intel iGPU - will turning boost off have any effect on that? I will see how server performance goes otherwise with these settings enabled. I'm running plex/sabnzbd/sonarr/radarr/pi-hole/unbound dockers so I think even in it's non turbo boosting I should be okay?
  23. Thanks, I'll give this a go and report back. It's not like the consumption isn't low on my current system, it's at 20w idle with disks spun down, but would like to get even lower if possible.
  24. Thank you for this guide, this seems to be working for me as I can see items being created in the sessions folder on the RamScratch disk. However, I have 'stopped' a plex viewing halfway throguh and can still see items in the 'session' folder on the RamScratch Disk, how long does it usually take to remove these?
  25. I have this exact same problem, I will get continous small bursts of writes of around 400Kb/s. I have no dockers/VMs etc running, only an array, 2 cache drives and a few plugins. I disabled the cache dirs plugin and zfs master refresh to see if it was those and I still got the continuous small bursts. Only thing that iotop shows is that kworker is triggering when the writes happen. Any ideas?
×
×
  • Create New...