Jump to content

[Plugin] autotweak


Recommended Posts

Unraid plugin that enables you to adjust your Unraid system's power profile to enhance performance or improve energy efficiency.
Additionally, it fine-tunes the TCP stack settings and network interface card (NIC) interrupt affinities to optimize network performance. 
Please note: This plugin is not compatible with other plugins that alter the same settings.

  • Like 2
  • Thanks 1
  • Upvote 2
Link to comment
On 4/6/2024 at 11:31 AM, dopeytree said:

Great plugin, would you be able to explain bit more about how it works? 

E.g - it mentions something about network optimisation.

 

I've swapped over to this from tips & tweaks.

 

I'd like to provide some clarity on the two features currently not visible in the GUI:

- Modern NICs support hardware queues, which can improve network traffic processing by distributing it across multiple CPUs. The plugin is designed to optimize this by assigning different hardware queues (and their corresponding interrupts) to various CPUs. Additionally, for systems with hybrid CPUs, the plugin aims to allocate these to E-cores. The plugin also enables the Transmit Packet Steering (XPS) feature in the kernel to further enhance network performance.

- The other feature sets some TCP buffer size parameters based on the maximum physical interface speed detected in the system.

 

For detailed information about any setting within the plugin, you can always use the question mark icon to reveal the documentation. 

Additionally the plugin provides detailed logging of its operations within the system log.

 

Link to comment
On 4/6/2024 at 11:52 PM, DuzAwe said:

Could we get the conservative power governor?

Also amd_pstate=active (with the upcoming 6.6 kernel) for Zen2 and newer AMD CPUs. After that the performance and especially the powersave governor are working way better.

 

Schedutil should be also preferred before ondemand/performance if amd_pstate is not availabe for your CPU.

 

Also the possibility to blacklist certain devices yourself. For example my M.2 SSD doesn't like L1+ (L1.1 and L1.2) but the rest seems fine.

 

PS:

 

Quote

[c0a9:540a] 01:00.0 Non-Volatile memory controller: Micron/Crucial Technology P2 [Nick P2] / P3 / P3 Plus NVMe PCIe SSD (DRAM-less) (rev 01)

[N:0:1:1] disk CT1000P2SSD8__1 /dev/nvme0n1 1.00TB

In the case you want to blacklist the M.2 for everybody.

Edited by InternetD
Link to comment

I tried installing this plugin to test it out and ever since I have been having random connection drops every few minutes for a few seconds at a time. I've since tried uninstalling the plugin/restarting/and a few other things to try and troubleshoot what could have happened. I tried booting into safemode with fresh bios settings and I'm just at a loss as to what could have happened. Is there a chance the plugin could still be causing this? I've never had this issue before so seems curious its only happened directly after testing it.

Syslog doesn't give any valuable information, the only information I have about when it happens is that all connections drop and ping respond with "no answer yet for icmp_seq=n".


Ignore this, it seems my router decided it wanted to mess with me on the very day I decided I wanted to install something that messed with network settings and ruin everything. Too bad I was too clueless to check that much much earlier. Was not your plugin in the end, so, apologies. Maybe I'll consider giving it another shot when I recover from the stress and ptsd that today turned out to be.

Edited by Ditiae
issue fixed
Link to comment

Before I installed the plugin the power draw of my cpu was always around 1.5 to 2,5 watts. After install it jumped between 1,5 to 4 watts. Also the power draw display on the dashboard refreshs in strange intervalls. It is set to 1000ms for refresh and should refresh in patterns like x - - - x - - - x - - - x but does something like x- - x x - - - x- - - x - x -. Overall load on cpu while idle went from pretty constant 1% to jumping between 1% and 3%. uninstalling didnt revert the changes completly but power draw went down a little.

 

edit: I installed powertop before and set it up according to: 

 

Edited by Ruckizucki_Mann
Link to comment
On 4/7/2024 at 1:19 AM, Spritzup said:

This looks really promising, awesome work! I'd like to schedutil added to the governor settings, and more visibility into what the app is doing auto-magically behind the scenes before I would fully switch over to it.

 

For detailed information about any setting within the plugin, you can always use the question mark icon to reveal the documentation. 

Additionally the plugin provides detailed logging of its operations within the system log.

Link to comment
On 4/8/2024 at 3:00 PM, InternetD said:

Also amd_pstate=active (with the upcoming 6.6 kernel) for Zen2 and newer AMD CPUs. After that the performance and especially the powersave governor are working way better.

 

Schedutil should be also preferred before ondemand/performance if amd_pstate is not availabe for your CPU.

 

Also the possibility to blacklist certain devices yourself. For example my M.2 SSD doesn't like L1+ (L1.1 and L1.2) but the rest seems fine.

 

PS:

 

In the case you want to blacklist the M.2 for everybody.

 

In the next release I changed it to schedutil in balanced profile on cpus that don't have pstate driver. Otherwise it uses powersave.

I don't have a system with AMD CPU to test it. But I have a 6.6.22 kernel compiled myself...

 

ASPM is tuned through a global kernel parameter not per device.

Did you try to update the firmware of your NVMe? I updated my WD NVMe with nvme-cli.

 

Link to comment
On 4/10/2024 at 11:33 PM, Fuzzy0101 said:

 

In the next release I changed it to schedutil in balanced profile on cpus that don't have pstate driver. Otherwise it uses powersave.

I don't have a system with AMD CPU to test it. But I have a 6.6.22 kernel compiled myself...

 

ASPM is tuned through a global kernel parameter not per device.

Did you try to update the firmware of your NVMe? I updated my WD NVMe with nvme-cli.

 

 

Thanks for the answer. I use amd_pstate since a while without any issue whatsoever on other devices with fitting hardware and of course newer kernel.

As for ASPM i wasnt sure if blacklisting is possible as it seems it need to be driver specific. For example the amdgpu driver allows to do so with amdgpu.aspm=0

 

The NVMe is uptodate, so no luck here. But its nearing its end of life anyway so i will keep attention on the next purchase. For example the newer KIOXIA NVMe with TLC and DRAM supports officially L1.1 and L1.2.

Link to comment
On 4/12/2024 at 11:57 AM, InternetD said:

 

Thanks for the answer. I use amd_pstate since a while without any issue whatsoever on other devices with fitting hardware and of course newer kernel.

As for ASPM i wasnt sure if blacklisting is possible as it seems it need to be driver specific. For example the amdgpu driver allows to do so with amdgpu.aspm=0

 

The NVMe is uptodate, so no luck here. But its nearing its end of life anyway so i will keep attention on the next purchase. For example the newer KIOXIA NVMe with TLC and DRAM supports officially L1.1 and L1.2.

You can try the kernel I compiled; it works with Unraid and is based on version 6.6.23. I also tried to include AMD support.

https://github.com/fuzzy01/unraid_enhanced_kernel/releases/tag/6.6.23-u6.12.10

  • Like 1
Link to comment
4 hours ago, InternetD said:

nvme_core.default_ps_max_latency_us=0 as a bootflag should do the trick against buggy NVMe low power state.

 

Yes but also makes it probably very warm. I don't know how much you played with it to make it work, but may be you can try to disable only APST PS states deeper than PS3.
You can see the exlat parameters of PS3 and PS4 at the end of nvme id-ctrl /dev/nvme0
If you set nvme_core.default_ps_max_latency_us between the two value you can disable only PS4 
You can check the current PS state with: nvme get-feature -f 2 -H /dev/nvme0
 

Link to comment
14 hours ago, Fuzzy0101 said:

 

Yes but also makes it probably very warm. I don't know how much you played with it to make it work, but may be you can try to disable only APST PS states deeper than PS3.
You can see the exlat parameters of PS3 and PS4 at the end of nvme id-ctrl /dev/nvme0
If you set nvme_core.default_ps_max_latency_us between the two value you can disable only PS4 
You can check the current PS state with: nvme get-feature -f 2 -H /dev/nvme0
 

 

Sadly it still doesnt work. Even with 0 the NVMe keeps crashing so i guess its a whole another issue on L1.1 and L2.2 that is pcie related.

Edited by InternetD
Link to comment
On 4/15/2024 at 11:08 AM, InternetD said:

 

Sadly it still doesnt work. Even with 0 the NVMe keeps crashing so i guess its a whole another issue on L1.1 and L2.2 that is pcie related.

Yes. Probably buggy firmware.

It turns out you can disable L1.1 and L1.2 per device. 

If the pci address of the NVME is 01:00.0, you can disable L1.2 like this:

 

echo 0 > /sys/bus/pci/devices/0000\:01\:00.0/link/l1_2_aspm 

 

There are other switches in that directory to turn off other pm features:

l1_1_pcipm  l1_2_aspm  l1_2_pcipm  l1_aspm

 

  • Like 1
Link to comment

Hey the plugin looks great and helped me save some Watts.

Currently i have two pcie devices seemingly not allowing lower package states. I assume my Motherboard doesn't allow ASPM on direct to CPU lanes.

Is this common behavior? Under "ASPM and PCIe PM status" some "ASPM Settings"  values are displayed red. Is there a way to force ASPM?

One example:

 

02:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 (prog-if 02 [NVM Express])

Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)

Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-

Capabilities: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us

Settings: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+

Status: Speed 8GT/s, Width x4

ASPM Capabilities: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+

ASPM Settings: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-

 

Link to comment
On 4/17/2024 at 12:24 AM, Fuzzy0101 said:

I uploaded the new version 1.3.11

  • Adds NVME SSD power management policy feature.
  • Improves GUI field descriptions and consistency.

 

On 4/17/2024 at 2:19 PM, Fuzzy0101 said:

Yes. Probably buggy firmware.

It turns out you can disable L1.1 and L1.2 per device. 

If the pci address of the NVME is 01:00.0, you can disable L1.2 like this:

 

echo 0 > /sys/bus/pci/devices/0000\:01\:00.0/link/l1_2_aspm 

 

There are other switches in that directory to turn off other pm features:

l1_1_pcipm  l1_2_aspm  l1_2_pcipm  l1_aspm

 

Still no luck for me, Must be some deeper incompatibility issue specific with my mainboard/NVMe combination. Thank you anyway :)

 

If I'm not wrong kernel.org must have an ASPM blacklist somewhere in their kernel code since you can report buggy devices on the mailing list. It may be wise to not override this list until getting desperate.

If you can get your hands on it it may be good to implement/update it with each Unraid release since your plugin might ignore it.

Edited by InternetD
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...