[Plugin] IPMI for unRAID 6.1+


Recommended Posts

Ok, did some further digging and I think I found the issue preventing the script from reading my drive temperatures. The output of smartctl -A -n standby /dev/sdx on a SAS drive is different than on a SATA drive, and thus the script can't parse the temperature. Here's sample output from a SAS drive:

 

smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.20-unRAID] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
Current Drive Temperature:     37 C
Drive Trip Temperature:        60 C

Manufactured in week 14 of year 2015
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  95
Specified load-unload count over device lifetime:  600000
Accumulated load-unload cycles:  823
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 5494811087863808

 

Link to comment
2 hours ago, dmacias said:

So with your board you can only control all the fans? Does everything in the readings show up? Run this command and post the output

 


dmidecode -t 2 | grep 'Manufacturer' | awk -F 'r:' '{print $2}'
 

 

 

Or just post the output of

 

dmidecode -t 2

Yep I can control the fans, read the temps. Everything shows up with the addon installed and it shows the proper fan rpms, just missing the tab to control it :). 
 

 

 

root@Navajo:~# dmidecode -t 2
Invalid type keyword: 2
Valid type keywords are:
  bios
  system
  baseboard
  chassis
  processor
  memory
  cache
  connector
  slot
 

root@Navajo:~# dmidecode -t 2 | grep 'Manufacturer' | awk -F 'r:' '{print $2}'
 Dell Inc.
 

Link to comment

Warning: I've never written anything in PHP before, but I modified the get_highest_temp function to read the temp from /var/local/emhttp/disks.ini. This allows the script to detect the temp of both SAS and SATA drives.

 

function get_highest_temp($hdds){
    global $hddignore;
    $ignore = array_flip(explode(',', $hddignore));
    $highest_temp = 0;
    $lines = file_get_contents('/var/local/emhttp/disks.ini');
    $lines = explode("\n", $lines);
    $pattern = '/^temp="([0-9]+)"/';
    foreach ($hdds as $serial => $hdd) {
        if (!array_key_exists($serial, $ignore)) {
            $temp = 0;
            $line_number = 0;
            for ($line = 0; $line < count($lines); $line++) {
                if (strpos($lines[$line], $hdd) > 0) {
                    $line_number = $line + 6;
                }
            }
            ob_start();
            echo $lines[$line_number];
            $templine = ob_get_contents();
            ob_end_clean();
            preg_match($pattern, $templine, $tempnum);
            $temp = $tempnum[1];
            $highest_temp = ($temp > $highest_temp) ? $temp : $highest_temp;
        }
    }
    debug("Highest temp is ${highest_temp}ºC");
    return $highest_temp;
}

 

Link to comment
Warning: I've never written anything in PHP before, but I modified the get_highest_temp function to read the temp from /var/local/emhttp/disks.ini. This allows the script to detect the temp of both SAS and SATA drives.
 
function get_highest_temp($hdds){   global $hddignore;   $ignore = array_flip(explode(',', $hddignore));   $highest_temp = 0;   $lines = file_get_contents('/var/local/emhttp/disks.ini');   $lines = explode("\n", $lines);   $pattern = '/^temp="([0-9]+)"/';   foreach ($hdds as $serial => $hdd) {       if (!array_key_exists($serial, $ignore)) {           $temp = 0;           $line_number = 0;           for ($line = 0; $line                 if (strpos($lines[$line], $hdd) > 0) {                   $line_number = $line + 6;               }           }           ob_start();           echo $lines[$line_number];           $templine = ob_get_contents();           ob_end_clean();           preg_match($pattern, $templine, $tempnum);           $temp = $tempnum[1];           $highest_temp = ($temp > $highest_temp) ? $temp : $highest_temp;       }   }   debug("Highest temp is ${highest_temp}ºC");   return $highest_temp;}

 

Well you gotta start somewhere, good job. If you look in the ipmi_helpers.php, I already have a function get_highest_temp that gets the highest hdd temp from the disks.ini, devs.ini and from UA's json file. This is for the webgui only.

I don't want to use the ini files for temps in the fan script because then it's tied to whatever interval unraid updates those temps. The script needs to get the temps on its own and act accordingly. So I need the smartctl - A output from those drives.
Link to comment
4 hours ago, dmacias said:

Well you gotta start somewhere, good job. If you look in the ipmi_helpers.php, I already have a function get_highest_temp that gets the highest hdd temp from the disks.ini, devs.ini and from UA's json file. This is for the webgui only.

I don't want to use the ini files for temps in the fan script because then it's tied to whatever interval unraid updates those temps. The script needs to get the temps on its own and act accordingly. So I need the smartctl - A output from those drives.

 

You can try to guest the SMART device type using a function like get_smart_type in this file.

  • Like 1
Link to comment
On 2/10/2019 at 2:35 PM, dmacias said:

Well you gotta start somewhere, good job. If you look in the ipmi_helpers.php, I already have a function get_highest_temp that gets the highest hdd temp from the disks.ini, devs.ini and from UA's json file. This is for the webgui only.

I don't want to use the ini files for temps in the fan script because then it's tied to whatever interval unraid updates those temps. The script needs to get the temps on its own and act accordingly. So I need the smartctl - A output from those drives.

 

I had a feeling the whole time that I was probably reinventing the wheel... but once I started I was determined to find a way 😂A few posts back in this thread I posted the output of smartctl -A from a SAS drive. Totally agreed that it's a better approach to get data directly. Derived data is usually a recipe for disaster. Just takes a while to come to a boil.

Link to comment
 

I had a feeling the whole time that I was probably reinventing the wheel... but once I started I was determined to find a way . A few posts back in this thread I posted the output of smartctl -A from a SAS drive. Totally agreed that it's a better approach to get data directly. Derived data is usually a recipe for disaster. Just takes a while to come to a boil.

Ok. I wasn't sure what command you ran before.

 

Link to comment

Hi - I have unRAID on a SuperMicro server and on the console I can do this:

# ipmi-dcmi --get-system-power-statistics
Current Power                        : 256 Watts
Minimum Power over sampling duration : 110 watts
Maximum Power over sampling duration : 404 watts
Average Power over sampling duration : 235 watts
Time Stamp                           : 02/15/2019 - 06:46:10
Statistics reporting time period     : 1181958000 milliseconds
Power Measurement                    : Active

I took a quick look at the code and I think I'd be able to hack in support for display of this (ideally it'd be a historical graph backed by a database, but we can start small). I'm mostly interested in the current power reading. How should I go about adding this? Thanks!

Edited by falcor
Link to comment

@dmacias is this something that could be added to your plugin it was provided by bonienl?

 

The IPMI plugin would need to add support for the dashboard display.

 

  if ($('#mb-temp').length) {
      var temp = $('span#temp').text();
      var unit = temp.indexOf('C')>0 ? 'C' : 'F';
      temp = temp.split(unit);
      if (temp[0]) $('#cpu-temp').html('Temperature: '+temp[0]+unit);
      if (temp[1]) $('#mb-temp').html('Temperature: '+temp[1]+unit);
    }

 

The above code extract comes from the System Temp plugin.

It checks the presence of the field 'mb-temp' and puts the temperature readings in the fields 'cpu-temp' and 'mb-temp' which get displayed on the dashboard page.

Edited by SimonF
Link to comment
On 2/14/2019 at 11:48 PM, falcor said:

Hi - I have unRAID on a SuperMicro server and on the console I can do this:


# ipmi-dcmi --get-system-power-statistics
Current Power                        : 256 Watts
Minimum Power over sampling duration : 110 watts
Maximum Power over sampling duration : 404 watts
Average Power over sampling duration : 235 watts
Time Stamp                           : 02/15/2019 - 06:46:10
Statistics reporting time period     : 1181958000 milliseconds
Power Measurement                    : Active

I took a quick look at the code and I think I'd be able to hack in support for display of this (ideally it'd be a historical graph backed by a database, but we can start small). I'm mostly interested in the current power reading. How should I go about adding this? Thanks!

Where and what were you wanting displayed? I'm not sure how many would be able to use this though. Do you need an enterprise power supply?

Link to comment
9 hours ago, SimonF said:

@dmacias is this something that could be added to your plugin it was provided by bonienl?

 

The IPMI plugin would need to add support for the dashboard display.

 


  if ($('#mb-temp').length) {
      var temp = $('span#temp').text();
      var unit = temp.indexOf('C')>0 ? 'C' : 'F';
      temp = temp.split(unit);
      if (temp[0]) $('#cpu-temp').html('Temperature: '+temp[0]+unit);
      if (temp[1]) $('#mb-temp').html('Temperature: '+temp[1]+unit);
    }

 

The above code extract comes from the System Temp plugin.

It checks the presence of the field 'mb-temp' and puts the temperature readings in the fields 'cpu-temp' and 'mb-temp' which get displayed on the dashboard page.

I've been meaning to check into the dashboard for 6.7+.  Thanks that helps.

Link to comment
16 minutes ago, dmacias said:

Where and what were you wanting displayed? I'm not sure how many would be able to use this though. Do you need an enterprise power supply?

I'd like at least a current power reading on the Dashboard, and preferably a little history graph. At least people with Supermicro gear (yourself included) should be able to use it, and I think Dell might support it too..

Link to comment
I'd like at least a current power reading on the Dashboard, and preferably a little history graph. At least people with Supermicro gear (yourself included) should be able to use it, and I think Dell might support it too..
It won't work without a pmbus power supply. Mine shows all 0's and not available. I wouldn't put too much effort in hacking something for the dashboard. It's different for 6.7+. When I look at adding dashboard for 6.7, I'll look at this. But if you can put something together that's cool too.
Link to comment
  • 2 weeks later...
8 hours ago, IamSpartacus said:

Is there a reason why my FANS 1-4 are all grouped together and there is no FAN 5 in my Fan Settings?  I've got the latest BIOS and IPMI BMC for my board (https://www.supermicro.com/products/motherboard/Xeon/C600/X10SRM-F.cfm).

 

s9M9FOP.jpg

 

 

There are only 2 fan settings that can be controlled. FANA and all the other fans. FAN1234 is just what I chose to call it since most the boards had only 4 fans plus FANA.

Link to comment
9 hours ago, dmacias said:

There are only 2 fan settings that can be controlled. FANA and all the other fans. FAN1234 is just what I chose to call it since most the boards had only 4 fans plus FANA.

 

Is that a recent change?  I just upgraded my board from an ASRock Rack board and with that IPMI I had that ability to configure all 5 of my fans separately.

Link to comment
4 minutes ago, dmacias said:
13 minutes ago, IamSpartacus said:
 
Is that a recent change?  I just upgraded my board from an ASRock Rack board and with that IPMI I had that ability to configure all 5 of my fans separately.

No it's a Supermicro thing. Only 2 zones controllable

 

Got it, that makes sense than as to why it's different.  I appreciate the quick responses.

Link to comment

My main server contains a X11SSL-CF board. IPMI was running very well but since a while the Fan Speed Mode is set at FULL. Hence my fans are running at full speed and only if I am setting it back to Standard or Optimal, IPMI plugin is taking back control and the fans are running at much lower speed.

BIOS version is 2.2 from 05/23/18, Firmware is 1.48 from 06/22/18, Redfish 1.0.1

Maybe I overlooked something during the recent IPMI updates? 

Link to comment



My main server contains a X11SSL-CF board. IPMI was running very well but since a while the Fan Speed Mode is set at FULL. Hence my fans are running at full speed and only if I am setting it back to Standard or Optimal, IPMI plugin is taking back control and the fans are running at much lower speed.
BIOS version is 2.2 from 05/23/18, Firmware is 1.48 from 06/22/18, Redfish 1.0.1
Maybe I overlooked something during the recent IPMI updates? 


Did you give it some time after you started the fan control? How long is your polling time? Also are your thresholds set with the editor for your fans?
Link to comment

I searched the topic, but I didn't didn't see anything that answered my questions.

 

I've got a SM X9DRi-LN4F+

 

I see this in the OP:

###2018.05.20
- format min and max percentages to one decimal place
- add support for Supermicro X9 boards
- set Supermicro boards to full speed mode (other modes seem to interfere with script)

 

I'm trying to figure out a few things.

 

The plugin references a hard drive to poll for temp, so does that mean the script would adjust fan speeds based on the temp of a hard drive instead of a CPU? My goal is to have this run the CPU fans, which I have on "FANA" and "FANB". Which do appear to be set at full speed. I've load tested them, and they keep temps down even at 100% extended load (30mins +)... however, 100% fan isn't exactly optimal. And when I turn the IPMI fan control off and use the SM "optimal speed" setting the fans jump up and down from 400+/- RPM to full 1800 RPM... even at idle. Which doesn't seem right.

 

 

Also, if you notice below... FANA is denoted correctly, but FAN1234 shows up instead of FANB... which is seen by the system. I'm assuming I'd have to edit the config or something to get that to be referenced right?

 

image.png.d19a0e496a168b83dd0f3888a9ae125e.png

 

 

So on advanced view, do these high and low values come from a referenced HD? Can I make it reference CPU temps instead if that's true? I run the case fans on a standalone controller and really I'm looking for the best way to control the CPU fans here.

 

image.png.770cc4a1eb1cc2ac90abfef189046988.png

 

Edited by CowboyRedBeard
Link to comment
4 hours ago, CowboyRedBeard said:

setting the fans jump up and down from 400+/- RPM to full 1800 RPM

You may need to change the minimum fans speeds. these are the ones i change to on my X9DR3 you should be able to run command from command line and use IP address for the BMC port.

 

ipmitool -I lanplus -H IPMI IP  -U user -P password sensor thresh FAN3 lower 200 200 300

 

image.thumb.png.0dce24954a5a00f104d17d26ef03c432.png

Edited by SimonF
Link to comment

OK, thanks for that info on the thresholds. But my question is more surrounding if it's referencing the hard drive temp instead of the CPU... minimum threshold isn't going to help me.

 

I'm thinking the plugin might be the way I monitor it and issue alerts, but CPU fan control might need to be done via BMC / BIOS on the SM board. I've since switched it to "standard" via the SM IPMI:

 

image.png.69080fccba7f4d088c594e0cdd9af72b.png

 

 

And it seems to hold the fans at a nice low speed and temps are stable (at idle)

 

image.png.03141ce68e8220e704ebdd60e7815e07.png

 

And when I'm loading it up (one processor doing a video conversion via handbrake) it kicks the fans up appropriately and holds acceptable temps. Although it seems I might be getting a bit of thermal throttling, but I need to do more testing on that.

 

I guess at this point I'm most interested in knowing if I can manage the fans referencing the CPU temp instead of a HD temp which makes no sense to me... so I had assumed I'm doing something incorrectly. Which wouldn't be the first time.

 

Link to comment

[mention=77388]CowboyRedBeard[/mention] Make sure you have the latest bios 3.3 and bmc firmware 3.48. Reason being, they sometimes update ipmi functionality and sensor names.

 

Supermicro boards have 2 fan zones. Usually one zone is FANA (peripherals) and another zone for all other numbered fans like FAN1 FAN2 FAN3 FAN4. (FAN1234). Some boards have more numbered fans (FAN5 or 6) that are tied to the other numbered fans or like yours have FANB which I believe is tied to the same zone as FANA.

 

The original intent of fan control was to monitor hard drive temps and control a zone of fans to cool hard drives. Since the bios fan control is based on cpu temps the fans don't usually spin up fast enough to maintain lower hard drive temps. E.g. during a parity check the cpu temps would not get hot enough to spin up the fans and the hdds would hit high 40C or 50s. So the fan control script monitors the hdds and uses the highest temp. So in order to control any zones you have to select a Temperature sensor to monitor in order for the fan control to work. If you want to use CPU temps you have to change from auto to a CPU temp sensor for that particular fan group. For your board you will have 2 zones: FANA & B on one zone and FAN1-6 on another zone. So you can select a temp sensor to use for each zone. You can use any temp sensor the bmc has listed or hdds temp. You can use the same for both zones if you like.

 

Also the bmc thresholds still may be relevant. These aren't the same as what's on the Fan Control webgui page. These are alert thresholds set in the bmc. You can use Config Editor and select sensors to view and edit them. They cause a warning to appear in the bmc event log if the fan rpms go outside the thresholds set. So if the default threshold is 1000 rpms and your cpu fan drops below this, then the bmc kicks the fan to full speed and logs an alert.

 

I hope all this helps clear up everything.

 

Edit: you can change the high and low temp threshold after you select a sensor. They determine at what temp the fan will be at maximum speed and minimum speed.

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.