[Plugin] IPMI for unRAID 6.1+


Recommended Posts

According to the SuperMicro website, the IMPI/BMC Firmware Revision: R 1.13 is the latest for the X11SSL-F which was the Firmware version installed when I build the system.  But I began to notice the error after the 6.2.4 update.  Once I updated to 6.3.0 the other day, I noticed the error remained and I figured I should ask to see if anyone else is reporting the same.

 

I'm using 1.26 sent by Supermicro support because of another issue, you can try and ask for it.

 

When you requested the update, did you reference issues that were logged from unRAID syslog, or did the BMC also have your issues logged?  I'm asking because; if I were to include the errors logged from unRAID syslog, would SuperMicro support even care?  My BMC event logs have no entries regarding IPMI/BMC issues.   

Link to comment

I would just switch to network connection in the ipmi settings and forget about it. You don't lose any functionality.

 

You mentioned that before.  I currently access IPMI using the Java IPMI Viewer.  My Router's DHCP service issued an IP address to the viewer, which I use to access.  Is what you're suggesting the same thing? 

Link to comment

 

 

I would just switch to network connection in the ipmi settings and forget about it. You don't lose any functionality.

 

You mentioned that before.  I currently access IPMI using the Java IPMI Viewer.  My Router's DHCP service issued an IP address to the viewer, which I use to access.  Is what you're suggesting the same thing?

Under Settings in the ipmi plugin, change Enable Network Connections to yes and then enter the IP, username and password. Then click apply. You'll have to reselect any sensors in the display and footer settings too.

 

However, I don't think the error will go away. I believe unRAID has the ipmi driver enabled by default. Before I had to enable it with modprobe.

 

 

Link to comment

When you requested the update, did you reference issues that were logged from unRAID syslog, or did the BMC also have your issues logged?  I'm asking because; if I were to include the errors logged from unRAID syslog, would SuperMicro support even care?  My BMC event logs have no entries regarding IPMI/BMC issues. 

 

My issue at the time was not unRAID related.

Link to comment

Under Settings in the ipmi plugin, change Enable Network Connections to yes and then enter the IP, username and password. Then click apply. You'll have to reselect any sensors in the display and footer settings too.

 

However, I don't think the error will go away. I believe unRAID has the ipmi driver enabled by default. Before I had to enable it with modprobe.

 

Understood!!  I need to modify settings within the unRAID IPMI plug-in :)  Sorry, didn't catch that at first!  I'll give it a shot and if errors disappear then great, if not...like I said, no biggie.  IPMI sensors are all recognized by the plugin currently and BMC is not reporting any events.  Not relevant, but in the past I have ignored errors from my syslog because I did not understand them and consequently, my server suffered problems that required me to work with unRAID community support to resolve.  Moderators and the Community support team have since warned me about ignoring syslog errors, even if not reported by Fix Common Problems.

 

Thanks again!

 

   

Link to comment
  • 3 weeks later...
  • 3 weeks later...

I'm having some issues with this plugin. It seems to be pulling the data just fine from my SuperMicro board, but I am unable to change any of its settings. The screen simply refreshes and the settings do not appear to "stick".

 

Reviewing the Unraid logs I see this:

Unraid root: error: update.php: missing csrf_token

Doing some googling I see that there was a change to the webui in 6.3.0 RC that needs to be accounted for in plugins:

 

 

Link to comment
  • 2 weeks later...
I'm having some issues with this plugin. It seems to be pulling the data just fine from my SuperMicro board, but I am unable to change any of its settings. The screen simply refreshes and the settings do not appear to "stick".

 

Reviewing the Unraid logs I see this:

Unraid root: error: update.php: missing csrf_token

Doing some googling I see that there was a change to the webui in 6.3.0 RC that needs to be accounted for in plugins:

 

 

 

Which page in particular is giving you that error?

Edit. My guess would be a stale browser. I would try closing your browser and opening the webgui again.

  • Upvote 1
Link to comment

Excellent plugin.  Quick question on fan control.  I have an Asrock board, but fan control is disabled.  After reading through the change log it looks like you make an assumption on the fan naming convention.  I assume this works for a single socket MB, but does not align with how things are named in a dual socket configuration; CPU_FAN1_1, CPU_FAN1_2, CPU_FAN2_1, and CPU_FAN2_2.  Any thoughts on supporting this type of configuration, MB is EP2C602-4L/D16?  Unfortunately, Asrock doesn't let you make changes via their web interface, so this was my best option to avoid having to reboot into bios.

 

Currently chasing down an Upper Critical non-recoverable error for both CPUs; where one pegs at 101C, and the other is at 40C but also has the error.  I seriously doubt that the CPU actually hit 101C even if the fan stalled, and am guessing this is some config error; open to suggestions on this one as well.  There is enough airflow through from drive fans (3x140 - 1000), the other CPU fan (Hyper 212's), exhaust fan (1x120 - 1500), and side/MB fan (1x140 - 1100) to provide good passive cooling; especially since the current load at the time of the error is basically idle.

Link to comment
Excellent plugin.  Quick question on fan control.  I have an Asrock board, but fan control is disabled.  After reading through the change log it looks like you make an assumption on the fan naming convention.  I assume this works for a single socket MB, but does not align with how things are named in a dual socket configuration; CPU_FAN1_1, CPU_FAN1_2, CPU_FAN2_1, and CPU_FAN2_2.  Any thoughts on supporting this type of configuration, MB is EP2C602-4L/D16?  Unfortunately, Asrock doesn't let you make changes via their web interface, so this was my best option to avoid having to reboot into bios.

 

Currently chasing down an Upper Critical non-recoverable error for both CPUs; where one pegs at 101C, and the other is at 40C but also has the error.  I seriously doubt that the CPU actually hit 101C even if the fan stalled, and am guessing this is some config error; open to suggestions on this one as well.  There is enough airflow through from drive fans (3x140 - 1000), the other CPU fan (Hyper 212's), exhaust fan (1x120 - 1500), and side/MB fan (1x140 - 1100) to provide good passive cooling; especially since the current load at the time of the error is basically idle.

Yes initially I made assumptions about fan names but have since made a script that is tied to the Configure button that automatically matches the actual fan names. However this only works for the single socket boards till I find time to implement a fix for the dual sockets. There are some discussions not far back about your board and another thread on here too. I see I can find it.

 

Edit here it is https://forums.lime-technology.com/index.php?/topic/46077-ASRock-server-board-(EP2C602-versions-and-any-other-with-IPMI)-and-CPU-Temp

 

Link to comment
1 hour ago, loond said:

Currently chasing down an Upper Critical non-recoverable error for both CPUs; where one pegs at 101C, and the other is at 40C but also has the error.  I seriously doubt that the CPU actually hit 101C even if the fan stalled, and am guessing this is some config error; open to suggestions on this one as well.  There is enough airflow through from drive fans (3x140 - 1000), the other CPU fan (Hyper 212's), exhaust fan (1x120 - 1500), and side/MB fan (1x140 - 1100) to provide good passive cooling; especially since the current load at the time of the error is basically idle.

 

Check out the e5-2670 buy thread. There's some discussion in there. This is a bug in the board. I have contacted asRock several times regarding it. Last contact was a week or two ago. They told me they are going to try and work on a bios fix, but I haven't heard anything back yet. You are not alone with this bug.

 

A short term fix that I have implemented is using this ipmi plugin to change the temperature trigger from 90 to 102 degrees for the cpus so they don't trigger the annoying alert... Obviously only a temporary fix, as I would like to have it actually working properly :).

Edited by DoeBoye
Link to comment

Thank you both for the quick response, and appreciate your contribution to the community.  This has turned into a bit of a lab for me; this being the 3rd iteration of the system.  Once I figured out/ understood the capability of unraid I just had to keep going; don't talk to my wife though ;-)

 

A few observations, and am not sure you might have experienced the same.  The plugin reports errors for both processors; where the web gui only reports the primary CPU.  Also, in each instance the event lasted exactly 5 sec according to the event log in the gui.  I had another strange problem previously where I was getting machine check errors with the ram evenly distributed in the primary mem slots between each CPU.  When I was getting those errors I wasn't getting the temp errors, but one of the CPUs consistently indicated 5C hotter.  After verifying via memtest the ram was good, I "solved" the issue by adding another 32GB; maxing out the primary CPU mem slots.  Also, when I was getting the machine check errors the 4 NICs would constantly fail over due to an IRQ 16 error.  They definitely have some issues with their BIOS.  It does make me wonder though if the BIOS just expects to see all the slots full given its expected use case, and if not just can't handle the delta.

Link to comment
1 hour ago, loond said:

 After verifying via memtest the ram was good, I "solved" the issue by adding another 32GB; maxing out the primary CPU mem slots.  Also, when I was getting the machine check errors the 4 NICs would constantly fail over due to an IRQ 16 error.  They definitely have some issues with their BIOS.  It does make me wonder though if the BIOS just expects to see all the slots full given its expected use case, and if not just can't handle the delta.

 

Thanks for the extra feedback! :). So you're still getting the incorrect temp reporting for the cpus, but the extra ram solved the machine check and irq16 problems?

 

Hmmm... I think what I'm taking from this is that you are suggesting that I have to buy more ram to fill the empty slots on my board.... Sounds like my hands are tied... No choice in the matter... ;)!

 

 

  • Upvote 1
Link to comment

I still get the errors (although didn't happen last night for some reason; turned network off, but event monitoring is still on.  Will change back to confirm), however the temps are now closer with the new ram; one processor might have been working harder shuffling data through the bus?  BTW, this is where I purchased all of my ram: http://www.ebay.com/itm/162349088753?_trksid=p2057872.m2749.l2649&ssPageName=STRK%3AMEBIDX%3AIT

Link to comment
On 4/4/2017 at 8:47 AM, dmacias said:

Which page in particular is giving you that error?

Edit. My guess would be a stale browser. I would try closing your browser and opening the webgui again.

 

I've tried it even in incognito, still happens. One thing to mention, I did move the Unraid interface off port 80, everything else works just fine, just not this IPMI plugin.


Regarding which pages generate the error, both the Setting and Fan Control tab have this error whenever I click apply.

Link to comment
 

I've tried it even in incognito, still happens. One thing to mention, I did move the Unraid interface off port 80, everything else works just fine, just not this IPMI plugin.

 

Regarding which pages generate the error, both the Setting and Fan Control tab have this error whenever I click apply.

I'm not sure what it is. All my plugins were compliant with the 6.3 csrf hardening even before it was released. I tried several browsers and my phone's browsers with my main server on 6.3.3. Also I booted up a 6.3.3 vm and changed the emhttp to 8088. Installed the plugin and applied settings from all browsers. No errors.

 

Maybe you could give more from the syslog or diagnostics. Also check that you don't have a dynamix.plg in /boot/config/plugins. Try to uninstall and reinstall the plugin.

 

Link to comment
11 hours ago, chaosratt said:

I've also tried both Chrome & Firefox on two different computers (both Win 7).

I would remove the plugin, delete the ipmi plugin directory on your flash drive then reinstall. If that doesn't fix it then hit Ctrl+Shift+i to bring up Inspect in Firefox or Chrome and go to console.  Then hit Ctrl+F5 to clear cache and refresh page.  Report any errors in console. Try changing settings and click apply.  Report any errors. Maybe try setting up network connection instead of local.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.