UnRAID on Asus Pro WS W680-ACE IPMI


Recommended Posts

On 2/22/2024 at 5:31 PM, jakea333 said:

It seems this problem is limited to the mATX variant (I can see at least 3 confirmed cases through this thread). Doesn't seem the ATX version is reporting the same problem. Will be interested in seeing how @Daniel15 fares with your upgrade, as you seem to have a better understanding of this process than myself.

 

Add me to the list as #4 with the same iGPU / Aspeed GPU issue with the W680-ACE SE and update to 6.12.6/8.  I currently don't use the iGPU on that server so changing the BIOS to:  Primary Display [Auto] , iGPU Multi-Monitor [Disabled]  has everything back to normal for me for now.

Edited by SShadow
  • Like 1
Link to comment
  • 2 weeks later...
On 2/21/2024 at 1:56 PM, mikeyosm said:

Oh no, hope not. I have the mATX variant and planned on using the slimSAS in SATA mode as well as both the m.2 slots. I guess we'll see as soon as I receive the cable.


Tested slimsas breakout cable with 4 sata devices + 4 devices on sata ports.
All m.2 slots still working, so no port sharing. 

  • Thanks 1
Link to comment

Hi, I have one problem with my Asus W680 board and I was wondering if anyone has solved this as well...

 

Basic configuration, fresh installation, IPMI not connected, i5 14500, 32GB, 1GB SSD.

 

Consumption in idle mode is around 25W.
I have a network cable plugged in - this is an important note! 🙂

 

If I enable all ASPM states in the BIOS and run powertop --auto-tune, it has no effect on consumption.

Then I disconnect the network cable, monitor and keyboard and power consumption drops to 14W (and I haven't exhausted all the options for optimizing that).

If I then connect back the network cable, the consumption jumps to 25W.

If I then connect the keyboard and mouse, the consumption is the same 25W.

If I disconnect the network cable so consumption drops to 17W.

If I disconnect the keyboard and mouse I'm back to 14W.

 

The board has 2x 2.5Gb and it doesn't matter if one UTP is connected or both. Consumption is still 25W.

 

The network cards on the board are of the i226-LM type.

EEE is disabled by default. If I try to turn it on, it works fine on one port, when I try on the other port it always freezes.

 

Does anyone have a similar experience? Do you have any advice or solution?

 

Is it normal for connecting a network cable to make a 10W difference in consumption? (without monitor and keyboard).

 

Link to comment

Ok i bought this board too a few months back and now got everything installed. i was SUPER excited about IPMI as this is one of my biggest headaches... my server is somewhat out of reach and when there was issues i would have to get a keyboard out and a small rasp-pi screen to trouble shoot...

BUT, as i read here there are some real issues on this board... either board or how it plays with unraid...

I have it along side a i5-14400 with the iGPU...

But when you use the IPMI card, the setting in BIOS for IGPU/Multi monitor are not there... and of course when you go to unraid the Intel GPU plugin wont detect it... meaning you cant use it for Plex trans coding... the whole point was to get rid of my P2000 GPU...

I noticed when using the IPMI card the HDMI output doesn't work correctly either... there is only PCIe and one other in the BIOS... no iGPU...

 

But... remove the IPMI card... then everything works...

Has anyone figured this out? there most be a way to use the iGPU and the IPMI card...

There most be some way... dummie HDMI, setting in the IMPI??

 

thanks for the help.

Link to comment
On 3/7/2024 at 1:53 PM, Riverfrome said:

Has anyone figured this out? there most be a way to use the iGPU and the IPMI card...

There most be some way... dummie HDMI, setting in the IMPI??

 

Maybe check this post:

Also, I've read a few posts afterward that the use of CSM disables the gpu options you are missing. So that would be my suggestion to check out.

Link to comment

You want your Graphics Configuration settings to look like this: 

image.png.70849c85cdc40b00441f87d0334b89e7.png

 

As above, ensure CSM is disabled. 

 

I've also changed a few other BIOS settings but I don't think they're relevant. Report back if the above doesn't work. 

I spent a bunch of time troubleshooting and finally got it working nicely. I have a HDMI dummy plug but I don't need to use it with these settings. 

Link to comment

For those of you that have the mATX version of the board (ASUS Pro WS W680M-Ace SE): does anyone have the IPMI + iGPU working together? Either I have IPMI all the way to Unraid but get an error for the modprobe of SR-IOV or the updating of the IPMI screen drops out after the initial Unraid boot selection screen (with the SR-IOV working as intended). I don't have a dGPU and on the mATX IPMI is integrated directly into the board, so no external IPMI card.

 

I have tried basically all of the combinations (+ dummy plug) in the BIOS but can't seem to set it up correctly. The frustrating part: I had it working before but lost the configuration after I needed to do a CMOS reset. Now I can't seem to get it back.

 

On 2/22/2024 at 11:33 PM, Daniel15 said:

I sometimes use it to watch the boot process (make sure nothing fails during Unraid startup) and the shutdown process (e.g. see if VMs are not shutting down cleanly or filesystems aren't unmounting cleanly), so losing it isn't ideal. I'll try dig into it when I have time, but it's hard for me to get enough free time to do that.

 

Could you report, if you have that working together with SR-IOV and if so - what are your BIOS settings?

Edited by HumanTechDesign
Link to comment
8 hours ago, HumanTechDesign said:

For those of you that have the mATX version of the board (ASUS Pro WS W680M-Ace SE): does anyone have the IPMI + iGPU working together?

I do. I've got IPMI + iGPU + SR-IOV all working. I'm still on Unraid 6.12.4 though, and I've heard from other posts in this thread that it breaks if you upgrade to 6.12.6 or above. Maybe a kernel bug? 

 

Unraid is on a fairly old kernel series (6.1.x) so I might try booting a live CD of a different distro with a newer kernel and see if it has the same issue. 

Edited by Daniel15
Link to comment
3 hours ago, Daniel15 said:

I do. I've got IPMI + iGPU + SR-IOV all working. I'm still on Unraid 6.12.4 though, and I've heard from other posts in this thread that it breaks if you upgrade to 6.12.6 or above. Maybe a kernel bug? 

I was searching through some threads after my post and I believe this could be true. I am on 6.12.8, so this change in 6.12.6 could very well affect me, yes. What I don't understand: I was already on 6.12.8 when I installed the board, and I could swear that I had it working with all three. I was trying to retrace the steps and as far as I remember, I might have installed the SR-IOV (and gputop etc.) plugin only after I had it working with all three (so iGPU recognised in Unraid but not using SR-IOV). What now breaks the setup seems to be the modified modprobe from the SR-IOV plugin (this throws an error message in the KVM, if the iGPU is not recognised). Interestingly, I don't have to blacklist the ASPEED GPU like others in this thread have to. I can still get to the point, where the plugins are loaded (after the mcelog line).

I hope that either a kernel (or Unraid or BIOS) update fixes it in the future.

 

Could you still tell me what your GPU settings in the BIOS are?

Link to comment

I have been following this thread and the other threads in reference to the Asus Pro WS W680-ACE IPMI motherboard with great interest. So much so, that I pulled the trigger on purchasing it today. Of course the motherboard appears to difficult to source right now, so I ordered directly from Asus. Starting a new build. My old UnRAID server is toast, no signs of life, and it's been down for almost 2 years now (retirement and life got in the way 🙂 ). Anyway my build will be as follows:

 

Fractal Design Define 7 Case

RMx Series™ RM850x — 850 Watt 80 PLUS Gold Fully Modular ATX PSU

Asus Pro WS W680-ACE IPMI

Intel Core i7 14700K

Noctua NH-U12A chromax.Black

64GB Crucial DDR5-4800 UDIMM - CT2K32G48C40U5

WD_BLACK 4TB SN850X NVMe Internal Gaming SSD

3 Seagate IronWolf NAS 12TB drives, 1 for parity, 24TB's for storage.

Haven't quite decided on what I'm going to use for a cache drive yet 🙂

 

My requirement is basically backup and storage. I have a Synology NAS doing primary backup and storage now. Once the UnRAID server is up and operational, it will become my main backup/storage and the Synology will be relegated to a backup role.

I have no real requirement for "remote management" but from reading through the threads it looks like the IPMI card provides some valuable stats. I'm not a PC gamer, so the onboard iGPU should be plenty for my needs. I will spin up a Windows 11 VM, a few docker containers and install some of the plugins available.

Anxiously waiting for the parts to arrive, so I can get the build started. The posts I read through so far, will hopefully make my install as painless as possible, but that remains to be seen. It's been a minute since I built a PC from scratch. Should be interesting, to say the least. 

 

 

 

 

  • Like 1
Link to comment
On 3/10/2024 at 7:30 PM, Daniel15 said:

Unraid is on a fairly old kernel series (6.1.x) so I might try booting a live CD of a different distro with a newer kernel and see if it has the same issue. 

I did some digging in the kernel commits (never done that before and I also don't have experience with the internals of Linux), but I found this: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=8d6ef26501b97243ee6c16b8187c5b38cb69b77d

If I read this correctly our issue actually is a feature instead of a bug (if this actually IS the cause). As far as I can tell it correlates with the kernel timeline in the Unraid releases. I have seen that there has been further development on the module/driver afterward but it would be interesting to see if this has been fixed or if it will stay that way from now on. This would suck because it takes away a lot of functionality. I can see this comment which gives me at least some hope https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/gpu/drm/ast/ast_mode.c?id=8d6ef26501b97243ee6c16b8187c5b38cb69b77d#n1784

 

* FIXME: Remove this logic once user-space compositors can handle more
*        than one connector per CRTC. The BMC should always be connected.

 

If I have the time, I will also boot up a live distro with a later kernel. However, as far as I can tell ALL boards with a BMC that is adressed with the ast driver (probably all ASPEED BMCs?) should run into this problem - not only this board. This should pop up in the whole server world?

 

I have started a dedicated thread for that topic here:

 

Edited by HumanTechDesign
  • Upvote 1
Link to comment

 

On 3/11/2024 at 8:31 PM, sldozier said:

Fractal Design Define 7 Case

RMx Series™ RM850x — 850 Watt 80 PLUS Gold Fully Modular ATX PSU

Asus Pro WS W680-ACE IPMI

Intel Core i7 14700K

Noctua NH-U12A chromax.Black

64GB Crucial DDR5-4800 UDIMM - CT2K32G48C40U5

WD_BLACK 4TB SN850X NVMe Internal Gaming SSD

3 Seagate IronWolf NAS 12TB drives, 1 for parity, 24TB's for storage.

Haven't quite decided on what I'm going to use for a cache drive yet 🙂

 

Your hardware looks good.

 

Since you're getting this board, you may as well get ECC RAM. That RAM you're getting is not ECC RAM - note that "on die ECC" is not the same thing as regular ECC, even if some manufacturers try to advertise it as such. All DDR5 has on-die ECC, and its purpose is mostly to increase manufacturing yield. It doesn't give you the protection that regular ECC does. 2 x 32GB Kingston KSM48E40BD8KM-32HM or KSM48E40BD8KI-32HA doesn't cost much more than the Crucial RAM you picked, and it supports ECC.

 

850W PSU is overkill for a server, even with a CPU like the 14700K, unless you're planning on putting a high-end graphics card in it. PSUs are usually pretty inefficient if you're only using less than 20-25% of their capacity and using a smaller PSU will save you some money (both in the cost of the PSU, and the cost of electricity). I guess you may want to have a higher-power PSU in case you ever reuse it for something else in the future?

 

I'm running my server on a 550W PSU because it's the smallest I can find, and even that is overkill (at least for me) since power draw at the wall for my system is always less than 150W. 400-450W is usually good for a server, but now that GPUs consume huge amounts of power, it's very difficult to find 'regular' PSUs that are that size, only server PSUs (which are small and have very loud fans since they're designed for 1U rackmount servers).

Edited by Daniel15
  • Thanks 1
Link to comment
On 3/16/2024 at 8:22 PM, Daniel15 said:

 

 

Your hardware looks good.

 

Since you're getting this board, you may as well get ECC RAM. That RAM you're getting is not ECC RAM - note that "on die ECC" is not the same thing as regular ECC, even if some manufacturers try to advertise it as such. All DDR5 has on-die ECC, and its purpose is mostly to increase manufacturing yield. It doesn't give you the protection that regular ECC does. 2 x 32GB Kingston KSM48E40BD8KM-32HM or KSM48E40BD8KI-32HA doesn't cost much more than the Crucial RAM you picked, and it supports ECC.

 

850W PSU is overkill for a server, even with a CPU like the 14700K, unless you're planning on putting a high-end graphics card in it. PSUs are usually pretty inefficient if you're only using less than 20-25% of their capacity and using a smaller PSU will save you some money (both in the cost of the PSU, and the cost of electricity). I guess you may want to have a higher-power PSU in case you ever reuse it for something else in the future?

 

I do realize this is way overkill for what my initial goal is for UnRAID, but I wanted to future proof the build. I belabored the idea of going with the Kingston ECC ram, but decided to go with Crucial. I'll probably end up upgrading the Kingston ECC anyway 😞 in the future 🤔. I have a 775W Thermaltake Toughpower XT PS (still probably overkill) from my dead UnRAID install that was running continuously since 2015, back when everything was still Lime Tech branded. It died sometime during my retirement move from the Mid-Atlantic to the South 😞. I'll probably go with some form of eGPU in the future but for now I'll use the MB's iGPU. But I disgress.  I do appreciate the review server components and the recommendations. I have most of the parts in, so sometime soon I'll start the build.

 

Link to comment
  • 2 weeks later...

@Lolight Silent corruption definitely has to do with ECC. With non-ECC RAM, data can get corrupted in memory before being written to disk, or corrupted when being transferred from the memory to the CPU. This is silent corruption because there's no way to tell that it happened, since a checksum written to disk (like what ZFS does for bitrot protection) will be a checksum of the corrupted data.

 

For every byte (8 bits) of RAM, ECC can detect and automatically correct an error in one bit, and can detect (but not correct) an issue in two bits. It's very rare to have memory data corruption in more than a single bit per byte. 

Link to comment
On 2/18/2024 at 7:34 PM, jakea333 said:

Glad it worked out. I am curious as to the root cause as well, as other boards with the Aspeed BMC don't seem to suffer in the same way. It's beyond my ability to troubleshoot, but I know that something changed between 6.12.4 and 6.12.6 that introduced this bug for me.

 

Maybe someone else can identify the specific fix that's needed. I'm planning to leave it blacklisted and check after each Unraid release. Hopefully it's fixed in time with kernel updates.

Have you tried 6.12.9 to see if the Aspeed blacklist is still required?

Link to comment
On 3/31/2024 at 12:24 PM, mikeyosm said:

Have you tried 6.12.9 to see if the Aspeed blacklist is still required?

 

On 4/1/2024 at 4:02 AM, jakea333 said:

Yes, I updated with the same symptoms.

 

If this is related to the mATX version of the board, we are discussing this problem in this thread:

 

Link to comment
  • 2 weeks later...

Hi all,

 

I’m about ready to pull the trigger on this motherboard but I wanted to clarify one thing about the ongoing ast issue before I do. This is only an issue if I need the IPMI to control the unraid CLI, right? Otherwise I can use the web interface without issue? Are there any other limitations currently?

 

Thank you!

 

 

Link to comment
  • 2 weeks later...

I also plan to buy the board and would like to pass my graphics card from one of the PCIe 5.0 slots to my VDI. This naturally rules out both slots, leaving only the two PCIe 3.0 Mode x4 slots. However, I still need space for an HBA LSI 9201-16i PCIe 2.0 x8. Would it still function properly in one of the PCIe 3.0 slots, or should I perhaps consider opting for an LSI 9600 series instead?

Link to comment
On 4/22/2024 at 12:43 AM, Cout99 said:

I also plan to buy the board and would like to pass my graphics card from one of the PCIe 5.0 slots to my VDI. This naturally rules out both slots, leaving only the two PCIe 3.0 Mode x4 slots. However, I still need space for an HBA LSI 9201-16i PCIe 2.0 x8. Would it still function properly in one of the PCIe 3.0 slots, or should I perhaps consider opting for an LSI 9600 series instead?

 

Are you ruling out both PCIe 5.0 slots solely due to the height of the card? You might be able to find a riser cable to make both slots accessible.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.