HP Proliant / Workstation & unRaid Information Thread


1812

Recommended Posts

Maybe someone in here can help me with an issue I am having. Ever since upgrading to 6.6.1 my HP ML10 V2 has had a strange issue where the CPU temperature in unRAID reads 80c under full load (this is normal for this machine) but in HP iLO it only reads 50c and thus the fans stay at 20%. I am just trying to figure out what the cause could be and if its something with unRAID or my server itself. I am running the latest bios.image.png.b0c96f7474e5fec9e41738543edf0b88.png

image.png.2041413252d0516f0b8fcb554fa6817d.png

 

Link to comment
10 hours ago, ucliker said:

Maybe someone in here can help me with an issue I am having. Ever since upgrading to 6.6.1 my HP ML10 V2 has had a strange issue where the CPU temperature in unRAID reads 80c under full load (this is normal for this machine) but in HP iLO it only reads 50c and thus the fans stay at 20%. I am just trying to figure out what the cause could be and if its something with unRAID or my server itself. I am running the latest bios.image.png.b0c96f7474e5fec9e41738543edf0b88.png

image.png.2041413252d0516f0b8fcb554fa6817d.png

 

I've had inconsistent temp readings from unraid and hp servers/workstations since around/after 6.2. Even when using the impi plugin, the temp doesn't exactly align with coretemp, which usually doesn't work now with dynamix system temp. This is from an ML30 G9

 

239750170_ScreenShot2018-12-28at8_18_51AM.thumb.png.ddf127865945892a4552a407a849f34c.png

 

 

I would trust iLO. There may be something that can be loaded in the OS by limetech, but I haven't spent anytime researching what that could be.

 

 

Link to comment
4 hours ago, 1812 said:

I've had inconsistent temp readings from unraid and hp servers/workstations since around/after 6.2. Even when using the impi plugin, the temp doesn't exactly align with coretemp, which usually doesn't work now with dynamix system temp. This is from an ML30 G9

 

imageproxy.php?img=&key=00b562fcac28e727239750170_ScreenShot2018-12-28at8_18_51AM.thumb.png.ddf127865945892a4552a407a849f34c.png

 

 

I would trust iLO. There may be something that can be loaded in the OS by limetech, but I haven't spent anytime researching what that could be.

 

 

I appreciate your input. The problem I'm having is the fans no longer spin up when the system is under full load. In the past when handbrake was converting movies the fan would spin up but now it doesn't happen. iLO reads 40 degrees and unraid reads 87 degrees which I know that during this use case unraid is correct. It's just strange. 

Link to comment
1 minute ago, 1812 said:

Please define the problem. Since it’s not the onboard nic, you may just have a general non-hp issue.

Well I am getting very slow speeds through my extra network card on the riser through much validation the issue is not with any of the hardware and all the NIC's and cables work fine however when I connect my Server - my PC the speeds dip whilst doing an iperf test starting at 20mbps and dipping as low as 0

Link to comment
1 minute ago, MarkPla7z said:

Well I am getting very slow speeds through my extra network card on the riser through much validation the issue is not with any of the hardware and all the NIC's and cables work fine however when I connect my Server - my PC the speeds dip whilst doing an iperf test starting at 20mbps and dipping as low as 0

I just replied in your other thread. I don’t think it’s an hp issue. I could be wrong, but I haven’t had issues using mellanox cards in hp servers.

Link to comment
1 minute ago, 1812 said:

I just replied in your other thread. I don’t think it’s an hp issue. I could be wrong, but I haven’t had issues using mellanox cards in hp servers.

I haven't had any issues with my Mellanox cards in my hp server. For a while, I had 2 cards going to two separate desktops. It sounds like it's a Mellanox issue. I actually had horrible transfer speeds even when using a RAM disk until I updated the firmware on one of my cards. 

Link to comment
13 minutes ago, ucliker said:

I haven't had any issues with my Mellanox cards in my hp server. For a while, I had 2 cards going to two separate desktops. It sounds like it's a Mellanox issue. I actually had horrible transfer speeds even when using a RAM disk until I updated the firmware on one of my cards. 

Now that I remember, I did have a Mellanox card die 2 years ago... and it had performance issues before it completely tanked.

Link to comment
2 minutes ago, 1812 said:

Now that I remember, I did have a Mellanox card die 2 years ago... and it had performance issues before it completely tanked.

I have had one die on me but it was always running extremely hot. These cards are used enterprise cards that got hammered for years so them dying is expected. 

Link to comment
  • 5 months later...

 

I am trying to add some new hard drives to my server to replace the ones that are going bad but for some reason the hard drives are not showing up in Unassigned Devices I bought three different hard drives and the third one I bought is the same brand as all the rest of the hard drives in this server but it’s still not showing up for me to pre-clear the only way the drives pop up is if I load them in a usb dock station anyone have any suggestions what should I do 

Link to comment
2 hours ago, Jmoney said:

 

I am trying to add some new hard drives to my server to replace the ones that are going bad but for some reason the hard drives are not showing up in Unassigned Devices I bought three different hard drives and the third one I bought is the same brand as all the rest of the hard drives in this server but it’s still not showing up for me to pre-clear the only way the drives pop up is if I load them in a usb dock station anyone have any suggestions what should I do 

had to help without knowing your setup. post your diagnostics zip file. 

Link to comment
2 minutes ago, Jmoney said:

thanks here they are 

 

/config
copy all *.cfg files, go file and the super.dat file. These are configuration files.

/config/shares
copy all *.cfg files. These are user share settings files.

Syslog file(s)
copy the current syslog file and any previous existing syslog files.

System
save output of the following commands:
lsscsi, lspci, lsusb, free, lsof, ps, ethtool & ifconfig.
display of iommu groups.
display of command line parameters (e.g. pcie acs override, pci stubbing, etc).
save system variables.

SMART reports
save a SMART report of each individual disk present in your system.

Docker
save files docker.log, libvirtd.log and libvirt/qemu/*.log.

 

The diagnostics are a ZIP file containing many other files.    The expectation is that you will post the entire ZIP file to the forum.

Link to comment
  • 2 months later...

Looking for any advice...

 

I have a new microserver gen10, running the latest unRAID. I have a Windows VM working (WHS 2011, as it's the only ISO I have on hand and was eager to try out creating my first VM).

 

I have a PCIe USB 3 card installed, which supports reset (SpaceInvader One's videos are invaluable!). I can passthrough the controller to the VM, and hot swap USB drives all day long/access them in the VM, that is until I stop the VM; If I try to start the VM again without rebooting the server first, unRAID freezes.

 

I learned how to setup the syslog server, and save the log to a share, but nothing that seems to be related to the issue has a chance to get logged (that said I am a total Linux/unRAID noob so maybe I am missing something).

 

Link to comment
20 hours ago, ddot said:

Looking for any advice...

 

I have a new microserver gen10, running the latest unRAID. I have a Windows VM working (WHS 2011, as it's the only ISO I have on hand and was eager to try out creating my first VM).

 

I have a PCIe USB 3 card installed, which supports reset (SpaceInvader One's videos are invaluable!). I can passthrough the controller to the VM, and hot swap USB drives all day long/access them in the VM, that is until I stop the VM; If I try to start the VM again without rebooting the server first, unRAID freezes.

 

I learned how to setup the syslog server, and save the log to a share, but nothing that seems to be related to the issue has a chance to get logged (that said I am a total Linux/unRAID noob so maybe I am missing something).

 

post your full diagnostics zip

Link to comment
3 hours ago, ddot said:

I tried different combinations of ACS override and unsafe interrupts to no avail; I undid all of that and the server is back to where is was (VM can be stopped and started at will but no USB controller).

 

Thanks!

unraidserver-diagnostics-20190929-1724.zip 164.4 kB · 0 downloads

in your syslinux.cfg add the following between append   initrd=/bzroot

 

 vfio-pci.ids=1b73:1100

then reboot.

 

 

also, I think you have a disk issue, your syslog is spammed with this:

 

Sep 29 04:40:25 UNRAIDSERVER ntfs-3g[5114]: ntfs_attr_pread_i: ntfs_pread failed: Input/output error
Sep 29 04:40:25 UNRAIDSERVER ntfs-3g[5114]: Failed to read index block: Input/output error
Sep 29 04:40:25 UNRAIDSERVER kernel: Buffer I/O error on dev sdg2, logical block 36, async page read
Sep 29 04:40:35 UNRAIDSERVER ntfs-3g[5114]: ntfs_attr_pread_i: ntfs_pread failed: Input/output error
Sep 29 04:40:35 UNRAIDSERVER ntfs-3g[5114]: Failed to read index block: Input/output error
Sep 29 04:40:35 UNRAIDSERVER kernel: Buffer I/O error on dev sdg2, logical block 36, async page read
Sep 29 04:40:45 UNRAIDSERVER ntfs-3g[5114]: ntfs_attr_pread_i: ntfs_pread failed: Input/output error
Sep 29 04:40:45 UNRAIDSERVER ntfs-3g[5114]: Failed to read index block: Input/output error
Sep 29 04:40:45 UNRAIDSERVER kernel: Buffer I/O error on dev sdg2, logical block 36, async page read
Sep 29 04:40:55 UNRAIDSERVER ntfs-3g[5114]: ntfs_attr_pread_i: ntfs_pread failed: Input/output error
Sep 29 04:40:55 UNRAIDSERVER ntfs-3g[5114]: Failed to read index block: Input/output error
Sep 29 04:40:55 UNRAIDSERVER kernel: Buffer I/O error on dev sdg2, logical block 36, async page read
Sep 29 04:41:05 UNRAIDSERVER ntfs-3g[5114]: ntfs_attr_pread_i: ntfs_pread failed: Input/output error
Sep 29 04:41:05 UNRAIDSERVER ntfs-3g[5114]: Failed to read index block: Input/output error
Sep 29 04:41:05 UNRAIDSERVER kernel: Buffer I/O error on dev sdg2, logical block 36, async page read

 

Link to comment
23 minutes ago, 1812 said:

in your syslinux.cfg add the following between append   initrd=/bzroot

 


 vfio-pci.ids=1b73:1100

then reboot.

 

 

also, I think you have a disk issue, your syslog is spammed with this:

 


Sep 29 04:40:25 UNRAIDSERVER ntfs-3g[5114]: ntfs_attr_pread_i: ntfs_pread failed: Input/output error
Sep 29 04:40:25 UNRAIDSERVER ntfs-3g[5114]: Failed to read index block: Input/output error
Sep 29 04:40:25 UNRAIDSERVER kernel: Buffer I/O error on dev sdg2, logical block 36, async page read
Sep 29 04:40:35 UNRAIDSERVER ntfs-3g[5114]: ntfs_attr_pread_i: ntfs_pread failed: Input/output error
Sep 29 04:40:35 UNRAIDSERVER ntfs-3g[5114]: Failed to read index block: Input/output error
Sep 29 04:40:35 UNRAIDSERVER kernel: Buffer I/O error on dev sdg2, logical block 36, async page read
Sep 29 04:40:45 UNRAIDSERVER ntfs-3g[5114]: ntfs_attr_pread_i: ntfs_pread failed: Input/output error
Sep 29 04:40:45 UNRAIDSERVER ntfs-3g[5114]: Failed to read index block: Input/output error
Sep 29 04:40:45 UNRAIDSERVER kernel: Buffer I/O error on dev sdg2, logical block 36, async page read
Sep 29 04:40:55 UNRAIDSERVER ntfs-3g[5114]: ntfs_attr_pread_i: ntfs_pread failed: Input/output error
Sep 29 04:40:55 UNRAIDSERVER ntfs-3g[5114]: Failed to read index block: Input/output error
Sep 29 04:40:55 UNRAIDSERVER kernel: Buffer I/O error on dev sdg2, logical block 36, async page read
Sep 29 04:41:05 UNRAIDSERVER ntfs-3g[5114]: ntfs_attr_pread_i: ntfs_pread failed: Input/output error
Sep 29 04:41:05 UNRAIDSERVER ntfs-3g[5114]: Failed to read index block: Input/output error
Sep 29 04:41:05 UNRAIDSERVER kernel: Buffer I/O error on dev sdg2, logical block 36, async page read

 

Unfortunately that's where I started: adding vfio-pci.ids=1b73:1100, which will passthrough to the VM, but restarting the VM crashes unRAID. After that I tried the different ACS override options in the VM Manager settings, and the unsafe interrupts, and combinations of both. I also tried pcie_acs_override=id:1b73:1100; everything leads to a server crash when restarting VM.

 

Concerning the spam - I did have an ntfs external drive attached, and I think it was device sdg, but it's been unplugged since yesterday; I was trying the Libvert Hotplug USB plugin, so maybe it's related to that.

Link to comment

this is a shot in the dark, but are you using legacy or UEFI boot for Unraid? whichever you're doing, consider trying to change to the other. 

 

looking back at your diagnostics again I see you've tried

 

iommo=on iommu=pt

the first one is incorrectly written and won't do anything anyways. But I've never had to use that on any proliant server or workstation, only a combination of allowing interrupts and/or some variation of acs override. 

 

 One of the first issues we need to address is that your iommu groupings are not separated enough to isolate just the usb card.

 

/sys/kernel/iommu_groups/2/devices/0000:00:03.0
/sys/kernel/iommu_groups/2/devices/0000:00:03.1
/sys/kernel/iommu_groups/2/devices/0000:03:00.0

which is

00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 60h-6fh) Host Bridge [1022:157b]
00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 60h-6fh) Processor Root Port [1022:157c]
03:00.0 USB controller [0c03]: Fresco Logic FL1100 USB 3.0 Host Controller [1b73:1100] (rev 10)
	Subsystem: Fresco Logic FL1100 USB 3.0 Host Controller [1b73:1100]

 

While the common knowledge is that you must pass through all the devices in an iommu group, I feel like I have had times where I've been able to pass through just a single item in a group when the other item was a bridge. I'm not sure why I think I remember doing that once but regardless, if you are able to do this, and then trying to restart or shut down a vm in this situation could be the cause.

 

 

please do the following: 

 

 

make sure your syslog server is exporting/writing to your flash drive (it shows it is unable to communicate to a network device)

modify your syslinux.cfg to

 

vfio-pci.ids=1b73:1100 pcie_acs_override=multifunction 

reboot

click on the info tab and verify the following:

HVM: Enabled
IOMMU: Enabled

if they are not, then you need to go into your bios and make sure all virtualization settings are on. If they are both enabled, then:

 

download a copy of diagnostics.zip file (this is a pre-vm look to make sure everything is configured correctly)

go to the vm icon, open the logs window (which will show you what is going on with the vm)

start the vm with the usb card assigned to it

once the vm is up, ensure it is working properly with the usb card

shut down the vm

restart the vm

 

if the vm starts with usb, then congratulations!

 

if the vm doesn't start but tells you another error but the server continues to work, download new diagnostics zip after that occurs and upload only that and not the pre-vm start diags

 

if the vm doesn't start and the server locks up, copy the text from the open vm logs window into a text file

if an error shows on the screen before lockup, notate that as well

reboot the server

access the flash drive where the syslog server writes and gather that text file

upload the pre-vm boot diagnostics folder, the text copy of the vm log, and the syslog server

 

 

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.