Corsair H100i v2 - Stuck at 100c after hard drive swap


Vanum

Recommended Posts

Good morning all! 

 

This is my last hail Mary before I switch to air cooling that can be easily controlled by Unraid. Maybe with your brains combined, you can help me address this issue. 

 

I shut down my Unraid server last night to install a new 8TB drive (white label WD, black Friday deal). I unplugged the old drive after transferring everything off of it and plugged in the new drive. I booted into the BIOS to make sure I could see it and then proceeded to launch into Unraid to configure it. I had some problems getting into the GUI right off the bat, so I restarted the machines a couple of times. 

 

Finally, I was able to get into the GUI and configure the drive. I noticed the CPU temps were sitting at 100c and my AIO fans on the radiator were going full blast. I shut down the Unraid server and went back to the BIOS. I was unable to alter the fans by any means from the BIOS. 

 

Since I couldn't control the fans there, I booted the server back up and jumped into my Windows VM. I have the iCue software that controls the AIO water cooler installed on this VM, so I attempted to control it here. Same thing. The fans/pump are changing their profiles and the CPU temp is reading 72c instead of 100c.

 

Extra notes: 

 

  • This incident is not the first time this has happened and it happened exactly the same way another time. I was replacing a drive with a new larger drive, booted up the server to configure, and the temps are stuck at 100c. I replaced the AIO thinking it was just a bad AIO. But now, it has me wondering.
  •  I've had the first AIO for almost a year and a half. I've had the second one about 3 months. 
  • I am aware that I can boot into Windows using bare metal and install iCue, but I am trying to configure/fix the issue without doing that if possible.
  • I am trying to figure out how replacing a hard drive and my AIO stop working could be correlated. I haven't been able to find a thread to connect them
  • The CPU is actually being throttled down to about 2 GHz from my normal 3.7 GHz/4.3 GHz

 

Things I've Tried

  • Thermal paste looks good
  • Screws are tight on the mobo around the heatsink
  • I'm not using a washer to make them fit better
  • CPU fans are controls by PWM instead of Auto
  • CPU fans are set to Full per Corsair's guidelines (changing this to any other setting doesn't change anything)
  • Created custom fan profiles on BIOS or within iCue
  • Reinstalled the drivers for the pump
  • Reinstalled iCue
  • Reinstalled the Corsair Link software (older version to manage pumps)
  • Unplugged the fan connector and USB header connected to the heatsink and replugged them
  • BIOS firmware upgrade (stock profile loaded afterward)
  • Unplug the hard drive and attempt to boot without it (no change)
  • Radiator and fans are clean, but I did blow some air through each of them.

 

I was working on it all last night, but I did have to concede as it was approaching 2:00 AM. I would love for some suggestions to try when I get home tonight. 

 

Thank you in advance for your help! I am sure the others who haven't spoken up about their issues will appreciate it as well.

 

-Vanum

 

Diagnostics package attached!

 

 

azeroth-diagnostics-20181127-0410.zip

Link to comment

Yeah, I have considered it. I figure it would at least fail while it was running though and not just a restart. 

 

Also, I just checked the logs and I am getting IRQ 16: nobody cared. I've heard this can really slow down some devices. I wonder if that is somehow screwing with my devices. 

 

Nov 27 03:28:16 Azeroth kernel: irq 16: nobody cared (try booting with the "irqpoll" option)
Nov 27 03:28:16 Azeroth kernel: CPU: 8 PID: 0 Comm: swapper/8 Not tainted 4.18.8-unRAID #1
Nov 27 03:28:16 Azeroth kernel: Hardware name: Gigabyte Technology Co., Ltd. Z370 AORUS Gaming 7/Z370 AORUS Gaming 7, BIOS F11 10/30/2018
Nov 27 03:28:16 Azeroth kernel: Call Trace:
Nov 27 03:28:16 Azeroth kernel: <IRQ>
Nov 27 03:28:16 Azeroth kernel: dump_stack+0x5d/0x79
Nov 27 03:28:16 Azeroth kernel: __report_bad_irq+0x32/0xac
Nov 27 03:28:16 Azeroth kernel: note_interrupt+0x1d3/0x224
Nov 27 03:28:16 Azeroth kernel: handle_irq_event_percpu+0x4c/0x6a
Nov 27 03:28:16 Azeroth kernel: handle_irq_event+0x33/0x51
Nov 27 03:28:16 Azeroth kernel: handle_fasteoi_irq+0x8c/0xfc
Nov 27 03:28:16 Azeroth kernel: handle_irq+0x1c/0x1f
Nov 27 03:28:16 Azeroth kernel: do_IRQ+0x43/0xbf
Nov 27 03:28:16 Azeroth kernel: common_interrupt+0xf/0xf
Nov 27 03:28:16 Azeroth kernel: </IRQ>
Nov 27 03:28:16 Azeroth kernel: RIP: 0010:cpuidle_enter_state+0xe8/0x141
Nov 27 03:28:16 Azeroth kernel: Code: ff 45 84 ff 74 1d 9c 58 0f 1f 44 00 00 0f ba e0 09 73 09 0f 0b fa 66 0f 1f 44 00 00 31 ff e8 fb bc be ff fb 66 0f 1f 44 00 00 <48> 2b 1c 24 b8 ff ff ff 7f 48 b9 ff ff ff ff f3 01 00 00 48 39 cb 
Nov 27 03:28:16 Azeroth kernel: RSP: 0018:ffffc90003227ea0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdc
Nov 27 03:28:16 Azeroth kernel: RAX: ffff88084ec20c00 RBX: 000000032d18fc58 RCX: 000000000000001f
Nov 27 03:28:16 Azeroth kernel: RDX: 000000032d18fc58 RSI: 0000000000000000 RDI: 0000000000000000
Nov 27 03:28:16 Azeroth kernel: RBP: ffff88084ec28e40 R08: 000000258c55163c R09: 0000000000000002
Nov 27 03:28:16 Azeroth kernel: R10: 0000000000000000 R11: 071c71c71c71c71c R12: 0000000000000001
Nov 27 03:28:16 Azeroth kernel: R13: 0000000000000001 R14: ffffffff81e57778 R15: 0000000000000000
Nov 27 03:28:16 Azeroth kernel: do_idle+0x192/0x20e
Nov 27 03:28:16 Azeroth kernel: cpu_startup_entry+0x6a/0x6c
Nov 27 03:28:16 Azeroth kernel: start_secondary+0x197/0x1b2
Nov 27 03:28:16 Azeroth kernel: secondary_startup_64+0xa5/0xb0
Nov 27 03:28:16 Azeroth kernel: handlers:
Nov 27 03:28:16 Azeroth kernel: [<00000000a3d7759a>] i801_isr [i2c_i801]
Nov 27 03:28:16 Azeroth kernel: Disabling IRQ #16

 

Link to comment

I would expect a bad pump to fail to start when powered up, because it can't overcome the static friction. If it happens to start it will probably keep running. I'd get a decent air cooler instead. That would be more than adequate as long as you're not overclocking - which you shouldn't do to a server, anyway.

 

The IRQ issue is unrelated and whether you can ignore it depends on what, if anything, is using IRQ 16. If it's an unused USB controller then it doesn't really matter. If it's your NIC or your HBA then performance will suffer.

  • Upvote 1
Link to comment

Hey John_M,

 

Thank you for the quick response.

 

I didn't mention that the pump is reporting water is pumping according to the BIOS and iCue. Even so, I am still at a dead end as to why it wouldn't be cooling if both the fan and the pump are reporting numbers. 

 

I have this issue a while back when I attempted to install an expansion card for more SATA ports. It crippled the performance of my 1080 TI, so I took them out and just got bigger drives instead. I didn't know if maybe there was a connection. 

Link to comment

If your cpu temps are sitting at 100C for more than a few seconds, it's probably something wrong with temp sensing. There are certain cpu thermal events that trigger motherboard alarms, typically called THERM#, ALERT#, and THERMTRIP#. Most cpu's can't go 100C without tripping the last one, sending a signal to the motherboard to shut down. Therefore, if it sits at that temp for a while, that means the sensor is returning incorrect temp to the system, otherwise it would just turn itself off.

Link to comment

Maybe the pump motor is rotating but it has become detached from the impeller - so tacho pulses but no water flow. How do the pipes feel to the touch if you power up after letting it cool down?

 

I notice you're running an rc version of Unraid (6.6.0-rc4). You ought to upgrade to 6.6.5. It might fix the IRQ issue.

Link to comment

@ClintE,

 

That is what I thought. The system would just shut down until it cooled off and plus the CPU through iCue is only reporting 72c and Unraid is changing from 100c. It is exactly 100c the entire time. 

 

@John_M

Yeah, I saw that 6.6.5 was released earlier this month. I will upgrade to that when I get home. Maybe I'll be able to use my SATA expansion card.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.