unRAID became slow after removing data disk


Recommended Posts

Hello all

 

After replacing my data disk because of issues explained here, I found my unRAID to become extremely unresponsive. After booting everything is fine for several minutes, then CPU usage seems to go up, the web UI takes ages to load a page and the array rebuild speed drops from over 100MBps to a couple of KBps. I can confirm this is happening regardless of the Docker image being enabled.

 

I'm running unRAID on a ESXI VM, have 16GB of ram assigned supported by a AMD FX6300 (6x3.5Ghz) and 7 datadrives as raw mapped LUNs connected (5 of them attached to the mobo, the rest to an PCI-E SATA controller). This issue only started after I removed the old faulty disk and restarted the array unprotected (I figured it'd be fine for the duration of the new drive pre-clearing). I did manage to go through 2 cycles of pre-clearing last night without the system becoming unstable, but only because the array wasn't mounted. So the issue only appears after mounting the array.

 

I managed to download the diagnostics report, please do find it included with the attachment.

 

Thank you in advance :)

 

-Ziggy

 

 

 

 

EDIT: I'm definitely thinking all those odd authentication attempts have something to do with it... I tried blocking the IP which magically solved the problem, though the attempts came back from an other origin... Suggestions?

ziggy_unraid-diagnostics-20160512-2222.zip

Link to comment

It looks like you have more exposed to the Internet than the FTP port?  Try locking it down to just the minimum you need, and no DMZ!

 

I don't see a reason, but you are right, just about everything at the end is using ridiculous amounts of CPU.  It's not just one thing hogging the CPU.  It doesn't look usable.

Link to comment

It looks like you have more exposed to the Internet than the FTP port?  Try locking it down to just the minimum you need, and no DMZ!

 

I don't see a reason, but you are right, just about everything at the end is using ridiculous amounts of CPU.  It's not just one thing hogging the CPU.  It doesn't look usable.

 

Thank you for your reply. Yeah, I had SSH and Unraid forwarded as well. Don't really need it since I'm passing everything via Nginx so I got rid of those, and the flooding seems to have stopped.

 

I'm still having performance issues though, whereas it took about 3 minutes to load a page, it now takes over a minute. This time with Docker enabled and multiple containers running.

 

I attached another diagnostic archive. Can anyone shed some light on this bizarre situation?

ziggy_unraid-diagnostics-20160514-1554.zip

Link to comment
  • 4 weeks later...

My apologies, I like to help, but I'm terrible about followup!

 

I've looked at the last diagnostics (old, May 14!), and it looks mostly OK.  What's still striking is how much CPU is being used by the various apps and tools.  Deluge is the busiest at 93, with other apps ranging from 84 to 53 to 15.  What's remarkable though is how much ordinary tools are using, Diagnostics is using 45, and todos is using 63!  It makes you wonder if your setup is not 'governing' correctly, perhaps being 'governed' down to its lowest CPU speed and never speeding back up.  I noticed that in both syslogs, the CPU is determined to be a 3.5GHz processor, but barely nanoseconds later in one syslog, it was 3.2GHZ (still 3.5GHz in the other)!  That's a second clue that controlling/maintaining CPU speed may be an issue.  I don't have any ideas to help.

 

One other possible issue, in last syslog you were getting probable syn floods on port 49153, a listener set up by a docker, don't know which.  You should probably check that out.

Link to comment

My apologies, I like to help, but I'm terrible about followup!

 

I've looked at the last diagnostics (old, May 14!), and it looks mostly OK.  What's still striking is how much CPU is being used by the various apps and tools.  Deluge is the busiest at 93, with other apps ranging from 84 to 53 to 15.  What's remarkable though is how much ordinary tools are using, Diagnostics is using 45, and todos is using 63!  It makes you wonder if your setup is not 'governing' correctly, perhaps being 'governed' down to its lowest CPU speed and never speeding back up.  I noticed that in both syslogs, the CPU is determined to be a 3.5GHz processor, but barely nanoseconds later in one syslog, it was 3.2GHZ (still 3.5GHz in the other)!  That's a second clue that controlling/maintaining CPU speed may be an issue.  I don't have any ideas to help.

 

One other possible issue, in last syslog you were getting probable syn floods on port 49153, a listener set up by a docker, don't know which.  You should probably check that out.

 

Hi Rob

 

Thank you still for following up, I didn't expect to receive any reply and I appreciate your time and efforts.

 

I wasn't able to figure out what the problem was so went ahead and got rid of ESXI. The hypervisor was kind of obsolete since I had everything set up in Docker, and ditching it for a full blown unRAID installation was a better idea anyway.

 

Having done that, everything seems to be running smoothly again. It was probably ESXI that was not throttling the CPU resources correctly, like you said.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.