Jump to content
limetech

unRAID OS version 6.4.0-rc10b available

131 posts in this topic Last Reply

Recommended Posts

I installed the update this morning. Suddenly, while I was soldering in the same room, I heard the server beep. I went over and it was frozen. I did a hard reboot and its now doing a parity check. It didn't create a log file in /logs on the usb. Diagnostics attached.

 

Running an AMD system, this is the first time its done this since I installed the system before the AMD fix.

tower-diagnostics-20171029-2051.zip

Share this post


Link to post
1 hour ago, tstor said:

As my example that you previously looked at shows, it did not kill the remote sessions quickly enough, which resulted in a array that was not properly shut down.

Make sure that your 'Time on Battery' (That is the time from when the power goes out and the UPS signal the server to start the shutdown) is set to a value that reflects the length of time in which the power might be automatically restored by the power grid action.  In my case, if the power is out for thirty seconds, it will be out for two-to-three hours at the minimum!   

 

The short setting addresses three major issues.  (1) It means the battery will last long enough to guarantee a clean shutdown even if the battery is five years old.  (2) It provides enough battery reserve to handle a couple of twenty-five second outages before the long-term outage occurs.  (3) It provides for the case where the power is restored and after you wait for half hour to see that is staying on, you restart the server and the the power goes out fifteen minutes later!   (It takes at least eight hours to recharge the UPS battery!) 

Edited by Frank1940

Share this post


Link to post
1 hour ago, RonUSMC said:

I installed the update this morning. Suddenly, while I was soldering in the same room, I heard the server beep. I went over and it was frozen. I did a hard reboot and its now doing a parity check. It didn't create a log file in /logs on the usb. Diagnostics attached.

 

Running an AMD system, this is the first time its done this since I installed the system before the AMD fix.

tower-diagnostics-20171029-2051.zip

I had a crash on my ryzen 1600x system just like this. It only happened when it was overclocking. Went back to stock clocks and it's been stable for 24 hours now.

Share this post


Link to post
6 hours ago, tstor said:

 

 

As my example that you previously looked at shows, it did not kill the remote sessions quickly enough, which resulted in a array that was not properly shut down.

 

You can change to time to wait (default 90 sec). See Settings -> Disk Settings -> Shutdown time-out:

Share this post


Link to post
On 10/29/2017 at 7:52 AM, limetech said:

 

Not addressed.  The setting to change is defined in /etc/php-fpm.d/www.conf:


pm.max_children = 10

To experiment with different settings you can make your own copy of 'www.conf' and save it on the usb flash device in 'config' directory.  Next, add this line in your 'go' file before emhttp is started:


cp /boot/config/www.conf /etc/php-fpm.d/

You can also make a change to /etc/php-fpm.d/www.conf directly and then type '/etc/rc.d/rc.php-fpm restart'.

 

The original default value is 5 and we set to 10.  Apparently that's too low.  Maybe double it again?

 

I would do an inline change in www.conf to prevent other parameters to go outdated when a local copy is kept on the flash. My recommendation is to put the following line in the go file

sed -ri 's/^(pm.max_children =) [0-9]+$/\1 20/' /etc/php-fpm.d/www.conf

The above would change the value of pm.max_children to 20.

  • Like 1

Share this post


Link to post

 

Thanks very much, i will try seting it to 20 and will report back if the error shows up in the log again.

 

On 29.10.2017 at 7:52 AM, limetech said:

 

Not addressed.  The setting to change is defined in /etc/php-fpm.d/www.conf:


pm.max_children = 10

To experiment with different settings you can make your own copy of 'www.conf' and save it on the usb flash device in 'config' directory.  Next, add this line in your 'go' file before emhttp is started:


cp /boot/config/www.conf /etc/php-fpm.d/

You can also make a change to /etc/php-fpm.d/www.conf directly and then type '/etc/rc.d/rc.php-fpm restart'.

 

The original default value is 5 and we set to 10.  Apparently that's too low.  Maybe double it again?

 

Share this post


Link to post

Hey All. I had this same issue with the last update and had to roll back. When upgrading. I can not access the gui. I can ping the IP all day long. When trying to access via https://xxx.xxx.xxx.xxx i get the log in prompt, then get pushed to http://storage/main with the error "File not found" if i try https://storage i get the log in prompt then pushed right back to the same error on http. I included my go and identity files for reference. 

 

ident.cfg

# Generated settings:
NAME="Storage"
timeZone="America/New_York"
COMMENT="Media server"
SECURITY="user"
WORKGROUP="WORKGROUP"
DOMAIN=""
DOMAIN_SHORT=""
hideDotFiles="no"
localMaster="yes"
USE_NTP="yes"
NTP_SERVER1="pool.ntp.org"
NTP_SERVER2=""
NTP_SERVER3=""
DOMAIN_LOGIN="Administrator"
DOMAIN_PASSWD=""
SYS_MODEL="Custom"
SYS_ARRAY_SLOTS="23"
SYS_CACHE_SLOTS="1"
USE_SSL="no"
PORT="80"
PORTSSL="443"
#!/bin/bash
modprobe i915
chmod 777 /dev/dri/*
/boot/config/enable_achi.sh
# Start the Management Utility
/usr/local/sbin/emhttp

 

Share this post


Link to post

Whats also interesting is after a reboot. I can get in for a couple mins while the drives mount. Once they mount I can no longer get in

Share this post


Link to post
11 minutes ago, bonienl said:

Try starting in safemode...

hmm. so with safmode i can access it even after the array starts. well. Guess its off to figure out what plugin is breaking it. 

Share this post


Link to post

Would it be possible to patch the kernel to allow the Elgato HD60 pro capture card to properly work when being passed to a windows VM. Here is the patch needed https://gist.github.com/numinit/40fc5ad96fd0990b0a63/ , to provide full support for the card, otherwise we need to rebuild the kernel ourselves and at the moment there are no instructions on how to build the latest stable release. 

This would be greatly appreciate and i believe a lot of users would be grateful about the added support. 

Share this post


Link to post
5 hours ago, morbidpete said:

hmm. so with safmode i can access it even after the array starts. well. Guess its off to figure out what plugin is breaking it.

 

You should probably start with the non Dynamix ones.  I have not heard of any of the CA ones or Fix Common Problems as having any issues of this kind.

Edited by Frank1940
"or in second sentence was "of".

Share this post


Link to post

I have been noticing that with the 6.4 betas any time I type "reboot" in CMD or use the "buttons" plugin reboot that my array starts a parity check when it comes back online. Attached is a diagnostics from when my server starts to reboot. Can anyone tell what is happening? I am not seeing anything in the logs.

 

Edited by archedraft

Share this post


Link to post
11 minutes ago, archedraft said:

I have been noticing that with the 6.4 betas any time I type "reboot" in CMD or use the "buttons" plugin reboot that my array starts a parity check when it comes back online. Attached is a diagnostics from when my server starts to reboot. Can anyone tell what is happening? I am not seeing anything in the logs.

pithos-diagnostics-20171030-1419.zip

 

Looks to me like one or both your VMs are not shutting down in the set 60 seconds, so it forces shutdown after that, do they shutdown if you the array?

Share this post


Link to post
 
Looks to me like one or both your VMs are not shutting down in the set 60 seconds, so it forces shutdown after that, do they shutdown if you the array?


Well shoot. No there is one VM that will not shutdown with the “virsh shutdown” command. In the 6.3 versions I was using “user scripts” to send the “virsh destroy” command at array shutdown to fix this. Are you aware of a way to send this command right after the “reboot” gets triggered?

Share this post


Link to post
6 hours ago, archedraft said:

 


Well shoot. No there is one VM that will not shutdown with the “virsh shutdown” command. In the 6.3 versions I was using “user scripts” to send the “virsh destroy” command at array shutdown to fix this. Are you aware of a way to send this command right after the “reboot” gets triggered?

 

 

Sorry, no.

Share this post


Link to post
14 hours ago, morbidpete said:

Whats also interesting is after a reboot. I can get in for a couple mins while the drives mount. Once they mount I can no longer get in

 

I had the same problem. Removed the following plugins:

 

fix common problems

preclear

speed test

tips and tweaks

 

I didn't do them one by one but thought this might help in case there is a commonality between plugins you may have vs what I removed.

  • Like 1

Share this post


Link to post
6 hours ago, Reckless Maker said:

I just started with unRAID and for some reason my system won't stay up for more than an hour. This is what posts to the screen:

59f7e39f2c321_20171030_1721581.thumb.jpg.c85894ff17c1931c135089b8d2e6beee.jpg

And this is what posts in safe mode:

59f7e421d08f3_20171030_1744451.thumb.jpg.273bab8b7023b5bcd43baad2e9deecd5.jpg

This is on a Ryzen 1600X with nothing attached but the ram and an AMD RX 550 graphics card.

tower-diagnostics-20171030-1937.zip

Check that you are up to date on bios and have C-states turned off to begin with. 6.4 has a fix for them but baby step yourself to stable. I’ll let smarter people than I troubleshoot to diags. 

Share this post


Link to post
4 hours ago, GroxyPod said:

 

I had the same problem. Removed the following plugins:

 

fix common problems

preclear

speed test

tips and tweaks

 

I didn't do them one by one but thought this might help in case there is a commonality between plugins you may have vs what I removed.

Odd, So i just started fresh and rebuilt the thumbdrive from scratch and upgraded to the latest beta. Got all my dockers back up and running and reinstalled all my plugins including the 4 you listed. No issues. Just one of those things ya know. Appreciate the input. 

Share this post


Link to post
12 hours ago, Reckless Maker said:

I just started with unRAID and for some reason my system won't stay up for more than an hour. This is what posts to the screen:

59f7e39f2c321_20171030_1721581.thumb.jpg.c85894ff17c1931c135089b8d2e6beee.jpg

And this is what posts in safe mode:

59f7e421d08f3_20171030_1744451.thumb.jpg.273bab8b7023b5bcd43baad2e9deecd5.jpg

This is on a Ryzen 1600X with nothing attached but the ram and an AMD RX 550 graphics card.

tower-diagnostics-20171030-1937.zip

I have a 1600x on a Gigabyte gaming 5 with the latest bios(F9a) and i had 2 crashes. One was from overclocking for some reason. It would crash maybe a hour later, guess unRAID doesn't like the OC. The other was while running the docker backup plugin which also happened when i was using my 7700k as well, so a random thing but would happen every now and then. I've been stable besides that for the last couple days. 

Share this post


Link to post

Is that OC beyond the boost frequency or just the built-in boost? I'll disable both because there does seem to be an issue with the clock.

59f8a0c5a3142_20171030_2132031.thumb.jpg.4fa7f7553a1cf9b95f3a61a071b55a16.jpg

I'll also disable C-states just for good measure.

Share this post


Link to post
Guest
This topic is now closed to further replies.