unRAID OS version 6.5.3 available


Recommended Posts

On my Ryzen 1700x with the Asus x370 motherboard all Windows VMs with GPU passthough are not functional at all. VMs workes great at first boot after installation but when restarted a couple of times they start lagging and hanging and take over 2 minutes just to load the main desktop & from there it is very difficult to do any task. Once the GPU is removed I can remotely log in without an issue and everything will work fine, I think it is an issue related to GPU passthorugh the AMD side

 

the system momory is at default speed and other VMs are working without an isse

 

the VM has 8 cores + 8GB ram + 1 GPU (GTX 1060 6GB) passthrough not system gpu.

 

I eventaully solved it by moving unRAID to my older i7 6700k on Gigabyte Z170X board. the VMs are running smooth with over 10 reboots and still testing nothing is gone wrong till now. I think its an AMD related issue.

Link to comment
2 hours ago, ryoko227 said:

@PSYCHOPATHiO out of curiosity, did you update to the newest BIOS

Always mostly day of release, I have a Bookmark folder for daily bios updates checks for all my boards.

Once there is an update I check of the changes & start applying my settings, amongst them is disableing c-state & enabling vm & IMMO settings. I have to mention that my memory is not on the compatible list so I keep it at stock speeds at 2133. I've used the machine under windows for a week with the memory set to 3000mhz and no issues.

Every kernel panic cause the system to restart a parity check and with 8TB drives it takes for EVER.

Link to comment

The rc1 post for this specifically said this:
"However, we want to make this change in isolation and release to the Community in order to get testing on a wider range of hardware, especially to find out if parity operations are negatively affected."

 

I'm not sure if 6.5.3 is the culprit since my parity checking only happens every two months but I am now getting repeated parity sync errors.  First manual scan fixed 53 then the next one 123 run not even a day later.

 

I have not backed out of 6.5.3 yet to see if that's the issue but I wanted to know if there is a way to change back the pre-empt setting in 6.5.3 so it behaves like 6.5.2 again and I will see if the parity sync errors go away.  Have already run a 24h memtest with no issues.  SMART is good too.

 

I have 4 4TB WD Red and then 2 3TB WD Reds.  The 2 3TB drives are plugged into a SATA PCIx1 card which was listed on the forums as compatible.  
IO Crest 4 Port SATA III PCI-e 2.0 x1 Controller Card Marvell Non-Raid with Low Profile Bracket SI-PEX40064

https://smile.amazon.com/gp/product/B00AZ9T3OU

 

Built the system December 2017 and all parity checks were good until now.

Link to comment
2 hours ago, nickp85 said:

The rc1 post for this specifically said this:
"However, we want to make this change in isolation and release to the Community in order to get testing on a wider range of hardware, especially to find out if parity operations are negatively affected."

 

I'm not sure if 6.5.3 is the culprit since my parity checking only happens every two months but I am now getting repeated parity sync errors.  First manual scan fixed 53 then the next one 123 run not even a day later.

 

I have not backed out of 6.5.3 yet to see if that's the issue but I wanted to know if there is a way to change back the pre-empt setting in 6.5.3 so it behaves like 6.5.2 again and I will see if the parity sync errors go away.  Have already run a 24h memtest with no issues.  SMART is good too.

 

I have 4 4TB WD Red and then 2 3TB WD Reds.  The 2 3TB drives are plugged into a SATA PCIx1 card which was listed on the forums as compatible.  
IO Crest 4 Port SATA III PCI-e 2.0 x1 Controller Card Marvell Non-Raid with Low Profile Bracket SI-PEX40064

https://smile.amazon.com/gp/product/B00AZ9T3OU

 

Built the system December 2017 and all parity checks were good until now.

 

Might I suggest that you open up a new thread in the 'General Support' subforum and be sure to include a diagnostics file which contains the time period during which you had parity sync errors.  I can tell you that SATA cards with  Marvell chip sets have had issues in the past.  (I am not saying at this point that is the your problem but there is precedence.)   

Link to comment
9 minutes ago, Frank1940 said:

 

Might I suggest that you open up a new thread in the 'General Support' subforum and be sure to include a diagnostics file which contains the time period during which you had parity sync errors.  I can tell you that SATA cards with  Marvell chip sets have had issues in the past.  (I am not saying at this point that is the your problem but there is precedence.)   

 

Thanks, I already have a support thread open. Wanted to say something here because of the comments made during the RC1 release.

Link to comment

@ryoko227 After I have moved the system to another USB and with a single HDD for VMs only, I copied the same VM of windows 10 from the First machine over the network and I'm actually runing it without an issue on an HDD. The only change is I pushed the DDR to 3000Mhz & running stable for almsot 24 hours now.

on an SSD and a GTX 1060 it becomes laggy, on an HDD and a GT 710 its going great, I'm confued lol

 

EDIT: also running same number of cores 8

 

Edited by PSYCHOPATHiO
  • Like 1
Link to comment

I upgraded an older/smaller disk in my server. That went fine but it showed this:

 

unRAID Parity sync / Data rebuild: 18-06-2018 18:46
Notice [Tower] - Parity sync / Data rebuild finished (0 errors)
Duration: 4 hours, 48 minutes, 55 seconds. Average speed: 230.8 MB/s

 

Is it combining the parity and disk speed? Because there is no way it hit 230.8 MB/s writing.

Link to comment

Just updated from 6.5.2 to 6.5.3 a couple days ago. Initially I thought everything went smoothly, as I haven't noticed any behavioral/performance symptoms. However when I logged in this morning I noticed in the system log that I have a call trace in my syslog. (Diagnostics attached). Obviously I can't be 100% certain that this is related to 6.5.3 specifically, but I have never seen any traces at all in the past. (I've run pretty much every stable release on this hardware since the unRAID v5 days). Just seems coincidental that the call trace happened shortly after the software upgrade. It seems noteworthy that the trace is concerning netfilter/macvlan and that I do have somewhat of a "non-standard" networking configuration, in that I'm using VLAN tagging and a Mellanox ConnectX-2 10G NIC.

 

As I said, I haven't noticed any behavioral systems or anything like that so I'm not too panicked about this. Just thought someone might like to take a look to see if any corner cases or incompatibilities of some recent change weren't covered during testing. Let me know if more information beyond the diagnostics file would be helpful. I'll monitor the logs to see if it happens again, and whether I can correlate it to a specific event.

 

Cheers,

 

-A

nas-diagnostics-20180624-0755.zip

Link to comment

I'm seeing this as well now. With UEFI boot enabled, a VM with iGPU passthrough doesn't boot, maxes out a CPU thread and totally fills the syslog with, 

kernel: vfio-pci 0000:00:02.0: BAR 2: can't reserve [mem 0xc0000000-0xcfffffff 64bit pref]

Disabling UEFI boot stops the problem and allows the VM and passthrough to return to working as expected.

 

Rich

 

diagnostics-20180624-1740.zip

Edited by Rich
Link to comment

I'd put this update off for a while but I've done it today and no issues so far

 

I had one vm stutter a few times after fresh server reboots but a sits just the one vm I'm assuming it's a Windows problem as a reboot fixed it

 

Other than that, all good so far 

Link to comment
8 hours ago, Rich said:

I'm seeing this as well now. With UEFI boot enabled, a VM with iGPU passthrough doesn't boot, maxes out a CPU thread and totally fills the syslog with, 


kernel: vfio-pci 0000:00:02.0: BAR 2: can't reserve [mem 0xc0000000-0xcfffffff 64bit pref]

Disabling UEFI boot stops the problem and allows the VM and passthrough to return to working as expected.

 

Rich

 

diagnostics-20180624-1740.zip

 

Most likely what is happening is that efi-framebuffer is being loaded into the area of memory that the GPU is also trying to use when unRAID is booted in UEFI mode. This thread explains the who, how, what, why and how to fix it if your issue is the same as mine was.

Link to comment
10 hours ago, ryoko227 said:

 

Most likely what is happening is that efi-framebuffer is being loaded into the area of memory that the GPU is also trying to use when unRAID is booted in UEFI mode. This thread explains the who, how, what, why and how to fix it if your issue is the same as mine was.

 

Thanks for the heads up ryoko. I gave the three commands a shot and it did sort out the syslog flooding, but sadly didn't solve the single thread at 100% or allow the VM to boot, so looks like i'll be continuing with UEFI disabled for the moment.

Link to comment

Is there anything in this upgrade that could have caused the following message when I try to ssh into the server?

 

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.