PSYCHOPATHiO Posted June 16, 2018 Share Posted June 16, 2018 On my Ryzen 1700x with the Asus x370 motherboard all Windows VMs with GPU passthough are not functional at all. VMs workes great at first boot after installation but when restarted a couple of times they start lagging and hanging and take over 2 minutes just to load the main desktop & from there it is very difficult to do any task. Once the GPU is removed I can remotely log in without an issue and everything will work fine, I think it is an issue related to GPU passthorugh the AMD side the system momory is at default speed and other VMs are working without an isse the VM has 8 cores + 8GB ram + 1 GPU (GTX 1060 6GB) passthrough not system gpu. I eventaully solved it by moving unRAID to my older i7 6700k on Gigabyte Z170X board. the VMs are running smooth with over 10 reboots and still testing nothing is gone wrong till now. I think its an AMD related issue. Quote Link to comment
ryoko227 Posted June 17, 2018 Share Posted June 17, 2018 @PSYCHOPATHiO out of curiosity, did you update to the newest BIOS for the Asus board? I've found that when unRAID is doing really weird and unexplainable stuff that its generally related to the BIOS. Quote Link to comment
PSYCHOPATHiO Posted June 17, 2018 Share Posted June 17, 2018 2 hours ago, ryoko227 said: @PSYCHOPATHiO out of curiosity, did you update to the newest BIOS Always mostly day of release, I have a Bookmark folder for daily bios updates checks for all my boards. Once there is an update I check of the changes & start applying my settings, amongst them is disableing c-state & enabling vm & IMMO settings. I have to mention that my memory is not on the compatible list so I keep it at stock speeds at 2133. I've used the machine under windows for a week with the memory set to 3000mhz and no issues. Every kernel panic cause the system to restart a parity check and with 8TB drives it takes for EVER. Quote Link to comment
nickp85 Posted June 17, 2018 Share Posted June 17, 2018 The rc1 post for this specifically said this: "However, we want to make this change in isolation and release to the Community in order to get testing on a wider range of hardware, especially to find out if parity operations are negatively affected." I'm not sure if 6.5.3 is the culprit since my parity checking only happens every two months but I am now getting repeated parity sync errors. First manual scan fixed 53 then the next one 123 run not even a day later. I have not backed out of 6.5.3 yet to see if that's the issue but I wanted to know if there is a way to change back the pre-empt setting in 6.5.3 so it behaves like 6.5.2 again and I will see if the parity sync errors go away. Have already run a 24h memtest with no issues. SMART is good too. I have 4 4TB WD Red and then 2 3TB WD Reds. The 2 3TB drives are plugged into a SATA PCIx1 card which was listed on the forums as compatible. IO Crest 4 Port SATA III PCI-e 2.0 x1 Controller Card Marvell Non-Raid with Low Profile Bracket SI-PEX40064 https://smile.amazon.com/gp/product/B00AZ9T3OU Built the system December 2017 and all parity checks were good until now. Quote Link to comment
BRiT Posted June 17, 2018 Share Posted June 17, 2018 The entire 6.5.3.x line should have no impact on actual parity functionality, just possibly parity speed performance. No one had encountered any issues up until your report. What happens if you run a third manual parity scan today? Quote Link to comment
Frank1940 Posted June 17, 2018 Share Posted June 17, 2018 2 hours ago, nickp85 said: The rc1 post for this specifically said this: "However, we want to make this change in isolation and release to the Community in order to get testing on a wider range of hardware, especially to find out if parity operations are negatively affected." I'm not sure if 6.5.3 is the culprit since my parity checking only happens every two months but I am now getting repeated parity sync errors. First manual scan fixed 53 then the next one 123 run not even a day later. I have not backed out of 6.5.3 yet to see if that's the issue but I wanted to know if there is a way to change back the pre-empt setting in 6.5.3 so it behaves like 6.5.2 again and I will see if the parity sync errors go away. Have already run a 24h memtest with no issues. SMART is good too. I have 4 4TB WD Red and then 2 3TB WD Reds. The 2 3TB drives are plugged into a SATA PCIx1 card which was listed on the forums as compatible. IO Crest 4 Port SATA III PCI-e 2.0 x1 Controller Card Marvell Non-Raid with Low Profile Bracket SI-PEX40064 https://smile.amazon.com/gp/product/B00AZ9T3OU Built the system December 2017 and all parity checks were good until now. Might I suggest that you open up a new thread in the 'General Support' subforum and be sure to include a diagnostics file which contains the time period during which you had parity sync errors. I can tell you that SATA cards with Marvell chip sets have had issues in the past. (I am not saying at this point that is the your problem but there is precedence.) Quote Link to comment
nickp85 Posted June 17, 2018 Share Posted June 17, 2018 9 minutes ago, Frank1940 said: Might I suggest that you open up a new thread in the 'General Support' subforum and be sure to include a diagnostics file which contains the time period during which you had parity sync errors. I can tell you that SATA cards with Marvell chip sets have had issues in the past. (I am not saying at this point that is the your problem but there is precedence.) Thanks, I already have a support thread open. Wanted to say something here because of the comments made during the RC1 release. Quote Link to comment
PSYCHOPATHiO Posted June 18, 2018 Share Posted June 18, 2018 (edited) @ryoko227 After I have moved the system to another USB and with a single HDD for VMs only, I copied the same VM of windows 10 from the First machine over the network and I'm actually runing it without an issue on an HDD. The only change is I pushed the DDR to 3000Mhz & running stable for almsot 24 hours now. on an SSD and a GTX 1060 it becomes laggy, on an HDD and a GT 710 its going great, I'm confued lol EDIT: also running same number of cores 8 Edited June 18, 2018 by PSYCHOPATHiO 1 Quote Link to comment
DieFalse Posted June 18, 2018 Share Posted June 18, 2018 Been running for 4 days with no ill effect. The parity check speed is the same for me and completed in 9hrs for 32TB. Quote Link to comment
Mat1926 Posted June 18, 2018 Share Posted June 18, 2018 Did update few days ago, and everything is fine...Thnx a lot... Quote Link to comment
1812 Posted June 19, 2018 Share Posted June 19, 2018 I upgraded an older/smaller disk in my server. That went fine but it showed this: unRAID Parity sync / Data rebuild: 18-06-2018 18:46 Notice [Tower] - Parity sync / Data rebuild finished (0 errors) Duration: 4 hours, 48 minutes, 55 seconds. Average speed: 230.8 MB/s Is it combining the parity and disk speed? Because there is no way it hit 230.8 MB/s writing. Quote Link to comment
ren88 Posted June 19, 2018 Share Posted June 19, 2018 i just notice the green background theme Quote Link to comment
JorgeB Posted June 19, 2018 Share Posted June 19, 2018 5 hours ago, 1812 said: Is it combining the parity and disk speed? It's an old bug, average speed is calculated based on parity size, not disk rebuilt size. 1 Quote Link to comment
1812 Posted June 19, 2018 Share Posted June 19, 2018 5 hours ago, johnnie.black said: It's an old bug, average speed is calculated based on parity size, not disk rebuilt size. thanks! Quote Link to comment
Ambrotos Posted June 24, 2018 Share Posted June 24, 2018 Just updated from 6.5.2 to 6.5.3 a couple days ago. Initially I thought everything went smoothly, as I haven't noticed any behavioral/performance symptoms. However when I logged in this morning I noticed in the system log that I have a call trace in my syslog. (Diagnostics attached). Obviously I can't be 100% certain that this is related to 6.5.3 specifically, but I have never seen any traces at all in the past. (I've run pretty much every stable release on this hardware since the unRAID v5 days). Just seems coincidental that the call trace happened shortly after the software upgrade. It seems noteworthy that the trace is concerning netfilter/macvlan and that I do have somewhat of a "non-standard" networking configuration, in that I'm using VLAN tagging and a Mellanox ConnectX-2 10G NIC. As I said, I haven't noticed any behavioral systems or anything like that so I'm not too panicked about this. Just thought someone might like to take a look to see if any corner cases or incompatibilities of some recent change weren't covered during testing. Let me know if more information beyond the diagnostics file would be helpful. I'll monitor the logs to see if it happens again, and whether I can correlate it to a specific event. Cheers, -A nas-diagnostics-20180624-0755.zip Quote Link to comment
JorgeB Posted June 24, 2018 Share Posted June 24, 2018 27 minutes ago, Ambrotos said: It seems noteworthy that the trace is concerning netfilter/macvlan and that I do have somewhat of a "non-standard" networking configuration, in that I'm using VLAN tagging and a Mellanox ConnectX-2 10G NIC. This thread might help. Quote Link to comment
Rich Posted June 24, 2018 Share Posted June 24, 2018 (edited) I'm seeing this as well now. With UEFI boot enabled, a VM with iGPU passthrough doesn't boot, maxes out a CPU thread and totally fills the syslog with, kernel: vfio-pci 0000:00:02.0: BAR 2: can't reserve [mem 0xc0000000-0xcfffffff 64bit pref] Disabling UEFI boot stops the problem and allows the VM and passthrough to return to working as expected. Rich diagnostics-20180624-1740.zip Edited June 24, 2018 by Rich Quote Link to comment
Ambrotos Posted June 24, 2018 Share Posted June 24, 2018 9 hours ago, johnnie.black said: This thread might help. That definitely seems to be what I'm encountering. Thanks for the tip. I'll follow up in that thread. Cheers, -A Quote Link to comment
bigjme Posted June 24, 2018 Share Posted June 24, 2018 I'd put this update off for a while but I've done it today and no issues so far I had one vm stutter a few times after fresh server reboots but a sits just the one vm I'm assuming it's a Windows problem as a reboot fixed it Other than that, all good so far Quote Link to comment
ryoko227 Posted June 25, 2018 Share Posted June 25, 2018 8 hours ago, Rich said: I'm seeing this as well now. With UEFI boot enabled, a VM with iGPU passthrough doesn't boot, maxes out a CPU thread and totally fills the syslog with, kernel: vfio-pci 0000:00:02.0: BAR 2: can't reserve [mem 0xc0000000-0xcfffffff 64bit pref] Disabling UEFI boot stops the problem and allows the VM and passthrough to return to working as expected. Rich diagnostics-20180624-1740.zip Most likely what is happening is that efi-framebuffer is being loaded into the area of memory that the GPU is also trying to use when unRAID is booted in UEFI mode. This thread explains the who, how, what, why and how to fix it if your issue is the same as mine was. Quote Link to comment
Rich Posted June 25, 2018 Share Posted June 25, 2018 10 hours ago, ryoko227 said: Most likely what is happening is that efi-framebuffer is being loaded into the area of memory that the GPU is also trying to use when unRAID is booted in UEFI mode. This thread explains the who, how, what, why and how to fix it if your issue is the same as mine was. Thanks for the heads up ryoko. I gave the three commands a shot and it did sort out the syslog flooding, but sadly didn't solve the single thread at 100% or allow the VM to boot, so looks like i'll be continuing with UEFI disabled for the moment. Quote Link to comment
phbigred Posted June 25, 2018 Share Posted June 25, 2018 Upgraded without issues that I saw in rc2. Everything is working swimmingly. Quote Link to comment
PSYCHOPATHiO Posted June 25, 2018 Share Posted June 25, 2018 2 hours ago, phbigred said: swimmingly am I reading this correctly! Quote Link to comment
ootuoyetahi Posted June 25, 2018 Share Posted June 25, 2018 Is there anything in this upgrade that could have caused the following message when I try to ssh into the server? @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! Someone could be eavesdropping on you right now (man-in-the-middle attack)! It is also possible that a host key has just been changed. Quote Link to comment
CHBMB Posted June 25, 2018 Share Posted June 25, 2018 39 minutes ago, ootuoyetahi said: It is also possible that a host key has just been changed. That's by far the most likely scenario..... Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.