Snubbers

Members
  • Posts

    30
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

Snubbers's Achievements

Newbie

Newbie (1/14)

4

Reputation

  1. Thanks, I will certainly do that this time! I've just been reading up on reconfiguring the array, but honestly, just jumping through the hoop of letting it rebuild disk3 on the old Parity drive, then upgrade that afterwards isn't a big deal in the grand scheme!
  2. Hi, I had a drive 'disable' due to reallocated sector counts suddenly increasing.. (8TB Ironwolf NAS edition). The plan was to buy 2 x 14TB HDDS and 1. Swap the current 8TB parity drive with one of the new 14TB Drives 2. Once swapped, pop in the other new 14TB drive as a replacement data drive. (My plan for the old 8TB Parity is to actually replace the other remaining 8TB Ironwolf NAS drive in the array at at later date) (For reference, Disk 3 is the 'disabled' drive) So I started following the wiki for Parity Swapping A. I pre-cleared both 14TB HDDs, then stopped the array. B. I assigned the one of the new 14TB HDD to the Parity Drive Slot C. I assigned the old 8TB Parity HDD to the missing data drive (Disk 3) D. Both had blue icons next to them and a tick box / copy button appear, which I ticked and clicked 'copy' E. I waited 30 hours until copying completed! All good so far! So it's completed copying and I think it probably did have the 'Start' button ready for me, but this is where I went off piste and messed up! Since I didn't want to rebuild Disk 3 on the old 8TB Parity drive, but on the second new precleared 14TB Drive, I simply assigned Disk 3 to the second new 14TB Drive and erroneously assumed it'd keep the start array button and want to rebuild on to whatever was assigned in Disk 3.. Nope.. doing this just had 2 'x's against the parity and disk 3 slots and 'too many errors' preventing the array starting.. So I switched disk 3 back to the old 8TB Parity drive and that showed 'blue' icons against Parity/DIsk 3 and it reset the parity copy process back to the beginning.. I have kicked off the parity copy again, but once completed tomorrow, how should I proceed considering I want to actually replace the data drive with a new 14TB one.. I think the long way would be to let it rebuild the old parity drive as Disk 3, then once complete, stop the array and reassigned disk 3 as the other 14TB drive, and let it rebuild that.. Or is there are slightly better shortcut?
  3. Thanks for the help! I've fired up a Win 10 VM (Fresh) and instantly getting 250-320Mbps , as the Win 10 VM is updating whilst running it, that's about right! speedtest tracker is still 55-60Mbps. Additional checks 1. I've switched speedtest tracker to use br0 (Same as the Windows 10 machine) and no change 2. I've checked the 'Interface' stats on the unRaid dashboard and that shows the VM @ 300+Mbps and the Speedtest Tracker @ ~60Mbps It does look isolated to the speedtest dockers for some reason?
  4. I've always been having this issue (with the speedtest app and speedtest-tracker app). Effectively my ISP provides 350Mbit down / 35Mbit Upstream.. I can max out the connection in SABNZBD and other download dockers, but the two speedtest ones seem to cap at 60Mbit/s down. I've tried Bridge/Host network modes to match the other dockers that can max the connection out but nothing changes, it's still capping. I've checked the speedtest endpoint which is the one closest to me, and running speedtest from any browser on any other device in the house maxes out the connection to the same server absolutely fine.. It's almost like OOKLA's CLI is limiting things? I get good network speeds in the house (I can saturate the gigabit connection if transferring files etc.) Any ideas?
  5. Another happy user: I just picked up the RM850i second hand and negotiated based on the fact I presumed the 'i' part would be useless on UNRAID so got a bit knocked off.. Imagine my surprise when I suddenly came across this in the app store and plugged in the USB cable internally and it just worked! The only stat I was slightly intrigued about was the overall power draw, as I have 5 HDDs, 2 NVME Drives and 3 SSDs with a GTX1660 Super this is about what I'd expect when not doing too much.
  6. Nice! OT, my 1660 super keeps dropping off the bus again within a few hours of rebooting. I've written a user script that is scheduled every 10 minutes to just run nvidia-smi and put a log entry to say if it's present or not to see exactly when it drops off the bus. Currently you only know if you have GPU Stats installed so when you login to the WEBUI it fires up nvidia-smi and that's when you realise, but since I login fairly infrequently I don't know if there is a pattern to the GPU disappearing or if it's truly random.
  7. Or maybe "Hi Gee1" this thread is for specific issues with the unraid NVidia Plugin, to request changes to Dockers, see the support thread for that docker (should be in the following sub-forum: https://forums.unraid.net/forum/47-docker-containers/ , if not the app store entry for the docker should link to its support thread"
  8. It looks correct to me, you will notice there are 2 bar graphs per line, representing two cores, CPUx means it think that's a full Core, HTx means it think that core is a hyper threaded (or equivalent, partial core) linked to the first one. i.e. CPU0 - HT1 [ ]0% [ ]0% So with 2 'cores' represented per line that is 64 cores shown overall which gives the correct total. Why it thinks half the cores are Hyper Threaded might be a foible, or it might be that generically the opeterons have some architecture that does have some shared resources between cores and so it's probably best classified as Hyper Threaded. [edit] Ahh, OK the opteron 6380 is based on Pile Driver cores, these are an improved version of the Bull Dozer cores, which do share some resources between two threads, in which case, the "HT" naming of alternate CPU's is a bit of a generalisation but warranted.
  9. Just to update my own 'issue' and say I think I've tentatively found a solution after a bit more problem solving Issues 1. Multiple Log entries stating "kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]" and "kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs" This correlated with nvidia-smi being called (either manually or by GPU Stats plugin). Solution - It seems moving this to a different PCIE slot (I have two x16 slots, although not x16 if you use both) has stopped this error from appearing 2. The GPU dropping off the bus randomly and then being held in reset (fans at full speed) with nvidia-smi reporting a Lost GPU. Solution - After trying many things: - Moving Slots (this fixed issue 1, but had no effect on this isue) - Passing through to a Win 10 VM - could not get it to work (error code 43) despite the correct VBios and stubbing the additional 2 devices (The 1660 super appears as 4 devices), however I believe this also had issues with the GPU getting 'lost' as the VM could not start randomly and I'd need to reboot. - Native booting to Win 10 worked fine, no issues (I had to remove the HBA Card to ensure no changes to my array could occur) - New BIOS revision - Made no difference - Changing power supplies - Made no difference - Finally (and why I didn't try this earlier) I tried to run memtest from the unRAID boot menu which would just reset the PC and it would never load memtest. I found out that you can't run memtest if booting using UEFI, so disabled that in the BIOS. Memory testing passed 24 hours of testing, however I then remembered reports that VM Pass through could be problematic with UEFI enabled, so kept it disable and booted to unRAID in legacy mode, and it's been 3+ days now without the GPU falling over. I am a bit tentative, however, the UEFI being flaky with GPU's for VMs and it affecting the Linux Drivers is plausible so I'll update again if it makes it to a week. I'm also running GPU Stats again (no multiple BAR reports still). TLDR: - I think/hope I've fixed my problem and just want to share incase it helps anyone else, disabling UEFI seems to have got me a nice stable system. Just the plex issue of not managing the power modes correctly which is definitely not an issue with this plugin!
  10. Just a quick update on my 1660 super getting 'lost'.. I've now ruled out the HW (I think) - Booting directly to a windows 10 native install (I just removed the HBA controller and unRAID USB Stick and used a spare 120GB SSD with a fresh WIN 10 install) I have run GPU tools for 3 days solid with no issues seen - I've then booted to unRAID (non-nvidia) and passed it through as the primary GPU to a Win10 VM and that's run diagnostics for 2.5 days with no issues) All I can think is the 440.59 linux drivers don't sit nicely with my Asus 1660 Super OC Phoenix GPU / Ryzen 3600 / Asus B450F Motherboard. I guess in the spirit of this plugin all I can do now is not use the GPU until the next unraid build is released and hopefully the plugin can be updated to the latest drivers. I appreciate LinuxServer are busy so I can't expect anything more than they've stated. I have popped a +1 post in the feature request for native unRAID support for NVidia Drivers. I'll focus on fixing the one or two niggles I've not yet resolved and sit patiently with fingers crossed!
  11. Just to add another hand in the air for someone who would love not just baked in Nvidia GPU drivers, but also the ability to update said drivers if required. I absolutely love unRAID now, I'm a bit of a convert, and love the support, have no expectation that people should support plugins etc, so will not be complaining that limetech haven't 'fixed' my issues yet or anything, I just wanted to say that this feature (and presuming some ability to upgrade nvidia drivers) would be very much appreciated., especially as it might well solve one of the last niggles I have (my 1660 super falling off the PCI bus randomly every day or so). I like the talk in the thread about having it optional, we all use our servers for our own reasons that are most important to us. I also think (as someone who has been developing all manner of software and dealing with OS issues for 30 years) that I can imagine opening up things like driver support for GPU's is a bit of a minefield (however, my experience of unRAID and HBA controllers has shown it's pretty good at this!). Anyway this is just a +1 to the OP's request..
  12. I have quite a collection of USB memory sticks, so when choosing which one to use for unRAID I tried a few, from USB 2.0, USB 3.0 and USB 3.1. By far the most concerning one was the Sandisk Ultra Fit 3.1, just sat doing nothing in a USB slot it gets inordinately hot and I worried about it's longevity so tried all my options and found the Sandisk Ultra USB 3.0 16gb/32gb sticks to be performant with no heat issues. As mentioned the only theoretical advantage of faster USB drives is when unRAID boots, it loads everything in to RAM so you might see a marginal improvement in boot speed using a faster USB stick, but once booted the USB stick is unused as it's all down to your CPU and RAM as to how fast unRAID runs. Dockers / VMs etc will also depend on the speed of the cache drive (if using one) and how much it accesses the main array etc, but again, nothing dependent on the unRAID USB stick speed.
  13. I run a 3900x / X570 setup for my main gaming PC and a 3600 / B450 for unRAID. PSU shouldn't hugely matter.. the 3900x stock + Motherboard should be < 200W (much less when idling) 8 HDD's (2A each at power up) would be ~200W (much less when idling) Assuming SSD's and other items, you'd probably be OK with a 650W.. However, if it was me, I'd go for an 850 so a GPU + Overclocking and more HDD's would't need a PSU upgrade. I can say that my 3900x on a EVGA Bronze 650 does not overclock that well, it's not the wattage, I think it's the PSU's ability to handle sudden changes in power requirements, my Seasonic 850W Focus Platinum allows higher overclocks. I was fortunate to also pick up a cheap HX1200i corsair PSU (A mining rig spare someone was offloading) and that is obviously overkill, but also like the Seasonic I get the same high overclocks for my 3900X. RAM, aside from ECC/non-ECC the only advice I would give is that Ryzen 3rd gen supports higher memory speeds and they can make a performance difference. I have 32GB Of DDR4 3600 C18 in my unRAID box (Corsair 3800 C18 as it is good VFM). I am not running ECC (have on previous servers) but it's a consideration as your motherboard does support un-buffered ECC.
  14. Thanks for the help. I uninstalled the GPU Stats plugin and rebooted. The issue happened again within 10 minutes (Checking with nvidia-smi I get the same GPU Lost message) I rebooted, and it maybe lasted 4 hours or so before happening again. I've used the nvidia-bug-report.sh that is mentioned when nvidia-smi loses the GPU and also carefully checked the syslog 1. Despite my GTX1660 Super being on the 440.59 supported list (checked on nvidia.com). the nvidia-bug-report.log files states "WARNING: You do not appear to have an NVIDIA GPU supported by the 440.59 NVIDIA Linux graphics driver installed in this system".. 2. Trawling through various logs, I found the error code XID79 just before the GPU went missing on one occasion, on the Nvidia developer site, this unfortunately can be attrutable to pretty much anything, HW error, Driver Error, Temperature etc.. 3. I've been checking the temperatures / HW state of the card, after boot it's in P0 (12W out of 125W) @ 33C, it them occasionally bumps up to P0 (26W/125W)@44C, so even when plex uses the card, 44C is barely ticking over, so pretty sure it's not temperature. 4. I think (looking at logs) there could possibly be some correlation between drives spinning down and the GPU crashing (or it may well be coincidence), I would like to try bulk spinning down/up the drives to see if power spikes might be upsetting the GPU, as I know HDD's draw the most power when they are spinning up.. [edit] - I found example user scripts to spin down/up all disks and tried those several times, whilst the GPU is idle and whilst transcoding a 4k HDR, no issues found. 5. I did at some point (more of a quick trial) have some User Scripts to 'tweak the driver for obvious reasons' and also to bump the card back to it's lowest power setting.. I haven't had these enabled for some time, so I've deleted the scripts entirely , and re-installed the unRAID-NVidia 6.8.3 from the plugin just to 'clear' things out.. 6. with 100% repeatability, I can trigger the "caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARS" and hte associated memory spanning message by just running nvidia-smi to check the GPU is still there, I did this every 5-10 minutes over lunch and everytime I get an associated message in syslog. So nothing conclusive yet, some observations, some clutching at straws, but I sense maybe some experimentation and discussion might prompt something of note.. One test I 'may' do is to go back to the normal unRAID build, and pass the GPU through to my windows 10 VM (it's only spun up once in a blue moon) and run something GPU intensive on that and see if it ever loses the GPU, whilst this is changing a few too many variables at once, it would at least indicate the HW itself is OK (Power/Temperature concerns etc)..