jsebright

Members
  • Posts

    68
  • Joined

  • Last visited

Everything posted by jsebright

  1. OK, so I couldn't resist having another go. This is what I've done and the issues Made sure the AMD Cezanne graphics and driver were uninstalled from the VM Added the GPU and audio to the VM Fiddled with the Advanced settings xml so that the passed through audio bus was the same as the graphics, and the function was 0x1 (sort of matching the devices pre-passthrough). This might not be necessary if we're not installing drivers. Booted up the VM. Black screens and nothing. But, I could remote desktop into the VM, so I did that and rebooted it. It came up with a screen. Made sure the keyboard & mouse was attached, and logged in. Rebooted - came up straight away with the screen. Changed the resolution to match the monitor, rebooted and it's still good. Shut it down, then rebooted and all fine. I even tried pausing the VM from unraid then starting it and that worked. So - quite successfull. But - I can only see one monitor and I have two plugged in (I think both connections are plugged in at the server anyway!). Display adapter in Device Manager shows "Microsoft Basic Display Adapter". Obviously not making the most of the chip, but it's perfectly usable. I'm not going to go further and try to install any driver updates as it's got a high probability of jamming something and the server might require a hard reboot. It's possible to get this far. Good luck!
  2. Can't remember exactly, but VM is Q35-5.1 (latest Q35 for 6.9.2) with SeaBIOS. I think that some time ago I had to move from i440fx to Q35 to get a GPU passthrough working. I've not tried much recently as I don't want to crash the machine again, but I think the first try I hadn't even done the VFIO binding. Tempted to have another try to see if I can get something basic working without the drivers that doesn't crash the system following a VM shutdown / startup. Will post back when I do that.
  3. Have just upgraded to this CPU with the aim of removing a graphics card and reducing power consumption, but still hoping to pass through to a Windows VM. Initial passthrough worked OK (I think). (I have not tried to mess with the GPU bios.) Then trying to install the drivers I had some problems. Driver install seemed to hang, then once I'd got round that (possibly), then I found that turning off the VM caused horrible problems with CPU usage going crazy and the system becoming unresponsive. A reboot was the only solution. Since then I've avoided this hoping that there will be some fixes in a new driver, or in 6.10 (not ready to try a beta build yet as I only have the one server). I have meddled with an Ubuntu VM, but couldn't get that to pass through. Am tempted to try doing a clean windows install to see what happens, but am wary of locking everything up again, also might try and see how I can get on without drivers as I'm not worried about gaming - just occasional desktop use. I hadn't considered dumping the bios might help as I haven't needed to go that route before with a stand alone (very old) AMD card that I passed through. Don't really want to put the old card back in, but that thought has crossed my mind. Interested to follow this and see what progress is made.
  4. ing Chia... ey at path: /root/.chia/mnemonic.txt ey at path: /root/.chia/mnemonic2.txt ot directory "/plots". ot directory "/plots1". ot directory "/plots2". ot directory "/plots3". ot started yet daemon vester: started mer: started l_node: started let: started annot create directory '/root/.chia/flax': File exists ing Flax... ey at path: /root/.chia/mnemonic.txt ey at path: /root/.chia/mnemonic2.txt ot directory "/plots". ot directory "/plots1". ot directory "/plots2". ot directory "/plots3". ot started yet daemon vester: started mer: started l_node: started let: started ing Plotman... ing Chiadog... Chiadog... ing Flaxdog... Flaxdog... Machinaris API server... Machinaris Web server... d startup. Browse to port 8926. I might have restarted the docker since - not sure sorry. With the usual first few characters cut off - If you know where I can get all the output I'll update it. - still getting plotting failures. This page https://github.com/madMAx43v3r/chia-plotter/issues/574 seems to suggest it may be RAM related. I have had some other crashes with the intense processing (otherwise the server is stable, but only lightly used), so I might do a full reboot and see what I can tweak.
  5. Sorry, that's something that took me a while to get right, and I couldn't see what was wrong with it. I'd suggest you try to revert to defaults and as simple as possible. Check the drive paths are mapping to where they should be. Can't offer any more help as I don't know enough...
  6. @localh0rst I had the internal server error. Logged into the docker and ran "flax init" that seemed to start it for me. May or may not work for you...
  7. Ah yes. Why didn't I think of that? Has the advantage that it doesn't get killed if the docker restarts. Watching a few together in just one window is pretty good though, and a bit less to keep track of.
  8. Nice. Haven't seen that before. I couldn't seem to open two windows of the unraid docker console, but I did find I could just use the command watch -n10 du -sh /plotting /plotting2 and it watches both folders in one window.
  9. Thanks for your reply. Seems like the runs are crashing sometimes. The tempdirs have files from different runs leftover in them. I've just checked my config against the wiki sample (was doing that as I saw your previous post). I've got max jobs set to 1, but had the staggers set differently - might have been causing the issue. I've got the threads set to 4 - I've only got 5 pairs of cores available (Ryzen 2600 with a pair pinned in case a VM wants a look-in) and it does tend to run them at about 90%. Perhaps this needs taking down from the default? - I'll see how it goes with the default stagger options before reducing it. Thanks again for everything your doing with this.
  10. It also took me some time to get MadMax plotting to work. I also think there's an issue that it's not clearing temp folders up. I've got two ssds that I'm using and have one of them as the primary temp, and one as tmp2. They are slowly filling up until plotting stops. After it's all ground to a halt - see plotting and plotting2 : root@Tower:/# df -h Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1p1 932G 673G 259G 73% / tmpfs 64M 0 64M 0% /dev tmpfs 32G 0 32G 0% /sys/fs/cgroup shm 64M 60K 64M 1% /dev/shm shfs 3.7T 893G 2.8T 24% /plots /dev/sde1 447G 313G 135G 70% /plotting /dev/nvme0n1p1 932G 673G 259G 73% /id_rsa /dev/sdf1 447G 447G 24K 100% /plotting2 /dev/sdb1 7.3T 7.2T 153G 98% /plots2 /dev/sdc1 7.3T 7.2T 153G 98% /plots3 tmpfs 32G 0 32G 0% /proc/acpi tmpfs 32G 0 32G 0% /sys/firmware I'm having to stop plotting, clear the files, and start it off again. Don't know if this is a MadMax or Machinaris issue. @guy.davis ?? This is still faster than the chia plotter, I don't have much diskspace left...
  11. Just did a "check for updates" and it's available. Just waiting for some plots to finish. Looking forward to the latest version.
  12. What's the best way to upgrade the unraid docker if it doesn't show "upgrade available". Will an edit of the settings pull the new images? PS: many thanks to @guy.davis for making and supporting this. Good luck with your farming.
  13. This problem occurred again, then I think I worked out what was going on. The device was "disappearing" when I started a VM, but only a certain one. I had had to fiddle with it a day or so ago as it wouldn't start. Something must have got messed up meaning the VM was trying to take control of the nvme drive. I could spot the device in the xml, but am not confident enough to edit it. Just saving the VM settings from the forms view didn't clear the device, but selecting all the possible usb devices and the one pcie device, saving, then clearing them all and saving seems to have sorted it out. Thanks for the support - I know a bit more about checking disks now.
  14. Am up to date on Bios - a reasonably new one that's been in for a few weeks before this issue. Have added the script. Will fix the other issues and see how it goes. Thanks both.
  15. Ah, thanks. Just cancelled it and rebooted. This is an unassigned drive - not part of the array. So it looks like /dev/sdh1 is OK for this. The BTRFS cache issue was the primary error (and still probably is). It's just that @JorgeB spotted another issue to do with this other drive that needs fixing. One problem turns into two...
  16. Hi @JorgeB Many thanks for your support on this - really appreciate it. Have now rebooted (switched off auto start of dockers & vms before doing this). scrubbed the disk again to fix the errors. Will double check the error count and zero them. Trying to fix the UD URBACKUP disk I get the following: root@Tower:~# xfs_repair /dev/sdh Phase 1 - find and verify superblock... bad primary superblock - bad magic number !!! attempting to find secondary superblock... .found candidate secondary superblock... unable to verify superblock, continuing... .found candidate secondary superblock... unable to verify superblock, continuing... ................................................ The dots then continue to fill up the window - not sure how long it will take but I'll just leave it running. When that's done I can reboot again and will take some diagnostics hoping that it's clean. It's still unclear why the nvme drive is dropping off. I can try re-seating them but they've been good for quite a few months at least.
  17. Errors makes sense - as I think you put in the FAQ, some of this is not as obvious as it could be. Diagnostics post cache fixing attached - but with the drive "missing" - although it was showing in UD. tower-diagnostics-20210425-1050.zip
  18. Checking btrfs dev stats showed lots of errors. I ran scrub firstly without the "Repair corrupted blocks" option and it showed the following (whih I first thought was no errors, but presumably is). UUID: e8b8d9ec-0ad2-4867-b3cf-87b43a0d9d15 Scrub started: Sun Apr 25 07:16:27 2021 Status: finished Duration: 0:03:07 Total to scrub: 1.10TiB Rate: 6.02GiB/s Error summary: verify=1438 csum=167501 Corrected: 0 Uncorrectable: 0 Unverified: 0 I then ran it with the "Repair corrupted blocks" option anyway, and it corrected loads of errors. (Seems a bit odd that the first "scrub" didn't highlight that there was something that needed correcting. Then I looked at the line "Error summary: verify=1438 csum=167501" from the first run). Finished, but odd that the "verify" number is a bit short of the first scrub run, and the corrected doesn't match the csum (but I don't know if it should). UUID: e8b8d9ec-0ad2-4867-b3cf-87b43a0d9d15 Scrub started: Sun Apr 25 07:28:00 2021 Status: finished Duration: 0:03:07 Total to scrub: 1.10TiB Rate: 6.02GiB/s Error summary: verify=1347 csum=167501 Corrected: 168848 Uncorrectable: 0 Unverified: 0 ran another scrub just to be sure, and this time there's clearly no errors. UUID: e8b8d9ec-0ad2-4867-b3cf-87b43a0d9d15 Scrub started: Sun Apr 25 07:39:58 2021 Status: finished Duration: 0:03:03 Total to scrub: 1.10TiB Rate: 6.15GiB/s Error summary: no errors found Set up the script from the FAQ on the main pool and the cache pool so I will get notified of errors. Then started some dockers - OK, started some VMS and then I had the error again - repeating every minute, just in case I missed it the first time: Not sure what do do now. Can't check cables as there are none - the nvme's are in the motherboard. Time to take some extra backups (assuming it's not too late!)
  19. Thanks, have seen recent errors on that. Thought it was due to backups from locations with intermittent connectivity. Pool status looked ok, but will get some diagnostics. It's currently doing a parity check due to an unrelated unplanned reboot. Sent from my ONEPLUS A5000 using Tapatalk
  20. Suddenly had the message "Cache pool BTRFS missing device" The pool which I use for VMs and Docker, running from 2 nvme drives had a problem that a drive suddenly went missing. I closed the VMs, took a diagnostics, and rebooted. On reboot, the pool appears fine and the VMs & dockers are running, but I don't know what to do to check the health of the cache pool - will BTRFS have fixed any differences automatically, is there something I should do to force checks, is there something I must avoid doing after this issue? Diagnostics attached (there might be a mess of other issues in there as I have a tendency to fiddle...) tower-diagnostics-20210424-0857.zip
  21. I've just upgraded from 6.8.3 to this beta and Wake-on-lan is working fine for me just as it was before. I think it's more a motherboard setting than Unraid... My sleep settings show set "g" before going to sleep.
  22. This is great! I've been coming back to this forum multiple times a day waiting for an update. Just today I'd decided to clear & reformat my cache as I don't really trust btrfs, and this update has come just at the right time for me to upgrade with an empty cache and reconfigure things nicely. Hope that the "various non-work-related challenges" have all been resolved and you and your families are safe & well. Thanks for providing such a great system. Will be moving to this in the next day.
  23. Awesome, thanks for confirming.
  24. Any chance this could be acknowledged and marked for a future fix (even if it's not in 6.9) ? To reproduce: Make a VM with a disk (I called it TEST). Don't set it to start straight after creating. Change the name of the VM (to TEST2 for example). Edit the VM (I changed the memory allocation). Click "UPDATE" and wait... Give up and click on the "VMS" tab to get away from the hang. Note that on going back in to Edit the VM, your change didn't stick. Workaround: When editing the VM (in "FORM VIEW") note that the "Primary vDisk location" is set to "Auto" and has the wrong path (based on the updated name). Change that to "Manual" and it finds the right path with the original name. Make other changes and "UPDATE". Note that if you go back into edit the VM the "Primary vDisk location" is still on "Auto" and pointing to the incorrect location. Any other changes you try to make will need the same "Manual" switch. I think the bug is that the "Primary vDisk location" sticks on Auto and won't change to Manual. (Perhaps due to the way the VM was created, or because the file is in the regular "domains" share, named vdisk1.img ... )
  25. It's been doing this for ages (for me at least). If the drive images are not in a folder that matches the machine name then you can't update it because the vDisk location defaults to "Auto". Change this to Manual and it works fine. This needs changing really - if the location doesn't match auto, can the editor show Manual and the real location?