Warrentheo

Members
  • Posts

    311
  • Joined

  • Last visited

Everything posted by Warrentheo

  1. From his lsscsi.txt file: [0:0:0:0] disk JetFlash Transcend 16GB 1100 /dev/sda /dev/sg0 state=running queue_depth=1 scsi_level=7 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/0:0:0:0 [/sys/devices/pci0000:00/0000:00:01.2/0000:01:00.0/0000:02:08.0/0000:05:00.3/usb4/4-4/4-4:1.0/host0/target0:0:0/0:0:0:0] [6:0:0:0] disk ATA WDC WDS500G2B0A 90WD /dev/sdb /dev/sg1 state=running queue_depth=32 scsi_level=6 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/6:0:0:0 [/sys/devices/pci0000:00/0000:00:01.2/0000:01:00.0/0000:02:0a.0/0000:07:00.0/ata6/host6/target6:0:0/6:0:0:0] [13:0:0:0] disk ATA SanDisk SSD G5 B 00WD /dev/sdc /dev/sg2 state=running queue_depth=32 scsi_level=6 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/13:0:0:0 [/sys/devices/pci0000:00/0000:00:08.3/0000:0c:00.0/ata13/host13/target13:0:0/13:0:0:0] NVMe module may not be loaded and his df.txt Filesystem Size Used Avail Use% Mounted on rootfs 16G 610M 15G 4% / tmpfs 32M 652K 32M 2% /run devtmpfs 16G 0 16G 0% /dev tmpfs 16G 0 16G 0% /dev/shm cgroup_root 8.0M 0 8.0M 0% /sys/fs/cgroup tmpfs 128M 244K 128M 1% /var/log /dev/sda1 15G 215M 15G 2% /boot /dev/loop0 9.2M 9.2M 0 100% /lib/modules /dev/loop1 7.3M 7.3M 0 100% /lib/firmware so it doesn't look like they are detected... Just the SSD's
  2. Ah sorry, I knew it defaulted to 2 Parity, but I thought you could select more from a drop down underneath the 2 like you can with the data drives... Haven't been able to shut my array down to double check... Maybe you would want to replace and rebuild one drive at a time then, that way you are not without parity protection... Just in case...
  3. I think you can just shutdown the pool, and add as many parity drives as you wish, if I remember correctly... Once the new parity drives are up and running, you can use the "New Config" tool in the tools pane to take drives out of the pool, and just check the "Parity drive is already up to date" to turn the pool back on... afterward, you can use the old drives as you see fit...
  4. Update, 10 days later... It appears that the data for every drive in the pool somehow got very corrupted... I am very miffed about this (around 11 TB possibly lost), but I want to be clear ahead of time, I am NOT expressing anger at UnRaid or Lime Technology, since I am sure about 95% of this is my fault somehow... My Main goals here are to figure out how this happened in the first place, and if possible find a way to keep it from happening to someone else... I realize there are several steps where I got punch happy and did things I even knew at the time I should not be doing... That said, total array corruption is a pretty major error even with it probably being mostly my fault, and I want to prevent that from happening to someone else... I was trying to reset the cache pool back to empty when I got stuck trying to reboot... The data corruption occurred after I forced a hard reboot after the soft reboot seemed to fail for multiple minutes... Is some part of the array/pool file table stored on the cache drive? Did wiping the cache pool somehow erase file table for the pool? I upgraded the SSD's in my cache pool around August of 2019, and that wipe and replacement of the pool went without a hitch... My array after the corruption went from around 28 shares and around 11 TB of data to only 2 shares left, and about 40 GB of data left... This weird partial corruption is very difficult for me to figure out, because every drive appears to have these same 2 shares left, they are not spread evenly with the 4 drives, but the data corruption was not total, somehow 5% of the data remained...? The XFS recovery software I am using says that nearly the entire drive ended up in the "Lost+Found" folder, with some of it apparently recoverable... I think I will be able to recover the large media files that way, mostly the BluRay Backup Dumps... Figuring out the small files is going to be very tedious since a decent part of the array was a Steam Games Folder, so a huge portion of the small files are going to be meaningless game files... Is this somehow caused by hardware failure of the CPU/RAM/Mobo? I suppose I need to shutdown and do a full long form hardware scan just to make sure... I realize now that forcing the hard shutdown when the kernel clearly didn't want to was a mistake, but I would expect minor Bit Rot, not near total corruption from that sort of thing... Currently I still trying to do a post mortem of the issue, I have been trying to do data recovery tools, but they take like 2 days to run on a 6TB drive, so that has been slow... So far the best solution I have found is manual data recovery where I "UnDelete" the entire drive, then manually rename a few hundred thousand files from memory, and pray that the files are whole... Any feedback or assistance on this process, or even a pat on the back and commiseration would be appreciated...
  5. Both gui, SSH and even local direct all show the same info, the data isn't in the root of the drives anymore, No files were moved in the main pool ( I just emptied the last few files off the cache on it ) Files were there before the hard reboot, and were all gone afterware Thank you for the replies, I am out of school, and will get home and try some of these suggestions and see where we are at...
  6. The cache was used primarily as a cache, and VM virtual disks, with some other minor knick knack files thrown in... 98% of the data on the server was under the main pool stored on magnetic media, only 1 share was set to "Cache Only"... I disabled VM machines, disabled Docker, deleted their associated image files, and had moved all other files off of the cache... it was reporting empty when I attempted to stop the pool, remove the cache drives with the "New Config" option (and only the cache drives were dropped) and perform a blkdiscard on the cache drives, I then attempted the reboot, which locked up the computer... Those shares were there before this reboot, and after the hard reboot, it appears every drive was somehow affected and removed the files... The drives still report the correct size of data stored on them, so I don't think it is gone, just no longer accessible through the normal shares... Is there a place where the files are stored that I can use to try and rebuild the pool, even if I have to do it manually...?
  7. I have checked the individual /mnt/disk?/ locations, they also reflect the same 2 shares that are reflected under /mnt/user/ I am trying to find a way to restore the original files if possible... or possibly find where the files got lost to and resetup the pool and copy things back manually...
  8. That occured just before the issue and was intentional, but I don't think it is related to the actual issue... I had corrupted the cache pool about a week ago, and repaired the issue, however I ended up deciding the repair was less than ideal and decided to format are restore from backup the data I had on it... The format worked as intended and I was getting ready to perform the restore, I issued a reboot command, when the system locked up for almost 15 minutes (longest previous reboot was around 5 minutes) I was unable to shutdown with any method, so I was forced to hard shutdown the system, the pool appeared to be corrupted after this reboot... (Empty cache pool appears to be working correctly)
  9. Hello, I need some help, I was forced to hard reboot my server (I run a windows gaming system on top of it as well). when the system returned, only 2 of the 25 or so file shares are now present... I have booted the array into maintenance mode, and did an xfs_repair on each drive in the array, but this didn't seem to help... I know a bit about linux, but I also know that I should ask for help before proceeding... The drives appear to still have the correct amount of space on them consumed, so I think the data is still there, I just can't seem to find a way to safely access it... Your assistance is greatly appreciated here... qw-diagnostics-20200712-1912.zip
  10. I just got notified that the plugin is not known to the community apps plugin or the fix common problems plugin, and when I search for it now, it doesn't show up in the list of plugins that can be downloaded.... When I search for "NVIDIA" in the community apps plugin, it no longer shows up... Rebooting the server, and uninstalling the GPU statistics plugin changed nothing... Did this plugin just go unsupported? or is there a glitch in the community apps plugin? if it is now unsupported, I will be very sad to see it go... 😢 Still thank you to the author no matter which is the case...
  11. It is a 2Tb pool, 80% full, and was originally in raid0, the only issue it had is being booted once with one drive missing... I am trying to see if it is possible to start it again in raid0, or is the array already corrupted?
  12. That is the correct, the question is is it possible to skip the conversion to the normal raid1 mode, and leave it in the raid0 mode...
  13. Hello, I recently moved locations, and during my move, on of the cache drives that I have in a raid0 btrfs pool came disconnected... Array got started with only one of the 2 drives... After troubleshooting, I discovered the issue, and stopped the array, on the main screen the drive had become "unassigned"... When I re-add the drive, UnRaid now warns me that the pool will be formatted when the array is started (I think it is trying to boot back in normal mirror raid mode, and then will make me switch it back to a raid0 array) Is there a way to just re-add the drive in the original raid0? or am I forced to format the drives? qw-diagnostics-20200310-0910.zip
  14. I don't think Windows even thinks in the terms of IOMMU groups, most likely you will need to temp boot off of a Linux Live USB/DVD image... The commands are: for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*} printf ' IOMMU Group %s ' "$n" lspci -nns "${d##*/}" done You have the setup I am looking into getting as well, so that info would be useful 😄
  15. I also have this bug, rc5 to rc6, VM's show fine until one gets launched (either from GUI or CMDL), then it goes blank... I only have one VM, and so I am willing to rebuild the libvirt partition for testing if needed... qw-diagnostics-20191118-1759.zip
  16. Yah, I also have been looking into going this direction, was looking at a Crosshair VIII board, just need to know the IOMMU groups...
  17. Install the Community App Plugin, then install the version of Virt Manager maintained by djaydev (The other version is no longer maintained)... This will give you all the things you think are missing...
  18. Back up and running, so far no issues, I appreciate your help... Now we just need to get UnRaid updated to show a shares cache status on the main share page 😛
  19. Thank you for the reply, I will update with the results when it is completed 🙂
  20. I am about to swap out my M.2 Raid-0 and upgrade the drives in prep to swapping to X570/PCI-e 4.0... To that end, I need to completely remove all support for the cache from all shares and VM's, migrate everything to the main spinning array so I can nuke the current array so it can be sold, then setup the new array like a new setup... I want to make the move without any data mishaps, and was hoping to get some feedback on the steps needed to make sure that the move goes smoothly... Any tips would be appreciated... I am sorry if this is repeated somewhere else, I have been unable to see anything else related to this out there...
  21. I have an EVGA GTX1070, and have 6.7.3-rc2 installed... Has not given me an issue, I also would not expect the slightly updated linux kernel that came in rc1 to cause that sort of issue... Not much has changed in rc 1 & 2...
  22. an MSI GT240 should not need the ROM file added for it, only the newer GTX 10 series and RTX cards should need that... Older cards should boot with no special consideration... You do still need to pass the entire IOMMU group, which means you need to make sure you also pass through: 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GT215 [GeForce GT 240] [10de:0ca3] (rev a2) Subsystem: Micro-Star International Co., Ltd. [MSI] GT215 [GeForce GT 240] [1462:1995] 01:00.1 Audio device [0403]: NVIDIA Corporation High Definition Audio Controller [10de:0be4] (rev a1) Subsystem: Micro-Star International Co., Ltd. [MSI] High Definition Audio Controller [1462:1995] Don't know if you did or not... I would recommend moving this to the troubleshooting thread though, since this should not be an issue with 6.7.3 (mine is working fine, and nothing changed in 6.7.3 should affect this (updated kernel and docker revert))... If you do need to repost this, I would recommend that you include your VM XML file with the diagnostics as well...
  23. <hostdev mode='subsystem' type='pci' managed='yes' xvga='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x0a' slot='0x00' function='0x0'/> </source> <alias name='hostdev0'/> <rom file='/mnt/user/domains/msi1030.dump'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </hostdev> Try removing xvga='yes' from the XML line for your card... From what I have read, this setting is no longer supported, so this is my best guess... Next best guess is to ask where you got the Video BIOS file for the card, try re-dumping it directly from the UnRaid commandline for this specific card if it was not gotten from there in the first place... Third best option, try using the Nvidia plugin for UnRaid (Read the documentation for it before you do) And last option is "Create" a new VM with the same options as the current one, and just point it to the same drives and image files as the current one, this resets all the settings and sometimes fixes some of these types of issues... This is what the line for my EVGA GTX1070 passthrough reads like: <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </source> <alias name='hostdev0'/> <rom file='/mnt/cache/system/EVGA_GTX1070_FTW_DT_DUMPFROMUNRAIDCOMMANDLINE.rom'/> <address type='pci' domain='0x0000' bus='0x05' slot='0x01' function='0x0'/> </hostdev> Edit: This is minor, but renaming your VBIOS file from ".dump" to ".rom" lets is show up in the VM Creation WebUI for UnRaid...
  24. Wow, that is quite the changelog! Awesome! Will begin testing immediately...
  25. Just installed 6.7.0-rc7... It has a blank button right next to the power button on the webui... When pressed, this adds a "flyout" effect with nothing in it... Removing the System Buttons plugin removes the button...