Singularity42

Members
  • Posts

    34
  • Joined

  • Last visited

Everything posted by Singularity42

  1. Yep, figured that out, thanks. Do you know why this happens? Ive seen it reported here quite a few times, had it happen a few times, and have some friends that said its happened to them as well. Should it retain labels?
  2. Server went AWOL overnight. CPU processes were maxed out at 99.99%. I resolved that, but the new issue is that I cannot mount my Unassigned disks, where my VM images live: May 31 08:25:14 Singularity kernel: sd 1:0:12:0: [sdm] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) May 31 08:25:14 Singularity kernel: sd 1:0:12:0: [sdm] Write Protect is off May 31 08:25:14 Singularity kernel: sd 1:0:12:0: [sdm] Mode Sense: 46 00 10 08 May 31 08:25:14 Singularity kernel: sd 1:0:12:0: [sdm] Write cache: enabled, read cache: enabled, supports DPO and FUA May 31 08:25:14 Singularity kernel: sdm: sdm1 May 31 08:25:14 Singularity kernel: sd 1:0:12:0: [sdm] Attached SCSI disk May 31 08:26:21 Singularity emhttpd: Samsung_SSD_870_S6R4NJ0R607499P_33001438038737393 (sdm) 512 3907029168 May 31 08:26:21 Singularity emhttpd: read SMART /dev/sdm May 31 08:26:32 Singularity unassigned.devices: Disk with ID 'Samsung_SSD_870_S6R4NJ0R607499P_33001438038737393 (sdm)' is not set to auto mount. May 31 09:07:40 Singularity unassigned.devices: Error: Device '/dev/sdm1' mount point 'Samsung_SSD_' - name is reserved, used in the array or by an unassigned device. May 31 09:08:22 Singularity unassigned.devices: Error: Device '/dev/sdm1' mount point 'Samsung_SSD_' - name is reserved, used in the array or by an unassigned device. Any ideas on how to resolve this? Edit: I see, its the mount point name. Which is failing when I attempt to change it. Edit 2: It looks like it also gave the same label to my cache disk: /dev/sdl1 on /mnt/cache type xfs (rw,noatime) [Samsung_SSD_] ... sdl 8:176 0 232.9G 0 disk └─sdl1 8:177 0 232.9G 0 part /mnt/cache Thanks. singularity-flash-backup-20210517-1342.zip
  3. Ah, was looking for what subforum to use. Is there a move option in Edit, or just delete and repost?
  4. Server went AWOL overnight. CPU processes were maxed out at 99.99%. I resolved that, but the new issue is that I cannot mount my Unassigned disks, where my VM images live: May 31 08:25:14 Singularity kernel: sd 1:0:12:0: [sdm] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) May 31 08:25:14 Singularity kernel: sd 1:0:12:0: [sdm] Write Protect is off May 31 08:25:14 Singularity kernel: sd 1:0:12:0: [sdm] Mode Sense: 46 00 10 08 May 31 08:25:14 Singularity kernel: sd 1:0:12:0: [sdm] Write cache: enabled, read cache: enabled, supports DPO and FUA May 31 08:25:14 Singularity kernel: sdm: sdm1 May 31 08:25:14 Singularity kernel: sd 1:0:12:0: [sdm] Attached SCSI disk May 31 08:26:21 Singularity emhttpd: Samsung_SSD_870_S6R4NJ0R607499P_33001438038737393 (sdm) 512 3907029168 May 31 08:26:21 Singularity emhttpd: read SMART /dev/sdm May 31 08:26:32 Singularity unassigned.devices: Disk with ID 'Samsung_SSD_870_S6R4NJ0R607499P_33001438038737393 (sdm)' is not set to auto mount. May 31 09:07:40 Singularity unassigned.devices: Error: Device '/dev/sdm1' mount point 'Samsung_SSD_' - name is reserved, used in the array or by an unassigned device. May 31 09:08:22 Singularity unassigned.devices: Error: Device '/dev/sdm1' mount point 'Samsung_SSD_' - name is reserved, used in the array or by an unassigned device. Any ideas on how to resolve this? Diagnostics attached, thanks. singularity-flash-backup-20210517-1342.zip
  5. Thank you. That was what I thought the issue might be. Its interesting it works via shell, but not in UI.
  6. Having an issue where my drives are SMART supported, but I cant get Error Counting or Self Test Logging in the UI. Any ideas what is going on here? When I query it locally it works: root@Singularity:~# smartctl -a -d sat /dev/sdd smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.10.28-Unraid] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.14 (AF) Device Model: ST3000DM001-1CH166 Serial Number: Z1F4W3YF LU WWN Device Id: 5 000c50 066363a82 Firmware Version: CC29 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Form Factor: 3.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Mon Mar 28 11:40:49 2022 CDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 584) seconds. Offline data collection capabilities: (0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 328) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x3085) SCT Status supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 119 099 006 Pre-fail Always - 205604352 3 Spin_Up_Time 0x0003 094 094 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 096 096 020 Old_age Always - 4587 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 16 7 Seek_Error_Rate 0x000f 079 060 030 Pre-fail Always - 94642487 9 Power_On_Hours 0x0032 078 078 000 Old_age Always - 19602 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 57 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 098 098 000 Old_age Always - 2 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0 189 High_Fly_Writes 0x003a 053 053 000 Old_age Always - 47 190 Airflow_Temperature_Cel 0x0022 077 054 045 Old_age Always - 23 (3 2 26 0 0) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 41 193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 472850 194 Temperature_Celsius 0x0022 023 046 000 Old_age Always - 23 (128 0 0 0 0) 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 8 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 8 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 13922h+22m+17.498s 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 294954462232 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 669858032080 SMART Error Log Version: 1 ATA Error Count: 2 CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 2 occurred at disk power-on lifetime: 19309 hours (804 days + 13 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 00 ff ff ff 4f 00 33d+23:40:16.653 READ FPDMA QUEUED 60 00 00 ff ff ff 4f 00 33d+23:40:16.653 READ FPDMA QUEUED 60 00 00 ff ff ff 4f 00 33d+23:40:16.649 READ FPDMA QUEUED 60 00 00 ff ff ff 4f 00 33d+23:40:16.646 READ FPDMA QUEUED 60 00 00 ff ff ff 4f 00 33d+23:40:16.637 READ FPDMA QUEUED Error 1 occurred at disk power-on lifetime: 19309 hours (804 days + 13 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 00 ff ff ff 4f 00 33d+23:40:12.521 READ FPDMA QUEUED 60 00 00 ff ff ff 4f 00 33d+23:40:12.516 READ FPDMA QUEUED 60 00 00 ff ff ff 4f 00 33d+23:40:12.512 READ FPDMA QUEUED 60 00 00 ff ff ff 4f 00 33d+23:40:12.508 READ FPDMA QUEUED 60 00 00 ff ff ff 4f 00 33d+23:40:12.057 READ FPDMA QUEUED SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. But doesnt work within the UI. I've got a disk thats failing (obvious) and wanted to see this info in the UI. Thanks. singularity-diagnostics-20220328-1142.zip
  7. I totally forgot I did this!! Thanks!
  8. Alright, Ive spent hours reading and trying to get this to work, but I just cant get nvidia-smi to work with my card. Failure and card info: root@Singularity:~# nvidia-smi NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. root@Singularity:~# lspci -k | grep -i NVI 84:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 770] (rev a1) Kernel modules: nvidia_drm, nvidia 84:00.1 Audio device: NVIDIA Corporation GK104 HDMI Audio Controller (rev a1) It is supported. I do not have it part of a group attached to a VM: I have ACS override disabled: Still cant get it to communicate. Any help would be appreciated. singularity-diagnostics-20220104-2015.zip
  9. Im really not sure what happened, but I ended up rebuilding all my vm's and just fixing up dhcp to match the new macs. This is a bit worrisome since I'm seeing two libvirt dirs and now 2 user dirs. If anyone is able to explain what may have happened here, and why I have duplicated folders, I would greatly appreciate it. The emergency is over for now, I have everything back up and running. Always keep backups!
  10. Also, it appears I have a user and user0 dir. Im not really sure what is going on here. I do appear to have the vdisks for the vms, but why double of every dir?
  11. At second glance, it may appear that Ive now lost the actual vdisks for each vm. I really need some help.
  12. Oddly, I have two directories: drwxr-xr-x 2 root root 40 Apr 7 13:48 libvirt/ drwxr-xr-x 5 root root 420 Apr 7 13:48 libvirt-/ where libvirt- appears to have the correct data root@Singularity:/etc/libvirt-# ls hooks/ nwfilter/ virt-login-shell.conf virtnetworkd.conf virtqemud.conf libvirt-admin.conf qemu/ virtinterfaced.conf virtnodedevd.conf virtsecretd.conf libvirt.conf qemu-lockd.conf virtlockd.conf virtnwfilterd.conf virtstoraged.conf libvirtd.conf qemu.conf virtlogd.conf virtproxyd.conf
  13. If I try to set: IMAGE_FILE="/mnt/cache/system/libvirt/libvirt.img" to IMAGE_FILE="/mnt/user/system/libvirt/libvirt.img" then trying to start the service, i get: root@Singularity:~# /etc/rc.d/rc.libvirt start Starting virtlockd... 2021-09-09 18:56:39.544+0000: 22776: info : libvirt version: 6.5.0 2021-09-09 18:56:39.544+0000: 22776: info : hostname: Singularity 2021-09-09 18:56:39.544+0000: 22776: error : main:977 : Can't load config file: Failed to open file '/etc/libvirt/virtlockd.conf': No such file or directory: /etc/libvirt/virtlockd.conf Starting virtlogd... 2021-09-09 18:56:39.574+0000: 22779: info : libvirt version: 6.5.0 2021-09-09 18:56:39.574+0000: 22779: info : hostname: Singularity 2021-09-09 18:56:39.574+0000: 22779: error : main:756 : Can't load config file: Failed to open file '/etc/libvirt/virtlogd.conf': No such file or directory: /etc/libvirt/virtlogd.conf no image mounted at /etc/libvirt
  14. I was doing some work to my unraid server, rebooted, came back up without any vms. I tried some backup vibvirt.img's as well, with no luck. Im wondering if it has anything to do with using a new version of virtio drivers, and having the iso in a different location, but I went back to the ones listed in the vm page, with no luck either. I also did have the libvirt image split between a disk on the array and cache. I copied the image and removed it from disk2, kept it on the cache drive. I had no issues with kvm after I did this (for about 1-2 days, until now). The vmdisks are still around, so I could recreate, but thats going to be a pain with dhcp/new macs. Any ideas on what I can try? Attached is my diagnostics. Here is a copy of the domain.conf: SERVICE="enable" IMAGE_FILE="/mnt/cache/system/libvirt/libvirt.img" IMAGE_SIZE="2" DEBUG="no" DOMAINDIR="/mnt/user/domains/" MEDIADIR="/mnt/user/isos/" VIRTIOISO="/mnt/user/isos/virtio-win-0.1.190-1.iso" BRNAME="br0" VMSTORAGEMODE="auto" HOSTSHUTDOWN="shutdown" TIMEOUT="60" singularity-diagnostics-20210909-1343.zip
  15. Thanks. I forgot to set it to HBA mode. Once I did that, unraid can see all the drives.
  16. Just moved from a tower to a HPE Proliant 380 G9 - Has a P840ar Controller. Im trying to figure out whats going on, but the drives cant seem to be seen by unraid. They look like they are showing up in dmesg, so at least they are being detected. Just not sure why unraid cant interact with them? singularity-diagnostics-20210901-1647.zip
  17. Ive got bigger issues now, really hoping someone can help me out. After removing the GeForce passthrough, and attempting to boot the windows vm, I now get back "Inaccessible Boot Device" upon boot. Here is the boot log: Edit: It boots in SATA mode. So, something wrong with VirtIO drivers for whatever reason. Cool, windows...
  18. Its something with attempting to passthrough the NVIDIA card. If I remove it, it boots fine. /sys/kernel/iommu_groups/1/devices/0000:02:00.0 /sys/kernel/iommu_groups/1/devices/0000:00:01.0 /sys/kernel/iommu_groups/1/devices/0000:01:00.0 /sys/kernel/iommu_groups/1/devices/0000:00:01.1 /sys/kernel/iommu_groups/1/devices/0000:01:00.1 its because my recently added expander card is in the same group. How do I fix this? 00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller (rev 06) 00:01.1 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x8 Controller (rev 06) Kernel driver in use: i801_smbus Kernel modules: i2c_i801 01:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 770] (rev a1) 01:00.1 Audio device: NVIDIA Corporation GK104 HDMI Audio Controller (rev a1) 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06) 02:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03) The 2 slots Im using share the same pci bus. Time to physically move things I guess. No luck. Same issue.
  19. Yep I think thats exactly what was happening. Once I cleared the mounts off the vm to the cache drive, and went through the stop/remove missing/reassign cache/start, then recreated the mounts to the same disk. Working like a charm now!
  20. What i was saying is that in host mode, even if manually creating the docker (-e 'TCP_PORT_8082'='8082') it does not adhere to the container port you give it, not the port mapping you give it - because as you mention, there is no port mapping in host mode. Since there is another container using port 8080, the port will be in use and webui wont work. Its currently working for me, in bridged mode, with a remapped port "8082", since no matter what I attempted, the container always created itself with port 8080 in host mode.
  21. Ya, I thought about this, but I wasnt close to these cables at all and I was able to reproduce the issue in a very strange way. Stop/Start array, cache disk comes in as sdi ( i dont know why ). Go mess with mounts (change directory location for mount) on a virtual machine that pulls scripts off original (sdg) cache disk. Disk now changes its identifier (verified by watching lsscsi) to sdg Disk IO errors, and IO is halted, due to id changing. I was able to do that 5 times with same result each time. I fixed it with the above steps, and its no longer an issue. 🤷‍♂️ Now just issues with IOMMU and booting that vm , yay.
  22. After having some other issues: I now cannot get a windows VM to boot: -overcommit mem-lock=off \ -smp 8,sockets=1,cores=4,threads=2 \ -uuid 4f226efe-37a7-9efe-9a80-e618a6412e69 \ -display none \ -no-user-config \ -nodefaults \ -chardev socket,id=charmonitor,fd=35,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control \ -rtc base=localtime \ -no-hpet \ -no-shutdown \ -boot strict=on \ -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x7.0x7 \ -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x7 \ -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x7.0x1 \ -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x7.0x2 \ -device ahci,id=sata0,bus=pci.0,addr=0x3 \ -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 \ -blockdev '{"driver":"file","filename":"/mnt/user/isos/virtio-win.iso","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-2-format","read-only":true,"driver":"raw","file":"libvirt-2-storage"}' \ -device ide-cd,bus=sata0.1,drive=libvirt-2-format,id=sata0-0-1 \ -blockdev '{"driver":"file","filename":"/mnt/disks/OCZ-AGILITY3_OCZ-E14BO6F10C0B6V6X/vdisk1.img","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":false,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"}' \ -device ide-hd,bus=sata0.2,drive=libvirt-1-format,id=sata0-0-2,bootindex=1,write-cache=on \ -netdev tap,fd=37,id=hostnet0,vhost=on,vhostfd=38 \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:2c:f8:ae,bus=pci.0,addr=0x2 \ -chardev pty,id=charserial0 \ -device isa-serial,chardev=charserial0,id=serial0 \ -chardev socket,id=charchannel0,fd=39,server,nowait \ -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 \ -device usb-tablet,id=input0,bus=usb.0,port=2 \ -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.0,addr=0x6 \ -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ -msg timestamp=on 2020-05-20 15:11:43.983+0000: Domain id=9 is tainted: high-privileges 2020-05-20 15:11:43.983+0000: Domain id=9 is tainted: host-cpu char device redirected to /dev/pts/3 (label charserial0) 2020-05-20T15:11:44.028088Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.0,addr=0x6: vfio 0000:01:00.0: group 1 is not viable Please ensure all devices within the iommu_group are bound to their vfio bus driver. 2020-05-20 15:11:44.050+0000: shutting down, reason=failed
  23. I was able to resolve this by: - stop array - un-assign cache device - remove missing device (same drive, different id) - assign new cache - start array again.
  24. This has something to do with my cache drive showing up as two devices. If I spin up the array, I see all my shares, everything works. Then the logs of the disk go into IO Errors, then I lose everything. When this happens the cache disk shows up as cache (sdi) and unassigned (sdg). Im assuming unraid doesnt know which is which and thats why it halts everything and stops the cache disk. Now, how to fix?
  25. Last night I powered down my unraid, added an additional disk to be used for dual parity, and now Ive got issues in unraid. - No shares appear in unraid - but the shares are still accessible and mounted - Cache drive shows up as a cache device and unassigned device - shows it has data use but there are no files/folders: Diag attached. Also, Im supposed to copy and paste this message from "Apps"? Hoping someone can help me here. singularity-diagnostics-20200520-0855.zip