Tydell

Members
  • Posts

    32
  • Joined

  • Last visited

Everything posted by Tydell

  1. @cambriancatalystDid you ever solve this? I think I solved mine. The syslog errors have gone away and I think bluetooth remains functional in my home assistant instance. Looks like the syslog flooding was a bug in bluez that they remedied with 5.64. The repos ich777 uses for un-get don't yet have anything above 5.63, but I found a slackware repository with 5.64 (there are higher versions than that, but didn't want to push my luck) and added it to the sources.list file for un-get. I then ran 'un-get update' then 'un-get upgrade' to upgrade bluez from 5.63 to 5.64 and restarted bluez. In detail: Add this line to Flash/config/plugins/un-get/sources.list: https://slackware.uk/slackware/slackware64-15.0/patches/ patches Terminal into Unraid & Run: 'un-get update' then 'un-get upgrade' Force bluez/bluetoothd to stop: sudo pkill bluetoothd Restart bluez: /etc/rc.d/rc.bluetooth start
  2. For what it's worth, I'm running into this as well. I installed bluez via un-get to expose bluetooth to Home Assistant, then started bluetooth via "/etc/rc.d/rc.bluetooth start" and now my syslog is being flooded with roughly 10 entries every second with this:
  3. Ah gotcha, sure thing.....here it is. I see one of my parity drives now has a CRC error count of 1...I'm sure I'm not doing my drives any favors with these hard resets. Thanks! syslog (3)
  4. I'm having an unfortunate issue as of late where my unraid server will completely lock up. It drops off the network and is entirely unreachable. A connected keyboard, monitor, and mouse are also unresponsive when this happens as well, so there's no local console either. The only way for me to recover is to force a reboot. An investigation of my logs hasn't yielded much. I've set my logs to copy to my flash drive to get historical information and there aren't any events that I can discern around the time of the freeze. The logs just kick back in at startup. This server's been pretty solid the last couple years - not sure what's going on! Any help shedding light on the potential cause would be hugely appreciated! diagnostics-20230302-2348.zip
  5. Yep, I sure did. Certs reprovisioned successfully and ssl is set to yes. Still can only get in via http
  6. I don't know about anyone else, but after deleting my certs yesterday, turning SSL back on, and re-provisioning certs, I still can only get in via http, not https.
  7. alternatively, you can just delete the ssl certs from the flash drive (/boot/config/ssl/certs)...i did that and was able to get back in. Not sure which is the preferred method, but just throwing it out there.
  8. Same thing happened to me today. Ultimately I got back in by SSHing in and deleting the certs on the flash drive (ssh into server, navigate to /boot/config/ssl and rename/delete the certs folder), then reboot (sbin/reboot). After doing that, I was able to get to the http ui. Upon redoing my certs (settings/management access) and setting USE SSL/TLS to yes, I'm still unable to use ssl to get to my server. It redirects to port 80 and uses a self-signed cert, not the letsencrypt cert. The WAN Port check also fails even though I definitely have port 443 forwarded to my server. Version 6.10.2 if it matters. Also, I didn't proactively install/update anything recently, though an autoupdate of something certainly could have happened. I've also not made any changes to my network setup either. To my knowledge, I didn't make any changes to my server or network leading up to this. FWIW I have NerdPack installed. Don't remember why I have it tbh.
  9. Same thing happened to me today. Ultimately I got back in by SSHing in and deleting the certs on the flash drive (ssh into server, navigate to /boot/config/ssl and rename/delete the certs folder), then reboot (sbin/reboot). After doing that, I was able to get to the http ui. Upon redoing my certs (settings/management access) and setting USE SSL/TLS to yes, I'm still unable to use ssl to get to my server. It redirects to port 80 and uses a self-signed cert, not the letsencrypt cert. The WAN Port check also fails even though I definitely have port 443 forwarded to my server. Version 6.10.2 if it matters. Also, I didn't proactively install/update anything recently, though an autoupdate of something certainly could have happened. I've also not made any changes to my network setup either. To my knowledge, I didn't make any changes to my server or network leading up to this. FWIW I have NerdPack installed. Don't remember why I have it tbh.
  10. Not sure if related or not, but today I suddenly stopped being able to get to the webgui as well. Shares and apps all worked still without issue. Navigating to http://myserverip resulted in an nginx error. https was giving me a dns error as it was trying to go to the custom myservers url with my hash in it. Ultimately I got back in by SSHing in and deleting the certs on the flash drive (ssh into server, navigate to /boot/config/ssl and rename/delete the certs folder), then reboot (sbin/reboot). After doing that, I was able to get to the http ui. Upon redoing my certs (settings/management access) and setting USE SSL/TLS to yes, I'm still unable to use ssl to get to my server. It redirects to port 80 and uses a self-signed cert, not the letsencrypt cert. The WAN Port check also fails even though I definitely have port 443 forwarded to my server. Version 6.10.2 if it matters. EDIT: Also, I didn't proactively install/update anything recently, though an autoupdate of something certainly could have happened. I've also not made any changes to my network setup either. To my knowledge, I didn't make any changes to my server or network leading up to this. EDIT2: FWIW I have NerdPack installed. Don't remember why I have it tbh.
  11. Just wanted to revisit this and report that I did disable spindown for a few days and after doing so, there were no more errors. I conducted further testing by re-enabling spindown, letting the disk spin down, then initiating a transfer to the array and since this disk was nearly empty, it transferred to this drive. It immediately threw all sorts of errors. I then disabled spindown on the disk again and since then I've had no issues. For that reason, I'm going to conclude that this disk just doesn't play nicely with spinning down. Thanks @Vr2Io!
  12. I've recently made some changes to my unraid server. I used to have an HBA with a 16 port SAS expander, but then also used onboard SATA ports to fill out my 24 drive bays. I recently purchased and installed a 24-port SAS expander to eliminate my need to use onboard SATA ports as they seemed to be flaky/inconsistent. I also installed a new 8TB drive. Unraid is reporting errors on that drive now and i'm also receiving parity errors on that drive as well. I've checked all the physical connections and have reseated the sata cable on the backplane. Is there anything else I might check? Here are what the recent errors look like: Any ideas as to what I might check next? I super appreciate any help! tyraid-diagnostics-20220204-1359.zip
  13. Nope, no issues....works out of the box. Bought a QNAP QSW-308-1C that works perfectly with it.
  14. The whole system has one nvidia gpu that's passed through to a vm and then the igpu that isn't passed through, per se, but is set up to be used in the plex docker container. The only thing I wish I had now was an ipmi port. That would make this whole build just about perfect. A couple other notes: some of the PCIE slots on this motherboard share lanes with the sata controller, so if you use some of the PCIE ports, it'll disable some of your onboard sata. Of note though, I used a SAS expander in my last PCIE slot that only pulls power from the PCIE slot - it doesn't use any actual data. That particular port did not disable the corresponding sata ports. If I think of any other gotchas, i'll post them here.
  15. Yep, I was able to use the igpu to transcode 4k content - it does the job just fine! I'm also currently passing a GPU I had laying around through to a VM for ethereum mining and it works great as well. I used the ACS Override setting to break up my IOMMU groups so all the devices have their own IOMMU group. I believe I had to do that in order to get the GPU to pass through to the VM. Overall, very happy with the build.
  16. It would appear that you, sir, are correct! Copying everything over now. Thanks again for all your help! Time for a 2nd parity drive, methinks.
  17. Oops, thanks for that! That worked! Still doesn't mount though Jul 5 08:59:29 Unraid kernel: XFS (sdc1): Corruption warning: Metadata has LSN (1:22491) ahead of current LSN (1:2). Please unmount and run xfs_repair (>= v4.3) to resolve. Jul 5 08:59:29 Unraid kernel: XFS (sdc1): log mount/recovery failed: error -22 Jul 5 08:59:29 Unraid kernel: XFS (sdc1): log mount failed Jul 5 08:59:29 Unraid unassigned.devices: Mount of '/dev/sdc1' failed: 'mount: /mnt/disks/Hitachi_HDS723030ALA640_MK0311YHG033ZA: wrong fs type, bad option, bad superblock on /dev/sdc1, missing codepage or helper program, or other error. ' Jul 5 08:59:29 Unraid kernel: XFS (sdc1): Corruption warning: Metadata has LSN (1:22491) ahead of current LSN (1:2). Please unmount and run xfs_repair (>= v4.3) to resolve. Jul 5 08:59:29 Unraid kernel: XFS (sdc1): log mount/recovery failed: error -22 Jul 5 08:59:29 Unraid kernel: XFS (sdc1): log mount failed Jul 5 08:59:29 Unraid unassigned.devices: Mount of '/dev/sdc1' failed: 'mount: /mnt/disks/Hitachi_HDS723030ALA640_MK0311YHG033ZA: wrong fs type, bad option, bad superblock on /dev/sdc1, missing codepage or helper program, or other error. ' These errors are usually a bad SATA cable, please post a SMART report for that disk. Attached. Lots of UDMA_CRC_Error_Count. Hitachi_HDS723030ALA640_MK0311YHG033ZA-20210705-0900.txt
  18. @JorgeB & @trurl - Thank you both for your help! My folder is back and I'll be going through now to see if there are any missing files. Lost+Found only has about 600KB of total stuff in it, and no important files that I recognize or can really see the contents of in notepad.
  19. I might have evidence now - It won't mount in unassigned devices - says it has a dup UUID. I haven't done what it suggests yet (running xfs_repair with the -L flag) I running xfs_repair -nv on it is not going well.: xfs_repair -nv /dev/sdc Phase 1 - find and verify superblock... bad primary superblock - bad magic number !!! attempting to find secondary superblock... .found candidate secondary superblock... unable to verify superblock, continuing... .found candidate secondary superblock... unable to verify superblock, continuing... ......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... The system log file also threw some errors during the check: Jul 5 08:30:37 Unraid kernel: ata4.00: exception Emask 0x10 SAct 0x180000 SErr 0x280100 action 0x6 frozen Jul 5 08:30:37 Unraid kernel: ata4.00: irq_stat 0x08000000, interface fatal error Jul 5 08:30:37 Unraid kernel: ata4: SError: { UnrecovData 10B8B BadCRC } Jul 5 08:30:37 Unraid kernel: ata4.00: failed command: READ FPDMA QUEUED Jul 5 08:30:37 Unraid kernel: ata4.00: cmd 60/3f:98:00:30:03/05:00:00:00:00/40 tag 19 ncq dma 687616 in Jul 5 08:30:37 Unraid kernel: res 50/00:c1:3f:35:03/00:02:00:00:00/40 Emask 0x10 (ATA bus error) Jul 5 08:30:37 Unraid kernel: ata4.00: status: { DRDY } Jul 5 08:30:37 Unraid kernel: ata4.00: failed command: READ FPDMA QUEUED Jul 5 08:30:37 Unraid kernel: ata4.00: cmd 60/c1:a0:3f:35:03/02:00:00:00:00/40 tag 20 ncq dma 360960 in Jul 5 08:30:37 Unraid kernel: res 50/00:c1:3f:35:03/00:02:00:00:00/40 Emask 0x10 (ATA bus error) Jul 5 08:30:37 Unraid kernel: ata4.00: status: { DRDY } Jul 5 08:30:37 Unraid kernel: ata4: hard resetting link Jul 5 08:30:37 Unraid kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jul 5 08:30:37 Unraid kernel: ata4.00: configured for UDMA/133 Jul 5 08:30:37 Unraid kernel: ata4: EH complete Jul 5 08:30:39 Unraid kernel: ata4.00: exception Emask 0x10 SAct 0x80800000 SErr 0x280100 action 0x6 frozen Jul 5 08:30:39 Unraid kernel: ata4.00: irq_stat 0x08000000, interface fatal error Jul 5 08:30:39 Unraid kernel: ata4: SError: { UnrecovData 10B8B BadCRC } Jul 5 08:30:39 Unraid kernel: ata4.00: failed command: READ FPDMA QUEUED Jul 5 08:30:39 Unraid kernel: ata4.00: cmd 60/3f:b8:00:00:0e/05:00:00:00:00/40 tag 23 ncq dma 687616 in Jul 5 08:30:39 Unraid kernel: res 50/00:c1:3f:05:0e/00:02:00:00:00/40 Emask 0x10 (ATA bus error) Jul 5 08:30:39 Unraid kernel: ata4.00: status: { DRDY } Jul 5 08:30:39 Unraid kernel: ata4.00: failed command: READ FPDMA QUEUED Jul 5 08:30:39 Unraid kernel: ata4.00: cmd 60/c1:f8:3f:05:0e/02:00:00:00:00/40 tag 31 ncq dma 360960 in Jul 5 08:30:39 Unraid kernel: res 50/00:c1:3f:05:0e/00:02:00:00:00/40 Emask 0x10 (ATA bus error) Jul 5 08:30:39 Unraid kernel: ata4.00: status: { DRDY } Jul 5 08:30:39 Unraid kernel: ata4: hard resetting link Jul 5 08:30:40 Unraid kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jul 5 08:30:40 Unraid kernel: ata4.00: configured for UDMA/133 Jul 5 08:30:40 Unraid kernel: ata4: EH complete Jul 5 08:31:50 Unraid kernel: ata4.00: exception Emask 0x10 SAct 0xc00 SErr 0x280100 action 0x6 frozen Jul 5 08:31:50 Unraid kernel: ata4.00: irq_stat 0x08000000, interface fatal error Jul 5 08:31:50 Unraid kernel: ata4: SError: { UnrecovData 10B8B BadCRC } Jul 5 08:31:50 Unraid kernel: ata4.00: failed command: READ FPDMA QUEUED Jul 5 08:31:50 Unraid kernel: ata4.00: cmd 60/3f:50:00:58:4c/05:00:01:00:00/40 tag 10 ncq dma 687616 in Jul 5 08:31:50 Unraid kernel: res 50/00:c1:3f:5d:4c/00:02:01:00:00/40 Emask 0x10 (ATA bus error) Jul 5 08:31:50 Unraid kernel: ata4.00: status: { DRDY } Jul 5 08:31:50 Unraid kernel: ata4.00: failed command: READ FPDMA QUEUED Jul 5 08:31:50 Unraid kernel: ata4.00: cmd 60/c1:58:3f:5d:4c/02:00:01:00:00/40 tag 11 ncq dma 360960 in Jul 5 08:31:50 Unraid kernel: res 50/00:c1:3f:5d:4c/00:02:01:00:00/40 Emask 0x10 (ATA bus error) Jul 5 08:31:50 Unraid kernel: ata4.00: status: { DRDY } Jul 5 08:31:50 Unraid kernel: ata4: hard resetting link Jul 5 08:31:51 Unraid kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Jul 5 08:31:51 Unraid kernel: ata4.00: configured for UDMA/133 Jul 5 08:31:51 Unraid kernel: ata4: EH complete I think she's dead!
  20. I have no evidence yet - I'll definitely be investigating. I do still have the old drive, although it's unseated slightly from its hot swap bay. It's weird - when the rebuild of that first disk was happening, the error count on that drive shot through the roof. I noticed a bit later on that the drive was actually showing up in both the array AND unassigned devices at the same time. I didn't do anything at that time other than let the rebuild finish. I then unseated it when I replaced it in the array. Also xfs_repair appars to have done the trick! I'll be going through and seeing if I can find any corrupted files, but it had me run the repair twice and now the folder is back, so thank you for that! Now to investigate the "failed" drive...
  21. So there are movies in the media share on a bunch of other disks that are completely fine, but if i navigate to \\unraid\media\movies, that folder in the share appears empty, but browsing the individual disks, the files are there. the other folders under the media share appear to be unaffected - just the Movies folder appears affected. I'll certainly look @ ddrescue though, much appreciated!
  22. So yesterday, I was ready to swap in a spare drive to replace one that was starting to throw SMART errors. I'd already moved all files off of it, so there was no real risk of data loss - or so I thought. After I replaced it, during the 10 hour rebuild, another drive started throwing hundreds of thousands of errors during the rebuild. This drive contained a few dozen movies in my media share. Not really knowing what else to do, I let the rebuild complete, which it did overnight. I then replaced the newly failed drive as well and then let that rebuild complete. Now, it would appear that my Movies folder under my media share has become corrupt somehow. The files still exist spread throughout my drives - including the newly replaced drive, though I'd be shocked if there wasn't significant corruption in those files, which are quite easily replaceable. My logs are now full of "Unmount and run xfs_repair" and "Metadata corruption detected at xfs_buf....." messages which eventually completely fill the log file until it stops filling up anymore. Seems like at this point, I don't have many options other than to just let it sit there and do its thing. Anything else I can do? Potentially create a new share and use Krusader or Unbalance to move the missing files to the new share? I appreciate any insights! diagnostics-20210704-2126.zip