Everything posted by Bitbass
-
[Plugin] Prometheus unRAID Plugins
Auto-update works perfectly. Thanks for taking care of this!
-
[Plugin] Prometheus unRAID Plugins
No problem! I'm thankful that you're putting the time into this!
-
[Plugin] Prometheus unRAID Plugins
Awesome! I tried downloading the plg file, editing it and then installing manually and it's not working. I'm guessing there's some additional packaging or perhaps I need the other files in the directory. Thanks for doing this for all of us, especially ich777! :)
-
[SUPPORT] GRTGBLN - DOCKER TEMPLATES
Had the smartctl plugin (from ich777) fail after the 7.2 update so I tried your smartctl Exporter. Throws permissions errors in the logs when trying to access the drives. It can enumerate them fine, just can't pull smartctl info. I have Privileged checked on, but no other variables. Hoping this is a simple variable fix.
-
[Plugin] Prometheus unRAID Plugins
The pure Prometheus Exporter plugin appears to still be working. Would it be hard to add the smartctl metrics to that instead of fixing the separate smartctl plugin? ***Edit - I'm really concerned about drive temps more than anything. I'm mapping drive temps to Grafana to see if I've had a silent fan failure. It's easy to see over time, but sometimes harder to notice in the moment. Things like IO metrics are working in the Prom Exporter, but that might not be a smart metric. Anyway, I appreciate that you built these tools and that I've been able to take advantage of them for so long. Hope things are well with you!
-
[Plugin] Prometheus unRAID Plugins
I've been using the SmartCTL exporter plugin for a while now and it seems to be broken as of 7.2.0. The full Prometheus Exporter plugin appears to be working as I'm getting other system stats. Just nothing in /metrics for "smart". I tried installing the smartctl container (not yours) to test and that's working but throwing a permissions error when it tries to grab the stats. Makes me wonder if the plugin is also running into a permissions problem with the 7.2 update. Let me know what you need. The standard diags? Thanks!
-
[Support] devzwf - Proxmox Backup Server Dockerfiles
I've started having problems with my PVE nodes crashing when the backup process runs. I suspect this is a recent update to PVE that I ran, which then quickly had a follow up set of patches for the kernel and the backup services on PVE. Makes me think there was a problem on the initial 8.4 patches. I ran those quick updates this week, but had another node crash last night during the backup. At least I suspect it was when the backup started, because the only node that crashed last night was the one node that had a backup scheduled. I'm now looking at what the possible causes might be. There are no additional patches available on the no-sub repo for PVE as of this morning. I might just have to wait for that to make it's way down from the Ent repo. As part of the recent patches I did update to PVE 8.4 and all of the PVE backup components were updated to the same release. I noticed that PBS has been updated to 3.4. Estimate on when we'll get that updated to this container for Unraid? I have no idea if that will help, but I'm trying to rule out whatever I can. Thanks for building this container!
-
Upgrade cache pool drives
The upgrade process was trivial! Thanks for the confirmation!
-
Upgrade cache pool drives
Because it's in the pool it'll just rebuild the pool. Makes sense. Thanks!
-
Upgrade cache pool drives
I just want to make sure it really is this simple before I blow things up and have to come begging for help. I have 2x256GB NVMe sticks in a BTRFS RAID1 pool. Balance and Scrub have never been run on the pool. I have Docker and a few shares on the cache pool. The Mover is set to hourly. I know that's more aggressive than usual but I had a reason. I can adjust that if necessary. I have 2x512GB NVMe sticks I'll be upgrading to. I have had VM turned on in the past, but it's not currently on. Looks like I do this: https://docs.unraid.net/unraid-os/manual/storage-management/#replace-a-disk-in-a-pool Simply shut the system down, replace one of the drives, start it up, assign the new drive in the missing cache pool slot, format the new drive and let it be part of the pool. Let things settle down and then repeat the process for the other drive. Should I run the Balance or Scrub for BTRFS? Anything else I'm missing? Do I need to do anything post replacement to expand the space?
-
[Plugin] Prometheus unRAID Plugins
Awesome! Thanks for the quick response!
-
[Plugin] Prometheus unRAID Plugins
@ich777 You had previously released a smartctl add-on plugin to the Prometheus exporter. I have it on my primary server. I'm now looking for it on my new server app store and can't find it. Was it removed, or did I install it through some other method? Thanks!
-
Drive errors followed by Unmountable for a second drive after reboot
Son of a... the culprit is attached. Appreciate you hanging in there @JorgeB. Everything is running much more responsively now and the rebuild on the drive I swapped out is blazing along at normal speeds.
-
Drive errors followed by Unmountable for a second drive after reboot
Ok, had an all morning power outage. What else can go wrong. Upon booting up Unraid with different cables for Disk15 I now have Disk15 as unmountable. Diags attached. And, I see data loss now. I know it might still be recoverable, but it's getting less likely. unraid-diagnostics-20241016-1317.zip
-
Drive errors followed by Unmountable for a second drive after reboot
Swapped cables around a couple of times. The unmountable was jumping around and not following a cable. So, I swapped motherboards. The good news is it's more stable now. No more unmountable situation, yet. The bad news is, the repair is moving very slowly. Looking at the Syslog, I think I've traced the ATA1 errors to the sdb drive. This is a newish drive, but it is a surveillance drive. It hasn't thrown any errors yet, aside from the ATA errors. When I try to run a Short SMART on Disk15 it fails with a Host Reset. All the other disks are fine with the SMART tests. Best I can figure is that Disk15 might have been failing, or maybe not, but Disk4 had write errors, went offline, and then when I rebooted Disk15 silently broke with the ATA errors that I didn't see at first. I'm attaching the current diag file. Hopefully someone can tell me this is plausible. So, my question is, what's the best path for me to recover with minimal data loss. I have drives I can swap in. I could probably add another drive to the array now, if there's a way for me to migrate the content or rebuild onto that without data loss. Current state is that I have a new Disk4 that doesn't have content on it. I have the old Disk4 that I can mount in another system (or in Unraid if that's the right thing to do) that might have content on it, but is throwing errors. I have Disk15 with ATA errors. I have spare 8TB drives that I can swap in. unraid-diagnostics-20241016-0746.zip
-
Drive errors followed by Unmountable for a second drive after reboot
What do you recommend as a next step?
-
Drive errors followed by Unmountable for a second drive after reboot
Ok, how about I cancel the rebuild again, shut it all down, transfer it into my other motherboard and see what I get out of that? It would be a different SATA controller, obviously. Rules out a failure there. I guess what I'm asking is where am I at for potential data recovery? Is Disk2 a lost cause at this point and I need to consider that data gone? Or do I still have the data but I need to be careful about how I rebuild things?
-
Drive errors followed by Unmountable for a second drive after reboot
No, didn't try that. I can give that a shot, but the current setup is a chain of power connectors and this one is second in line. Wouldn't be crazy if that's gone bad, but unlikely. I saw in another thread that there's a way to do an XFS repair. Should I consider that, and if so, do I need to wait for the rebuild on Disk4 to complete?
-
Drive errors followed by Unmountable for a second drive after reboot
Replaced the SATA cable for Disk2. No change that I can tell. Unraid started in disk selection mode. I simply started the array. unraid-diagnostics-20241015-0804.zip
-
Drive errors followed by Unmountable for a second drive after reboot
I have spare drives to swap in. I've learned my lesson from the past and I won't rush things now (despite this mistake I probably already made). The rebuild is jumping between 8 days and 30+ days. So, I have time before I do anything else!
-
Drive errors followed by Unmountable for a second drive after reboot
Unraid started finding errors on Disk 4 last night. I didn't notice it until this morning, at which time I rebooted Unraid to see if it would clear it up. Instead, it's now showing Disk 2 as being unmountable. I'm suspecting I have a SATA controller problem, but the array is now trying to rebuild Disk 4. It is going VERY slowly. Maybe I screwed up on starting the array and having it kick into a rebuild. How do I navigate my way out of this now? Do I need to confirm it's a SATA controller problem? If I need to, I can transfer the system into a different motherboard and sata controller. I've done it before. I just need to understand what the risk is to stopping the rebuild, and if that's the right approach. unraid-diagnostics-20241014-1450.zip
-
[Support] binhex - Plex Pass
Thanks!
-
[Support] binhex - Plex Pass
Is this thing on? Everything ok in here?
-
[Support] binhex - Plex Pass
I have a "transcode" folder in my cache that's gotten rather large. This is for plexpass. Most of the dates are from over a year ago. Is this some stranded content? Is it safe to delete it?
-
[PLUGIN] LCD_Manager
I did, which is part of the reason I'm asking if anyone has gotten it working with the ESP32. Seems like a "should be able to" answer at this point. ***Edit - that came off kind of snarky, sorry about that, wasn't my intent. I'm interested in building some analog gauges that are "remote" via an ESP32 over the network. This plugin is obviously pulling the real time stats from Unraid, so I'm hoping someone has taken that next step to get it working over IP. I'm hoping I can figure out how to take the data meant for an LCD display and convert that into PWM for analog gauges.