hermy65

Members
  • Posts

    273
  • Joined

  • Last visited

Everything posted by hermy65

  1. I have 4 SSDs in my cache pool right now and two of them are showing some errors so im going to replace them but im not sure how to go about doing so. The 2 that are showing errors are the cache drives 1 and 2 if that matters at all. Do i just stop the array, remove one of them then put the new drive in place and start the array back up like i do when upgrading a data drive?
  2. Over the last few days ive been running into an issue where i would notice the various containers i was running would be ridiculously slow and in sometime 100% unresponsive. Downloads would be down around 300kb/s and thats where they would stay. If i rebooted the server everything would return to normal and downloads would be in that 40-60Mb/s then after an hour or two everything would go to hell again. Today i thought maybe it was an issue with my docker image so i removed it and started to rebuild my containers but that doesnt seem to be helping me either. My machine isnt underpowered so that shouldnt be the issue but im running out of ideas. Edit: Ive been adding containers for ~1.5 hours and ive only been able to add maybe 10. Something is definitely not right here Diagnostics are attached storage-diagnostics-20181229-2350.zip
  3. Ant chance you could share your buffer settings?
  4. Interesting, I may spin up an NFS share and see if that helps at all
  5. I actually had the issue with 6.6.1 as well, just bumped to 6.6.3 to see if it was any better. Should I try NFS shares instead of SMB? Is there a reason to use NFS over SMB to begin with?
  6. For the better part of 4 months ive been having an issue with my kodi instances buffering so ive been trying to resolve it and unfortunately have not made much progress. First off, my unRAID box is running on dual Xeon E5-2630 v4's and 64GB of ECC RAM. All of my kodi instances are on my wired gigabit LAN and yet they constantly buffer/cache, sometimes its every 5-10 minutes, other times its every 20-30 seconds. I dont notice anything on my unRAID machine that should be causing it but im out of ideas. I am using mariadb to host the SQL DB my Kodi instances point to but i cant foresee that being the issue. Any insight would be greatly appreciated. Thanks!
  7. Ive got an issue that i cant resolve, hopefully you guys can tell me what dumb thing i did/did not do. I have a handful of kodi boxes up and running pointing at a SQL database. I decided to setup this headless container to handle library updates but im running into an issue. All i did was setup the advancedsettings to point at the database then point sonarr at it but it doesnt update the library. In the kodi log file it says: 17:58:37.621 T:22745948530432 WARNING: Process directory 'smb://ip/folder/name of show' does not exist - skipping scan. These folders all exist but kodi headless doesnt appear to be able to see them. What can i do to get this up and running?
  8. Im getting emails from letsencrypt about my certs expiring soon, do i need to do anything or does it take care of it on its own?
  9. Question for everyone. I have been using this plugin for a while but completely missed the setting that allows you to ignore specific file extensions and now when it runs the verification task my .nfo's are almost always flagged as having been modified. Whats my best course of action to prevent getting that every time it runs? Is there a way to remove the hash from a specific file type in batches?
  10. I ran into an interesting issue that hopefully i can get some insight on. I added a smb share in unassigned devices and mounted it, then i installed duplicati and had the destination folder be the newly mapped smb share. I started the backup procedure but after ~5 minutes of it running duplicati itself goes unresponsive. I cannot access the webgui, i cannot stop/restart the container and docker in general appears to only be partially functional afterwards. If i try to stop the array it never actually stops because the duplicati container wont stop. If i end up getting it rebooted and the array back online then as soon as duplicati runs again the same situation happens. Any ideas?
  11. Probably not the right place to post this but im not sure where else to put it. I have 2 unraid servers, one is my main machine and the other is more of an archive machine located at another location that has irreplaceable data. Im looking for a way to keep the two machines in sync so that if something new gets added to a archive specific directory on my production machine it can sync over to the archive machine. Any suggestions on how to automate this whole thing?
  12. This is the second week in a row this has happened to me but the first time i was able to get it to create diagnostics. I have the CA Backup and the CA Auto Update plugins kicking off Sunday nights at different times. The last two week though when i come and check on the machine everything is non-responsive and i have to hard power the machine. Last week i wasnt even able to ssh into it, this week though i was able to. Attached are my diagnostics. storage-diagnostics-20171009-0545.zip Edit: I also just noticed that literally all of my containers are gone...
  13. I started noticing things weren't running as well as they normally were yesterday so i started digging through the syslog and found this: Oct 5 22:04:59 Storage root: CA Docker Autostart Manager Finished Oct 5 22:05:31 Storage shfs/user: err: shfs_write: write: (28) No space left on device Oct 5 22:05:31 Storage shfs/user: err: shfs_write: write: (28) No space left on device Oct 5 22:05:31 Storage shfs/user: err: shfs_write: write: (28) No space left on device Oct 5 22:05:31 Storage shfs/user: err: shfs_write: write: (28) No space left on device Oct 5 22:05:31 Storage shfs/user: err: shfs_write: write: (28) No space left on device Oct 5 22:05:31 Storage shfs/user: err: shfs_write: write: (28) No space left on device Oct 5 22:05:31 Storage shfs/user: err: shfs_write: write: (28) No space left on device Oct 5 22:05:31 Storage kernel: loop: Write error at byte offset 7557210112, length 4096. Oct 5 22:05:31 Storage kernel: blk_update_request: I/O error, dev loop0, sector 14760168 Oct 5 22:05:31 Storage kernel: BTRFS error (device loop0): bdev /dev/loop0 errs: wr 520, rd 0, flush 0, corrupt 0, gen 0 Its all over throughout my syslog so this is just a snippet of it. My cache pool shows that there is 109GB free right now so i dont understand the No space left on device error. Included are my diagnostics. storage-diagnostics-20171006-0545.zip
  14. I posted here this morning about an issue i was having which turned into something else so i figured that this warranted a different thread since the issue no longer matched. What started out as my machine being unresponsive then upon reboot it would work for a few minutes then throw an Disabling IRQ #18 error which made the GUI and any containers/vms no longer function although i could still access the array. One user told me to move my second NIC to another slot which i did but it didnt help, so i removed the 2nd card completely and am now running onboard and im experiencing the same issues without seeing the IRQ #18 error. Since the machine is non responsive i cannot run diagnostics from the gui, tried running it from cmd but it never actually finishes. I thought perhaps it was an issue with 6.3.4 so i downgraded to 6.3.3 but that didnt help either. The best i can do is give you my syslog that i copied via cmd and hopefully that helps to some extent. Not sure if this helps now either but here are the results of cat /proc/interrupts - prior to pulling my other NIC line 18 had eth0 listed EDIT: Diagnostics finally worked so they are attached as well! root@Storage:~# cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 0: 29 0 0 0 0 0 0 0 IR-IO-APIC 2-edge timer 1: 1 0 0 0 0 0 0 1 IR-IO-APIC 1-edge i8042 8: 53 0 0 1 1 0 0 0 IR-IO-APIC 8-edge rtc0 9: 0 0 0 0 0 0 0 0 IR-IO-APIC 9-fasteoi acpi 12: 4 0 0 0 0 0 0 0 IR-IO-APIC 12-edge i8042 16: 85 2 4 1 13 3 2 9 IR-IO-APIC 16-fasteoi ehci_hcd:usb1 18: 0 0 0 0 0 0 0 0 IR-IO-APIC 18-fasteoi i801_smbus 19: 282611 26913 18752 13007 103143 102744 81838 40706 IR-IO-APIC 19-fasteoi ata_piix, ata_piix 23: 3520 623 597 423 1951 1380 744 694 IR-IO-APIC 23-fasteoi ehci_hcd:usb2 24: 0 0 0 0 0 0 0 0 DMAR-MSI 0-edge dmar0 25: 0 0 0 0 0 0 0 0 DMAR-MSI 1-edge dmar1 28: 289 33 19 23 126 33 25 34 IR-PCI-MSI 2097152-edge xhci_hcd 29: 0 0 0 0 0 0 0 0 IR-PCI-MSI 2097153-edge xhci_hcd 30: 0 0 0 0 0 0 0 0 IR-PCI-MSI 2097154-edge xhci_hcd 31: 0 0 0 0 0 0 0 0 IR-PCI-MSI 2097155-edge xhci_hcd 32: 0 0 0 0 0 0 0 0 IR-PCI-MSI 2097156-edge xhci_hcd 33: 0 0 0 0 0 0 0 0 IR-PCI-MSI 2097157-edge xhci_hcd 34: 0 0 0 0 0 0 0 0 IR-PCI-MSI 2097158-edge xhci_hcd 35: 0 0 0 0 0 0 0 0 IR-PCI-MSI 2097159-edge xhci_hcd 36: 70877 6830 4978 3240 20018 12940 9281 6576 IR-PCI-MSI 524288-edge mpt2sas0-msix0 37: 372337 43280 25194 19402 123527 81982 60890 46134 IR-PCI-MSI 409600-edge eth0 38: 58305 4623 3762 2280 12847 9519 7089 5222 IR-PCI-MSI 1048576-edge mpt2sas1-msix0 39: 58951 4757 4022 2465 12296 9615 7070 5059 IR-PCI-MSI 1572864-edge mpt2sas2-msix0 NMI: 0 0 0 0 0 0 0 0 Non-maskable interrupts LOC: 557747 555080 546294 658922 636080 486156 435353 406441 Local timer interrupts SPU: 0 0 0 0 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 0 0 0 0 Performance monitoring interrupts IWI: 0 0 0 0 0 0 0 0 IRQ work interrupts RTR: 0 0 0 0 0 0 0 0 APIC ICR read retries RES: 210030 211826 198842 171955 289050 257279 328987 316112 Rescheduling interrupts CAL: 8458 7961 8816 7957 9253 8715 7401 6328 Function call interrupts TLB: 7845 7495 8179 7344 8682 8082 6781 5816 TLB shootdowns TRM: 0 0 0 0 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 0 0 0 0 Threshold APIC interrupts DFR: 0 0 0 0 0 0 0 0 Deferred Error APIC interrupts MCE: 0 0 0 0 0 0 0 0 Machine check exceptions MCP: 5 5 5 5 5 5 5 5 Machine check polls ERR: 0 MIS: 0 PIN: 0 0 0 0 0 0 0 0 Posted-interrupt notification event PIW: 0 0 0 0 0 0 0 0 Posted-interrupt wakeup event syslog.txt storage-diagnostics-20170522-1331.zip
  15. Well that didnt help. Machine was up for 3-4 minutes then the Disabling IRQ #18 happened again and everything went unresponsive. Should i make a new post about this since its not really the same issue as the thread title suggests?
  16. I downgraded to 6.3.3 prior to seeing your message to see if that helps. This is what i see using your command above 18: 601755 7851 7386 8481 9263 12039 8049 7874 IR-IO-APIC 18-fasteoi i801_smbus, eth0
  17. I was able to access the shares, was unable to access the gui or any running containers. Just hooked a monitor up to the box and saw a Disabling IRQ #18 message on the screen. Not sure what that means and without logs im not sure if im going to be able to figure out what the hell happened.
  18. root@Storage:~# diagnostics Starting diagnostics collection... Been running since about 1 minute after your first post
  19. So, any suggestions if running diagnostics has taken ~18 minutes so far and nothing is being generated?
  20. My machine has been acting up and i cannot get the gui to load at all. I figured i would run diagnostics but since i cant access the gui i obviously cannot do that. Is there a way to manually run diagnostics from the command line?
  21. Here are the results root@Storage:~# btrfs dev stats /mnt/cache [/dev/sdf1].write_io_errs 0 [/dev/sdf1].read_io_errs 0 [/dev/sdf1].flush_io_errs 0 [/dev/sdf1].corruption_errs 0 [/dev/sdf1].generation_errs 0 [/dev/sdk1].write_io_errs 0 [/dev/sdk1].read_io_errs 0 [/dev/sdk1].flush_io_errs 0 [/dev/sdk1].corruption_errs 0 [/dev/sdk1].generation_errs 0
  22. Started having issues today where the mover was not moving anything from the cache pool to the array so i stopped the array and put it in maintenance mode then ran the btrfs check --readonly option after clicking on the disk in the unraid gui. It had some info in there but i accidentally closed it but the gist of it was something like this: block group 422051840 has wrong amount of free space failed to load free space cache for block group 422051840 In the log when the mover is running i see this May 1 16:55:11 Storage shfs/user: err: shfs_write: write: (28) No space left on device May 1 16:55:11 Storage kernel: loop: Write error at byte offset 7035449344, length 4096. May 1 16:55:11 Storage kernel: blk_update_request: I/O error, dev loop0, sector 13741112 May 1 16:55:11 Storage kernel: BTRFS error (device loop0): bdev /dev/loop0 errs: wr 495, rd 0, flush 0, corrupt 0, gen 0 May 1 16:56:36 Storage shfs/user: err: shfs_write: write: (28) No space left on device I have ample space across all drives so i know im not out of space. I did some searching for the "wrong amount of free space" error above and saw some stuff referencing a clear_cache flag but nothing here on the lime tech forums. Any ideas on what to do here? Diagnostics are attached storage-diagnostics-20170501-1655.zip