Jump to content

chesh

Members
  • Content Count

    44
  • Joined

  • Last visited

Community Reputation

1 Neutral

About chesh

  • Rank
    Advanced Member

Converted

  • Gender
    Undisclosed
  1. That totally fixed my issue. Set it back to defaults and ran a parity check. No slowdowns and no errors in the logs. Thanks for the help!
  2. I've been troubleshooting an issue w/ my Unraid server for the last month and have been mostly avoiding the issue by not running a parity check. At the beginning of the month, while my parity check was running, I noticed my docker containers and VMs were running like crap when a parity check was running. At least, that's what I eventually figured out after downgrading back to 6.5.3 thinking it was an issue w/ the new 6.6.x releases. It started out with my Windows 7 VM being unresponsive and my containers having timeout issues. I eventually found the following in my logs: Nov 29 11:22:08 Tower kernel: INFO: rcu_sched self-detected stall on CPU Nov 29 11:22:08 Tower kernel: 30-...: (60000 ticks this GP) idle=a26/140000000000001/0 softirq=1132375/1132375 fqs=14147 Nov 29 11:22:08 Tower kernel: (t=60001 jiffies g=52263 c=52262 q=104379) Nov 29 11:22:08 Tower kernel: NMI backtrace for cpu 30 Nov 29 11:22:08 Tower kernel: CPU: 30 PID: 11752 Comm: unraidd Not tainted 4.14.49-unRAID #1 Nov 29 11:22:08 Tower kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./EP2C602-4L/D16, BIOS P1.80 01/16/2014 Nov 29 11:22:08 Tower kernel: Call Trace: Nov 29 11:22:08 Tower kernel: <IRQ> Nov 29 11:22:08 Tower kernel: dump_stack+0x5d/0x79 Nov 29 11:22:08 Tower kernel: nmi_cpu_backtrace+0x9b/0xba Nov 29 11:22:08 Tower kernel: ? irq_force_complete_move+0xf3/0xf3 Nov 29 11:22:08 Tower kernel: nmi_trigger_cpumask_backtrace+0x56/0xd4 Nov 29 11:22:08 Tower kernel: rcu_dump_cpu_stacks+0x8e/0xb8 Nov 29 11:22:08 Tower kernel: rcu_check_callbacks+0x212/0x5f0 Nov 29 11:22:08 Tower kernel: update_process_times+0x23/0x45 Nov 29 11:22:08 Tower kernel: tick_sched_timer+0x33/0x61 Nov 29 11:22:08 Tower kernel: __hrtimer_run_queues+0x78/0xc1 Nov 29 11:22:08 Tower kernel: hrtimer_interrupt+0x87/0x157 Nov 29 11:22:08 Tower kernel: smp_apic_timer_interrupt+0x75/0x85 Nov 29 11:22:08 Tower kernel: apic_timer_interrupt+0x7d/0x90 Nov 29 11:22:08 Tower kernel: </IRQ> Nov 29 11:22:08 Tower kernel: RIP: 0010:memcmp+0x2/0x1d Nov 29 11:22:08 Tower kernel: RSP: 0018:ffffc900077c7cd0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 Nov 29 11:22:08 Tower kernel: RAX: 0000000000000000 RBX: ffff881015ec0ce8 RCX: 0000000000000fd7 Nov 29 11:22:08 Tower kernel: RDX: 0000000000001000 RSI: ffff88103b417000 RDI: ffff881015ed7000 Nov 29 11:22:08 Tower kernel: RBP: ffff881015ed7000 R08: 00000000000000b6 R09: ffff881015ec0d88 Nov 29 11:22:08 Tower kernel: R10: 0000000000000fd0 R11: 0000000000000ff0 R12: ffff88103856c800 Nov 29 11:22:08 Tower kernel: R13: 0000000000000000 R14: ffff881015ec0d60 R15: 000000000000000f Nov 29 11:22:08 Tower kernel: check_parity+0x27c/0x30b [md_mod] Nov 29 11:22:08 Tower kernel: ? ttwu_do_wakeup.isra.4+0xd/0x84 Nov 29 11:22:08 Tower kernel: handle_stripe+0xefc/0x1293 [md_mod] Nov 29 11:22:08 Tower kernel: unraidd+0xb8/0x111 [md_mod] Nov 29 11:22:08 Tower kernel: ? md_open+0x2c/0x2c [md_mod] Nov 29 11:22:08 Tower kernel: ? md_thread+0xbc/0xcc [md_mod] Nov 29 11:22:08 Tower kernel: ? handle_stripe+0x1293/0x1293 [md_mod] Nov 29 11:22:08 Tower kernel: md_thread+0xbc/0xcc [md_mod] Nov 29 11:22:08 Tower kernel: ? wait_woken+0x68/0x68 Nov 29 11:22:08 Tower kernel: kthread+0x111/0x119 Nov 29 11:22:08 Tower kernel: ? kthread_create_on_node+0x3a/0x3a Nov 29 11:22:08 Tower kernel: ret_from_fork+0x35/0x40 Nov 29 11:22:12 Tower kernel: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 30-... } 63749 jiffies s: 7381 root: 0x2/. Nov 29 11:22:12 Tower kernel: blocking rcu_node structures: l=1:16-31:0x4000/. Nov 29 11:22:12 Tower kernel: Task dump for CPU 30: Nov 29 11:22:12 Tower kernel: unraidd R running task 0 11752 2 0x80000008 Nov 29 11:22:12 Tower kernel: Call Trace: Nov 29 11:22:12 Tower kernel: ? md_open+0x2c/0x2c [md_mod] Nov 29 11:22:12 Tower kernel: ? md_thread+0xbc/0xcc [md_mod] Nov 29 11:22:12 Tower kernel: ? handle_stripe+0x1293/0x1293 [md_mod] Nov 29 11:22:12 Tower kernel: ? md_thread+0xbc/0xcc [md_mod] Nov 29 11:22:12 Tower kernel: ? wait_woken+0x68/0x68 Nov 29 11:22:12 Tower kernel: ? kthread+0x111/0x119 Nov 29 11:22:12 Tower kernel: ? kthread_create_on_node+0x3a/0x3a Nov 29 11:22:12 Tower kernel: ? ret_from_fork+0x35/0x40 Is this a bad SATA/molex power connector or bad cable to this part of my backplane? Do I possibly have some ports going out? Any help would be much appreciated. Thanks for any help that you can provide! tower-diagnostics-20181130-1023.zip
  3. You can safely ignore the error. It'll be fixed in an update to the container. Read above if you want to know what's going on w/ Sickrage/Sickchill/Medusa
  4. FYI: Converting from Sickrage to Sickchill was extremely simple. Take a backup of your configuration from before Saturday (10/13) and just restore it into Sickchill (should be in appdata\Sickrage\backup and is a .zip file). If you don't have a backup in that location, do my trick above to switch back to the 10-6 version of Sickrage, don't start the container, rename all of the .old files in your configuration directory back to .db files and then start Sickrage. From here you can go into Settings -> Backup/Restore and make a backup. Then, shutdown Sickrage, start Sickchill, and restore your backup. Should be as easy as that. I had previously gone to Medusa, but it uses slightly different tables in the .db file and not everything was working as nice as I would have liked (some shows were showing up in my shows list, but I couldn't access them from the dropdown and then Sickrage started redownloading all the episodes, even though I didn't have any missing episodes). If you do these instructions to switch to Sickchill, the worse you may have to deal with is it not knowing about all the episodes it has downloaded since the backup was made. They'll just show up in your Schedule as missed episodes. If you go into each show that is "missed" (but has already downloaded) you can just "Re-scan files" and it will find the file downloaded and not try to redownload it. It may try to postprocess it again, but that won't harm anything other then move the episode to the front of Recently Added TV.
  5. Just converted to Medusa w/ a backup/restore of configuration files. Once you've done the restore in Medusa, go into appdata\binhex-medusa\restore and rename the .db file to main.db and then restart the container. That was the only one off I had to do to get my DB restored. After that, I just checked all my settings and corrected anything that didn't convert over correctly. Shutdown and deleted Sickrage once I got that all fixed. Everything seems to be working well now.
  6. Not sure why yours doesn't have :latest. Might be a carry over from an older version of unRaid (I've been using unRaid since 5.x). Deleting an orphan is as simple as clicking on the orphaned image and selecting delete (will be in the context menu).
  7. Edit the SickRage container and in the repository line change it binhex/arch-sickrage:latest to binhex/arch-sickrage:v2018.10.06-1-01 and hit save. You'll have to manually remove the orphan from your docker tab.
  8. You can roll back by changing latest to v2018.10.06-1-01 in your docker config for SickRage (full line should be: binhex/arch-sickrage:v2018.10.06-1-01). You can watch here: https://hub.docker.com/r/binhex/arch-sickrage/builds/ for when binhex releases the next version that will fix the issue. Once a new release is out, change binhex/arch-sickrage:v2018.10.06-1-01 back to binhex/arch-sickrage:latest and it will grab the latest version.
  9. After rebuilding just about everything today, I was finally able to figure out what was causing my slowdowns. Putting the answer here, so in case anyone else runs against the same problem. After starting dockers one at a time and watching the disk throughput via a parity check, it was SickRage causing the issue. Turns out I had a ton of crap in the post-processing folder and when it would go to post-process, it was extracting a ton of crap all at the same time and trying to move it to my specified folders. Weird thing about SickRage, it was also extracting movies, decided it didn't know what it was, and then moved on. I had a lot of movies in my post-process folder. So, I moved everything out of that folder that had already been post-processed, started back up SickRage, and here we are almost 45mins later and no issues. I had successfully run Deluge and Plex for most the day and had no issues. But, as soon as SickRage hit the post-processing task, there would go my speeds again. It must spawn off a bunch of child threads when post-processing, as killing the SickRage container wouldn't bring my speed back. It would take almost 20-30mins for it to finish what it was doing (even though the container was stopped) before it would go back to normal. Hence the hard time trying to pin out the root cause. Anyway, I hope this helps someone else.
  10. So, in further troubleshooting, I set all of my containers to not auto-start and then rebooted my server again. Once back up, I kicked off another parity check and I was getting the expected speeds (80mb/sec - 140mb/sec). It ran like this for about 30mins and then dropped back down to 56kb/sec. This is with literally nothing running but UnRaid. Not sure what is causing the disk usage. Attached is a picture of the current disk usage with nothing running. Anyway to figure out what's eating it?
  11. So, I let the server run over night and tried some moving of files between shares via a local VM on UnRaid. Still have extremely slow write speeds (500kb/sec). So, before I got in the showing this morning, I fully shutdown my server, let it sit for 10mins, and then powered back on. This was 3hours ago. I just tested copying some files between shares again, and I'm still having an issue. Any thoughts on troubleshooting further? Attached is a new diagnostics. tower-diagnostics-20170505-0850.zip
  12. That's just it. While the disk was rebuilding yesterday, all speeds were fine. As of today, the R/W speeds have been crap. I've rebooted 3 times tonight (including the final one to downgrade to see if that was the issue), and it's still acting weird. I'll continue testing in the morning to see if anything changes. Since I originally posted, I've tested w/ my local Win7 VM on the UnRAID server copying files between shares and I'm getting anywhere from 750kb/sec - 99mb/sec. For instance, I started watching a show from my Roku 4 through my Plex docker and it was taking forever to spool. RDP'd into my Win7 VM and started a copy process between shares and it started slow and once it started speeding up, my spool sped up too. Not sure if that's indicative of something. Like I said, I'll continue to watch and see if it speeds up any, and post any follow up if it doesn't get better tomorrow. I've already cancelled the parity check for the night to see if it gets better tomorrow. I'll kick off another parity in the morning to see if it's goes any quicker. PS: Squid, thanks for being a rock in the forums. Don't ever leave. You're one of the best community support guys in here.
  13. Yesterday I replaced a 4TB Red that was having read errors. The disk I replaced it with I had already precleared (3 pass) and had it sitting aside as a hot spare. At the same time, I upgraded from 6.3.1 to 6.3.3. So, not sure which, if either caused this issue. But, after the disk rebuild finished, I seem to be having extremely slow read/write speeds. I first noticed it when everyone connected to my Plex remote were continuously buffering. I tried copying some files between shares via my laptop and noticed some slow copy speeds there, too (1mb - 20mb fluctuating). So, just for the hell of it, I just kicked off a parity check. It's going at a whopping 55kb/sec. Is this due to rebuilding the disk, or do I have something else going wrong? I've attached a diagnostic report. Thanks for any help anyone can provide! tower-diagnostics-20170504-2129.zip
  14. Well, sh*t, that's all it was. Thanks, Squid!