weirdcrap

Members
  • Posts

    454
  • Joined

  • Last visited

Everything posted by weirdcrap

  1. If you checked the "anonymize statistics" box during diagnostics creation there should be no personally identifiable information in the diagnostics so it can be left if you want.
  2. Winner! I do not, not sure when I lost it or maybe I forgot to ever set it up and I just never noticed until the new dashboard UI.
  3. I am seeing what appears to be fan RPMs displayed under the motherboard temperature area. I am on the latest BIOS now (F3, this was taken before I updated) and still have the RPMs showing up in the temp area. Diagnostics: node-diagnostics-20190305-1449.zip Perl is installed and up to date. This does not happen on my server at home which is also on RC5 with the Dynamix system temp plugin but with a different motherboard.
  4. ah, didn't realize that came from the system temps plugin. Will do!
  5. This is being handled here as it is a plugin issue: As you can see in (underlined in red) on the current RC5--which is my first RC of this version--I am seeing what appears to be fan RPMs displayed under the motherboard temperature area. I am on the latest BIOS now (F3, this was taken before I updated) and still have the RPMs showing up in the temp area. This does not happen on my server at home which is also on RC5 but with a different motherboard.
  6. After using it for a few weeks I am satisfied with how everything is working, as always thanks for your help Johnnie.
  7. Ok so I installed the plugin and even after spinning up all drives it detects one drive as spundown. I ran hdparm on each disk and one of the disks on the H200 is reporting a status of "Unknown". The offending disk is /dev/sdg and it is a Seagate NAS HDD. I have another one of the same disk model installed and it reported it's status fine (sdf). EDIT: weird after spinning SDG down and back up it is now detecting correctly. EDIT2: Spoke to soon, left it for a minute or so and now SDG is back to unknown. It seems to snap out of it once I start copying data. root@Node:/home/chris# hdparm -C /dev/sdh /dev/sdh: drive state is: active/idle root@Node:/home/chris# hdparm -C /dev/sde /dev/sde: drive state is: active/idle root@Node:/home/chris# hdparm -C /dev/sdj /dev/sdj: drive state is: active/idle root@Node:/home/chris# hdparm -C /dev/sdk /dev/sdk: drive state is: active/idle root@Node:/home/chris# hdparm -C /dev/sdl /dev/sdl: drive state is: active/idle root@Node:/home/chris# hdparm -C /dev/sdb /dev/sdb: drive state is: active/idle root@Node:/home/chris# hdparm -C /dev/sdf /dev/sdf: drive state is: active/idle root@Node:/home/chris# hdparm -C /dev/sdg /dev/sdg: drive state is: unknown root@Node:/home/chris# hdparm -C /dev/sdg /dev/sdg: drive state is: unknown root@Node:/home/chris# hdparm -C /dev/sdi /dev/sdi: drive state is: active/idle root@Node:/home/chris# hdparm -C /dev/sdd /dev/sdd: drive state is: active/idle
  8. Sweet, I'll give it a try then thanks!
  9. I was getting ready to try this out but it sounds like it may not work for me either (all my cards are Perc H200s). @Squid Is this a confirmed issue?
  10. A couple follow up questions as I have been doing reading on the turbo mode and the plugin for auto-enabling it that Squid made. First, It sounds like this plugin is still useful as UnRAID's current "Auto" setting is to just leave turbo mode off, correct? Second, the discussion in the thread makes it sound like this is ok to use with Docker applications as long as I have a cache drive and all my shares are set to "Cache: Yes." Otherwise the disks won't ever spin down and turbo mode will always be enabled (this is what I got from the discussion). Sadly, based on some recent posts in the plugin thread it sounds like Dell H310s (and probably H200s since they use same chipset?) may return some odd statues to hdparm that are causing the plugin to not function as expected. So I may not be able to use it after all...
  11. Interesting, you are correct about those speeds being normal, copying between my two H200s on my home server produces the same speeds. My Parity writing mode is set to Auto. So what about the "blocking" of writes to the cache drive from docker? When the parity check was running I had movies hung up and not importing in Radarr. I SSH'd into the machine and manually copied one of the movies with MC and it copied no problem... This is all anecdotal of course without having to put my old hardware back in and doing way more testing. I don't recall ever having performance issues importing downloads via docker containers or watching Plex regardless of whether a parity check was running or not. This is the biggest concern for me. I'm doing some testing this morning and will come back and edit my post with results. EDIT: Well after a reboot things seem to be in order...Radarr is picking up movies from NZBGet with no delay unlike last night's 30 minutes to import a local movie file. Plex hasn't hung up or thrown a to slow error yet.
  12. My primary server "NODE" recently got an upgrade of sorts and I am now noticing disk performance has been degraded. Running the latest RC5. For the longest time I have been running 2x Marvel Chipset 9215 (IIRC) PCIE x1 4 port SATA HBAs without much fuss. Recently I decided to swap them out (to prevent future UnRAID issues) with a newly flashed Dell H200. My disks were split between my 6 onboard SATA ports and 6 ports spread between the two Marvell controllers. One Marvell in the PCIE x16 slot and one in the first PCIE x1 slot. With this I would see a consistent 100-120Mbps. Now I have 7 disks (including parity) running off the H200 in the PCIE x16 slot and the 4 Mobo SATA III ports all filled up. All disks are successfully reporting 6Gbps speeds. With this I am seeing sustained speeds of only about 50-60Mbps. I ran a parity check right off the bat just to see how everything would perform under load. Speeds are good (around ~150 - ~100) but I noticed Plex would throw errors about not being fast enough for playback. At first I just wrote it off until I started trying to import a movie via Radarr. Radarr downloads via NZBGet to my cache drive and it is picking up and moving from their to my array cache which should normally only take ~30 seconds for ~5GB movie. When I logged in I had 4 movies all queued up and not moving at all until I cancelled the parity check. A few minutes after the parity check was cancelled everything imported just fine. I would only see small intermittent bursts of write activity in the system stats. This led me to check speeds copying between disks which is where I got more concerned. Whether I am copying between disks connected to the H200 or between disks on the H200 and the Mobo controller disks I see bursts starting at ~100ish but dropping down to a sustainable ~50ish. This is significantly lower than I was expecting to see. Copying a 4.5GB ISO from an array disk on the mobo controller to an array disk on the H200 is around 50-60Mbps. Copying array disk to array disk on the H200 is around 60Mbps. I already tried adjusting my disk tuneables to what @johnnie.black recommends (up until today I had been running at the defaults on this server). Sorry for the wordiness, just trying to be thorough in my description of the issue. Attached is my diagnostics from the server. I am considering going back to my Marvell controllers just to regain performance since this is my primary heavily used/shared server. I have an H310 that should be arriving tomorrow that I could try swapping out in it's place once I get it flashed. What can I do to improve my performance beyond switch back to my old hardware? To summarize: I used to be able to have downloads continue to be imported in the background and/or watch Plex during parity checks and such but since putting in the H200 it seems to cause pretty significant performance issues when doing simultaneous R/W operations. node-diagnostics-20190225-0140.zip
  13. Just noticed this as well. Glad to know you are working on it. For now I have switched back to the main rutorrent interface.
  14. Sweet, I thought it looked NIC related and am glad it is nothing to be concerned about. 256 sync_thresh and the new PSU seems to have fixed all my problems. Going to start swapping disks back into their original locations and see if they continue to behave themselves. EDIT: Final result, it appears to have been all down to an underpowered PSU. After putting the new PSU in and swapping all disks and cables back I have had zero issues. Thanks for all your help Johnnie.
  15. New call trace that is different from the others. The is is the only one I have gotten so far in the last two parity checks. Dec 14 16:53:15 VOID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog Dec 14 18:54:37 VOID kernel: ------------[ cut here ]------------ Dec 14 18:54:37 VOID kernel: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out Dec 14 18:54:37 VOID kernel: WARNING: CPU: 4 PID: 31 at net/sched/sch_generic.c:461 dev_watchdog+0x150/0x1a8 Dec 14 18:54:37 VOID kernel: Modules linked in: xt_CHECKSUM iptable_mangle ipt_REJECT ebtable_filter ebtables ip6table_filter ip6_tables vhost_net tun vhost tap ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs md_mod it87 hwmon_vid bonding edac_mce_amd kvm_amd ccp kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd r8169 mpt3sas fam15h_power wmi_bmof ahci mxm_wmi glue_helper i2c_piix4 k10temp i2c_core mii wmi libahci raid_class pcc_cpufreq scsi_transport_sas button acpi_cpufreq Dec 14 18:54:37 VOID kernel: CPU: 4 PID: 31 Comm: ksoftirqd/4 Not tainted 4.18.20-unRAID #1 Dec 14 18:54:37 VOID kernel: Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 R2.0, BIOS 2603 06/26/2015 Dec 14 18:54:37 VOID kernel: RIP: 0010:dev_watchdog+0x150/0x1a8 Dec 14 18:54:37 VOID kernel: Code: 15 fd 97 00 00 75 36 4c 89 ef c6 05 09 fd 97 00 01 e8 93 c5 fd ff 89 e9 4c 89 ee 48 c7 c7 ee 0f d9 81 48 89 c2 e8 53 c0 b2 ff <0f> 0b eb 0f ff c5 48 81 c2 40 01 00 00 39 cd 75 98 eb 13 48 8b 83 Dec 14 18:54:37 VOID kernel: RSP: 0018:ffffc900019cbdb0 EFLAGS: 00010282 Dec 14 18:54:37 VOID kernel: RAX: 0000000000000000 RBX: ffff88044bb363b0 RCX: 0000000000000007 Dec 14 18:54:37 VOID kernel: RDX: 0000000000000000 RSI: ffff88045ed16470 RDI: ffff88045ed16470 Dec 14 18:54:37 VOID kernel: RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000020400 Dec 14 18:54:37 VOID kernel: R10: 0000000000000987 R11: 000000000000b6a8 R12: ffff88044bb3639c Dec 14 18:54:37 VOID kernel: R13: ffff88044bb36000 R14: ffff8804482e5c80 R15: 0000000000000004 Dec 14 18:54:37 VOID kernel: FS: 0000000000000000(0000) GS:ffff88045ed00000(0000) knlGS:0000000000000000 Dec 14 18:54:37 VOID kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Dec 14 18:54:37 VOID kernel: CR2: 0000000001af7544 CR3: 0000000447f76000 CR4: 00000000000406e0 Dec 14 18:54:37 VOID kernel: Call Trace: Dec 14 18:54:37 VOID kernel: call_timer_fn+0x18/0x7b Dec 14 18:54:37 VOID kernel: ? qdisc_reset+0xc0/0xc0 Dec 14 18:54:37 VOID kernel: expire_timers+0x7f/0x8e Dec 14 18:54:37 VOID kernel: run_timer_softirq+0x72/0x120 Dec 14 18:54:37 VOID kernel: ? __switch_to_asm+0x34/0x70 Dec 14 18:54:37 VOID kernel: ? __switch_to_asm+0x40/0x70 Dec 14 18:54:37 VOID kernel: ? __switch_to+0x1fe/0x30d Dec 14 18:54:37 VOID kernel: ? __switch_to_asm+0x40/0x70 Dec 14 18:54:37 VOID kernel: __do_softirq+0xce/0x1e2 Dec 14 18:54:37 VOID kernel: ? smpboot_park_thread+0x25/0x25 Dec 14 18:54:37 VOID kernel: run_ksoftirqd+0x19/0x2d Dec 14 18:54:37 VOID kernel: smpboot_thread_fn+0x134/0x149 Dec 14 18:54:37 VOID kernel: kthread+0x10b/0x113 Dec 14 18:54:37 VOID kernel: ? kthread_flush_work_fn+0x9/0x9 Dec 14 18:54:37 VOID kernel: ret_from_fork+0x22/0x40 Dec 14 18:54:37 VOID kernel: ---[ end trace 3af301e239cd4c04 ]--- Dec 14 18:54:37 VOID kernel: r8169 0000:02:00.0 eth0: link up Dec 14 21:25:12 VOID kernel: mdcmd (222): spindown 13 Dec 14 23:00:01 VOID Plugin Auto Update: Checking for available plugin updates Dec 14 23:00:06 VOID Plugin Auto Update: Community Applications Plugin Auto Update finished Dec 15 00:02:43 VOID crond[1836]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null Dec 15 00:10:39 VOID kernel: mdcmd (223): spindown 8 Dec 15 00:10:40 VOID kernel: mdcmd (224): spindown 10 Dec 15 00:10:41 VOID kernel: mdcmd (225): spindown 12 Dec 15 00:10:47 VOID kernel: mdcmd (226): spindown 9 Dec 15 00:10:49 VOID kernel: mdcmd (227): spindown 7 Dec 15 00:11:00 VOID kernel: mdcmd (228): spindown 15 Dec 15 02:00:02 VOID Docker Auto Update: Community Applications Docker Autoupdate running Dec 15 02:00:02 VOID Docker Auto Update: Checking for available updates Dec 15 02:00:09 VOID Docker Auto Upda
  16. down to 1024 and it still happened within an hour of adjusting the values. Dec 14 06:06:01 VOID kernel: mdcmd (58): set md_num_stripes 4096 Dec 14 06:06:01 VOID kernel: mdcmd (59): set md_sync_window 2048 Dec 14 06:06:01 VOID kernel: mdcmd (60): set md_sync_thresh 1024 Dec 14 06:06:01 VOID kernel: mdcmd (61): set md_write_method Dec 14 06:06:01 VOID kernel: mdcmd (62): set spinup_group 0 0 Dec 14 06:06:01 VOID kernel: mdcmd (63): set spinup_group 1 0 Dec 14 06:06:01 VOID kernel: mdcmd (64): set spinup_group 2 0 Dec 14 06:06:01 VOID kernel: mdcmd (65): set spinup_group 3 0 Dec 14 06:06:01 VOID kernel: mdcmd (66): set spinup_group 4 0 Dec 14 06:06:01 VOID kernel: mdcmd (67): set spinup_group 5 0 Dec 14 06:06:01 VOID kernel: mdcmd (68): set spinup_group 6 0 Dec 14 06:06:01 VOID kernel: mdcmd (69): set spinup_group 7 0 Dec 14 06:06:01 VOID kernel: mdcmd (70): set spinup_group 8 0 Dec 14 06:06:01 VOID kernel: mdcmd (71): set spinup_group 9 0 Dec 14 06:06:01 VOID kernel: mdcmd (72): set spinup_group 10 0 Dec 14 06:06:01 VOID kernel: mdcmd (73): set spinup_group 11 0 Dec 14 06:06:01 VOID kernel: mdcmd (74): set spinup_group 12 0 Dec 14 06:06:01 VOID kernel: mdcmd (75): set spinup_group 13 0 Dec 14 06:06:01 VOID kernel: mdcmd (76): set spinup_group 14 0 Dec 14 06:06:01 VOID kernel: mdcmd (77): set spinup_group 15 0 Dec 14 06:06:01 VOID kernel: mdcmd (78): set spinup_group 16 0 Dec 14 06:06:01 VOID kernel: mdcmd (79): set spinup_group 17 0 Dec 14 06:06:01 VOID kernel: mdcmd (80): set spinup_group 18 0 Dec 14 06:06:01 VOID kernel: mdcmd (81): set spinup_group 19 0 Dec 14 06:06:28 VOID emhttpd: req (8): csrf_token=****************&title=System+Log&cmd=%2FwebGui%2Fscripts%2Ftail_log&arg1=syslog Dec 14 06:06:28 VOID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog Dec 14 06:59:00 VOID kernel: INFO: rcu_sched self-detected stall on CPU Dec 14 06:59:00 VOID kernel: 2-....: (60000 ticks this GP) idle=e4a/1/4611686018427387906 softirq=5969851/5969851 fqs=14532 Dec 14 06:59:00 VOID kernel: (t=60000 jiffies g=1902670 c=1902669 q=27572) Dec 14 06:59:00 VOID kernel: NMI backtrace for cpu 2 Dec 14 06:59:00 VOID kernel: CPU: 2 PID: 8980 Comm: unraidd Not tainted 4.18.20-unRAID #1 Dec 14 06:59:00 VOID kernel: Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 R2.0, BIOS 2603 06/26/2015 Dec 14 06:59:00 VOID kernel: Call Trace: Dec 14 06:59:00 VOID kernel: Dec 14 06:59:00 VOID kernel: dump_stack+0x5d/0x79 Dec 14 06:59:00 VOID kernel: nmi_cpu_backtrace+0x71/0x83 Dec 14 06:59:00 VOID kernel: ? lapic_can_unplug_cpu+0x8e/0x8e Dec 14 06:59:00 VOID kernel: nmi_trigger_cpumask_backtrace+0x57/0xd7 Dec 14 06:59:00 VOID kernel: rcu_dump_cpu_stacks+0x91/0xbb Dec 14 06:59:00 VOID kernel: rcu_check_callbacks+0x23f/0x5ca Dec 14 06:59:00 VOID kernel: ? tick_sched_handle.isra.5+0x2f/0x2f Dec 14 06:59:00 VOID kernel: update_process_times+0x23/0x45 Dec 14 06:59:00 VOID kernel: tick_sched_timer+0x36/0x64 Dec 14 06:59:00 VOID kernel: __hrtimer_run_queues+0xb1/0x105 Dec 14 06:59:00 VOID kernel: hrtimer_interrupt+0xf4/0x20d Dec 14 06:59:00 VOID kernel: smp_apic_timer_interrupt+0x79/0x91 Dec 14 06:59:00 VOID kernel: apic_timer_interrupt+0xf/0x20 Dec 14 06:59:00 VOID kernel: Dec 14 06:59:00 VOID kernel: RIP: 0010:xor_avx_5+0x231/0x352 Dec 14 06:59:00 VOID kernel: Code: c4 c1 7d 6f 92 40 01 00 00 c4 c1 6c 57 93 40 01 00 00 c5 ec 57 93 40 01 00 00 c5 ec 57 95 40 01 00 00 c5 ec 57 90 40 01 00 00 fd 7f 90 40 01 00 00 c4 c1 7d 6f 9a 60 01 00 00 c4 c1 64 57 9b Dec 14 06:59:00 VOID kernel: RSP: 0018:ffffc900021d3c68 EFLAGS: 00000287 ORIG_RAX: ffffffffffffff13 Dec 14 06:59:00 VOID kernel: RAX: ffff88040f61ca00 RBX: ffff88040f606a00 RCX: ffff88040f606000 Dec 14 06:59:00 VOID kernel: RDX: 0000000000000000 RSI: ffff88040f61c000 RDI: 0000000000001000 Dec 14 06:59:00 VOID kernel: RBP: ffff88040f605a00 R08: ffff88040f607000 R09: ffff88040f610000 Dec 14 06:59:00 VOID kernel: R10: ffff88040f610a00 R11: ffff88040f607a00 R12: 0000000000000a00 Dec 14 06:59:00 VOID kernel: R13: ffff88040f61c000 R14: ffff88040f605000 R15: ffff88040f606000 Dec 14 06:59:00 VOID kernel: ? xor_avx_5+0x2d/0x352 Dec 14 06:59:00 VOID kernel: check_parity+0x118/0x349 [md_mod] Dec 14 06:59:00 VOID kernel: handle_stripe+0xe8a/0x1226 [md_mod] Dec 14 06:59:00 VOID kernel: unraidd+0xbc/0x123 [md_mod] Dec 14 06:59:00 VOID kernel: ? md_open+0x2c/0x2c [md_mod] Dec 14 06:59:00 VOID kernel: md_thread+0xcc/0xf1 [md_mod] Dec 14 06:59:00 VOID kernel: ? wait_woken+0x68/0x68 Dec 14 06:59:00 VOID kernel: kthread+0x10b/0x113 Dec 14 06:59:00 VOID kernel: ? kthread_flush_work_fn+0x9/0x9 Dec 14 06:59:00 VOID kernel: ret_from_fork+0x22/0x40 Dec 14 07:01:40 VOID kernel: INFO: rcu_sched self-detected stall on CPU Dec 14 07:01:40 VOID kernel: 3-....: (59999 ticks this GP) idle=5fa/1/4611686018427387906 softirq=5826078/5826078 fqs=14552 Dec 14 07:01:40 VOID kernel: (t=60001 jiffies g=1902684 c=1902683 q=32177) Dec 14 07:01:40 VOID kernel: NMI backtrace for cpu 3 Down to 800 now to see if that will be stable. EDIT: already call traced at 800 Dec 14 07:26:02 VOID kernel: CPU: 4 PID: 8980 Comm: unraidd Not tainted 4.18.20-unRAID #1 Dec 14 07:26:02 VOID kernel: Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 R2.0, BIOS 2603 06/26/2015 Dec 14 07:26:02 VOID kernel: Call Trace: Dec 14 07:26:02 VOID kernel: <IRQ> Dec 14 07:26:02 VOID kernel: dump_stack+0x5d/0x79 Dec 14 07:26:02 VOID kernel: nmi_cpu_backtrace+0x71/0x83 Dec 14 07:26:02 VOID kernel: ? lapic_can_unplug_cpu+0x8e/0x8e Dec 14 07:26:02 VOID kernel: nmi_trigger_cpumask_backtrace+0x57/0xd7 Dec 14 07:26:02 VOID kernel: rcu_dump_cpu_stacks+0x91/0xbb Dec 14 07:26:02 VOID kernel: rcu_check_callbacks+0x23f/0x5ca Dec 14 07:26:02 VOID kernel: ? tick_sched_handle.isra.5+0x2f/0x2f Dec 14 07:26:02 VOID kernel: update_process_times+0x23/0x45 Dec 14 07:26:02 VOID kernel: tick_sched_timer+0x36/0x64 Dec 14 07:26:02 VOID kernel: __hrtimer_run_queues+0xb1/0x105 Dec 14 07:26:02 VOID kernel: hrtimer_interrupt+0xf4/0x20d Dec 14 07:26:02 VOID kernel: smp_apic_timer_interrupt+0x79/0x91 Dec 14 07:26:02 VOID kernel: apic_timer_interrupt+0xf/0x20 Dec 14 07:26:02 VOID kernel: </IRQ> Dec 14 07:26:02 VOID kernel: RIP: 0010:xor_avx_5+0x28d/0x352 Dec 14 07:26:02 VOID kernel: Code: c5 fd 7f 98 60 01 00 00 c4 c1 7d 6f 82 80 01 00 00 c4 c1 7c 57 83 80 01 00 00 c5 fc 57 83 80 01 00 00 c5 fc 57 85 80 01 00 00 <c5> fc 57 80 80 01 00 00 c5 fd 7f 80 80 01 00 00 c4 c1 7d 6f 8a a0 Dec 14 07:26:02 VOID kernel: RSP: 0018:ffffc900021d3c68 EFLAGS: 00000287 ORIG_RAX: ffffffffffffff13 Dec 14 07:26:02 VOID kernel: RAX: ffff88040f0f1600 RBX: ffff88040f0e3600 RCX: ffff88040f0e3000 Dec 14 07:26:02 VOID kernel: RDX: 0000000000000000 RSI: ffff88040f0f1000 RDI: 0000000000001000 Dec 14 07:26:02 VOID kernel: RBP: ffff88040f0e2600 R08: ffff88040f0e4000 R09: ffff88040f0e5000 Dec 14 07:26:02 VOID kernel: R10: ffff88040f0e5600 R11: ffff88040f0e4600 R12: 0000000000000600 Dec 14 07:26:02 VOID kernel: R13: ffff88040f0f1000 R14: ffff88040f0e2000 R15: ffff88040f0e3000 Dec 14 07:26:02 VOID kernel: ? xor_avx_5+0x2d/0x352 Dec 14 07:26:02 VOID kernel: check_parity+0x118/0x349 [md_mod] Dec 14 07:26:02 VOID kernel: handle_stripe+0xe8a/0x1226 [md_mod] Dec 14 07:26:02 VOID kernel: unraidd+0xbc/0x123 [md_mod] Dec 14 07:26:02 VOID kernel: ? md_open+0x2c/0x2c [md_mod] Dec 14 07:26:02 VOID kernel: md_thread+0xcc/0xf1 [md_mod] Dec 14 07:26:02 VOID kernel: ? wait_woken+0x68/0x68 Dec 14 07:26:02 VOID kernel: kthread+0x10b/0x113 Dec 14 07:26:02 VOID kernel: ? kthread_flush_work_fn+0x9/0x9 Dec 14 07:26:02 VOID kernel: ret_from_fork+0x22/0x40 EDIT2: I'm just going to take it all the way back to the default 192. EDIT3: After further tweaking, I got no more call traces with a sync_thresh value of 256 or lower. Running a second check now to ensure my results are consistent. IT doesn't seem to drastically affect my parity check speed either which is always nice.
  17. Replaced PSU, so far so good 50% into the parity check and no more disk resets. I am still receiving call traces though, do I still need to crank sync threshold down more? Current tunables: void-syslog-20181214-0539.zip
  18. I'm thinking it might be time to buy a new PSU. because of how my PSU is mounted in the system I could never see the wattage rating and I couldn't quite remember what it was nor could I find my purchase receipt for it. Finally shut the thing down and pulled it out and it is only a 650 which is kind of on the light side for everything I have hooked up in here and given that I'm guessing the PSU is at least 3 years old by now: https://outervision.com/b/U6ohNP All my drives aren't 7200rpm but I couldnt recall how many so I just went with all faster disks in the calculation. EDIT: Finally found the email receipt, I bought it back in 2012! EDIT2: Bought an 850W EVGA new in plastic on Ebay for $100: https://www.newegg.com/Product/Product.aspx?Item=9SIA85V4SC8056 Should future proof me plenty as I can't fit any more disks in this thing and I don't plan on attaching anything else that is power hungry. Maybe upgrading mobo and CPU at some point but that would be it.
  19. Returned from vacation and started a parity check. Now the disk resetting is ata4, a 2TB Seagate drive NOT in the same top norco 5 bay enclosure as where my other suspect disks started at (it is in the one below it). Dec 10 07:51:54 VOID kernel: ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x40d0202 action 0xe frozen Dec 10 07:51:54 VOID kernel: ata4.00: irq_stat 0x00400000, PHY RDY changed Dec 10 07:51:54 VOID kernel: ata4: SError: { RecovComm Persist PHYRdyChg CommWake 10B8B DevExch } Dec 10 07:51:54 VOID kernel: ata4.00: failed command: READ DMA EXT Dec 10 07:51:54 VOID kernel: ata4.00: cmd 25/00:78:28:14:cd/00:01:01:00:00/e0 tag 0 dma 192512 in Dec 10 07:51:54 VOID kernel: res 50/00:00:27:80:11/00:00:02:00:00/e0 Emask 0x10 (ATA bus error) Dec 10 07:51:54 VOID kernel: ata4.00: status: { DRDY } Dec 10 07:51:54 VOID kernel: ata4: hard resetting link Dec 10 07:51:59 VOID kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 10 07:51:59 VOID kernel: ata4.00: configured for UDMA/133 Dec 10 07:51:59 VOID kernel: ata4: EH complete Dec 10 07:52:03 VOID kernel: ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x90202 action 0xe frozen Dec 10 07:52:03 VOID kernel: ata4.00: irq_stat 0x00400000, PHY RDY changed Dec 10 07:52:03 VOID kernel: ata4: SError: { RecovComm Persist PHYRdyChg 10B8B } Dec 10 07:52:03 VOID kernel: ata4.00: failed command: READ DMA EXT Dec 10 07:52:03 VOID kernel: ata4.00: cmd 25/00:00:a8:49:dc/00:04:01:00:00/e0 tag 3 dma 524288 in Dec 10 07:52:03 VOID kernel: res 50/00:00:a7:b5:20/00:00:02:00:00/e0 Emask 0x10 (ATA bus error) Dec 10 07:52:03 VOID kernel: ata4.00: status: { DRDY } Dec 10 07:52:03 VOID kernel: ata4: hard resetting link Dec 10 07:52:13 VOID kernel: ata4: softreset failed (1st FIS failed) Dec 10 07:52:13 VOID kernel: ata4: hard resetting link Dec 10 07:52:20 VOID kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 10 07:52:20 VOID kernel: ata4.00: configured for UDMA/133 Dec 10 07:52:20 VOID kernel: ata4: EH complete Dec 10 07:54:03 VOID kernel: ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x90202 action 0xe frozen Dec 10 07:54:03 VOID kernel: ata4.00: irq_stat 0x00400000, PHY RDY changed Dec 10 07:54:03 VOID kernel: ata4: SError: { RecovComm Persist PHYRdyChg 10B8B } Dec 10 07:54:03 VOID kernel: ata4.00: failed command: READ DMA EXT Dec 10 07:54:03 VOID kernel: ata4.00: cmd 25/00:00:20:d0:33/00:04:03:00:00/e0 tag 1 dma 524288 in Dec 10 07:54:03 VOID kernel: res 50/00:00:1f:3c:78/00:00:03:00:00/e0 Emask 0x10 (ATA bus error) Dec 10 07:54:03 VOID kernel: ata4.00: status: { DRDY } Dec 10 07:54:03 VOID kernel: ata4: hard resetting link Dec 10 07:54:09 VOID kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 10 07:54:09 VOID kernel: ata4.00: configured for UDMA/133 Dec 10 07:54:09 VOID kernel: ata4: EH complete I can't be 100% sure as the cabling runs behind the case wall for this drive and I would have to pick up and flip the entire server to get back there, but I believe it is hooked up directly to my motherboard SATA controller. void-diagnostics-20181210-0758.zip EDIT: To illustrate what has been swapped where so far I took a pic of the server and made some colored circles. Red was swapped with red, green with green, and purple is my latest troublemaker in this post.
  20. I have lowered it further but am going out of town for a long weekend so I will come back and post when I return and have tested the new settings. Thanks for all your help so far Johnnie!
  21. While I was waiting for that parity check to finish UnRAID hung (before I got to try swapping anything). WebUI, dockers, SSH/Telnet all unresponsive but for some reason my terminal still worked. When I logged in it immediately showed me a message about a "hang time value being exceeded" or something along those lines. I managed to reboot it and capture the diagnostics to the flash drive which I will attach after the server finishes booting back up all the way. Edit; Shut the server down and swapped the parity and data disk on the asmedia controller to continue my troubleshooting. void-diagnostics-20181205-2026.zip
  22. It is and it had the cable swapped last night with a new SATAIII cable fresh out of the plastic. the first parity run after that (last night) and the error was gone but now the second one this morning and it is back. I have a modular PSU but am out of molex connectors so I have been using sata to molex splitters to power two of the Norco bays. I'll have to buy more as I don't have any more splitters on hand. Edit: sometimes my reading comprehension is horrid lol. I'll try swapping its spot with the other asmedia data disk and see if the errors follow it.
  23. Adjusted the sync_thresh down 20 and so far 25% done and no call traces yet. Though now my parity disk link hard reset is back. First parity check after replacing the card and sata card it was gone, now the second one with no other changes and it is back. Dec 5 08:16:46 VOID kernel: mdcmd (134): check nocorrect Dec 5 08:16:46 VOID kernel: md: recovery thread: check P ... Dec 5 08:16:46 VOID kernel: md: using 8192k window, over a total of 5860522532 blocks. Dec 5 08:16:46 VOID kernel: md: recovery thread: P incorrect, sector=0 Dec 5 08:16:46 VOID kernel: md: recovery thread: P incorrect, sector=24 Dec 5 08:16:46 VOID kernel: md: recovery thread: P incorrect, sector=32 Dec 5 08:16:46 VOID kernel: md: recovery thread: P incorrect, sector=96 Dec 5 08:16:53 VOID kernel: md: recovery thread: P incorrect, sector=1474968 Dec 5 08:18:01 VOID emhttpd: req (13): csrf_token=****************&title=System+Log&cmd=%2FwebGui%2Fscripts%2Ftail_log&arg1=syslog Dec 5 08:18:01 VOID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog Dec 5 08:19:03 VOID kernel: ata8.00: exception Emask 0x10 SAct 0x0 SErr 0x90002 action 0xe frozen Dec 5 08:19:03 VOID kernel: ata8.00: irq_stat 0x00400000, PHY RDY changed Dec 5 08:19:03 VOID kernel: ata8: SError: { RecovComm PHYRdyChg 10B8B } Dec 5 08:19:03 VOID kernel: ata8.00: failed command: READ DMA EXT Dec 5 08:19:03 VOID kernel: ata8.00: cmd 25/00:78:20:12:cd/00:03:01:00:00/e0 tag 30 dma 454656 in Dec 5 08:19:03 VOID kernel: res 50/00:00:1f:12:cd/00:00:01:00:00/e0 Emask 0x10 (ATA bus error) Dec 5 08:19:03 VOID kernel: ata8.00: status: { DRDY } Dec 5 08:19:03 VOID kernel: ata8: hard resetting link Dec 5 08:19:13 VOID kernel: ata8: softreset failed (1st FIS failed) Dec 5 08:19:13 VOID kernel: ata8: hard resetting link Dec 5 08:19:19 VOID kernel: ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 5 08:19:19 VOID kernel: ata8.00: configured for UDMA/133 Dec 5 08:19:19 VOID kernel: ata8: EH complete
  24. Apparently mine were adjusted at some point in the past as they indicate they are user set rather than the default...
  25. Where are the tunable settings? I thought they used to be under tips and tricks plugin but I haven't gone looking for them in forever.