kevin_h

Members
  • Posts

    8
  • Joined

  • Last visited

kevin_h's Achievements

Noob

Noob (1/14)

0

Reputation

  1. I am running Unraid 6.8.3 Stable. I recently upgraded from a Dual Xeon server board to a Ryzen 5 3600 (better performance, less power usage). I added in a new Dual 1GB NIC since the new motherboard only has 1 NIC on-board. I set the on-board NIC to be a single interface with a Static IP address with bonding and bridging disabled. I then set the new Dual NIC with Bonding and Bridging Enabled and added another Static IP to the new bond. In the Docker settings pane, I set the new bridge as the default interface to use. I have set the Interface rules so that all the interfaces get the same identifier each time which is working. eth0 - Onboard NIC, Bond & Bridge disabled eth1 - Port 1 of Dual NIC, bond enabled (eth1 & eth2 are members), bridge enabled eth2 - Port 2 of Dual NIC, member of eth1 bond eth3 - Port 1 of 10GB SFP card (No issues, DAC to Plex server with static IP) eth4 - Port 2 of 10GB SFP card (Currently not in use) The issue is Docker is still using the eth0 interface for all of it's networking when Bridge is selected for the containers, even if Bridge and Bond is disabled. Is Docker set to use eth0 all the time? When I go to the Docker page in the WebUI it is showing the IP address for the non-bonded on board NIC. Do I need to specify the new bridge for each container instead of using the Bridge selection in the drop-down?
  2. Thanks again John. I have moved the 2 slots to be directly connected to the motherboard SATA connectors with a Molex>SATA power dongle to help eliminate any issues that the SAS backplane may introduce and am in the process of rebuilding the 2 disks again. It'll probably be ~ 24 hrs before I know if the change had any affect. I tried rolling back from 6.5.3 -> 6.5.2 and that did not help. Once the rebuild is complete I will upgrade back to 6.5.3. I also tried adjusting the timings within the 9211 firmware for timeouts and that had no affect. I compared the SMART attributes you mentioned and I am not seeing a dramatic difference. I have rebooted a few times since the diagnostics were first taken. From first diagnostic to current: Iron Wolf Pro (ST6000NE0021) SMART Attribute 12 First: 10 Current: 14 SMART Attribute 192 First: 29 Current: 33 SMART Attribute 199 First: 0 Current: 0 WD Red (WD4000FYYZ *W7) SMART Attribute 12 First: 71 Current: 75 SMART Attribute 192 First: 64 Current: 68 SMART Attribute 199 First: 0 Current: 0 I guess my next test is to use my 1U server as the host and my 4U as a DAS with a 9211-8e to see if it's motherboard related. I have a power board that can boot the backplanes/HD's with no mobo. I am reattaching my original diagnostics zip file so it is accessible again after the forums change. hoskins-fs-diagnostics-20180827-1220.zip
  3. So apparently I tried to post my reply last night during the maintenance period. The same 2 disks slots dropped again last night. I noticed that the HBA did lost communication with the disk and re-added them at different sdX disk names. I am able to sucessfully rebuild the disk each time. Attached are the diagnostics from right after they dropped. hoskins-fs-diagnostics-20180828-2008.zip
  4. Thanks John. I noticed as well that my drives were getting awfully warm. I am only using 14 out of 24 bays so I need to spread them out in my case to get better airflow over each drive. It is currently rebuilding both drives so once I that completes I will wait till they drop and grab diagnostic as soon as I get the email notifying me of the error state. Not sure how I will test my power supply since it is a redundant Hot-Swap power supply with a distributor that connects everything. I made sure all 6 power connectors were fully seated when I replaced the backplane. I will double check each connector again to be sure they are all the way connected. Since they are both NAS/Enterprise drives, getting power from a backplane shouldn't be an issue should it?
  5. Hoping someone with more expertise can assist in figuring why the same 2 slots keep dropping out of the Array. Running unRaid Pro 6.5.3 on: Supermicro SC847 X8-DT6 mobo, 2 x E5530 32GB ECC RAM SAS2-846EL1 Backplane 9211-8i in IT mode Dual 8TB Parity I have ran Extended SMART test on both drives and no errors reported. I just replaced my SAS1 backplane for the SAS2 hoping that would solve my issue. The data is still on the drives because I can move them to another server and view all the contents and no errors are reported on the other machine. I swapped out the HBA for a different 9211-8i and same result. Moved the drives to the 12-bay backplane and no difference. I keep dropping Disk 8 and Disk 5. Disk 8 is a 3 month old Iron Wolf Pro 6TB and Disk 5 is a WD RE 4TB. Both were purchased new. I swapped the 4TB RE for a different 4TB RE and it dropped from the same array slot number but was in a different Hot-Swap slot. Any help would be greatly appreciated. hoskins-fs-diagnostics-20180827-1220.zip
  6. I recently upgraded from unRaid 6.4.0 to 6.4.1 using the built-in upgrade tool. Since then one of my network bridges does not show up as an option for any of my Docker containers. Since 6.4 moved to Nginx for the UI, I was using another bridge for my LetsEncrypt docker to avoid having to change ports but now br1 is not an option to map Docker to. Anyone else ran into this?
  7. Thank you very much for the response. According to that command, it is in fact sdr that is causing the errors. Every other stat is reporting 0. btrfs dev stats /mnt/cache [/dev/sdr1].write_io_errs 0 [/dev/sdr1].read_io_errs 793 [/dev/sdr1].flush_io_errs 0 ...
  8. This is my first unRaid server so I am not that familiar with file systems other than NTFS so I am hoping someone can clear my thinking real quick. I have a 10-drive cache pool of 300GB 10k SAS drives for a total of 1.5TB of cache inside a Supermicro SC847 chassis with 10 data disk and dual parity. The other day I ran a balance and scrub on the cache and got some unrecoverable errors so I started looking more and saw that I have some BTRFS errors in the disk logs. Since the cache pool is a RAID 1 setup I'm just a bit confused as to which disk is actually failing. I ordered 2 new disks as replacements but I want to make sure I understand the error. When I click on the Disk Log for sdr, I get: Jan 4 06:04:17 FS kernel: blk_update_request: critical target error, dev sdr, sector 148276291 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 777, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 778, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 779, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 780, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 781, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 782, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 783, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 784, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 785, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 786, flush 0, corrupt 0, gen 0 But when I click on the Disk Log for sdq, I get: Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 751, flush 0, corrupt 0, gen 0 Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 752, flush 0, corrupt 0, gen 0 Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 753, flush 0, corrupt 0, gen 0 Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 754, flush 0, corrupt 0, gen 0 Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 755, flush 0, corrupt 0, gen 0 Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 756, flush 0, corrupt 0, gen 0 Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 757, flush 0, corrupt 0, gen 0 Jan 4 06:00:54 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 758, flush 0, corrupt 0, gen 0 Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 759, flush 0, corrupt 0, gen 0 Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 760, flush 0, corrupt 0, gen 0 Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 761, flush 0, corrupt 0, gen 0 Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 762, flush 0, corrupt 0, gen 0 Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 763, flush 0, corrupt 0, gen 0 Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 764, flush 0, corrupt 0, gen 0 Jan 4 06:02:32 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 765, flush 0, corrupt 0, gen 0 Jan 4 06:02:36 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 766, flush 0, corrupt 0, gen 0 Jan 4 06:02:37 FS kernel: BTRFS info (device sdq1): read error corrected: ino 17489478 off 9459154944 (dev /dev/sdr1 sector 147029536) Jan 4 06:02:42 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 767, flush 0, corrupt 0, gen 0 Jan 4 06:02:42 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 768, flush 0, corrupt 0, gen 0 Jan 4 06:02:42 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 769, flush 0, corrupt 0, gen 0 Jan 4 06:02:42 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 770, flush 0, corrupt 0, gen 0 Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 771, flush 0, corrupt 0, gen 0 Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 772, flush 0, corrupt 0, gen 0 Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 773, flush 0, corrupt 0, gen 0 Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 774, flush 0, corrupt 0, gen 0 Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 775, flush 0, corrupt 0, gen 0 Jan 4 06:03:59 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 776, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 777, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 778, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 779, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 780, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 781, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 782, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 783, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 784, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 785, flush 0, corrupt 0, gen 0 Jan 4 06:04:17 FS kernel: BTRFS error (device sdq1): bdev /dev/sdr1 errs: wr 0, rd 786, flush 0, corrupt 0, gen 0 So is sdr going bad or is sdq or are both on their way out? I ordered 2 so I can replace both but if only 1 is causing the errors due to the mirror I'd rather keep the second as a spare. Thanks.