Jump to content

Network Timeouts


Netbug
Go to solution Solved by MAM59,

Recommended Posts

UnRAID Version: 6.10.3

 

  • Case: Lian Li PC-A77FB
  • Motherboard: Asus X99-A II ATX LGA2011-3 Motherboard
  • Processor: Intel® Core™ i7-6850K
  • RAM: G.Skill Ripjaws V Series 32GB (4 x 8GB) DDR4-3200 Memory
  • Drive Cage(s): 3x Supermicro CSE-M35T-1B Hot Swap Bay
  • SATA Extender: Supermicro AOC-SASLP-MV8 SATA Extender
  • Cache Drive: Intel® SSD 660p 512GB M.2
  • Boot Device: Flash JD_FireFly - 8 GB (sda)
  • Hard Drives:
  • Parity    WDC_WD80EFAX-68KNBN0_VDHBX7VD - 8 TB (sdb)    
  • Disk 1    WDC_WD20EADS-00R6B0_WD-WCAVY5374796 - 2 TB (sdf)
  • Disk 2    WDC_WD30EZRX-19D8PB0_WD-WMC4N0J7D8TT - 3 TB (sdd)
  • Disk 3    WDC_WD20EARS-00MVWB0_WD-WCAZA9034298 - 2 TB (sde)
  • Disk 4    WDC_WD80EFAX-68KNBN0_VDH2MGVD - 8 TB (sdh)
  • Disk 5    ST4000DM005-2DP166_WGY0E53A - 4 TB (sdk)
  • Disk 6    ST3000DM001-1CH166_W1F23145 - 3 TB (sdm)
  • Disk 7    WDC_WD30EZRX-00SPEB0_WD-WCC4E0NP7NLR - 3 TB (sdl)
  • Disk 8    ST3000DM001-1ER166_Z5029NBR - 3 TB (sdj)
  • Disk 9    ST4000DM005-2DP166_WDH2S61F - 4 TB (sdg)
  • Disk 10    ST4000DM004-2CV104_ZFN0BNMN - 4 TB (sdi)
  • Disk 11    ST4000DM004-2CV104_ZFN1TJV6 - 4 TB (sdc)

 

So last night I was watching some content from my Emby docker when the content froze. I walked over to my desktop to access the GUI, and I was getting a timeout for the dashboard and all other docker GUIs. After some troubleshooting, it was recommended by a friend who was helping to change the GPU. I completed this and thought everything was good. The solution seemed to work for about 12 hours, but has now started again. Content is unwatchable as it freezes every minute or so for 2-3 minutes.

 

What will happen is I will get a timeout on the GUI through my web browser. After 2-3 minutes, it will reload and continue working properly, then lock up again. When I check the unRAID system directly through the command line (monitor, keyboard connected to the tower), there is no loss of response and no slowdowns.

 

When I do a ping test on the local network to the system, there are no timeouts at all.

 

I'm at a loss for where next to start troubleshooting.

 

Please advise.

image.png.c9f118a158fd5edd1eac7512d394263f.png

tower-diagnostics-20221010-1851.zip

Edited by Netbug
Clarification
Link to comment
  • Solution

From what I can see at a first glance are 2 problems:

 

a) you have set your network to use IPV4+IPV6, but you do not have any V6 router in your LAN. Although this is not a real error, you can save some time of wasted tries by disabling V6 completely in you network settings

 

a) the machine tries to send you error messages but fails miserably because you have entered the wrong credentials for your google mail accout. Without these messages it is hard to guess what is going wrong.

 

c) one of your disks has bad sectors:

Oct  9 22:06:38 Tower kernel: ata4.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen
Oct  9 22:06:38 Tower kernel: ata4.00: irq_stat 0x08000000, interface fatal error
Oct  9 22:06:38 Tower kernel: ata4: SError: { UnrecovData HostInt 10B8B BadCRC }
Oct  9 22:06:38 Tower kernel: ata4.00: failed command: READ DMA
Oct  9 22:06:38 Tower kernel: ata4.00: cmd c8/00:78:c8:00:00/00:00:00:00:00/e0 tag 7 dma 61440 in
Oct  9 22:06:38 Tower kernel:         res 50/00:00:b7:00:00/00:00:00:00:00/e0 Emask 0x50 (ATA bus error)
Oct  9 22:06:38 Tower kernel: ata4.00: status: { DRDY }
Oct  9 22:06:38 Tower kernel: ata4: hard resetting link
Oct  9 22:06:38 Tower kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Oct  9 22:06:38 Tower kernel: ata4.00: configured for UDMA/133
Oct  9 22:06:38 Tower kernel: ata4: EH complete
Oct  9 22:06:39 Tower kernel: ata4.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen
Oct  9 22:06:39 Tower kernel: ata4.00: irq_stat 0x08000000, interface fatal error
Oct  9 22:06:39 Tower kernel: ata4: SError: { UnrecovData HostInt 10B8B BadCRC }
Oct  9 22:06:39 Tower kernel: ata4.00: failed command: READ DMA
Oct  9 22:06:39 Tower kernel: ata4.00: cmd c8/00:f8:48:03:00/00:00:00:00:00/e0 tag 10 dma 126976 in
Oct  9 22:06:39 Tower kernel:         res 50/00:00:47:03:00/00:00:00:00:00/e0 Emask 0x50 (ATA bus error)
Oct  9 22:06:39 Tower kernel: ata4.00: status: { DRDY }
Oct  9 22:06:39 Tower kernel: ata4: hard resetting link
Oct  9 22:06:39 Tower kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Oct  9 22:06:39 Tower kernel: ata4.00: configured for UDMA/133
Oct  9 22:06:39 Tower kernel: ata4: EH complete

 

when read, machine stops, resets drive and retries. if this happens during playback, it will surely stop

The question is if this  is a real bad sector or just a loose/bad cable. Anyway, it will slow down anything for a while (I guess ~10s or so)

This happens quite often and afterwards the box tries to do a filesystem recovery (which will halt the playback for a much longer time)

So at the end, it is only trying to fix your broken disk and does not serve data anymore...

 

Watch the cabling and check the drives.

 

 

  • Like 1
Link to comment
2 hours ago, MAM59 said:

From what I can see at a first glance are 2 problems:

 

a) you have set your network to use IPV4+IPV6, but you do not have any V6 router in your LAN. Although this is not a real error, you can save some time of wasted tries by disabling V6 completely in you network settings

 

a) the machine tries to send you error messages but fails miserably because you have entered the wrong credentials for your google mail accout. Without these messages it is hard to guess what is going wrong.

 

c) one of your disks has bad sectors:

Oct  9 22:06:38 Tower kernel: ata4.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen
Oct  9 22:06:38 Tower kernel: ata4.00: irq_stat 0x08000000, interface fatal error
Oct  9 22:06:38 Tower kernel: ata4: SError: { UnrecovData HostInt 10B8B BadCRC }
Oct  9 22:06:38 Tower kernel: ata4.00: failed command: READ DMA
Oct  9 22:06:38 Tower kernel: ata4.00: cmd c8/00:78:c8:00:00/00:00:00:00:00/e0 tag 7 dma 61440 in
Oct  9 22:06:38 Tower kernel:         res 50/00:00:b7:00:00/00:00:00:00:00/e0 Emask 0x50 (ATA bus error)
Oct  9 22:06:38 Tower kernel: ata4.00: status: { DRDY }
Oct  9 22:06:38 Tower kernel: ata4: hard resetting link
Oct  9 22:06:38 Tower kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Oct  9 22:06:38 Tower kernel: ata4.00: configured for UDMA/133
Oct  9 22:06:38 Tower kernel: ata4: EH complete
Oct  9 22:06:39 Tower kernel: ata4.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen
Oct  9 22:06:39 Tower kernel: ata4.00: irq_stat 0x08000000, interface fatal error
Oct  9 22:06:39 Tower kernel: ata4: SError: { UnrecovData HostInt 10B8B BadCRC }
Oct  9 22:06:39 Tower kernel: ata4.00: failed command: READ DMA
Oct  9 22:06:39 Tower kernel: ata4.00: cmd c8/00:f8:48:03:00/00:00:00:00:00/e0 tag 10 dma 126976 in
Oct  9 22:06:39 Tower kernel:         res 50/00:00:47:03:00/00:00:00:00:00/e0 Emask 0x50 (ATA bus error)
Oct  9 22:06:39 Tower kernel: ata4.00: status: { DRDY }
Oct  9 22:06:39 Tower kernel: ata4: hard resetting link
Oct  9 22:06:39 Tower kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Oct  9 22:06:39 Tower kernel: ata4.00: configured for UDMA/133
Oct  9 22:06:39 Tower kernel: ata4: EH complete

 

when read, machine stops, resets drive and retries. if this happens during playback, it will surely stop

The question is if this  is a real bad sector or just a loose/bad cable. Anyway, it will slow down anything for a while (I guess ~10s or so)

This happens quite often and afterwards the box tries to do a filesystem recovery (which will halt the playback for a much longer time)

So at the end, it is only trying to fix your broken disk and does not serve data anymore...

 

Watch the cabling and check the drives.

 

 

Thank you for the reply.

 

I have fixed A (switched to IPv4 only) and B (corrected email configuration).

 

For C, forgive my ignorance, but how do I identify which disk has the problematic cable (or possibly drive failure)? I do not see any identifying information in the section of the log that you pasted.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...