Netbug Posted October 10, 2022 Share Posted October 10, 2022 (edited) UnRAID Version: 6.10.3 Case: Lian Li PC-A77FB Motherboard: Asus X99-A II ATX LGA2011-3 Motherboard Processor: Intel® Core™ i7-6850K RAM: G.Skill Ripjaws V Series 32GB (4 x 8GB) DDR4-3200 Memory Drive Cage(s): 3x Supermicro CSE-M35T-1B Hot Swap Bay SATA Extender: Supermicro AOC-SASLP-MV8 SATA Extender Cache Drive: Intel® SSD 660p 512GB M.2 Boot Device: Flash JD_FireFly - 8 GB (sda) Hard Drives: Parity WDC_WD80EFAX-68KNBN0_VDHBX7VD - 8 TB (sdb) Disk 1 WDC_WD20EADS-00R6B0_WD-WCAVY5374796 - 2 TB (sdf) Disk 2 WDC_WD30EZRX-19D8PB0_WD-WMC4N0J7D8TT - 3 TB (sdd) Disk 3 WDC_WD20EARS-00MVWB0_WD-WCAZA9034298 - 2 TB (sde) Disk 4 WDC_WD80EFAX-68KNBN0_VDH2MGVD - 8 TB (sdh) Disk 5 ST4000DM005-2DP166_WGY0E53A - 4 TB (sdk) Disk 6 ST3000DM001-1CH166_W1F23145 - 3 TB (sdm) Disk 7 WDC_WD30EZRX-00SPEB0_WD-WCC4E0NP7NLR - 3 TB (sdl) Disk 8 ST3000DM001-1ER166_Z5029NBR - 3 TB (sdj) Disk 9 ST4000DM005-2DP166_WDH2S61F - 4 TB (sdg) Disk 10 ST4000DM004-2CV104_ZFN0BNMN - 4 TB (sdi) Disk 11 ST4000DM004-2CV104_ZFN1TJV6 - 4 TB (sdc) So last night I was watching some content from my Emby docker when the content froze. I walked over to my desktop to access the GUI, and I was getting a timeout for the dashboard and all other docker GUIs. After some troubleshooting, it was recommended by a friend who was helping to change the GPU. I completed this and thought everything was good. The solution seemed to work for about 12 hours, but has now started again. Content is unwatchable as it freezes every minute or so for 2-3 minutes. What will happen is I will get a timeout on the GUI through my web browser. After 2-3 minutes, it will reload and continue working properly, then lock up again. When I check the unRAID system directly through the command line (monitor, keyboard connected to the tower), there is no loss of response and no slowdowns. When I do a ping test on the local network to the system, there are no timeouts at all. I'm at a loss for where next to start troubleshooting. Please advise. tower-diagnostics-20221010-1851.zip Edited October 10, 2022 by Netbug Clarification Quote Link to comment
Solution MAM59 Posted October 11, 2022 Solution Share Posted October 11, 2022 From what I can see at a first glance are 2 problems: a) you have set your network to use IPV4+IPV6, but you do not have any V6 router in your LAN. Although this is not a real error, you can save some time of wasted tries by disabling V6 completely in you network settings a) the machine tries to send you error messages but fails miserably because you have entered the wrong credentials for your google mail accout. Without these messages it is hard to guess what is going wrong. c) one of your disks has bad sectors: Oct 9 22:06:38 Tower kernel: ata4.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Oct 9 22:06:38 Tower kernel: ata4.00: irq_stat 0x08000000, interface fatal error Oct 9 22:06:38 Tower kernel: ata4: SError: { UnrecovData HostInt 10B8B BadCRC } Oct 9 22:06:38 Tower kernel: ata4.00: failed command: READ DMA Oct 9 22:06:38 Tower kernel: ata4.00: cmd c8/00:78:c8:00:00/00:00:00:00:00/e0 tag 7 dma 61440 in Oct 9 22:06:38 Tower kernel: res 50/00:00:b7:00:00/00:00:00:00:00/e0 Emask 0x50 (ATA bus error) Oct 9 22:06:38 Tower kernel: ata4.00: status: { DRDY } Oct 9 22:06:38 Tower kernel: ata4: hard resetting link Oct 9 22:06:38 Tower kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Oct 9 22:06:38 Tower kernel: ata4.00: configured for UDMA/133 Oct 9 22:06:38 Tower kernel: ata4: EH complete Oct 9 22:06:39 Tower kernel: ata4.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Oct 9 22:06:39 Tower kernel: ata4.00: irq_stat 0x08000000, interface fatal error Oct 9 22:06:39 Tower kernel: ata4: SError: { UnrecovData HostInt 10B8B BadCRC } Oct 9 22:06:39 Tower kernel: ata4.00: failed command: READ DMA Oct 9 22:06:39 Tower kernel: ata4.00: cmd c8/00:f8:48:03:00/00:00:00:00:00/e0 tag 10 dma 126976 in Oct 9 22:06:39 Tower kernel: res 50/00:00:47:03:00/00:00:00:00:00/e0 Emask 0x50 (ATA bus error) Oct 9 22:06:39 Tower kernel: ata4.00: status: { DRDY } Oct 9 22:06:39 Tower kernel: ata4: hard resetting link Oct 9 22:06:39 Tower kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Oct 9 22:06:39 Tower kernel: ata4.00: configured for UDMA/133 Oct 9 22:06:39 Tower kernel: ata4: EH complete when read, machine stops, resets drive and retries. if this happens during playback, it will surely stop The question is if this is a real bad sector or just a loose/bad cable. Anyway, it will slow down anything for a while (I guess ~10s or so) This happens quite often and afterwards the box tries to do a filesystem recovery (which will halt the playback for a much longer time) So at the end, it is only trying to fix your broken disk and does not serve data anymore... Watch the cabling and check the drives. 1 Quote Link to comment
JorgeB Posted October 11, 2022 Share Posted October 11, 2022 1 hour ago, MAM59 said: UnrecovData HostInt 10B8B BadCRC This is usually a bad SATA cable. 1 Quote Link to comment
Netbug Posted October 11, 2022 Author Share Posted October 11, 2022 2 hours ago, MAM59 said: From what I can see at a first glance are 2 problems: a) you have set your network to use IPV4+IPV6, but you do not have any V6 router in your LAN. Although this is not a real error, you can save some time of wasted tries by disabling V6 completely in you network settings a) the machine tries to send you error messages but fails miserably because you have entered the wrong credentials for your google mail accout. Without these messages it is hard to guess what is going wrong. c) one of your disks has bad sectors: Oct 9 22:06:38 Tower kernel: ata4.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Oct 9 22:06:38 Tower kernel: ata4.00: irq_stat 0x08000000, interface fatal error Oct 9 22:06:38 Tower kernel: ata4: SError: { UnrecovData HostInt 10B8B BadCRC } Oct 9 22:06:38 Tower kernel: ata4.00: failed command: READ DMA Oct 9 22:06:38 Tower kernel: ata4.00: cmd c8/00:78:c8:00:00/00:00:00:00:00/e0 tag 7 dma 61440 in Oct 9 22:06:38 Tower kernel: res 50/00:00:b7:00:00/00:00:00:00:00/e0 Emask 0x50 (ATA bus error) Oct 9 22:06:38 Tower kernel: ata4.00: status: { DRDY } Oct 9 22:06:38 Tower kernel: ata4: hard resetting link Oct 9 22:06:38 Tower kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Oct 9 22:06:38 Tower kernel: ata4.00: configured for UDMA/133 Oct 9 22:06:38 Tower kernel: ata4: EH complete Oct 9 22:06:39 Tower kernel: ata4.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Oct 9 22:06:39 Tower kernel: ata4.00: irq_stat 0x08000000, interface fatal error Oct 9 22:06:39 Tower kernel: ata4: SError: { UnrecovData HostInt 10B8B BadCRC } Oct 9 22:06:39 Tower kernel: ata4.00: failed command: READ DMA Oct 9 22:06:39 Tower kernel: ata4.00: cmd c8/00:f8:48:03:00/00:00:00:00:00/e0 tag 10 dma 126976 in Oct 9 22:06:39 Tower kernel: res 50/00:00:47:03:00/00:00:00:00:00/e0 Emask 0x50 (ATA bus error) Oct 9 22:06:39 Tower kernel: ata4.00: status: { DRDY } Oct 9 22:06:39 Tower kernel: ata4: hard resetting link Oct 9 22:06:39 Tower kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Oct 9 22:06:39 Tower kernel: ata4.00: configured for UDMA/133 Oct 9 22:06:39 Tower kernel: ata4: EH complete when read, machine stops, resets drive and retries. if this happens during playback, it will surely stop The question is if this is a real bad sector or just a loose/bad cable. Anyway, it will slow down anything for a while (I guess ~10s or so) This happens quite often and afterwards the box tries to do a filesystem recovery (which will halt the playback for a much longer time) So at the end, it is only trying to fix your broken disk and does not serve data anymore... Watch the cabling and check the drives. Thank you for the reply. I have fixed A (switched to IPv4 only) and B (corrected email configuration). For C, forgive my ignorance, but how do I identify which disk has the problematic cable (or possibly drive failure)? I do not see any identifying information in the section of the log that you pasted. Quote Link to comment
Netbug Posted October 11, 2022 Author Share Posted October 11, 2022 1 hour ago, JorgeB said: This is usually a bad SATA cable. Thanks. How do I identify to which drive that cable is feeding? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.