blk_update_request & crashing issues


Recommended Posts

I have also been having this same problem since I upgraded to 6.9.2. I have been fighting this issue for over 2 months now and can't find anything that helps. I am having the problem only on 8tb Seagate Ironwolf pro drives from what I can tell, but that is the largest drives I have, so the fall into parity and the drives that are usually being written to currently. I have had the issue happen on both data and parity drives. When it happens on a data drive the server locks and I am unable to get logs. From what I can tell it happens any time that cache wants to sync or the mover moves files from cache to a drive. I am at my whit's end at this point and ready to go back to windows as I can't keep my unraid server up for more than 5 hours before I start loosing drives. After the issue happens I can restart the server, remove, start, stop and re-add the drive and then rebuild parity with no issues. Once parity is rebuilt, and I start to turn dockers and other items back on, and cache wants to sync or write to a drive I will loose at least one parity if not the data drive it is trying to write too. All Smart tests come back good on the drives that are seeing the issue. 

 

Current config

Chasis: Supermicro CSE-836be16

Gigabyte x570 Aurus Elite Wifi

AMD Ryzen 5 3600

4 x 8gb 32000 DDR4 memory

IBM M1015 9220 8I HBA flashed into IT Mode

Nvida Quattro P2000 

Cache 2x 256GB Samsung Evo SSD's

Parity 2x8TB Seagate Ironwolf Pro

Data Mix of Seagate Desktop and NAS drives in  4, 5, 6TB 

 

 

Below are all of the steps that I have completed to date to try and correct this issues but have not had any success at all. 

Removed the 8TB data drive from the array that started with the issue and rebuilt parity

Moved from 2 parity drives to a single parity drives

Swapped MB(x370), Proc(Ryzen 2600g), and memory(DDR4 2600) with another older AMD system that was not in use but is in good condition. 

Removed the existing setup and ran stress tests on the hardware to confirm no issues

Removed the Nvidia card in both the current and swapped systems

Ran the HBA card in the 16x PCIe slot instead of the 4x PCIe slot

Swapped SATA cables between cache drives and MB

Swapped HBA cards to a backup card also and 8i 9220 card

Reseated and Swapped ports for the HBA cables on the backplane

Removed all plugins and dockers and ran a bare system for a week on both current and swapped hardware

Moved drives around the backplane but the issue follows the drive and not the backplane port. 

Power supplies are redundant and I have swapped between then without any effect on the issue. The backup PS was not being used when the issue started and not plugged in at the time. 

 

The last piece of hardware that I have not replaced is the backplane. That has been ordered with new HBA cables that will be here Saturday but I find it hard to believe that is the issue as usually when a backplane goes bad it affects all the drives or the issue doesn't follow the drives when they are moved around the backplane. It is specific to a port or ports on the backplane itself. 

 

I have attached the logs from my latest failure. Please any help will be greatly appreciated as I don't know know what else to do other than go back to windows and frankly I hate that idea. 

 

Thanks

Read

tower-diagnostics-20210825-2131.zip

Link to comment

Thank you JorgeB for the info. Your instructions are great and the changes seemed to take as expected. I have completed these steps and am in the process of rebuilding both of my 8TB parity drives. I will follow up in a day or two once the rebuild is complete and I have some time to let the drives spin down and then attempt a cache move or sync. I really appreciate the help. 

 

On another note, I'm curious if people are seeing more issues on AMD systems with 6.9.x systems. I ran really solid with not issues but since I have upgraded to 6.9.2 I have had a ton of crashes. I swapped back to a x370 board with a 2600g processor and seems thing to run well. Just something I noticed in all of this. 

 

Thanks

Read

Link to comment

 @JorgeB I completed the rebuild of both 8TB Parity drives and did an array shutdown with out losing a parity drive. This was guaranteed to do it before so it is promising. Don't want to call it good yet as I have not tried to actually put any load on the system with reading and writing at intervals. Thank you again for the help, I don't know if I would have found that with out you. 

 

@ChatNoir I found that last night while I was working on completing the steps that JorgeB outlined. Most of that didn't apply to my server hardware but I will go back through all of that and see what I can adjust out. Thank you for posting that and trying to help get some stability back in my world

 

@zenmak Sorry I kind of hijacked your post. Please everyone see if you can help zenmak with his issue as it doesn't seem to be related to my issue, we were both just getting the same error. 

 

Thanks every for the great community! 

 

 

Link to comment
8 hours ago, Snack_Ears said:

@ChatNoir I found that last night while I was working on completing the steps that JorgeB outlined. Most of that didn't apply to my server hardware but I will go back through all of that and see what I can adjust out. Thank you for posting that and trying to help get some stability back in my world

Well, in your first diag you were running 4 single rank DIMM at 3200MT/s on a Ryzen 3600 that is only specd to 2933 in this configuration.

An then you complained about crashs with Ryzen, so yt may or may not be your issue ; but for a server, running outside of the memory controller specs doesn't seem a great idea in my opinion.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.