Jump to content
TheTick077

Hard Drive Disabled - Help!

11 posts in this topic Last Reply

Recommended Posts

So I'm new to Unraid, and I built a system (AMD based B450 setup - if you need the details I can provide them). This is now the 3rd time I have had errors and a hard drive disabled. The first time I rebuilt and moved on after a clean SMART test. The 2nd I went to the Beta version of Unraid because I read that there were some issues with AMD boards that were resolved in the beta. Now this time, I am wondering if there is something else going on... I saved off the logs from the last two times, but here is what I think are where things begin to get screwy (Full logs also attached):

Jul 19 10:18:42 Tower kernel: ahci 0000:01:00.1: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000d37ee000 flags=0x0000]
Jul 19 10:18:42 Tower kernel: ata6.00: exception Emask 0x10 SAct 0xffe000 SErr 0x0 action 0x6 frozen
Jul 19 10:18:42 Tower kernel: ata6.00: irq_stat 0x08000000, interface fatal error
Jul 19 10:18:42 Tower kernel: ata6.00: failed command: WRITE FPDMA QUEUED
Jul 19 10:18:42 Tower kernel: ata6.00: cmd 61/08:68:c0:e1:bf/00:00:00:00:00/40 tag 13 ncq dma 4096 out
Jul 19 10:18:42 Tower kernel:         res 40/00:b8:f0:b6:9f/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
Jul 19 10:18:42 Tower kernel: ata6.00: status: { DRDY }

And then something similar today:

Jul 30 14:05:54 Tower kernel: ahci 0000:01:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000e address=0xfecfe000 flags=0x0000]
Jul 30 14:05:54 Tower kernel: ata6.00: exception Emask 0x10 SAct 0x1ffff0 SErr 0x0 action 0x6 frozen
Jul 30 14:05:54 Tower kernel: ata6.00: irq_stat 0x08000000, interface fatal error
Jul 30 14:05:54 Tower kernel: ata6.00: failed command: WRITE FPDMA QUEUED
Jul 30 14:05:54 Tower kernel: ata6.00: cmd 61/08:20:60:ae:5f/00:00:00:00:00/40 tag 4 ncq dma 4096 out
Jul 30 14:05:54 Tower kernel:         res 40/00:40:b8:78:65/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Jul 30 14:05:54 Tower kernel: ata6.00: status: { DRDY }

Is my hardware failing? Is there something else going on? Not sure what I need to do, but I can't have this drive fail every 10 days... 

tower-syslog-20200731-1645.zip tower-syslog-20200720-1401.zip

Share this post


Link to post

Instead of syslog, please post diagnostics. Diagnostics includes syslog, SMART for all attached disks, and many other things that give a more complete understanding of your configuration and situation.

 

Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread.

Share this post


Link to post

Like @trurlmentioned you should always post the complete diags, but this is a rather common problem with the onboard SATA controller and Ryzen boards, usually using the latest beta helps, but I see you're already using it, look for a bios update os disable IOMMU of not needed.

Share this post


Link to post

Sorry! I rebooted the machine and reran the SMART test right before reading your post. So hopefully that didn't mess anything up.

 

And I don't want to disable IOMMU since I use my windows VM as a "bare metal" daily driver and pass one of the video cards to it.

tower-diagnostics-20200731-1313.zip

Edited by TheTick077

Share this post


Link to post
1 hour ago, johnnie.black said:

look for a bios update

or wait for a newer Unraid release, with a newer kernel, not much more you can do, other than getting a different board.

Share this post


Link to post
2 hours ago, johnnie.black said:

or wait for a newer Unraid release, with a newer kernel, not much more you can do, other than getting a different board.

Well that's not very helpful (not saying you aren't helpful - I understand there are situations outside of your control). I wish I had know this before I purchased my hardware and license. I don't see it mentioned anywhere that Ryzen boards aren't supported. So there is basically nothing I can do if my BIOS is up to date, and I'm on the Beta release? I'm just SOL and will have to rebuild my drive every ~10 days?

 

Ok, just checked, and there is a new BIOS available for my mobo, but I'm honestly doubtful it will help at all right now. Should I still update it?

https://www.gigabyte.com/us/Motherboard/B450-AORUS-ELITE-rev-10/support#support-dl-bios

Edited by TheTick077

Share this post


Link to post

Thinking about it a little more, I have a couple of questions: it always is the same drive. Should I try a different port? Different SATA cable? Could either of those things possibly solve the issue?

Share this post


Link to post

There are many using Unraid with Ryzen. Don't know if anything here will help you or not:

 

Trying other ports, cables, might help.

 

 

Share this post


Link to post
1 minute ago, trurl said:

There are many using Unraid with Ryzen. Don't know if anything here will help you or not:

 

Trying other ports, cables, might help.

 

 

Thanks I'll read through that and give them a try. But my server isn't locking up, it is just getting those IRQ errors.

Share this post


Link to post
10 hours ago, TheTick077 said:

I don't see it mentioned anywhere that Ryzen boards aren't supported.

They are supported, but they tend to have more issues with Linux in general, not just Unraid, though like mentioned and AFAIK these controller issues mostly  happened with v6.8, everyone that update to v6.9 reported no more errors, due to newer Linux kernel, also you should update the bios, since it's not the latest.

Share this post


Link to post

Well, I updated the BIOS, followed that thread's suggestions, changed port and cable on the offending drive. So let's hope that does it. If not, not sure what else to try. Thanks for the help/suggestions!

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.