Controller issues / drive disappeared


Recommended Posts

  • Replies 71
  • Created
  • Last Reply

Top Posters In This Topic

Thanks. So, the issue is not the disk 13. We are getting closer to narrow down the issue. Let me switch only the power cable. This way, we will know whether the issue is power or sata cable.

 

It could be that 12 and 13 share the same power cable. Need to check this out later.

 

The PSU per is highly unlikely the issue. I changed it twice before. The current one is quite new, over-powered and quite a good one.

Link to comment

Disk 12 is the real issue. Disk 13 never disables / disappears / ejects. However, the errors from disk 13 seems to trigger disk 12 to disable.

 

It is 100% consistent / replicable. Disk 12 always disables. Disk 13 only shows errors in the diagnostic (not GUI). After switching ports, parity has issues in log (not in GUI), but yet again it leads to disk 12 being disabled.

 

I still believe it is related to the raid card. Either over-heating or just running with too much HD capacity for what the card can handle in a stable manner.

 

Let me explore though to rule out all options. I am now trying to add a second raid card. And then connect four disks from the on-board controller over to the second raid card. This would address the concern of a potentially faulty port.

Link to comment

Disk13 isn't showing any problems currently, only parity which is using the same SATA port on the onboard controller disk13 was, and still the same problems with disk12 which is on a different controller then parity, so likely unrelated, though I don't remember if you already swapped disk12 around.

Link to comment

Parity and disk 13 errors somehow must be related. It's always the same frequence. One disk triggers errors (Unraid GUI is fine) followed by disk 12 with errors (and disk 12 being disabled).

 

It seems though that I did something wrong as I'd wanted to move parity away from the on-board controllers, which may be on a faulty port. Must have done this wrong. Let me check and correct this.

Link to comment
1 minute ago, johnnie.black said:

Not yet, it's still rebuilding without errors on these latest diags.

Oh... I am getting paranoid about saying errors in the GUI... I checked it again and indeed not showing errors now. No clue what I missed. Let's wait whether things are ok now. I have now moved four disks to a second raid card. I had planned for those four to come from the onboard controller, but may have done a mistake and moved from one raid to another raid instead.

Link to comment

No point in continuing a rebuild with read errors on another disk, since it will be rebuilding garbage, you need to start swapping things around to start ruling them out, cables, controllers, PSU, etc., also I might not have the time to keep checking new diags multiple times a day, check the  syslog, if there are read error like this there's still problem, and it tells which disk it is:

 

Aug 30 17:31:49 Tower kernel: md: disk13 read error, sector=2999712512
Aug 30 17:31:49 Tower kernel: md: disk13 read error, sector=2999712520
Aug 30 17:31:49 Tower kernel: md: disk13 read error, sector=2999712528
Aug 30 17:31:49 Tower kernel: md: disk13 read error, sector=2999712536
Aug 30 17:31:49 Tower kernel: md: disk13 read error, sector=2999712544
Aug 30 17:31:49 Tower kernel: md: disk13 read error, sector=2999712552

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.