Read errors on parity rebuild..


DevXen

Recommended Posts

I've had read errors multiple times, in my experience it's always been due to those fragile SAS breakout cables going bad.  I had my server wall mounted, and the drive trays would slip down occasionally, pinching/chafing the data cables.  You will get some better advice here soon enough.

 

Have you run a smart report on the disk?

Link to comment
11 minutes ago, Sardine8207 said:

I've had read errors multiple times, in my experience it's always been due to those fragile SAS breakout cables going bad.  I had my server wall mounted, and the drive trays would slip down occasionally, pinching/chafing the data cables.  You will get some better advice here soon enough.

 

Have you run a smart report on the disk?

 

No. I tried with the first ones that had write errors and the smart tests stuck At 100% and would never finish. But the diagnostics have the smart info for each drive I saw so that's good. 

 

Here's a little back story... (Didn't put it here cause I already posted a different post about it. But here...

 

 

On Sat I had 2 drives disabled for write errors. Then I swapped them out and rebuilt data separately on them. Took one out ran an extensive chkdsk looking for bad sectors. It didn't find any. The second one still has like 6 days to go on it's check but so far no errors. 

And then today I woke up to 2 drives disabled due to read errors. One brand New one I replaced on Sat and a different drive in the array.

 

About 3 hours into the rebuild on one of the drives I got 2 more drives with read errors.

 

So at that point I turned my server off. Thinking it's the hba controller card or the cables. But it would be strange for multiple cables die at the same time. They are really thick cables. Here's a pic. 

20231205_161530.jpg

Link to comment

Looks like the same issue as in your other thread, controller problems:

 

Dec  5 18:15:05 MediaXen kernel: aacraid: Host adapter abort request.
Dec  5 18:15:05 MediaXen kernel: aacraid: Outstanding commands on (4,1,21,0):
Dec  5 18:15:05 MediaXen kernel: aacraid: Host bus reset request. SCSI hang ?
Dec  5 18:15:05 MediaXen kernel: aacraid 0000:82:00.0: outstanding cmd: midlevel-0
Dec  5 18:15:05 MediaXen kernel: aacraid 0000:82:00.0: outstanding cmd: lowlevel-0
Dec  5 18:15:05 MediaXen kernel: aacraid 0000:82:00.0: outstanding cmd: error handler-5
Dec  5 18:15:05 MediaXen kernel: aacraid 0000:82:00.0: outstanding cmd: firmware-37
Dec  5 18:15:05 MediaXen kernel: aacraid 0000:82:00.0: outstanding cmd: kernel-0
Dec  5 18:15:05 MediaXen kernel: aacraid 0000:82:00.0: Controller reset type is 3
Dec  5 18:15:05 MediaXen kernel: aacraid 0000:82:00.0: Issuing IOP reset

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.