Disk Read Errors - Have I messed up?


HidYn

Recommended Posts

Hi Guys,

 

Came home from a camping trip today to a horror show. I had disk errors and a disabled disk.

 

Following a few guides from the forums I'm trying to rebuild the disk that got disabled but I'm getting read errors from my parity.

 

All disks seem to pass S.M.A.R.T fine.

 

I do a parity check on the 1st of the month and it looks like that completed fine this morning from the e-mail I was sent.

 

Should also mention I'm using a HP Micro Gen8 and seem the have the same issue as an earlier poster with 2 disks appearing in unassigned devices.

 

I didn't use my brain and take a diagnostics before the inital reboot and panic to try and fix the issue. Stupid of me I know but never had any issues for the last 2 years.

 

I've attached a diagnostics of the current state of play in the hope that someone can help.

 

Any help would be really appreciated.

 

You live and learn.

 

Thanks in advance

alpha-diagnostics-20190901-1828.zip

Edited by HidYn
Link to comment
4 minutes ago, johnnie.black said:

There are no read errors on the diags posted, rebuild was canceled almost right after array start.

I stopped the rebuild when the reads/writes shot up to over 22 million and read errors went to over 82k according to the GUI.

 

I'll start a rebuild again and take a screenshot and another diagnostic.

 

Thanks for the reply.

Link to comment

I see, the problem is that the log is being flooded with these:

 

Sep  1 16:53:20 alpha kernel: ACPI Error: Method parse/execution failed \_SB.PMI0._PMM, AE_AML_BUFFER_LIMIT (20180810/psparse-516)
Sep  1 16:53:20 alpha kernel: ACPI Error: AE_AML_BUFFER_LIMIT, Evaluating _PMM (20180810/power_meter-338)
Sep  1 16:53:21 alpha kernel: ACPI Error: SMBus/IPMI/GenericSerialBus write requires Buffer of length 66, found length 32 (20180810/exfield-393)

 

So it didn't catch the start of the problem, I believe there's was a way to stopping that, you might want to google it.

Link to comment
3 hours ago, johnnie.black said:

No point in rebuilding with errors on another disk(s), likely a controller/cable/power problem.

You might be on to something, Disabled the HPE Smart Array and now rebuilding the disk again. Looks much more promising with it's estimations - 2 days to rebuild 8tb. Fingers crossed this sorts it and thanks again for the help.

Link to comment

Note that if a disk is shown as unmountable before starting the rebuild it will still have that status on completing the rebuild.    The only way to clear an unmountable status (besides wiping the disk contents) is to run a file system repair.    This can be done either on the emulated drive or on the rebuilt drive.

  • Like 1
Link to comment
5 minutes ago, johnnie.black said:

There are errors on all disks, could be the miniSAS cable, the board (i.e., the SATA controller) or the PSU, I would start with replacing the miniSAS cable (or checking it's correctly connected on the motherboard).

Thanks I'll get on this now.

 

Really do appreciate the help.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.