2 Disks went DSBL at the same time

Followers

April 2, 20242 yr

Scheduled Parity kicked off laat night 10pm. My health check ran last night 12:20am, no problems. 2:10am 1 of my parity disks and a data disk went to DSBL. 3 am Scheduled Prity Check Paused so Mover could run, then resumed as it should at 3:21am when mover finished. I've been awake for 15 minutes and the parity check has not moved. Stuck at 74.7%, and claiming it is moving at 99MB/s.

Any thoughts on if I need to replace? What is my best path forward to fix both disks? I do have 2 disks in unassigned devices on standby, precleared and ready.

Diagnostics attached

nas-diagnostics-20240402-0644.zip

Edited April 2, 20242 yr by dmoney517
Update to timeline of events.

Quote

Solved by JorgeB

April 2, 20242 yr

Go to solution

April 2, 20242 yr

Community Expert
Solution

Due to all the log spam the start of the problem is missing, but looks like a HBA problem, reboot and post new diags after array start.

Quote

April 2, 20242 yr

Author

I am not able to Cancel the current Parity Check. When I click cancel / OK, the Parity check dialogue does not go away and STOP Array is still greyed out.

Edited April 2, 20242 yr by dmoney517

Quote

April 2, 20242 yr

Community Expert

At most it can be doing a read check, since there are disabled disks, you can cancel.

Quote

April 2, 20242 yr

Author

23 minutes ago, JorgeB said:

At most it can be doing a read check, since there are disabled disks, you can cancel.

It wont let me cancel when I click the button. Nothing changes on the screen. 20 minutes after I tried, this just popped up in the log. Not sure if its related.

Apr 2 07:46:34 NAS nginx: 2024/04/02 07:46:34 [error] 22158#22158: *4835131 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.105, server: , request: "POST /update.htm HTTP/1.1", upstream: "http://unix:/var/run/emhttpd.socket/update.htm", host: "192.168.0.108", referrer: "http://192.168.0.108/Main"

Quote

April 2, 20242 yr

Community Expert

Type reboot in the CLI, if it doesn't reboot after 5 minutes you will need to force it.

Quote

April 2, 20242 yr

Author

Cli reboot worked. New diagnostics attached.

nas-diagnostics-20240402-0852.zip

Quote

April 2, 20242 yr

Community Expert

After array start please.

Quote

April 2, 20242 yr

Author

6 minutes ago, JorgeB said:

After array start please.

Here they are...

nas-diagnostics-20240402-0906.zip

Quote

April 2, 20242 yr

Community Expert

Emulated disk4 is mounting, assuming contents look correct you can rebuild on top.

Before doing it, it may be a good idea to make sure the HBA is well seated and sufficiently cooled, you can also use a different PCIe slot if available.

Quote

April 2, 20242 yr

Author

25 minutes ago, JorgeB said:

Emulated disk4 is mounting, assuming contents look correct you can rebuild on top.

Before doing it, it may be a good idea to make sure the HBA is well seated and sufficiently cooled, you can also use a different PCIe slot if available.

Thanks...what about the Parity? Should I rebuild both? Or rebuild disk 4 first then parity?

Quote

April 2, 20242 yr

Community Expert

You can do both at the same time.

Quote

April 2, 20242 yr

Author

I rebooted the server, and now Parity Disk 1 is not showing at all. Ugh. I thought I was passed all of this.

Edited April 2, 20242 yr by dmoney517

Quote

April 2, 20242 yr

Author

6 hours ago, JorgeB said:

You can do both at the same time.

I moved cables and the missing disk is back again, albeit in DSBL mode still (obviously).

I have intitiated rebuild of both drives. Attaching diags again. Can you please take a look 1 more time to ensure everything is looking as it should?

Appreciate all of your help!

nas-diagnostics-20240402-1620.zip

Quote

April 3, 20242 yr

Community Expert

Everything looks good so far, you can post new diags once it's done if you want.

Quote

April 3, 20242 yr

Author

3 hours ago, JorgeB said:

Everything looks good so far, you can post new diags once it's done if you want.

Thanks. It finished overnight. Everything looks in order. Diags attached.

My assumption is that the port on my second HBA is went bad? Both DSBL disks were on the same SFF-8087 breakout. I moved that cable to an open port on my other HBA (running 2 9201-8i, but only using 3/4 ports) when one of the disks stopped being read and it came back?

I replaced one of my 2.5 SSDs with an M2 during this build (unplanned, I broke it while pulling cables), so I could technically consolidate at this point down to 1 HBA (8 drives) and the 4 on board sata ports and that would cover all my current drives.

Should I do this assuming the entire HBA is bad? Or is it possible to have 1 bad port on the HBA? Also debating returning both of them (bought at the same time, still in return window until May) and getting a new HBA all together from a different seller.

As always, greatly appreciate your advice and help!

Last set of diagnostics attached!

nas-diagnostics-20240403-0717.zip

Quote

April 3, 20242 yr

Community Expert

Still looking good.

2 minutes ago, dmoney517 said:

My assumption is that the port on my second HBA is went bad?

It's possible, but could also be for example the cable, just being moved might have helped, I would just leave as is for now and keep monitoring, and if you have more issues with the new port, I would suspect a cable problem.

Quote

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Followers

Go to topic listing

2 Disks went DSBL at the same time

Featured Replies

Solved by JorgeB

Join the conversation

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)