Jump to content
Harro

(Solved)Device-to-host register FISes sent due to a COMRESET??

18 posts in this topic Last Reply

Recommended Posts

Ever since changing to the lsi 9300-16i controller I have been plagued with disks dropping offline. Disks seem fine with smart reports. 

I have ordered new mini-sas cables thinking this is the only item related to all the drives that go offline, besides a bad controller. I have also ordered new drives in case these are deemed bad. 

I was running an extended smart on disk 14 when everything went tit's up. In the last week I have rebuilt disk 1 & 14 fours times, only to have them go offline in a day or so.

anything else others see in my diagnostics?  What steps next?

 

Update : Replaced cables and all has been working fine.

tower-diagnostics-20190909-1350.zip

Edited by Harro
Solved

Share this post


Link to post
50 minutes ago, Harro said:

Ever since changing to the lsi 9300-16i controller I have been plagued with disks dropping offline.

Do you have any suspicion it might be a counterfeit card? Have you verified the serial number with LSI?

Share this post


Link to post

Bought card from ALLHDD.com so not an ebay purchase. It was new in a sealed box along with a vacuum seal around card.  So I doubt counterfeit but will run serial  on LSI web. 

Share this post


Link to post

Updated the firmware to the latest after install.

Added the extra power to card from this thread

 

Share this post


Link to post

Is it advisable to shut server down and re set all cables. I have now 3 disks that are now in unassigned devices, so I can rebuild 2 and I will copy off the data on the 3rd, which is not a whole lot maybe 4TB. Would I shrink array and take the 3rd drive out and let the 2 other drives rebuild?

Share this post


Link to post

The reason for change to 9300-16i, what disk controller use before ?

 

You have one 9300 ? ( lspci show have two, but I assume should be one )

Forget it, 9300-16i have a PLX chip and two SAS3008 controller.

Edited by Benson

Share this post


Link to post
6 minutes ago, Benson said:

The reason for change to 9300-16i, what disk controller use before ?

 

You have one 9300 ? ( lspci show have two, but I assume should be one )

I had 2 HP220 controllers each handling 8 drives. Went with a single 16i so I could replace the 2 and use the  PCIe 3.0 x16 to gain parity check speed.

 

The 16i is actually 2 controllers on one card. Atleast that is what it showed when I flashed the newer firmware.

Edited by Harro

Share this post


Link to post

All disk drop or half, may be try only use 8 port of 9300 and reinsert HP220 for another disks.

Share this post


Link to post

I am running 12 disks off the 9300 and the other disks are running on the onboard sata. 

I was planning on inserting my old card again and seeing what happens.

Share this post


Link to post
1 hour ago, Benson said:

reinsert HP220 for another disks.

Reinserted the HP220 and put 3 disks on that card and am now rebuilding disk 1. Disk 1 was reballed before the restart so I was expecting to rebuild. All other disks are online with are looking ok.

Share this post


Link to post
36 minutes ago, Harro said:

Reinserted the HP220 and put 3 disks on that card and am now rebuilding disk 1. Disk 1 was reballed before the restart so I was expecting to rebuild. All other disks are online with are looking ok.

Suggest not use the port which have drop disk ( best could use one of the controller on 9300-16i ), this just test does one controller have problem or not.

 

After rebuild, I would put array in maintenance mode and perform parity check ( no correction ), this just make loading to controller, if disk show error or drop again, then you don't need rebuild any disk. Once load test pass, then start array as usual and test/monitor does everything normal.

Edited by Benson

Share this post


Link to post
13 minutes ago, Benson said:

Suggest not use the port which have drop disk ( best could use one of the controller on 9300-16i ), this just test does one controller have problem or not.

I do have 1 disk on that side of the 9300-16i. That disk has not shown any signs of problems and the other 3 disks are on the HP220. I have a new set of mini sas cables coming which I suspect might be the problem. Atleast I am hoping instead on the card.

 

Question now remains is do I format the disk (1) that is rebuilding since it is showing no file system or let it rebuild?

Edited by Harro
added txt

Share this post


Link to post
1 minute ago, Harro said:

Atleast I am hoping instead on the card.

Sure. HBA problem quite trouble. Two controller in one card also too hot.

Share this post


Link to post
11 minutes ago, Harro said:

Question now remains is do I format the disk (1) that is rebuilding since it is showing no file system or let it rebuild?

Too bad, if emulate disk show unmountable, even rebuild won't fix this, waiting some expert jump in. ( don't perform format )

Edited by Benson

Share this post


Link to post
2 minutes ago, Benson said:

Too bad, if emulate disk show unmountable, even rebuild won't fix this

This is why I am leaning toward cancelling the Parity-Sync/Data-Rebuild and format , then rebuild.

Share this post


Link to post

I stopped Parity-Sync/Data-Rebuild and went into maintenance mode and am checking using the xfs repair

 

Update XFS repair done and disk mounted once again and rebuilding now.

Edited by Harro

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.