(Solved)Device-to-host register FISes sent due to a COMRESET??


Harro

Recommended Posts

Ever since changing to the lsi 9300-16i controller I have been plagued with disks dropping offline. Disks seem fine with smart reports. 

I have ordered new mini-sas cables thinking this is the only item related to all the drives that go offline, besides a bad controller. I have also ordered new drives in case these are deemed bad. 

I was running an extended smart on disk 14 when everything went tit's up. In the last week I have rebuilt disk 1 & 14 fours times, only to have them go offline in a day or so.

anything else others see in my diagnostics?  What steps next?

 

Update : Replaced cables and all has been working fine.

tower-diagnostics-20190909-1350.zip

Edited by Harro
Solved
Link to comment

Is it advisable to shut server down and re set all cables. I have now 3 disks that are now in unassigned devices, so I can rebuild 2 and I will copy off the data on the 3rd, which is not a whole lot maybe 4TB. Would I shrink array and take the 3rd drive out and let the 2 other drives rebuild?

Link to comment
6 minutes ago, Benson said:

The reason for change to 9300-16i, what disk controller use before ?

 

You have one 9300 ? ( lspci show have two, but I assume should be one )

I had 2 HP220 controllers each handling 8 drives. Went with a single 16i so I could replace the 2 and use the  PCIe 3.0 x16 to gain parity check speed.

 

The 16i is actually 2 controllers on one card. Atleast that is what it showed when I flashed the newer firmware.

Edited by Harro
Link to comment
36 minutes ago, Harro said:

Reinserted the HP220 and put 3 disks on that card and am now rebuilding disk 1. Disk 1 was reballed before the restart so I was expecting to rebuild. All other disks are online with are looking ok.

Suggest not use the port which have drop disk ( best could use one of the controller on 9300-16i ), this just test does one controller have problem or not.

 

After rebuild, I would put array in maintenance mode and perform parity check ( no correction ), this just make loading to controller, if disk show error or drop again, then you don't need rebuild any disk. Once load test pass, then start array as usual and test/monitor does everything normal.

Edited by Benson
Link to comment
13 minutes ago, Benson said:

Suggest not use the port which have drop disk ( best could use one of the controller on 9300-16i ), this just test does one controller have problem or not.

I do have 1 disk on that side of the 9300-16i. That disk has not shown any signs of problems and the other 3 disks are on the HP220. I have a new set of mini sas cables coming which I suspect might be the problem. Atleast I am hoping instead on the card.

 

Question now remains is do I format the disk (1) that is rebuilding since it is showing no file system or let it rebuild?

Edited by Harro
added txt
Link to comment
11 minutes ago, Harro said:

Question now remains is do I format the disk (1) that is rebuilding since it is showing no file system or let it rebuild?

Too bad, if emulate disk show unmountable, even rebuild won't fix this, waiting some expert jump in. ( don't perform format )

Edited by Benson
  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.