rawfuls Posted February 15, 2021 Share Posted February 15, 2021 disk4 disabled overnight, log shows tons of errors, UnRAID shows a billion writes with tons of errors. Not totally thinking it's failed, maybe a loose connection; but recently had a similar situation and lost a good chunk of data. Would like some guidance before I move forward, attaching diagnostics below. I believe the plan is to use unBalance to move all data from disk4 and let unBalance throw it onto the rest of the array. Once unBalance is finished, New Config > Retain all config > All > leave disk4 blank, start up the array, let it rebuild. Once rebuilt, stop array, downsize array down to 10 drives (now 11), then move disk5 to disk4, disk6 to disk 5, etc and should be good? diagnostics-20210215-1121.zip Quote Link to comment
JorgeB Posted February 15, 2021 Share Posted February 15, 2021 Disk dropped offline so there's no SMART, power down, check connection and post new diags. Quote Link to comment
rawfuls Posted February 16, 2021 Author Share Posted February 16, 2021 (edited) Sigh, restarted and now getting an error at start... MPT BIOS Fault 11b encountered at adapter PCI... Firmware Fault Code: 4101h. This happened to my last LSI 9201-8i card, I wonder if the motherboard is eating these or if these cards are actually dying. Tried a different PCI slot to the same issue. Edited February 16, 2021 by rawfuls Quote Link to comment
Kevek79 Posted February 16, 2021 Share Posted February 16, 2021 Just because I am curious. Do you have active cooling on the lsi board. The chips on those can get very toasty when not cooled extensively, Quote Link to comment
rawfuls Posted February 16, 2021 Author Share Posted February 16, 2021 7 hours ago, Kevek79 said: Just because I am curious. Do you have active cooling on the lsi board. The chips on those can get very toasty when not cooled extensively, Nope, so now we know it really is the card failing then Saw heatsinks on the cards and figured it would be alright. Looks like I'll be on the market for another card and figure out an active cooling resolution. Quote Link to comment
ChatNoir Posted February 16, 2021 Share Posted February 16, 2021 Not sure that is the issue but you could try to add cooling to see if this improve things. 23 minutes ago, rawfuls said: Nope, so now we know it really is the card failing then Possible that the card crashes because of overheating. Does not mean that it is dead. 25 minutes ago, rawfuls said: Saw heatsinks on the cards and figured it would be alright. Those cards are build for server racks with plenty of airflow through the add-on cards. If you are using a regular case, it is possible that there is not enough airflow for just a heatsink. Maybe the card is dead, but it might be quick test to plug a fan on the MB and point it to the heatsink to see if it behaves better. Quote Link to comment
rawfuls Posted February 16, 2021 Author Share Posted February 16, 2021 16 minutes ago, ChatNoir said: Not sure that is the issue but you could try to add cooling to see if this improve things. Possible that the card crashes because of overheating. Does not mean that it is dead. Those cards are build for server racks with plenty of airflow through the add-on cards. If you are using a regular case, it is possible that there is not enough airflow for just a heatsink. Maybe the card is dead, but it might be quick test to plug a fan on the MB and point it to the heatsink to see if it behaves better. The card was cooled off and still wouldn't pass the MPT BIOS error message. Unless it's isolated as a motherboard PCI error, it does seem that I could have taken both of these cards out with heat (my previous card was in the same spot, no active cooling). Case is a Rosewill L4500. It's on the side-most PCI slot, so no real cooling going on. I'm seeing quite a few people bolt up 40mm fans to the heatsinks, so that may be in order. Quote Link to comment
Kevek79 Posted February 17, 2021 Share Posted February 17, 2021 I have a 40mm noctua fan mounted on the main heatsink of my lsi card. never had any issues with my card regardless of the season and it can get pretty hot in our appartement during the summer months. Quote Link to comment
rawfuls Posted February 20, 2021 Author Share Posted February 20, 2021 (edited) New card came in and still having the same issue... Appears it's the motherboard/BIOS, as after testing on my desktop, both the old card and new card work... go figure. I've put the new card in, and connected only 4x hard drives (one SAS -> 4 SATA), and it seems to boot without any issues. Once I use both SAS ports on the 9201, it no longer boots and shows the above error message. Seems like with the total of 8x drives, the 9201i card is indicating a failure... what gives? EDIT: okay.. some more info. (1) LSI 9201-8i w/ 2 SAS (8x hard drives via splitters) -> no boot (1) LSI 9201-8i w/ 1 SAS (4x hard drives via splitters) -> successful boot (2) LSI 9201-8i w/ 1 SAS each (4x hard drives via splitters) -> only recognizes (1) LSI 9201-8i card (4x drives) Testing in my own personal desktop: (1) LSI 9201-8i w/ 2 SAS (8x hard drives via splitters) -> successful boot Seems to me like the motherboard isn't able to read all 8 drives anymore, any belief to thinking it's just a BIOS modification? Edited February 20, 2021 by rawfuls Quote Link to comment
rawfuls Posted February 20, 2021 Author Share Posted February 20, 2021 Success! Kind of. The drive failed and seems to be (shorting?) out the controller it's connected to. If connected to the LSI card, the whole adapter fails out. If connected to the board, the whole board controller fails out. At this point, the drive has been disconnected and the parity is emulating the contents. Is this where I now go New Config > Preserve All. Do I downsize by one drive or start the array with the same # of drives, but leave disk4 empty? Quote Link to comment
JorgeB Posted February 20, 2021 Share Posted February 20, 2021 1 hour ago, rawfuls said: Is this where I now go New Config > Preserve All. New config won't preserve any data in the emulated drive, if you don't want to rebuild that drive with a different disk first move all the data from the emulated disk to others then do the new config (you'll need to re-sync parity). Quote Link to comment
rawfuls Posted February 20, 2021 Author Share Posted February 20, 2021 15 minutes ago, JorgeB said: New config won't preserve any data in the emulated drive, if you don't want to rebuild that drive with a different disk first move all the data from the emulated disk to others then do the new config (you'll need to re-sync parity). So unBalance from disk4 to all other disks. New Config, Preserve All; reduce by 1 disk in total # of disk? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.