Jump to content

[Solved] Suspecting my PCIe SATA controller to be broken


razr

Recommended Posts

Hey,

 

I recently bought all the parts to setup my first DIY NAS and decided for UNRAID as my OS.

 

I bought the SilverStone SST-DS380B [1] as my case to put in 8x 8TB Seagate IronWolf NAS disks for my array. The board I decided for is the ASRock Z390M-ITX/AC [2].

 

Since the board only has 6 SATA ports I added an additional PCIe SATA controller, the HighPoint Rocket 640L [3]. Now with 10 SATA ports and 1 M.2 port I connected four of my array disks to the motherboard and four to the PCIe SATA controller. Then to complete my setup I put two 480 GB SSDs into the two leftover SATA ports and finally added a 500 GB NVMe SSD to the M.2 port for my docker containers and volumes.

 

First I really struggeld with overheating disks in the case. Especially in parity syncs and/or data rebuilds. This I managed to fix by some threads in this forum! I replaced the fans, drilled some holes in the metal backplane of the drive cage and added a "cooling duct" to get more airflow through the drive cage [4].

 

But I still have some problems with my parity syncs and data rebuilds. The parity sync after the initial setup already failed after a couple of hours. I cannot remember anymore if it was all the disks of the extra SATA controller, but at least two started giving errors at some point. I noticed very late and the sync ran through and showed "Valid Parity" afterwards. I expected this to have high temperatures of the disks, so I did the "drilling and cooling duct" stuff and now my temperatures are ok, even when all disks are up and read/written to.

 

Nevertheless at some point first the read speed falls down to zero and after a short time all disks connected to the extra SATA controller start to throw errors. It goes as far as some disks are getting disabled! After I restart the machine everything is fine again. I did a extended SMART scan of all those disks yesterday and it did not come up with any errors (I can share the results if it would help).

 

So I expect my controller to have some issues if all disks start reading/writing at the same time! Has anyone any idea how I can test this?

 

Thanks in advance for taking the time!

 

Best regards,

Max

 

[1] https://www.silverstonetek.com/product.php?pid=452

[2] https://www.asrock.com/mb/Intel/Z390M-ITXac/index.asp

[3] https://www.highpoint-tech.com/USA_new/series_r600-overview.htm

[4] https://forums.unraid.net/topic/42109-sff-silverstone-ds380-build/

Edited by razr
marked as solved
Link to comment
1 hour ago, razr said:

drilled some holes in the metal backplane of the drive cage and added a "cooling duct" to get more airflow through the drive cage [4].

Nice for disk cooling.

 

1 hour ago, razr said:

So I expect my controller to have some issues if all disks start reading/writing at the same time! Has anyone any idea how I can test this?

Marvell controller always not recommend, you may need another controller.

Edited by Vr2Io
Link to comment

Hey @Vr2Io,

 

thanks for your quick reply.

 

15 hours ago, Vr2Io said:

Nice for disk cooling.

I can only recommend doing that to anyone using that case! I cooled down my drives about 20 degree Celsius!

 

15 hours ago, Vr2Io said:

Marvell controller always not recommend, you may need another controller.

Could I have found that information somewhere? Never heard, that Marvell controllers are discouraged to used (didn't even know what the chipset on mine was, tbh).

Is there documentation somewhere on which controllers/chipsets are recommended or are working good in general? Or could you recommend a specific PCIe controller with 4 SATA ports?

 

Thanks in advance!

 

Best regards,

Max

Link to comment
59 minutes ago, razr said:

Hey @Vr2Io,

 

thanks for your quick reply.

 

I can only recommend doing that to anyone using that case! I cooled down my drives about 20 degree Celsius!

 

Could I have found that information somewhere? Never heard, that Marvell controllers are discouraged to used (didn't even know what the chipset on mine was, tbh).

Is there documentation somewhere on which controllers/chipsets are recommended or are working good in general? Or could you recommend a specific PCIe controller with 4 SATA ports?

 

Thanks in advance!

 

Best regards,

Max

Avoid any type of Marvell controllers!

Use LSI/Broadcom instead - the SAS 9207-8i is a good one and not expensive.

Edited by Zonediver
Link to comment

Hi @Zonediver,

 

22 minutes ago, Zonediver said:

Avoid any type of Marvell controllers!

Ok, something I will note down for the future :) Would be interested in the "why" though. Do you have any forum or blog posts that explain more on that?

 

26 minutes ago, Zonediver said:

Use LSI/Broadcom instead - the SAS 9207-8i is a good one and not expensive.

I just found it on Amazon for about 150 Euros. Saw that it is 16.4 cm long. Hope it will fit in the case :)

It comes with two SAS connectors, right? Would it make sense to distribute the four drives on both of the connectors? Or would it also be ok to just connect all four drives on one SAS connector? Would it make any difference?

The backplane of my drive cage has two connectors per drive, one marked SATA and the other marked SAS. Currently I have connected all drives on the SATA port. I would guess I have to connect to the SAS port with this new card, right?

 

Best regards,

Max

Link to comment
36 minutes ago, razr said:

Hi @Zonediver,

 

Ok, something I will note down for the future :) Would be interested in the "why" though. Do you have any forum or blog posts that explain more on that?

 

I just found it on Amazon for about 150 Euros. Saw that it is 16.4 cm long. Hope it will fit in the case :)

It comes with two SAS connectors, right? Would it make sense to distribute the four drives on both of the connectors? Or would it also be ok to just connect all four drives on one SAS connector? Would it make any difference?

The backplane of my drive cage has two connectors per drive, one marked SATA and the other marked SAS. Currently I have connected all drives on the SATA port. I would guess I have to connect to the SAS port with this new card, right?

 

Best regards,

Max

@ Marvell: They are known for some Firmware-Bugs dropping connected drives.

@ LSI/Broadcom: This controller type has a PCIe 3.0 x8 interface and can handle 8x SATA-Drives on 2x SFF8087-Connectors.

You can also connect 2x Port Expanders for 2x20 or total 40 HDDs(!) max 😉

@ Connection: It doesn't matter where the HDDs are connected.

On Amazon, the price for the LSI is between € 90.- and € 110.- for a new one.

INFO: This controller needs active cooling because its a Server-HBA! - I use the NF-A4x10 FLX.

BEWARE: DONT connect SSDs on this LSI-controller because it can't handle TRIM - so the Cache-SSD (and all other SSDs) must be connected

to the Mainboard-SATAs.

Edited by Zonediver
Link to comment

Some member also recommand JMB585 5 port SATA controller, pls search.

 

30 minutes ago, razr said:

I would guess I have to connect to the SAS port with this new card, right?

 

LSI HBA support SATA or SAS two different kind device, the 2nd port is for SAS 2nd data path, it usually for redeundancy function.

 

Always are connect "SATA" port first, no matter it is SATA or SAS disk.

 

LSI HBA 2nd hand or server pull would cheaper a lot, most people use that.

Edited by Vr2Io
Link to comment

Hey,

 

49 minutes ago, Zonediver said:

@ Marvell: They are known for some Firmware-Bugs dropping connected drives.

Thanks a lot for the explanation!

 

49 minutes ago, Zonediver said:

@ LSI/Broadcom: This controller type has a PCIe 3.0 x8 interface and can handle 8x SATA-Drives on 2x SFF8087-Connectors.

You can also connect 2x Port Expanders for 2x20 or total 40 HDDs(!) max 😉

I checked again and unfortunately this card is not fitting in my case. Especially not with the connectors on the end. The card itself might just fit in, but then there is the drive cage, that blocks the ports.

 

40 minutes ago, Vr2Io said:

Some member also recommand JMB585 5 port SATA controller, pls search.

I checked Amazon and found something like this: https://www.amazon.com/IO-Crest-Non-Raid-Controller/dp/B07ST9CPND/

Would you recommend?

 

Best regards,

Max

Link to comment

Go to Ebay and search for the vendor "the Art of Server".  He supplies many different versions of the LSI cards in both LSI  or OEM configurations.  One caution about LSI cards.  LSI sells the chip sets to any one who wants to purchase them.  A lot of the older LSI cards are not longer manufactured by  Avago Technologies/Broadcom.  (LSI was acquired by Avago Technologies in 2014.)  There are manufacturers in China who are counterfeiting these older cards and labeling them as LSI.  Quality is suspect for many of these cards.  So buyer beware. 

 

There is a large supply of the older cards available on the used market as server farms are decommissioned.  These are usually very good buys as the hardware is always made by LSI.  They often require that the firmware be changed so that they are in the IT MODE to work  with Unraid.  (Most are loaded with RAID firmware.)  You can do it yourself or purchase a card where the vendor has already done it. 

 

With all Ebay purchases, vet the vendor very carefully!  You are really 'buying' the vendor on Ebay first and the product second.

Edited by Frank1940
Link to comment
5 hours ago, razr said:

In the other thread @JorgeB said he used the card for months now without any issues. If nobody really objects I would buy that card and report back here once I have it installed.

 

Best,

Max

 

You could look at this style of LSI card with the connectors near the back panel.

 

https://www.ebay.co.uk/itm/Dell-PERC-H200-8Port-6Gb-s-Adapter-SAS-SATA-Controller-Card-9211-8I-47MCV-M1015/383315087450?hash=item593f5c105a:g:M5cAAOSwzO9d8eq0

 

You may need to cross flash or find one pre-done. PCI-E 2.0 is fine for spinning disks

 

 

Link to comment

I just had a look at ebay. Most of them are sent from Hongkong or China and need about a month to be delivered to me :( But I think I found a pre-owned Dell Perc H200 now.

 

Just as an info for anyone finding this thread later with a similar problem: There is a section about PCI SATA Controllers in the Unraid Wiki article about supported hardware components:

https://wiki.unraid.net/Hardware_Compatibility#PCI_SATA_Controllers

 

Does this card need active cooling (as @Zonediver mentioned earlier about the LSI SAS 9207-8i)? The one I found comes with a heatsink, but it looks like it could be removed and replaced with something else.

 

I did a first quick read of the crossflashing article and currently I'm a bit overwhelmed by the info in there. I hope I'll understand it by the time the card arrives :) Is there an easy way to find out if I even need to crossflash? Like just installing it, connecting all the drives and if it works it is okay? Or is it the case that if I need to crossflash and I don't, it works but "poorly"?

Link to comment
27 minutes ago, razr said:

Just as an info for anyone finding this thread later with a similar problem: There is a section about PCI SATA Controllers in the Unraid Wiki article about supported hardware components:

https://wiki.unraid.net/Hardware_Compatibility#PCI_SATA_Controllers

This list is very outdated and has not been really maintained for several years.  A few things may have been added but cards that don't work, aren't supported or are not available have not been removed.  (Doing so would require the wisdom and knowledge of a Salomon...)

Link to comment
2 hours ago, razr said:

I just had a look at ebay. Most of them are sent from Hongkong or China and need about a month to be delivered to me :( But I think I found a pre-owned Dell Perc H200 now.

 

Just as an info for anyone finding this thread later with a similar problem: There is a section about PCI SATA Controllers in the Unraid Wiki article about supported hardware components:

https://wiki.unraid.net/Hardware_Compatibility#PCI_SATA_Controllers

 

Does this card need active cooling (as @Zonediver mentioned earlier about the LSI SAS 9207-8i)? The one I found comes with a heatsink, but it looks like it could be removed and replaced with something else.

 

I did a first quick read of the crossflashing article and currently I'm a bit overwhelmed by the info in there. I hope I'll understand it by the time the card arrives :) Is there an easy way to find out if I even need to crossflash? Like just installing it, connecting all the drives and if it works it is okay? Or is it the case that if I need to crossflash and I don't, it works but "poorly"?

I've cross flashed 3 cards now, 4th one on standby when I have the time.

 

have a read here.

 

https://wiki.unraid.net/Crossflashing_Controllers

 

basically you need a version of SAS2FLSH which allow you to apply the stock LSI firmware to a OEM (HP/DELL) card. I think Version 7 is the last one that works.

 

Basically

-Look at stats, get the unique SAS ID (you can enter a random one later if needed)

-Wipe Card

-Flash early version

-Flash current version 20.07.00 

-Set SAS ID 

 

Personally I wipe the BIOS from the card, this would allow you to select disk for boot etc. however it just slows down the process when all you need is a dumb controller.

 

My cards haven't had active cooling over the last 2 years and have been fine, though with mainly 4TB drives the controller wasn't running at full load as it would be when connected to a 20 drive array.  It is recommended to have some cooling, any arrangement that gives a little airflow should be fine. 

 

You may run into some issues with EFI motherboards, however once I wiped the card I was able to flash normally. 

 

If you shop for Dell H200 Unraid, for Dell H200 IT mode, you may find one that has been flashed for you though often they leave the BIOS on so you get a slow boot. Still it will work.

 

Quite hard to brick the cards but useful to have another PC to flash on and I find having a PCI-E x1 to PCI-E x16 cable handy for quickly plugging a device in for flashing etc.  

 

 

https://www.aliexpress.com/item/33029992462.html?spm=a2g0o.productlist.0.0.51fa7c39DfrmE9&algo_pvid=661e7606-edc9-4f2f-b9cd-628462cd8755&algo_expid=661e7606-edc9-4f2f-b9cd-628462cd8755-13&btsid=0b0a187916011612482847438e115b&ws_ab_test=searchweb0_0,searchweb201602_,searchweb201603_

 

 

Edited by Decto
Link to comment
On 9/27/2020 at 1:01 AM, Decto said:

My cards haven't had active cooling over the last 2 years and have been fine, though with mainly 4TB drives the controller wasn't running at full load as it would be when connected to a 20 drive array.  It is recommended to have some cooling, any arrangement that gives a little airflow should be fine.

Since I did my "cooling enhancements" I have a pretty decent airflow in my case. I hope this will be enough for handling four 8TB hard disks. But in any case, can I somehow monitor the temperature?

 

Also I noticed there is a little four pin (?)  connector on the lower right corner oif the card. Can you tell me what that one is for?

image.png.3c222deb38eff0b44b2001bf5e1bc07a.png

 

On 9/27/2020 at 1:01 AM, Decto said:

I've cross flashed 3 cards now, 4th one on standby when I have the time.

 

have a read here.

 

https://wiki.unraid.net/Crossflashing_Controllers

 

basically you need a version of SAS2FLSH which allow you to apply the stock LSI firmware to a OEM (HP/DELL) card. I think Version 7 is the last one that works.

 

Basically

-Look at stats, get the unique SAS ID (you can enter a random one later if needed)

-Wipe Card

-Flash early version

-Flash current version 20.07.00 

-Set SAS ID

 

I'm still not really sure on how to crossflash. The wiki post you are mentioning was the one I was looking over a couple of days ago. But I find it a bit overwhelming confusing or unclear somehow.

 

I followed the link to this post in the "LSI Controller FW updates IR/IT modes" thread and downloaded the latest zip from mediafire (Update on 17.04.2017, v4). This zip now contains some instructions on how to flash (I have no Windows machines, so I'll go for the unetbootin FreeDOS USB stick).

 

The instructions contain the execution of a batch script, that "will wipe the controller clean". Is this the "Personally I wipe the BIOS from the card" part? Or do I have to do any additional things to do this?

 

I think the part of the instructions "flash the original DELL IT-firmware, call 5ITDELL.bat" is the one you mentioned as "Flash early version", right?

 

The card has not arrived yet, so I cannot yet report back on the success of my actions,. But I want to be prepared :)

 

Thanks for taking the time!

Edited by razr
overwhelming was the wrong word
Link to comment

So the controller arrived today and I flashed it. I had the "Failed to initialize PAL" issue and did everything using the EFI shell. While flashing on Step 5.2 "flash the LSI 9211-8i (P7) IT-firmware, call 5ITP7.bat" it asked if I want to continue although the "NVDATA versions are not matching". I accepted and it continued without any problems. The next step "flash the LSI 9211-8i (P20) IT-firmware, call 5ITP20.bat" then execute without any issues. Finally I had to manually write back the SAS address, since there was no EFI script to do it.

 

After I re-installed everything back in my case and started the machine all disks connected to the controller are not found anymore. I then switched to the SAS port of my drive cage but same result.

 

I attached the diagnostics zip from after the reboot. I checked the files already and could not find any controller matching LSI and/or H200 at all. Would be great if someone could have a look.

coruscant-diagnostics-20200930-2210.zip

Link to comment
32 minutes ago, razr said:

 

After I re-installed everything back in my case and started the machine all disks connected to the controller are not found anymore.

Same issue with my Dell H310 (same as H200 but with ports at back of card) after flashing to IT firmware.

 

Taping over pins 5 and 6 with Kapton tape fixed it and that card and all attached drives were immediately recognized.

Link to comment
49 minutes ago, Hoopster said:

Same issue with my Dell H310 (same as H200 but with ports at back of card) after flashing to IT firmware.

 

Taping over pins 5 and 6 with Kapton tape fixed it and that card and all attached drives were immediately recognized.

Did your machine boot up? Because mine does. It also shows the other seven drives connected to the mainboard. But not the ones connected to the card. Are there any experiences with the taping in such a case?

Link to comment
2 minutes ago, razr said:

Did your machine boot up? Because mine does. It also shows the other seven drives connected to the mainboard. But not the ones connected to the card. Are there any experiences with the taping in such a case?

Yes, the machine booted up without issue. Same thing; all drives connected to MB showed up, HBA connected drives did not.

 

Apparently, those pins have something to do with the card being used on a Dell MB in a Dell server but on many non-Dell boards there can be issues with the card not initializing properly without the pins taped.

Link to comment
13 minutes ago, razr said:

Alright. Then I will give it a try tomorrow! First need to check where I can get Kapton tape though :D Will report back once I tried.

You can use regular electrical tape.  I did to start with even though it is a little thick.  When I got some Kapton tape (a cheap knock-off version from Amazon), I replaced the electrical tape.

Link to comment
2 hours ago, Hoopster said:

You can use regular electrical tape.  I did to start with even though it is a little thick.  When I got some Kapton tape (a cheap knock-off version from Amazon), I replaced the electrical tape.

The adhesive on electrical tape tends to be kinda nasty after a bit, so replace with Kapton ASAP. Also, electrical is WAY thicker, so it could spread out the contacts far enough to cause issues if you need those pins to make connection in the future.

Link to comment
3 minutes ago, jonathanm said:

The adhesive on electrical tape tends to be kinda nasty after a bit, so replace with Kapton ASAP. Also, electrical is WAY thicker, so it could spread out the contacts far enough to cause issues if you need those pins to make connection in the future.

True.  I only used it for a couple of days while I waited for some Kapton from Amazon.  I wanted to see if the tape trick would fix the problem (it did) and then  ordered some Kapton.

 

It is definitely much thicker than Kapton.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...