gumby327

Members
  • Posts

    67
  • Joined

  • Last visited

Posts posted by gumby327

  1. my computer is starting to act like the three grand I dumped into it.  All the headache stemmed around a 12gbs HBA SAS controller.  I picked up an older one since then and have it in the main server now, like it and I bought a second one for the other machine.  I spent tipple on the double speed and newer chip.  But I think it overheated somewhere in it's 5 year life and is completely flaky,  It may barely survive a parity check then die hours later, a lot of boots it lasted 3 hours and died.

  2. Last night after replacing the entire computer around the HBA SAS controller I continued to see CRC fails.  So, well me being me I had a rock solid backup server I took the controllers and switched them from my backup to my main computer.  And now we have the answer.  These controllers are all old crap, and the best crappy one I have is actually the worst.  Has nothing to do with unRAID or CPU's or motherboards, no.  It was the $232 SAS controller card all along.  So, now in my backup server I now have a ton of CRC errors after all night parity checking.

  3. It has been running now for about a week and the improvements are:

      1.  The HBA controller, same old one, is now in an 8x slot.

      2.  The HBA controller has a 1.5 inch fan cooling it.

      3.  The Strix ATA full motherboard is a ton more flexible and works better in unRAID for my needs.

      4.  The NODE 804 is now relegated down to my backup PC.  The Fractal Designs R5 is the right box for the job because every hard drive has fresh air forced over it.

      5.  I went to Version: 6.10.0-rc3 up from the stable release.  That made my thermals start working.

     

    What I am seeing is full parity check and no problems yet.

  4. 4 hours ago, JonathanM said:

    Has it always been actively cooled? Server grade parts expect lots of airflow, sometimes when they are used in normal desktop style cases instead of rack mount, they need a dedicated fan.

    No, and that is my guess is that the HBA controller overheated on me.  It all works good with 7 drives.  When I add 8 things work for a bit then erode fast then poof it locks the disk.  The replacement HBA controller is coming.  I have a RAID controller here as well I may just force it to HBA.

  5. On 3/6/2022 at 3:31 AM, JorgeB said:

    Like mentioned using an x4 slot won't cause that kind of problems.

     

    Not surprising, the problem could be the HBA itself or other issue, like power, etc.

    well, brand new motherboard, and same old problem, so the only thing remaining is the HBA controller card is bad.

  6. So, the experiment to place it in a x16 slot, load it up with 8 drives and run it was a fail.  Shortly after the array rebuild it's failed drive, another one died.  So, I know when this goes to be my backup server, there will be no parity, and it will have only 5 drives.  In fact, for that case it does not even need the SAS HBA, so I probably will just set that card aside.  Either the drive allocation in the motherboard or chip is bad or the card has a flaw.  The new motherboard has a lot more strength and is will be a full mini SAS cable not these tiny ones.  But that board has 8 SATA ports out the gate, so all I will need off the add on card is 3 or 4 more lanes.

  7. 12 hours ago, gumby327 said:

    I really wish someone would have helped me catch my mistake.  That is RAID, not HBA.  I just thank my lucky stars I did not install it and learn afterwords as it initialized a RAID array and all my data was gone.

     

    12 hours ago, gumby327 said:

     

     

  8. I have had Gumby running steady no errors for over two days and locked in the actual problem.  It was a HBA SAS PCIe 8x resting in a PCIe 4x slot.  I always knew it would have restricted bandwidth in that slot, but what I had never guessed is the strategy it was using was port multiplier.  It says half lanes so multiply for the other SAS rail.  It was fairly OK for 7 drives, but when you placed a 8th drive in you started getting CRC errors.  Today I am going to lock in that as the problem by placing it in a 16x GPU slot and hook up the 8th drive and do a parity check.  That will tell me the machine would make a good back up server.  

     

    Second thing I will do is build out Gumby with more premium specs.

     

    I ordered a new parts list for Gumby:

       https://www.amazon.com/gp/product/B09GP7V2W5/ref=ppx_yo_dt_b_asin_title_o03_s00?ie=UTF8&psc=1

       https://www.amazon.com/gp/product/B00Q2Z11QE/ref=ppx_yo_dt_b_asin_title_o02_s00?ie=UTF8&psc=1

       https://www.amazon.com/gp/product/B07V6132NX/ref=ppx_yo_dt_b_asin_image_o01_s00?ie=UTF8&psc=1

       https://www.amazon.com/gp/product/B00HS23QZO/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&psc=1

     

    The existing Node 804 is being demoted to backup "Pokey" server and getting one of my two HBA SAS controler cards.  Since it has only one second x16 size lane that is x4 it can never be a VM machine as well as a large array server.  So, I have a spare AMD Ryzen 5 5600x 6 core 12 thread processor.  That will make it stable and powerful beyond it's needs as a backup and reverse proxy.

  9. so, that is the final deal.  The PCIe architecture of that Micro ATX B550M has a limit and I flat out exceeded it.  The machine cannot do Windows VM and also handle HBA loads same time and it seems to be about one express lane short in overall bandwidth.

  10. new drive is in.  what I did was took array drive8 and  assigned it to slot for drive1.  So, it is on the motherboard SATA now.  It has been running clean for a couple of hours.  What I am not sure about is the LSI Broadcom SAS 9300-8i is x8 and it is in a x4 slot.  I wonder if it is doing a multiplier for the 4 lanes, and if that is the case that has been noted as a problem with unRAID.

  11. new update, you are probably not going to believe this.  NODE 804, when fully disked up with 12 HDD hard drives, may have too much case twist.  Unfortunately I broke the IT rules.  I tried two things.  #1, I put half the questionable ram in Pokey, and I put the other half in Gumby.  Then I swapped the SAS connection spots and pressed down on the controller card into the slot ... it was not seated correctly.  Now I look at a screenshot of my UDMA CRC Errors in Unraid System Dashboard V2 and I see we are now rock steady.  For five minutes.  I am even watching 4 movies same time, ran mover as well as fire up my surveillance FTP server ...  Been going for 10 minutes now and it has not had a single CRC detected.

     

    So, to me right now it looks like the NODE 804 case is to weak to handle carrying it up my ladder and placing it on the shelf when I am done working on it without checking all the cards for seating every time.

  12. I wonder what would happen if I placed this array on the other board (Intel one) and placed this dongle (flash drive) over there.  It is another chipset and much older, and less cores.  But that is suppose to be the power of unRAID

  13. That is haunting me, but the pattern I am seeing is disk 1 and or 3 start throwing CRC errors.  I take them out put in a brand new drive and here today it lasted about 7 hours and all of a sudden it starts going nuts with CRC's.  I have moved the two cables around, but nothing works what ever drive I stick in that spot fails.

     

    NODE 804:

     

               == parity1      == disk4

               == parity2     == disk3

    air --->  == disk6        == disk2  air --->

               == disk5        == disk1

                                 |                    |

              wires           |       PSU       |

                                _______________

     

    Disk 1 is closest to the case cover.

     

    I don't know disk1 is literally 10 hours old now and it went out right away.  disk3 is still fine.  7 and 8 are sata and on the floor in front of the case.  So... let me take the ram out of server "Pokey" and place it in "Gumby" and take all of the ram out of "Gumby" and set it aside.  I don't want to kill Pokey since it is my backup to the media library.  The problem I may have is thermal pad on my CPU, the cooler has to come off to get at my ram. 

     

    Did you ever get your CPU temp to report out?

  14. I had a AMD Ryzen 7 5700X and then a *G.  The problems mounted to a point where the whole thing is no longer worth experimenting on.  I went from bit-rot, to CRC failures on over 10 drives and I had drives error out with little explanation.  I changed from Marvel to HBA SAS, I changed cables, nothing would last more than a week.  It was a total waste of time and thousands of dollars.  I have an Intel across the workshop that is stable never ha d a single problem.  It even works fine on the disks failed in either of my two AMD Ryzen chips.  The processor sensors have never detected, I cannot load drivers... At the end of the horrible experience I have embraced change.

     

    So, it is time for clean slate.  I am thinking of building a Intel server.  My parts list is:

    https://www.amazon.com/dp/B09D1HDPQT/?coliid=IPY2LP8W0O5IH&colid=2VOOCJZIV4PU5&psc=1&ref_=lv_ov_lig_dp_it

    https://www.amazon.com/dp/B086MN2XYL/?coliid=I1NXKFT148T5KX&colid=2VOOCJZIV4PU5&psc=1&ref_=lv_ov_lig_dp_it oops, I need the one with the IGPU

     

    Is there any problems with this combination?

  15. it is funny, there is nobody over there in Motherboards and Processors, so I found myself talking to  .... well, me.

     

    So, I was having fun with it, thanks Gary, no problem ... but that was right and left brain.

     

     

  16. if and when the tweak I am doing above fails ... can someone recommend to me a Intel like match build (Processor and MB):

    • Micro-ATX
    • 16 threads or better
    • PCIe 1 x16 double wide, 1 x1 or better, 1 x8 or better (would settle for a x4 like I have now)  
    • Ethernet 2.5 GBS
    • USB extra 2.0 inside case for thumb-drive
    • Has all the necessary unraid drivers to do the VM

     

    in fact, this is the exact match:

     

    https://www.amazon.com/dp/B09D1HDPQT/?coliid=IPY2LP8W0O5IH&colid=2VOOCJZIV4PU5&psc=1&ref_=lv_ov_lig_dp_it

    https://www.amazon.com/dp/B086MN2XYL/?coliid=I1NXKFT148T5KX&colid=2VOOCJZIV4PU5&psc=1&ref_=lv_ov_lig_dp_it

     

    will that work with Unraid?