Jump to content

neural

Members
  • Posts

    25
  • Joined

  • Last visited

Posts posted by neural

  1. Here is what we did and its slow:

     

    Setup

    1. Array setup with one parity drive set to "Turbo" mode
    2. A cache (NVMe 512 GB) for a share called "LargeFiles"
    3. Attached a HDD 16TB drive, mounted it and use Krusader to migrate from direct connected HDD (mounted) to the share called LargeFiles

     

    We tested with a folder of 200 Gb smaller documents 

    We tested with large 1 GB files 

    We tested with mixed large and small

    .... all are slow

     

    Is it a best practice to use the Network as the main conduit to migrate data into the Array ?

     

    Txs

  2. UPDATE #3 - Kernel Panic - Stable for >48 hrs

     

    What's inside the server now

    1. Supermicro AOC-SASLP-MV8 Rev1.01 new Supermicro 8087 to SATA cables - CBL-0097L-03
    2. Nvidia Quadro 4000
    3. USB (Plugged into the motherboard USB slot)

    Array: All clear

    • Resolved Drive Issue & Added another 3TB drive
    • Parity is clear, Array no errors clear

    Next Usecase: 

    • Now will begin to setup Unraid with VMs and add the K80 (when I get the riser cable and K80 cooler) 
  3. UPDATE #2 - Kernel Panic - Continues 

     

     

    What's inside the server now at the 3rd Kernel Panic - Only these cards at the point of failure 

     

    LSI, HBA (with new Supermicro 8087 to SATA cables - CBL-0097L-03) not the red ones as shown in the picture

    Radeon Graphics Card (Small temp discrete solution for testing) 

    USB (Plugged into the motherboard USB slot)

    ** Memtest86 v5.01 ran for 16 hrs without errors - I exited and moved ahead with that considering it cleared. 

     

    After Kernel Panic #3:

     

    Powered off server as it just hangs when Kernel Panic error occurs 

    Removed USB from the motherboard and moved it to the back panel - Thinking maybe the heat 

    Removed the GFX card and replaced with the Quadro 4000

    Removed & Replaced the LSI HBA 

    New HBA - Supermicro AOC-SASLP-MV8 Rev1.01

     

    Also: Bad Drive is not bad; Why is Unraid not reflecting this ?

    1407772400_UnraidDrivewithoutIssuesSnip-Screenshot2022-04-14170951.jpg.c4cba8a45f6bed63256020642cee442e.jpg

    - I tested with 3 sets of Sata Cables - Drive is 100% no Smart errors and works in secondary tests

    - I tested with 3 controllers (onboard, LSI #1 & Supermicro) also tested drive with HBAs in different slots

     

    Troubleshooting: 

     

    #1 -  ? LSI HBA activity causing the Kernel Panic - Removed it to test with alternative HBA

     

    Test Case - Recreate the Kernel Panic - 

    Copy over network to share and Copy from mounted disk to share same time.

     

    HBA Supermicro.jpg

  4. Update #1 - Kernel Panic 

     

    Whats inside the server now - Only these cards at the point of failure 

    1. LSI, HBA (with new Supermicro 8087 to SATA cables - CBL-0097L-03) not the red ones as shown in the picture
    2. Radeon Graphics Card (Small temp descrete solution for testing) 
    3. USB (Plugged into the motherboard USB slot)

    Unraid: 

    • Setup., Added 4 drives, cleared all, added one as parity and three for storage.
    • Added shares and one docker (Krusader) and two utiliies (for mounting drives)
    • File movement - Moved 3 tb to array, and then via network was moving 1tb over network at same time
    • Via Unraid Web UI - One drive "dropped" or unmounted, unresponsive UI - and then Via Unraid Linux server saw the Kernel Panic error again
    • Server was online for about 24 hours without any activity then approx 2 hrs into a drive clearing, and file copy it crashed. 

    After Kernel Panic:

    1. Powered off server as it just hangs when Kernel Panic error occurs 
    2. Removed USB from the motherboard and moved it to the back panel - Thinking maybe the heat 
    3. Removed the GFX card and replaced with the Quadro 4000
    4. Left the HBA in place
    5. Started server and started Memtest86 v5.01 (still running test #8, pass 34% / test 58%) still no errors

     

    Troubleshooting: 

     

    #1 -  ? LSI HBA activity causing the Kenel Panic - So will double check ALL Bios for both Onboard and the Card (and recreate the same conditons)

    - Copy over network to share and Copy from mounted disk to share same time. 

  5. Hi All,

     

    I wanted to share this journey as it unfolds and gain the collective experience from the community as it goes forward. I do hope to solve this and return to the Unraid community (first build was back in Jan 2013)

     

    I will explain my equipment, versions and usecase then begin to explain the Problem and steps to solve it. 

     

    Equipment:

     

    1. Supermicro X9DA7/E with latest bios v3.3 
    2. Intel® Xeon® CPU E5-2620 0 @ 2.00GHz x 2 (aka Dual)
    3. 64 GiB DDR3 Multi-bit ECC (all Memtested) 
    4. LSI SAS9207-8i (HBA flashed in IT mode) 
    5. USB - Unraid - Patriot AUTOBAHN 8gb "Low Profile"

     

    "Equipment to be added for docker and VM workloads (usecases); 

     

    1. Nvidia Quadro 4000
    2. Nvidia K80 

     

    Special note: The onboard Supermicro raid is disabled and we are using the LSI SAS9207-8i card as the primary HBA

     

    Unraid: Version 6.9.2 2021-04-07

     

    (No dockers, no addons nothing but vanilla install to date)

     

    Situation: 

    • When all cards are installed (as seen in the Image). 
    • We login to the Web UI and try and build the initial array with one parity drive (8TB) & one drive (6TB) the server stops/locks up with this error. 
    • It does not reboot

     

    Error: 

     

    KERNEL PANIC - NOT SYNCING: TIMEOUT: NOT ALL CPUS ENTERED BROADCAST EXCEPTION HANDLER

     

    Troubleshooting plan:

     

    Worked: (One of three)
    Removed all cards; then add one by one -  Re-added one card; LSI SAS9207-8i; Re-connected two drives to above card

    Started server, created a NEW Parity drive and one disk drive and after 12 hours it is OK/Healthy and NO Errors.

    Now we will add one card at a time and re-start the troubleshooting) 

     

    Kernel panic error 01 - Smaller.jpg

    MoBo and Cards - Label.jpg

  6. Wierd question; but

     

    Bay Area - San Jose etc...

     

    Are there anymore Bricks and Mortar places to get equipment?

     

    * Micro Center - Closed

    * FRY's is ok but lacks many of the parts

    * Central Computer - has some items

     

    Just wondering if there is a GREAT place we do not know about

×
×
  • Create New...