Jump to content

IDontBelongHere

Members
  • Posts

    37
  • Joined

  • Last visited

Posts posted by IDontBelongHere

  1. 10 hours ago, jrprinty said:

    Thank you so much for the documentation. I just ran into this same issue with a DS4246 and the NetApp HBA and have already bought the Dell controllers and an LSI 9201-16e yesterday out of desperation. This now confirms I’m not crazy nor the one person that’s had this issue. I just wish I would have found this post a week ago before spending all the time researching the problem that nobody else seemed to have. I have even went so far as to RMA the HBA, DS4246, and the QSFP cable. The new ones of those will be here Saturday but the Dell and LSI stuff won’t be here until Monday. 
     

    Also I did get my drives formatted but during boot/starting of the array drives would drop out randomly and I’d have to pull them out and reinsert them hoping they’d show up correctly. 

    You're welcome. Glad it's helped some-body save time/money. 

  2. 19 hours ago, crowdx42 said:

    I didn't get the shelf, with the limit of 30 drives for an array I decided not to go the shelf route and instead upgrade drives when needed, I still have a few 4tb drives which I will end up upgrading either to 8tb or 10tb, keeping in mind that 10tb would require me to upgrade the parity drives to 10tb first.
     

    Ah! I must've missed the 24-bay being full part. Yeah. You probably made the best move. 

  3. On 1/14/2020 at 10:27 PM, ridley said:

    Out of curiosity, now that I have fired the diskshelf up, are they always thus noisy?

    Mine is decently quiet. I have two PSU's plugged in. 

    Just a note, I had a thread on here from when I first got my shelf. You might want to read through it. It might save you a lot time, money, and frustration. 

     

    If you intend on staying down your current path, I have dual IOM6's for sale if you're interested. :D 

    Also the NetApp HBA and QSFP cables. 

  4. On 11/4/2019 at 12:59 PM, crowdx42 said:

    Thanks for the reply, one last question, what was the reason for changing the controllers? Was it purely bandwidth, or is there a reliability issue with the original NetApp controller?

    Also, my setup is in our spare bedroom, is there any hacks to make these units quieter?

    Patrick

    This thread is essentially a log of all the work I did to my shelf and why. I changed the controller because, as noted in the title, my preclears kept failing while writing the signature. Instead of spending more time trying to figure out if it was the NetApp controller or the NetApp HBA, I decided to just replace both at the same time. 

  5. 14 hours ago, crowdx42 said:

    Hi,

     so I am thinking of getting the same disk shelf, do you happen to have a link to the actual setup that worked for you? Am I correct in reading that only a single cable is needed between the controller card and the disk shelf?
    Thanks in advance.

    Patrick

    Dell Compellent HB-SBB2-E601-COMP 0952913-07 6GB SAS EBOD Controller 

    (https://www.ebay.com/itm/Dell-Compellent-HB-SBB2-E601-COMP-0952913-07-6GB-SAS-EBOD-Controller/372797485227)

     

    LSI SAS9200-16e 16-Port External HBA Full-Height PCIe P20 IT Mode ZFS FreeNAS  (http://www.ebay.com/itm/323724809849)

     

    CABLEDECONNMini SAS26P SFF-8088 to SFF-8088 1M External Cable Attached SCSI (https://www.amazon.com/dp/B00S7KTXW6)

  6. UPDATE: So, after trying 3 or 4 different IOM6 controllers, an IOM3 controller, and two different NetApp HBA's, I decided to take it with me back home in Atlanta when I went over the weekend. I picked up another DS4243 along the way for a friend back home. I tried using the same HBA's and controllers as I had used previously, but with the new shelf. His server gave the option to format the drives, which was greyed out on my server.

    I also had attached my original DS424* to his server using a second port on the NetApp HBA. It didn't seem to be recognized at all. 

    I ran a preclear on all 4 drives, and when the fastest drive got to 99% of the clear portion, the JBOD crapped its pants and reassigned the drive to a new assignment. 

    Luckily, I had at home waiting for me a Dell Compellent controller that had been delivered the day I left, as well as an LSI SAS9200-16e HBA, and a an sff-8088 cable to connect the two. Upon returning home, I swapped those parts in and started a preclear on all 4 drives. 19 hours later, my fastest drive precleared and post-read successfully!

     

    So, I'm not sure if it was the NetApp HBA, or the NetApp controllers, but one of those two, if not both, was the culprit. 

     

  7. Well, my 300GB continue to clear fine, but my 6TB drive continues to fail at the same point. I got another 6TB in the mail today. Currently preclearing it as well. Hopefully I have better luck with it. 

    
     	
    ############################################################################################################################
    #                                                                                                                          #
    #                                     unRAID Server Preclear of disk 5000cca23c0ff7ec                                      #
    #                                       Cycle 1 of 1, partition start on sector 64.                                        #
    #                                                                                                                          #
    #                                                                                                                          #
    #   Step 1 of 4 - Verifying unRAID's Preclear signature:                                                           FAIL    #
    #   Step 2 of 4 - Writing unRAID's Preclear signature:                                                          SUCCESS    #
    #   Step 3 of 4 - Zeroing the disk:                                                       [10:31:36 @ 158 MB/s] SUCCESS    #
    #                                                                                                                          #
    #                                                                                                                          #
    #                                                                                                                          #
    #                                                                                                                          #
    #                                                                                                                          #
    #                                                                                                                          #
    #                                                                                                                          #
    #                                                                                                                          #
    #                                                                                                                          #
    ############################################################################################################################
    #                              Cycle elapsed time: 10:31:37 | Total elapsed time: 10:31:37                                 #
    ############################################################################################################################
    
    --> FAIL: unRAID's Preclear signature not valid.
    
    
    ssmtp: Cannot open smtp.live.com:587
    cat: /tmp/.preclear/sdi/smart_error: No such file or directory
    ls: cannot access '/sys/block/sdi': No such file or directory
    

     

  8. Well, this looks promising. I accidentally disconnected from my wifi during the preclear from terminal and killed the connection. The preclear plugin kept it going, but it still failed. 

    I tried plugging my sas drive into my server case using a mini sas to sas cable to my LSI SAS2008, but it did not recognize my sas drives at all. That cable will be going back tomorrow. 

    I swapped the IOM6 controller for the original IOM3 it came with, and tried preclearing both my 6TB and a 300GB sas drive that came with it. (I know, why didn't I try all of these operations on that? I told you, I don't belong here.) This seems even more promising than eliminating the multipathing, lol. 

     

    It's successfully verified the unRaid's Preclear signature, which it would always fail on previously. 

     

                         unRAID Server Preclear of disk 5000cca02a47b314                                     #
    #                                       Cycle 1 of 1, partition start on sector 64.                                        #
    #                                                                                                                          #
    #                                                                                                                          #
    #   Step 1 of 4 - Verifying unRAID's Preclear signature:                                                        SUCCESS    #
    #   Step 2 of 4 - Writing unRAID's Preclear signature:                                                          SUCCESS    #
    #   Step 3 of 4 - Zeroing the disk:                                                        [0:45:14 @ 110 MB/s] SUCCESS    #
    #   Step 4 of 4 - Post-Read in progress:                                                                     (50% Done)    #
    #                                                                                                                       

     

    Fingers crossed I can cancel my eBay orders of the Dell Compellent HB-SBB2-E601-COMP and the LSI SAS9200-16e I ordered 45 minutes ago out of desperation to get this thing working properly before I leave town for the weekend. Lol. 

  9. 2 hours ago, johnnie.black said:

    Dual connection on those shelfs is for redundancy or SAS multipath, not for additional lanes, and it will cause problems with Unraid and dual port SAS disks since multipath is not supported.

    Ah! Gotcha. Well, I took the extra cable out, and it removed the redundant device, but it still seems to reinitialize to the system after every mbr clear during the last part of the clearing process. 

     

    I currently have some mini-sas to sas connectors otw so that I can try preclearing the drive inside my server itself to see if it gives me any better results. 

     

    I also am in the process of formatting it to 4K, as it's been formatted 512e the whole time. I highly doubt it'll have any effect on my issue, but I figured, one, it couldn't hurt, and two, if I do finally get this working, might as well have it in its native format.

  10. Okay. So I got home and disconnected my bottom IOM6 controller. I then rebooted the server and power cycled the shelf. The shelf's noise level went back down to normal, but the drive still showed up twice. I unplugged the second QSFP cable and then the second drive assignment dropped off from the preclear plugin screen. Also, the Disk Log Information screen stopped showing the message "Oct 3 17:08:37 TOWER emhttpd: device /dev/sdq problem getting id"

    Quote

     

    Oct 3 17:08:15 TOWER kernel: sd 10:0:1:0: [sdq] Spinning up disk...
    Oct 3 17:08:33 TOWER kernel: sd 10:0:1:0: [sdq] 11721045168 512-byte logical blocks: (6.00 TB/5.46 TiB)
    Oct 3 17:08:33 TOWER kernel: sd 10:0:1:0: [sdq] 4096-byte physical blocks
    Oct 3 17:08:33 TOWER kernel: sd 10:0:1:0: [sdq] Write Protect is off
    Oct 3 17:08:33 TOWER kernel: sd 10:0:1:0: [sdq] Mode Sense: f7 00 10 08
    Oct 3 17:08:33 TOWER kernel: sd 10:0:1:0: [sdq] Write cache: enabled, read cache: enabled, supports DPO and FUA
    Oct 3 17:08:33 TOWER  kernel: sd 10:0:1:0: [sdq] Attached SCSI disk
    Oct 3 17:08:34 TOWER  emhttpd: device /dev/sdq problem getting id
    Oct 3 17:08:37 TOWER emhttpd: device /dev/sdq problem getting id
    OMITTED BC THIS MESSAGE REPEATS EVERY SECOND
    Oct 3 17:13:10 TOWER emhttpd: device /dev/sdq problem getting id

    Oct 3 17:13:11 TOWER emhttpd: device /dev/sdq problem getting id
    Oct 3 17:13:12 TOWER emhttpd: HUH728060AL5200_XXXXXXXX (sdq) 512 11721045168
    Oct 3 17:20:48 TOWER  preclear.disk: Pausing preclear of disk 'sdq'
    Oct 3 17:21:04 TOWER preclear.disk: Resuming preclear of disk 'sdq'

     

     

    So looks like the bottom controller must be disconnected, as well as only one QSFP cable being connected.

     

    I will post updates once the preclear finishes. It's looking very promising though.

  11. 53 minutes ago, jonathanm said:

    Is it possible you have redundant or multipath connections to the shelf? If the same device shows up twice, unraid isn't going to be happy.

    I currently do have two qsfp cables connected to the top controller only, which should only give it more lanes for speed. I read that connecting the second IOM6 controller would result in redundancy, so I ensured that I did not connect the second one at any point. But it is plugged into the shelf itself. 

     

    Originally, when this first started happening, I only had one cable attached server <-> NetApp, so unless it's actually the second controller, I'm not sure that's it. 

  12. I recently bought a NetApp DS4243 disk shelf and upgraded to IOM6 controllers. I'm using the  NetApp HBA x2065a-r6 with two QSFP cables running from the top controller to the card. Both controllersd are currently installed, the top two psu's are on, if that makes any difference. 

    Unraid sees the shelf fine as well as the drives. It allows me to format the drives that came with it to 512b size, and I can begin preclearing a 6TB HGST SAS drive I bought off eBay. However, whenever it gets to 99%, the drive assignments switch and it can no longer find the drive by its old drive assignment and fails to write the MBR signature. 

    When I go to preclear disks, it lists the drive twice under two different dev assignments. (eg. /dev/sdh and /dev/sdg). 

    I've removed all other disks from the shelf to see if it shows up properly and finishes, but same issue. I'm leaning towards maybe it's the HBA, but Im merely taking stabs in the dark. I do have a second identical HBA I have not tried yet, and will probably do that once I get home from work today. 

    Attached is my diagnostics zip. I'm hoping to get to the bottom of this before my return windows on some of these items close if possible. 

     

    Thank you for any assistance you can provide. 

    dragonsnest-diagnostics-20191003-1334.zip

  13. 34 minutes ago, trurl said:

    Of course that is going to compromise the rebuilds. Probably we should have suggested rebuilding to new disks so you didn't mess with the originals.

     

    I guess you can check connections of those disks and try again, but

     

    Do you have backups of anything important and irreplaceable? If not that should probably take priority over even the rebuilding. You must have another copy of anything important and irreplaceable on another system.

    Yes. I have off-site backups of my most important stuff. 

×
×
  • Create New...