[SOLVED] Can't write MBR signature after preclear using DS4246


Recommended Posts

I recently bought a NetApp DS4243 disk shelf and upgraded to IOM6 controllers. I'm using the  NetApp HBA x2065a-r6 with two QSFP cables running from the top controller to the card. Both controllersd are currently installed, the top two psu's are on, if that makes any difference. 

Unraid sees the shelf fine as well as the drives. It allows me to format the drives that came with it to 512b size, and I can begin preclearing a 6TB HGST SAS drive I bought off eBay. However, whenever it gets to 99%, the drive assignments switch and it can no longer find the drive by its old drive assignment and fails to write the MBR signature. 

When I go to preclear disks, it lists the drive twice under two different dev assignments. (eg. /dev/sdh and /dev/sdg). 

I've removed all other disks from the shelf to see if it shows up properly and finishes, but same issue. I'm leaning towards maybe it's the HBA, but Im merely taking stabs in the dark. I do have a second identical HBA I have not tried yet, and will probably do that once I get home from work today. 

Attached is my diagnostics zip. I'm hoping to get to the bottom of this before my return windows on some of these items close if possible. 

 

Thank you for any assistance you can provide. 

dragonsnest-diagnostics-20191003-1334.zip

Edited by IDontBelongHere
Solved
Link to comment
53 minutes ago, jonathanm said:

Is it possible you have redundant or multipath connections to the shelf? If the same device shows up twice, unraid isn't going to be happy.

I currently do have two qsfp cables connected to the top controller only, which should only give it more lanes for speed. I read that connecting the second IOM6 controller would result in redundancy, so I ensured that I did not connect the second one at any point. But it is plugged into the shelf itself. 

 

Originally, when this first started happening, I only had one cable attached server <-> NetApp, so unless it's actually the second controller, I'm not sure that's it. 

Link to comment

Okay. So I got home and disconnected my bottom IOM6 controller. I then rebooted the server and power cycled the shelf. The shelf's noise level went back down to normal, but the drive still showed up twice. I unplugged the second QSFP cable and then the second drive assignment dropped off from the preclear plugin screen. Also, the Disk Log Information screen stopped showing the message "Oct 3 17:08:37 TOWER emhttpd: device /dev/sdq problem getting id"

Quote

 

Oct 3 17:08:15 TOWER kernel: sd 10:0:1:0: [sdq] Spinning up disk...
Oct 3 17:08:33 TOWER kernel: sd 10:0:1:0: [sdq] 11721045168 512-byte logical blocks: (6.00 TB/5.46 TiB)
Oct 3 17:08:33 TOWER kernel: sd 10:0:1:0: [sdq] 4096-byte physical blocks
Oct 3 17:08:33 TOWER kernel: sd 10:0:1:0: [sdq] Write Protect is off
Oct 3 17:08:33 TOWER kernel: sd 10:0:1:0: [sdq] Mode Sense: f7 00 10 08
Oct 3 17:08:33 TOWER kernel: sd 10:0:1:0: [sdq] Write cache: enabled, read cache: enabled, supports DPO and FUA
Oct 3 17:08:33 TOWER  kernel: sd 10:0:1:0: [sdq] Attached SCSI disk
Oct 3 17:08:34 TOWER  emhttpd: device /dev/sdq problem getting id
Oct 3 17:08:37 TOWER emhttpd: device /dev/sdq problem getting id
OMITTED BC THIS MESSAGE REPEATS EVERY SECOND
Oct 3 17:13:10 TOWER emhttpd: device /dev/sdq problem getting id

Oct 3 17:13:11 TOWER emhttpd: device /dev/sdq problem getting id
Oct 3 17:13:12 TOWER emhttpd: HUH728060AL5200_XXXXXXXX (sdq) 512 11721045168
Oct 3 17:20:48 TOWER  preclear.disk: Pausing preclear of disk 'sdq'
Oct 3 17:21:04 TOWER preclear.disk: Resuming preclear of disk 'sdq'

 

 

So looks like the bottom controller must be disconnected, as well as only one QSFP cable being connected.

 

I will post updates once the preclear finishes. It's looking very promising though.

Edited by IDontBelongHere
Clarity
Link to comment
On 10/3/2019 at 6:28 PM, IDontBelongHere said:

I currently do have two qsfp cables connected to the top controller only, which should only give it more lanes for speed.

Dual connection on those shelfs is for redundancy or SAS multipath, not for additional lanes, and it will cause problems with Unraid and dual port SAS disks since multipath is not supported.

Link to comment
2 hours ago, johnnie.black said:

Dual connection on those shelfs is for redundancy or SAS multipath, not for additional lanes, and it will cause problems with Unraid and dual port SAS disks since multipath is not supported.

Ah! Gotcha. Well, I took the extra cable out, and it removed the redundant device, but it still seems to reinitialize to the system after every mbr clear during the last part of the clearing process. 

 

I currently have some mini-sas to sas connectors otw so that I can try preclearing the drive inside my server itself to see if it gives me any better results. 

 

I also am in the process of formatting it to 4K, as it's been formatted 512e the whole time. I highly doubt it'll have any effect on my issue, but I figured, one, it couldn't hurt, and two, if I do finally get this working, might as well have it in its native format.

Link to comment

I bought two NetApp HBA's. I just noticed they're slightly different versions. The one I had before was P8001 rev. 3. The one I swapped it with was rev. 5. Running another preclear, skipping the preread for the sake of testing. 

 

If this still fails, next step is to try a different controller. I bought 4 IOM6 controllers, and it came with 2 IOM3 controllers. 

Link to comment

 

9 hours ago, johnnie.black said:

It's likely related to the enclosure, but if you're are using the preclear plugin try the script instead just to rule out some plugin issue.

Trying that now. My mini-sas to sas cable should be here today and should hopefully allow me to clear them. Maybe the enclosure can't preclear them but hopefully it can still host already cleared drives.

Link to comment

Well, this looks promising. I accidentally disconnected from my wifi during the preclear from terminal and killed the connection. The preclear plugin kept it going, but it still failed. 

I tried plugging my sas drive into my server case using a mini sas to sas cable to my LSI SAS2008, but it did not recognize my sas drives at all. That cable will be going back tomorrow. 

I swapped the IOM6 controller for the original IOM3 it came with, and tried preclearing both my 6TB and a 300GB sas drive that came with it. (I know, why didn't I try all of these operations on that? I told you, I don't belong here.) This seems even more promising than eliminating the multipathing, lol. 

 

It's successfully verified the unRaid's Preclear signature, which it would always fail on previously. 

 

                     unRAID Server Preclear of disk 5000cca02a47b314                                     #
#                                       Cycle 1 of 1, partition start on sector 64.                                        #
#                                                                                                                          #
#                                                                                                                          #
#   Step 1 of 4 - Verifying unRAID's Preclear signature:                                                        SUCCESS    #
#   Step 2 of 4 - Writing unRAID's Preclear signature:                                                          SUCCESS    #
#   Step 3 of 4 - Zeroing the disk:                                                        [0:45:14 @ 110 MB/s] SUCCESS    #
#   Step 4 of 4 - Post-Read in progress:                                                                     (50% Done)    #
#                                                                                                                       

 

Fingers crossed I can cancel my eBay orders of the Dell Compellent HB-SBB2-E601-COMP and the LSI SAS9200-16e I ordered 45 minutes ago out of desperation to get this thing working properly before I leave town for the weekend. Lol. 

Link to comment

Well, my 300GB continue to clear fine, but my 6TB drive continues to fail at the same point. I got another 6TB in the mail today. Currently preclearing it as well. Hopefully I have better luck with it. 


 	
############################################################################################################################
#                                                                                                                          #
#                                     unRAID Server Preclear of disk 5000cca23c0ff7ec                                      #
#                                       Cycle 1 of 1, partition start on sector 64.                                        #
#                                                                                                                          #
#                                                                                                                          #
#   Step 1 of 4 - Verifying unRAID's Preclear signature:                                                           FAIL    #
#   Step 2 of 4 - Writing unRAID's Preclear signature:                                                          SUCCESS    #
#   Step 3 of 4 - Zeroing the disk:                                                       [10:31:36 @ 158 MB/s] SUCCESS    #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
#                                                                                                                          #
############################################################################################################################
#                              Cycle elapsed time: 10:31:37 | Total elapsed time: 10:31:37                                 #
############################################################################################################################

--> FAIL: unRAID's Preclear signature not valid.


ssmtp: Cannot open smtp.live.com:587
cat: /tmp/.preclear/sdi/smart_error: No such file or directory
ls: cannot access '/sys/block/sdi': No such file or directory

 

Link to comment

UPDATE: So, after trying 3 or 4 different IOM6 controllers, an IOM3 controller, and two different NetApp HBA's, I decided to take it with me back home in Atlanta when I went over the weekend. I picked up another DS4243 along the way for a friend back home. I tried using the same HBA's and controllers as I had used previously, but with the new shelf. His server gave the option to format the drives, which was greyed out on my server.

I also had attached my original DS424* to his server using a second port on the NetApp HBA. It didn't seem to be recognized at all. 

I ran a preclear on all 4 drives, and when the fastest drive got to 99% of the clear portion, the JBOD crapped its pants and reassigned the drive to a new assignment. 

Luckily, I had at home waiting for me a Dell Compellent controller that had been delivered the day I left, as well as an LSI SAS9200-16e HBA, and a an sff-8088 cable to connect the two. Upon returning home, I swapped those parts in and started a preclear on all 4 drives. 19 hours later, my fastest drive precleared and post-read successfully!

 

So, I'm not sure if it was the NetApp HBA, or the NetApp controllers, but one of those two, if not both, was the culprit. 

 

Edited by IDontBelongHere
Link to comment
  • 3 weeks later...
14 hours ago, crowdx42 said:

Hi,

 so I am thinking of getting the same disk shelf, do you happen to have a link to the actual setup that worked for you? Am I correct in reading that only a single cable is needed between the controller card and the disk shelf?
Thanks in advance.

Patrick

Dell Compellent HB-SBB2-E601-COMP 0952913-07 6GB SAS EBOD Controller 

(https://www.ebay.com/itm/Dell-Compellent-HB-SBB2-E601-COMP-0952913-07-6GB-SAS-EBOD-Controller/372797485227)

 

LSI SAS9200-16e 16-Port External HBA Full-Height PCIe P20 IT Mode ZFS FreeNAS  (http://www.ebay.com/itm/323724809849)

 

CABLEDECONNMini SAS26P SFF-8088 to SFF-8088 1M External Cable Attached SCSI (https://www.amazon.com/dp/B00S7KTXW6)

Link to comment
  • JorgeB changed the title to [SOLVED] Can't write MBR signature after preclear using DS4246

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.