Jump to content

Transplanted Array has drives inconsistently missing plus other issues


Recommended Posts

I am trying to get an array going in a new case/mb/cpu (everything but HDD's), plus I added a couple of drives which seemed to work, but now every time I restart I'm either good, and see all but one drive, which may really be bad, or I see 4-5 drives which are not found and I am presented with the serial number below of what drive was expected.  I am tearing my hair out and feeling sick form trying to figure this out, so time to ask for help. 

 

I had the drives that weren't in the existing array running preclears in the new chassis before the transplant, and some didn't pass, but those did not get incorporated with the previos arrays drives.  So the one drive, when things are mostly working, was just precleared in the past couple days.  It says the drive is missing and that it is emulated.  When I browse the contents of that drive, there is nothing.

 

I have 2 LSI 9211-8i sas cards which are new to me, ebay seller says new, at first they would freeze on their bios screens, so I removed the bios using sas2flash.  I had to do this in another computer as I couldn't get sas2flash to find any cards on my NAS.

 

New System:

Norco 2442 24 Drive bay chassis

1920X Threadripper

Noctua NH-U9 TR4-SP3

ASrock Fatal1ty X399 Professional Gaming

CORSAIR Vengeance LPX 32GB (4 x 8GB) 288-Pin DDR4 SDRAM DDR4 3000 (PC4 24000) AMD X399 Compatible Desktop Memory Model CMK32GX4M4C3000C15 * I initially had this inserted not all the way (I hate single sided ram clips), after fixing it I ran memtest for maybe 2 hours, which I know is too short, no issues

2x LSI 9211-8i sas plus 2 reverse breakout sata->sas cables from the motherboards 8 sata ports

3 120mm and 2 80mm Noctuas

1 old and 1 new AMD graphics cards

SAMSUNG 970 EVO M.2 2280 500GB PCIe Gen3. X4, NVMe 1.3 64L V-NAND 3-bit MLC Internal Solid State Drive (SSD) MZ-V7E500BW

 

pcie order from the top (io shield end) to bottom

1 newish amd card

2 lsi card

3 old amd card

4 lsi card

 

One thing I noticed between trying to figure out what was broken (which drive bays) was that booting unraid can take 3-5 minutes to boot, sometimes it doesn't boot, and sometimes it does boot, but in the gui boot mode, the webpage wont load, it just says something like localhost can't be found.  During boot there are a bunch of sas (i think) related errors, which it says it is correcting.

 

The only thing I can think to do is try upgrading the sas card firmware?  I didn't update it in sas2flash when I cleared the bios, I took a picture of the config utility, it says they are both on firmware 20.00.00.00-IT.  I believe there is a newer version, which I am happy to try if instructed.

 

My primary goal is to not lose any files, secondary, to get the new hardware running as perfectly as my old hardware with 24 drive bays in use.

 

Thank you so much to anyone who tries to help!

tower-diagnostics-20181006-2215.zip

Link to comment

Small update, I wanted to get a diagnostics file from when all but the one trouble drive show up.  I have been booting in to gui mode, but these couple of times I was not, and it froze during boot.  So there seems to be an issue when I don't boot in gui mode.  Also I just noticed it says I could do the following to the trouble drive:

Format will create a file system in all Unmountable disks, discarding all data currently on those disks.

 

I don't think anything was stored on it, unless docker put something there, is this safe to do?  Of course the next time I boot I will probably have 5 missing drives...

 

Edit: Just remembered that the sas config utility said I had 9200-8e cards, the cards have internal connectors, so the e is weird, is this something?  I found that firmware version 20-20.00.07.00 exists, so should I try it? Did they flash my cards wrong?

tower-diagnostics-20181006-2326.zip

MVIMG_20181003_201126.jpg

Edited by bobobeastie
Link to comment

Update 2:  I updated the firmware to 20.00.07.00, no change, first time it wouldn't boot, 2nd time it looked good except for the 1 disk, array was started, I wanted to see what my options were if I stopped the array.  After I did that, the errors rolled in, and the screen where you choose drives looks like it does at times when I boot and immediately have issues.

tower-diagnostics-20181007-0156.zip

Link to comment

Ugh, now I'm reading that ASRock x399 boards are not compatible with lsi sas controllers.  I suppose my options are replace the motherboard or the sas controller. If I go with replacing the motherboard, I wonder what Newegg will do? 15% restocking fee maybe... Ok, not too bad.  If I were to replace the sas controller, are there working alternatives with this board?

 

I would love to get some advice from someone more practiced in these things. Thanks.

Link to comment

Just chatted with newegg, they will let me return with a 15% restocking fee.  So now I need to choose a replacement.  My criteria is ATX, 8 SATA, and I would choose 10 gbe if there were any that matched all 3, I'm leaning toward the Designare, but I don't see my RAM on it's QVL.  With the designare or auros 7's 5 pcie slots, I might even be able to add a 10 gbe nic with 2 lsi sas cards and 2 graphics cards.

Link to comment

I ordered a designare and sent back the ASRock.  I am wondering what I should do with my array tomorrow when I hook it up. 1 of the drives I think maybe had an attempted write after having been precleared, and it didn't work, so unraid emulated it from parity. Should I rebuild parity? Can I go back to before I added precleared drives to the array? If I can do this I will just do another round of preclear NG on all new drives.  The only thing I did in the ASRock board was preclear drives, add them to the array, and set cache to yes. After adding them to the array was when it was clear there were real issues. Thanks.

Link to comment

If I'm reading the situation correctly, I think what I would do if I were you is add all the drives you wish to use as data drives, and omit all parity drives for the first test. See if things work as expected. Maybe do a read check to see if any drives pop errors.

 

If I misunderstood and you have a failed drive that was being emulated, then I'd assign all the drives just the way it was exactly when that occurred, and see if it still emulates properly.

Link to comment

Great, thank you so much for answering.  I think what you are saying is in the drive selection screen when the array is not mounted, select none for parity drive, then follow the instructions for checking a file system which I just found here: https://wiki.unraid.net/Check_Disk_Filesystems is that correct?  I have no reason to think that any of the drives were written to, but possibly there is something unraid was doing, or maybe as a result of the sas card errors something was incorrectly corrected?  So this sounds good, I think it is pretty unlikely that any of the 99-100% drives were written to, and the newly precleared drives were empty.

 

Do I need to do anything to the drive which had just been precleared by the unstable system and was marked with a red x?  I'm guessing I can address check disk results, maybe preclear the red x drive, and then add 1 or 2 parity drives after I am satisfied with the results?  I have only just added precleared drives over the course of a year, and 1 replacing a drive for a larger drive, so this is new to me.

 

Thank you.  

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...