Jump to content

Help Needed - Can't Run Parity Check


Recommended Posts

Running unRAID v5.0 Pro

Controllers: 2 X LSI SAS 9207-8i (MPT2BIOS v7.29.00.00 - 2012-11-12 / FW Rev 15.00.00.00-IT)

 

I think the main problem is this: "kernel: mpt2sas0: _base_fault_reset_work : SAS host is non-operational !!!!"

 

Please note that the attached syslog has been truncated.  I hacked off a LOT of repeated lines at the end that won't be necessary (repeated disk read error).

 

I removed 6 x 3 TB drives from my array, re-init array and re-assigned drives.  Started array and it started parity check.  Ran for a few mins just fine and I came back an hour or so to check on the status.  Main status window shows huge read/write/error numbers.  Main status also says that parity is valid.  It is not.  I need to recalculate the parity.  All data on all 8 x 4 TB drives is valid and present.

 

- I think there is a FW v17 available for the SAS controllers, but I don't know how to flash them (I haven't researched yet).  Nor do I know if I *need* to upgrade the FW.

- I replaced ALL cables including the SAS-to-SATA cables with brand new ones.

- Parity drive and Cache drive are connected to motherboard via SATA cables; not to the SAS controllers.

- I reseated all power cables to all drives.  All drives show up in BIOS and are accessible if I manually mount to them.

- The 6 x 3 TB drives were empty.  I moved the data to another server.  They were removed from the main share, but not from the array until I did the re-init of the array.

 

My gut feeling is that I have a controller either not configured correctly, it is going bad, or that it isn't compatible with unRAID.  Or I have a bad motherboard or bad PCIe slot on the motherboard.  Since I'm down to only 8 data drives, I could swap all drives over to the other controller that was used with the 3 TB drives and see if I get this same problem.  I have another controller new in the box that I could swap out too.

 

I've had this problem before but I've always shut down, open the case, ensure all cables are pushed in and secure, reboot, and re-run parity without problems.

 

Any help or suggestions would be appreciated.  Thanks!

syslog.zip

Main_Screen_Snapshot.jpg.fdb348dae9c28b3c7323ac348c1e4951.jpg

Link to comment

I verified that my motherboard BIOS is up-to-date.

 

I have downloaded the latest BIOS and FW files for my controllers.  I'll need to create a USB boot disk of some sort or move the controllers to a Windowz machine with a PCIe slot to perform the updates.  Funny how laptops and SFF machines have now dominated my house, thus eliminating PCIe slots.  :(

Link to comment

I updated the Firmware then updated the BIOS without problems.  unRAID started up just fine and now parity is being rebuilt.  It has been going for about 3 hours without problems.

 

One thing I did notice is that the LSI controllers were getting HOT.  Just in the short amount of time that I did the firmware/bios upgrade.  Afterwards, I powered off and ensured the cards were seated.  They felt hot.  So, I moved a 120mm fan so it blows directly down on the two controllers.  I've got good airflow in the case but I think they may have needed a little more help.  I'm wondering if this could have been the problem all along.  However, I just moved 18 TB of data so you would think they would have overheated during those transfers.

 

Once parity finishes, I have 2 more drives to preclear and add to the array.  I'll give it a week and multiple reboots before I feel comfortable.

Fan.jpg.43465774b71747896ac1f1bf184d3dcf.jpg

Link to comment

Parity has been running for at least 24 hrs now and is at 40%.  I was getting about 157 MB/s and then it hit where it needed to recalc parity and has fallen down to about 11 MB/s.  I don't remember it going that low in the throughput but I'm going to let it continue to the end.

 

I'll run some drive speed checks, smart tests, and reiserfs checks once everything settles down.

 

it could not have been seated properly the whole time.

 

Well, at one time I removed the controllers, verified no debris in PCIe slots, no damage to card edges, etc.  I've checked / wiggled to ensure they have been seated several times... and each time they have.  That was one of my initial concerns.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...