[SOLVED]Parity Drive Red Balled during Parity Check

June 21, 201412 yr

I am in the process of migrating an existing array created under 4.7 Plus to 5.0.5 after experiencing data drive failure on the old setup. The array had been reset to a new config that didn't include any of the previously failed disks, the original parity drive was re-assigned as the parity drive on the new config. I was at the point where a parity sync was running with an estimated completion time of around 4 hours (drives are all 2TB). It had been running for less than 30 minutes when I heard a drive spin down and I noticed the sync had stopped and the Parity drive was now red balled.

I have included a Syslog and Smartctl reports for all the connected drives but it looks like the parity drive has died, it was not able to be read. While the parity sync was running I was adjusting the NFS shares, not sure if this has interfered with the sync in some way.

Syslog_21062014_.txt

smart_reports.zip

Quote

June 21, 201412 yr

Author

Have checked the drive in my Windows PC and it looks OK, have included a screenshot of the SMART values below.

This drive is connected to the main backplane of my server and all power connections and cabling have previously been checked so I don't know why the drive would suddenly drop out like that. I've already sent 2 drives off for RMA this week, surely I can't be so unlucky as to have a 3rd die $:-\$ ?

Quote

June 21, 201412 yr

It looks as if whatever drive is /dev/sdc dropped off-line.

Since it subsequently came back without any obvious errors indicated in the smart report I would expect this to be a cable or power issue rather than a failing disk. It could be an insufficient power issue - what model is your power supply and what is it rated to supply? I would also re-check your cabling. Assuming these all look OK you should probably simply try the parity sync again.

Quote

June 21, 201412 yr

Author

It didn't come back though itimpi. What I did was remove it from the server and place it in my PC to get those SMART values in the screenshot. There's lots of write errors at the end of the Syslog I posted, I assume they relate to the parity drive.

I have a HP N36L Microserver with a 150w PSU (one of those SFF proprietary jobs). I did have the new disk plugged in but not assigned and it's a 7200RPM drive whereas all my others with the exception of the cache (a laptop drive) are 5900RPM. I expect the 7200RPM will draw a bit more power but given I have removed one drive from the overall count I figured it would still be OK.

I have 5 x 3.5" drives and 1 x 2.5" drive plugged in currently, the old setup before all the trouble started had 6 x 3.5" and 1 x 2.5" and had run happily for 3 years that way. I'll check the cabling again to be sure, each port on the backplane has it's own power connection.

Quote

June 21, 201412 yr

It didn't come back though itimpi. What I did was remove it from the server and place it in my PC to get those SMART values in the screenshot. There's lots of write errors at the end of the Syslog I posted, I assume they relate to the parity drive.

Are you saying that it did not come back when you powered down the server, checked the connections, and then powered it back up? My experience is that once a drive goes offline a power-cycle is required to get it running again. The syslog shows that it dropped offline just before you started getting the write errors.

Quote

June 21, 201412 yr

Author

Are you saying that it did not come back when you powered down the server, checked the connections, and then powered it back up? My experience is that once a drive goes offline a power-cycle is required to get it running again. The syslog shows that it dropped offline just before you started getting the write errors.

I didn't think to try a re-boot, just powered down and took the drive out. Thanks for looking at the syslog, I tried to read it but got a bit lost. I'm just letting a surface scan complete on the affected drive at the moment then will try it in the UnRAID server again. I hope my PSU isn't failing, they don't seem that easy to replace.

Quote

June 21, 201412 yr

OK. I would have been surprised if a reboot had not bought it up since you stated it worked fine on another machine.

I must admit that a 150W supply does seem borderline for the number of drives you have. It probably struggles when trying to initially spin them up, but as long as there are no subsequent issues can keep them spinning. However any glitch might mean a drive can drop offline. I do not know if others who use this hardware can provide feedback on how many drives they use to see if others run a similar number of drives without problems. I do know (from my own experience) that a power supply running at/over its max rating can cause a drive to unpredictably drop offline - particularly when the system is under maximum load when creating/checking parity.

Just for interest, what is the 2.5" drive being used for? If it is something like a cache drive then you might want to consider running without it; or use your current problems as an excuse to see if you can pick up a cheap SSD for this purpose; as that should slightly reduce maximum power consumption.

Quote

June 21, 201412 yr

Author

I must admit that a 150W supply does seem borderline for the number of drives you have.

To be fair the box only comes with 4 internal HDD bays and I have been naughty and squeezed 7 drives in mine. It's quite a common thing in the Microserver world though. I'll ask around and see if others have had issues, there's also the age of the PSU to consider (3yo now).

Just for interest, what is the 2.5" drive being used for?

It's my cache drive, a WD Black. I added it when I found that write speeds were a bit on the slow side running cacheless. An SSD might be the go but I'm wondering if the amount of writes generated in a cache role might kill it prematurely. Also my cache drive is 500gb, not cheap in SSD land still.

Quote

June 21, 201412 yr

I must admit that a 150W supply does seem borderline for the number of drives you have.

To be fair the box only comes with 4 internal HDD bays and I have been naughty and squeezed 7 drives in mine. It's quite a common thing in the Microserver world though. I'll ask around and see if others have had issues, there's also the age of the PSU to consider (3yo now).

I would guess then that the power supply is not rated for that many drives being used in anger. If only being used in something like Windows, then you are probably never hammering all drives simultaneously so get away with it.

I used to have a TeraStation NAS with a 150W supply. This has 4 x 3.5" 7200 rpm drives configured as a RAID-5 array (so all drives are in use simultaneously). I found that I could just get away with adding a 2.5" drive (for running local apps), but that an additional 3.5" drive would cause problems.

Quote

June 21, 201412 yr

Author

I would guess then that the power supply is not rated for that many drives being used in anger. If only being used in something like Windows, then you are probably never hammering all drives simultaneously so get away with it.

I used to have a TeraStation NAS with a 150W supply. This has 4 x 3.5" 7200 rpm drives configured as a RAID-5 array (so all drives are in use simultaneously). I found that I could just get away with adding a 2.5" drive (for running local apps), but that an additional 3.5" drive would cause problems.

I've had 2 drives preclearing at once in it before and previous parity syncs went OK. The age of the PSU may be playing a part.

My plan is to replace all the 2TBs with 4TBs over time so eventually it would just be the 4 x 3.5" and the 1 x 2.5". Can't do anything till I at least get the parity swapped out though. The surface scan is about halfway through now, no red blocks so far.

Quote

June 21, 201412 yr

I have had 7 drives in my N54L since I got it. 6 3.5 5900 rpm 1 Laptop 7200rpm. I have had 4 3.5 5400rpm and 6 5400rpm laptop and 1 7200rpm laptop drives in it earlier in it's life. All with no problems. That was pushing it. The N36L is designed for 4 7200rpm drives and 1 ODD so it should be able to handle 6 7200 rpm drives. But the power supply in a N36L would be getting older and weaker now so the number of drives possible would drop.

Quote

June 21, 201412 yr

Author

The N36L is designed for 4 7200rpm drives and 1 ODD so it should be able to handle 6 7200 rpm drives. But the power supply in a N36L would be getting older and weaker now so the number of drives possible would drop.

Thanks BobPhoenix for the input. I am looking at a replacement 300w PSU on eBay that is the same form factor, just trying to get info on the cable lengths to make sure everything will reach.

After reseating all the power connections again I tried the resync again and it appears to have worked. All the drives are now green balled and parity is listed as being in sync. I have attached the syslog, from my limited understanding it looks good.

syslog22062014.txt

Quote

July 6, 201411 yr

Author

Marking this as solved because the upgrade to 5.0.5 and swapping out the bad disks has worked. The original parity drive has been swapped out for a larger drive as part of the upgrade process, I believe the original issue was power related however.

Quote

[SOLVED]Parity Drive Red Balled during Parity Check

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)