More problems with Supermicro server - three failed drives, on parity one data,


Recommended Posts

  • Replies 71
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

When it happened before more than these two drives dropped off the array and there were additional issue, I had error message on my console. It is consistent now that again these two drives have come up with errors.

 

If I pull the second parity drive and do a pre clear on it, and it passes, I can put it back in the array later as a new second parity drive, correct?

 

So what I am going to do, is pull the second parity drive out of the server and do a preclear on it to verify its bad. At the same time I am going to do a new config and run a parity check with just one parity drive now and see what happens.

 

Sound like a good idea?

Link to comment

This was on my old system dual core i5 and 16GB of RAM, not the current dual quad core Xeon with 52GB of ECC.

 

Note that your Xeon is probably an appreciably slower CPU than your i5 was.

 

You didn't mention the specific i5 you were using, but I suspect it probably had a PassMark score roughly equal to the Xeon's 4875 ... but with half the cores; so the per-core performance of the i5 was nearly double that of your Xeon.    Not sure that's the reason, but the 2nd parity calculation is far more CPU intensive than the basic parity used for the first parity disk.

 

Link to comment

So the passmark of my i5-2500 is 6245 with single thread rating of 1871, the passmark for my current xeon is 4875 with single thread rating of 1075 FYI.

 

Interestingly enough, the speed on the parity check on my new server is now up to 88MB/s which is the fastest I've seen it so far with this card.

 

I have found a drive on my array that is showing errors now, I hope it doesn't crap out my parity check which is 12.5 hrs away from finishing.

 

1 Raw read error rate 0x000f 118 100 006 Pre-fail Always Never 185481712

3 Spin up time 0x0003 091 091 000 Pre-fail Always Never 0

4 Start stop count 0x0032 097 097 020 Old age Always Never 3429

5 Reallocated sector count 0x0033 100 100 010 Pre-fail Always Never 0

7 Seek error rate 0x000f 078 060 030 Pre-fail Always Never 17424691335

9 Power on hours 0x0032 077 077 000 Old age Always Never 20942 (2y, 4m, 19d, 14h)

10 Spin retry count 0x0013 100 100 097 Pre-fail Always Never 0

12 Power cycle count 0x0032 100 100 020 Old age Always Never 111

183 Runtime bad block 0x0032 098 098 000 Old age Always Never 2

184 End-to-end error 0x0032 100 100 099 Old age Always Never 0

187 Reported uncorrect 0x0032 080 080 000 Old age Always Never 20

188 Command timeout 0x0032 100 001 000 Old age Always Never 1 1 161

189 High fly writes 0x003a 100 100 000 Old age Always Never 0

190 Airflow temperature cel 0x0022 066 038 045 Old age Always In the past 34 (8 35 34 28 0)

194 Temperature celsius 0x0022 034 062 000 Old age Always Never 34 (0 21 0 0 0)

195 Hardware ECC recovered 0x001a 118 100 000 Old age Always Never 185481712

197 Current pending sector 0x0012 100 100 000 Old age Always Never 0

198 Offline uncorrectable 0x0010 100 100 000 Old age Offline Never 0

199 UDMA CRC error count 0x003e 200 200 000 Old age Always Never 0

240 Head flying hours 0x0000 100 253 000 Old age Offline Never 3103h+23m+01.071s

241 Total lbas written 0x0000 100 253 000 Old age Offline Never 74467708364

242 Total lbas read 0x0000 100 253 000 Old age Offline Never 509130527037

Done

Link to comment

Alright so the parity check finished finally, all is good, only found 33 errors this time. Preclear on what used to be my second parity disk finished one pass successfully so that is good. Now just not sure if I am going to put that 8TB disk back as a second parity disk or as a data disk.

 

Thoughts?

Link to comment

I think I'd investigate the disk that was showing errors in reply #66. Maybe use your spare disk to replace it? I'd also want to do file system checks. On an array with that many disks I'd be keen to have a second parity disk. Glad the new cable seems to have helped. Are you planning to stick with the SAS2LP?

 

Link to comment

I'm glad if finished but IMO it's not really conclusive:

 

Same cables/backplane and controller as the last time, you are not using parity2 but the disk seems fine and the other disk that failed twice before didn't fail this time, my money is still on the SAS2LP and some issue/compatibility with the expander/backplane.

 

Link to comment

So the passmark of my i5-2500 is 6245 with single thread rating of 1871, the passmark for my current xeon is 4875 with single thread rating of 1075 FYI.

 

Interestingly enough, the speed on the parity check on my new server is now up to 88MB/s which is the fastest I've seen it so far with this card.

 

Not really surprising then, as your old CPU was considerably faster, especially on single thread performance.

 

I have found a drive on my array that is showing errors now, I hope it doesn't crap out my parity check which is 12.5 hrs away from finishing.

 

Although never good, those errors on some Seagates are more like reallocated sectors, you can see they were there on the first diagnostics you posted on this thread, so they are not new, but keep an eye on that disk or replace it if you want to play it safe.

 

Device Model:     ST5000DM000-1FK178
Serial Number:    W4J01GH7

  9 Power_On_Hours          0x0032   077   077   000    Old_age   Always       -       20867
183 Runtime_Bad_Block       0x0032   098   098   000    Old_age   Always       -       2
187 Reported_Uncorrect      0x0032   080   080   000    Old_age   Always       -       20

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.