Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Parity not valid, disk 10 appears to be unformatted - help!

Featured Replies

I'm running 4.5 beta 4.

 

Everything was going well until last night, when there was a write error on my disk 10. I didn't know what had happened, except that the array became unresponsive, and I couldn't telnet or use the main page to make any changes. I restarted the server and tried to copy my file again, I canceled the parity sync that started as a result of the unclean shutdown. This time I got a message stating there was a write error on Disk 10. I restarted again, and this time I was confronted with a parity drive that was invalid and needed to be rebuilt. I decided that it would be a good idea to let the array do its thing and to replace disk10 asap once parity was done. Well it crashed during the parity check and upon restart, disk 10 appears unformatted and parity is not valid.

I tried following the reiserfsck instructions. I was told that there was a hardware error (block 16) and I need to use the -B option (if I felt brave), but when i try to that, I get no confirmation option and it seems to stop straight away. I even tried the --rebuild-tree and --fix-fixable modes, but none seem to work with -B or on their own without it.

 

All I really want is to get my data off the drive and replace it with one of the 6 new 1 Tb drives that I now have. However, since my system says parity is not valid, I won't be able to rebuild the data will I? I do have about 200 GB of the media still on my temporary drive, but that still means 500 Gb+ or re-ripping DVDs, BluRay and HD-DVDs again. :(

 

Can anyone suggest anything useful. I'm at work now, so no syslogs etc. for the time being.

I'm running 4.5 beta 4.

 

Everything was going well until last night, when there was a write error on my disk 10. I didn't know what had happened, except that the array became unresponsive, and I couldn't telnet or use the main page to make any changes. I restarted the server and tried to copy my file again, I canceled the parity sync that started as a result of the unclean shutdown. This time I got a message stating there was a write error on Disk 10. I restarted again, and this time I was confronted with a parity drive that was invalid and needed to be rebuilt. I decided that it would be a good idea to let the array do its thing and to replace disk10 asap once parity was done. Well it crashed during the parity check and upon restart, disk 10 appears unformatted and parity is not valid.

I tried following the reiserfsck instructions. I was told that there was a hardware error (block 16) and I need to use the -B option (if I felt brave), but when i try to that, I get no confirmation option and it seems to stop straight away. I even tried the --rebuild-tree and --fix-fixable modes, but none seem to work with -B or on their own without it.

 

All I really want is to get my data off the drive and replace it with one of the 6 new 1 Tb drives that I now have. However, since my system says parity is not valid, I won't be able to rebuild the data will I? I do have about 200 GB of the media still on my temporary drive, but that still means 500 Gb+ or re-ripping DVDs, BluRay and HD-DVDs again. :(

 

Can anyone suggest anything useful. I'm at work now, so no syslogs etc. for the time being.

This is not as bad as it might seem.  You can recover.

 

First and most important.  DO NOT FORMAT the drive that says it is unformatted.  Also, there is a special sequence of steps you can use to get the array to think that disk10 is failed and needs to be rebuilt, and that parity is good and can be trusted.

 

1.  Get a copy of your syslog, before you do ANYTHING else.  It might have some clues to what is happening.  It could be as simple as a loose cable to disk10, or, it could actually be a failed disk.  For now, assume it is a failed disk.

 

2. Stop your array (if not already stopped) and power down.

3. Replace the failed drive with one of your new ones.  Hopefully, you already have a 1TB parity drive and your replacement 1TB Data drive will not be bigger than the parity drive.

4. Power back up.  If a parity check starts (very unlikely) cancel it.

5. Assign the new replacement drive to disk10 on the Drive Assigments page.

6. FOLLOW THE NEXT STEP EXACTLY...

7. Check the checbox under the "Restore" button and press it, but DO NOT START THE ARRAY... DO NOT PRESS "Start" Not yet any way, because we first have to issue a command to make the server think that disk10 is the one that is failed, and that parity is good and can be used to reconstruct it.

When you press the "Restore button, all your disk indicators will turn "BLUE", as if it is an entirely new array.  The array status will be "Stopped: Initial configuration"   DO NOT START IT.  Not until after step 8 is completed will it be pressed.  This is critical.   (Did I warn you enough to not start the array at this time... good ;))

8. Log on via the system console, or via telnet and type:

cd

mdcmd set invalidslot 10

As described in the "Trust My Parity" procedure described in the wiki, you should get the following response:

cmdOper=set

cmdResult=ok

 

Now, I know there are a lot of warnings on the wiki page... I put them there to prevent use of the procedure in the wrong circumstances... The wiki page was originally written to simply keep from having to re-calculate parity.  We are using the same mdcmd command to tell it that slot 10 in the array needs to be rebuilt.

 

Now, once you see the cmdOper=set, cmdResult=OK

response,  you can then press the "Start" button on the management interface.  All the disk status indicators should turn Green; the system state should be Started; and there should be a disk rebuild of disk10 in progress. 

 

If you refresh the management console you will see parity and all the other data disks being read, and disk10 being written to. It will probably take 6 hours or more to rebuild your disk10.  (a little bit longer than the time it usually takes to do a full parity check)

 

If you have ANY questions about this procedure... ask first before doing anything... and post your syslog when you get home.

 

I have a class to attend early this evening, but I'll check back in later when I get back home.

 

Oh yes.. here are two other people who were able to perform a similar series of steps to recover and rebuild a specific disk when they had a failure like yours (or had done something stupid/carelessly without thinking or asking for guidance):

http://lime-technology.com/forum/index.php?topic=3716.0

and

http://lime-technology.com/forum/index.php?topic=3367.0

 

Edit: fixed link to second thread.

 

Joe L.

  • Author

Thanks a lot, Joe!

I stayed up really late last night looking for info, and I suspected that I could do exactly what you said (based on other people's problems) but wasn't sure. My only concern is that perhaps the parity is no longer valid, because it did start to build parity a couple of times while it thought disk 10 was unformatted. Last time I tried to start the array it wouldn't come online this morning. I'll check all cables etc. then read the FAQ about getting a syslog, as I've never done that before either.

And yes, my parity drive is a 1 TB one. :)

Thanks a lot, Joe!

I stayed up really late last night looking for info, and I suspected that I could do exactly what you said (based on other people's problems) but wasn't sure. My only concern is that perhaps the parity is no longer valid, because it did start to build parity a couple of times while it thought disk 10 was unformatted. Last time I tried to start the array it wouldn't come online this morning. I'll check all cables etc. then read the FAQ about getting a syslog, as I've never done that before either.

And yes, my parity drive is a 1 TB one. :)

If it did try to build parity while it thought disk10 was "unformatted" than you are correct... You might lose some data.  It all depends on how far it got before you cancelled it.

 

Worst case, it 'read zeros' for every byte and updated parity in its entirety... then when you rebuild disk10, it will be empty.

 

Better case... only part of it was zeroed... you will need to do a reiserfsck to rebuild the part of the file-system overwritten to get it to be sane... This has proven to be pretty decent, even when a drive was partly cleared in error.

See this post for a horror-story that ended pretty decently http://lime-technology.com/forum/index.php?topic=3367.0

 

Only way to know for sure will be when you start the array.  At that point your disk10 will be simulated and you can browse it to see what is there, even as it is rebuilding it to the new physical disk10 from parity and the other data disks.  (Just don't write to it until things are back stable and the rebuild is complete)

 

Joe L.

  • Author

Here's the syslog in two parts (154 kb total). And I believe the parity was stopped between 0.1 and 1.0 %, so hopefully not too much damage done. Thanks for all your help!

The syslog shows the array exactly as Joe has analyzed, with serious media errors (UNC - UNCorrectable) on Disk 10.  It does show that a parity sync did start, but did not run for long (about 50 seconds), and was hampered by all of the exception handling of the media errors, so may not have traversed very far.  There will still be some damage, so once you rebuild Disk 10 on a new drive, you will still have some fixing to do.  For now, I would recommend not using the array AT ALL, including Starting it, until you have followed Joe's instructions and finished rebuilding Disk 10.

  • Author

Thanks Rob. I was lost by some of your comments until I realised you were talking about Joe L.- my real name's also Joe, you see. :)

So, the array is off. I will do as Joe L. suggests. I'll then see how much I've lost. Thanks both of you.

I've hardly slept since Monday, and fell asleep with my 12 1/2 week old baby instead of opening up the server to rearrange disks etc!

 

Oh, there were also a few prior parity checks, but I stopped them pretty quick also.

  • Author

Thank you Joe. L for the excellent instructions; not only did you tell me sequentially what to do, but also what feedback to expect from the telnet session and the browser console too. And Rob J for encouragement and confirming everything: Awesome! The missing drive is currently being re-built.

 

Permit me to go on a hardware tangent in my sleep-derived euphoric state...

 

Since I have drive cages (non-hot-swap), and it's a pain to rearrange things using them in my CoolerMaster Centurion 590 case:

- and I accidentally napped most of the evening away

- and I received an almost brand new SansDigital 4 bay tower raid from forum member GoChris

- and I was feeling lazy but bold,

 

I simply disconnected disk10, leaving it in place in the tower. Instead, I replaced my Broadcom PCIe x1 gigabit card with the PCIe x1 card that comes with the SansDigital box, installed four of my new 1 TB green drives into it and connected it up to the tower via a multi-lane cable. All that in less time than it would have taken to replace the drive in the tower.

 

Well, anyway, the data is being rebuilt, and for sure most - if not all - of the files appear even after 0.9% rebuild. I hope that not too many are corrupted. Thank you unRAID, and thank you forums!

 

Lastly, a question for you guys, occasionally I 'lose' my disk 3 and it's absolutely as a result of a loose cable. Can I use the same technique to prevent the data being rebuilt from parity on disk3 once i correct the problem?

 

 

  • Author

Arrgghhh! I jinxed myself, and suddenly disk 3 appears to be missing during the rebuild. I've shut down the server, but perhaps I shouldn't have?  :'(

 

I'm going to try and redo the whole series of steps above again.

 

Okay... it's working again now. Keeping fingers crossed.

 

Once this is all done, I'll clear the other three new 1 TB drives. Eventually I'll move them into the Centurion case and I'll put four of the remaining six 750 GB drives into the SansDigital box, so that they'll be much easier to upgrade one at a time than if they were in the Centurion case. The plan is to switch to 1, 1.5 and 2 TB drives exclusively by the middle of next year.

Thank you Joe. L for the excellent instructions; not only did you tell me sequentially what to do, but also what feedback to expect from the telnet session and the browser console too. And Rob J for encouragement and confirming everything: Awesome! The missing drive is currently being re-built.

 

Permit me to go on a hardware tangent in my sleep-derived euphoric state...

 

Since I have drive cages (non-hot-swap), and it's a pain to rearrange things using them in my CoolerMaster Centurion 590 case:

- and I accidentally napped most of the evening away

- and I received an almost brand new SansDigital 4 bay tower raid from forum member GoChris

- and I was feeling lazy but bold,

I simply disconnected disk10, leaving it in place in the tower. Instead, I replaced my Broadcom PCIe x1 gigabit card with the PCIe x1 card that comes with the SansDigital box, installed four of my new 1 TB green drives into it and connected it up to the tower via a multi-lane cable. All that in less time than it would have taken to replace the drive in the tower.

You are very brave... but as long as you are back up running, fine.  Personally, I would not have introduced new hardware into the mix until things are stable.

But, now that you have... I hope they work properly.

Well, anyway, the data is being rebuilt, and for sure most - if not all - of the files appear even after 0.9% rebuild. I hope that not too many are corrupted. Thank you unRAID, and thank you forums!

As soon as the array was started you could get access to your files from the "simulated" drive provided by parity and the other data drives.

Lastly, a question for you guys, occasionally I 'lose' my disk 3 and it's absolutely as a result of a loose cable. Can I use the same technique to prevent the data being rebuilt from parity on disk3 once i correct the problem?

This is NOT a good time to lose a second drive.  You really should have dealt with the loose cable LONG ago, when you first noticed it... before it became critical to a rebuild operation.

 

Let's hope it works for the next few hours.

 

Joe L.

Arrgghhh! I jinxed myself, and suddenly disk 3 appears to be missing during the rebuild. I've shut down the server, but perhaps I shouldn't have?  :'(

 

I'm going to try and redo the whole series of steps above again.

 

Okay... it's working again now. Keeping fingers crossed.

 

Once this is all done, I'll clear the other three new 1 TB drives. Eventually I'll move them into the Centurion case and I'll put four of the remaining six 750 GB drives into the SansDigital box, so that they'll be much easier to upgrade one at a time than if they were in the Centurion case. The plan is to switch to 1, 1.5 and 2 TB drives exclusively by the middle of next year.

You are very lucky...  You should seriously consider replacing the cable to drive3 (after it has finished the rebuild of disk10) 

Don't touch anything while the rebuild is in progress...  Take a nap... you probably need one.

 

Joe L.

It takes surprisingly little to dislodge a cable and create an intermittent problem.  Every time you open the case you introduce the opportunity.  Even moving the server about can cause this to happen.

 

Touching a cable and causing it to become slightly askew in a SATA socket can be all it takes.  I have found that locking cables (for ports that support them) are a smart investment.

 

If you are having flakey behavior in the array (occasional hangs, parity check won't complete, lockups, etc.) don't ignore it!  You don't want to get into a drive failure scenario with these types of problems.

  • Author

To be honest, the first two times it happened, I replaced the disk. Upon checking the drive I'd taken out, I found it to be fine. It was the third time recently when I suspected it was the cable - all three failures had come after I'd opened the case. So I figured I'd just unassign and reassign the disk and rebuild it. Last night's 'missing disk' clinched it for me - it must be the cable. I'll have to replace it asap, once the rebuild is done. Thanks all for the advice.

There are 2 cable paths involved.  The more likely is the SATA data cable, but it could also be the power cabling, including any splitters in the path.  And if the drive is installed in a backplane, then it is also possible that the backplane is defective, or its connections are loose, or the drive is not seated well.

  • Author

There are 2 cable paths involved.  The more likely is the SATA data cable, but it could also be the power cabling, including any splitters in the path.  And if the drive is installed in a backplane, then it is also possible that the backplane is defective, or its connections are loose, or the drive is not seated well.

 

You know I think in part it's a symptom of incrementally increasing one's server's capacity; unless you're really diligent each time you add disks, things can begin to get sloppy. Now that I have the external hot-swap device as part of my array, and will soon have the hardware to completely max out the storage capacity of my tower (without spending $$$ on 5-in-3 backplanes), I'm going to completely rebuild my server, rearranging the disks, fixing the cabling well etc. As 1TB drives hit closer to $50 (they've been as low as $85 recently), I think the next server I build will be full of drives from day one, and I'll just expand externally like I just did.

  • Author

One last question:

 

The drive that failed... can I still reformat it and use it elsewhere? Or is it likely to be troublesome no matter what I do with it, in which case I'll RMA it?

 

Thanks for any insights.

All of the media errors I saw certainly make that troubling, but the decision MUST rest on the SMART report, not the syslog.  Let's take a look at the SMART report for that drive.  See the Troubleshooting page, Obtaining a SMART report section.

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.