Unformatted disk after data rebuild


Recommended Posts

25 minutes ago, dogfluffy said:

At this point if I disconnect the 4 drives that are connected to the other onboard controller and connect these last 3 drives. I could then pull the SMART logs and it wouldn't hurt anything or cause any more errors since the bad controller isn't connected to anything? Sound good?

Yes that should be fine.

 

Link to comment
  • Replies 114
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

Gah lag. Yes in every sense, I think this is it. I had a power molex connecting to a string of SATA power connections and it looks like it might have wiggled loose and been causing problems. I ALSO seem to have a bad drive, although that could be the result of this loose connection. I pulled disk13 and connected it via a USB with it's own power and it says Smart read failed and was disabled unless via ATA connection. I had it connected via the ATA and it would report that SMART was disabled. I tried to enable it from the console but was unsuccessful. I can connect it back up to a different power rail and reboot inside the case again. I think this was the drive that was clicking though. I was able to get the SMART off disk8 and disk10 and attached the report. Disk13 is still connected via the USB right now.

 

Of course 5 minutes before I ordered the Supermicro controller card. I should try to find direct SATA power cables for my Seasonic modular power supply so I can toss these power Molex to SATA wyes.

smartreportpartial.zip

Link to comment

And yeah, disk13 I connected to the supermicro controller and a native SATA power connection and it shows DISK_DISABLE_NP in unMENU and isn't even available in the disk management pulldown to try and run SMART report on. I also had it connected to another wye on accident and rebooted with the intended native SATA and now it just says DSK_DSBL in unMENU and not installed on the normal unraid screen. It still won't run smartctl and returns failed. So I suppose I should order another disk? I had been thinking about buying a 6 or 8 TB to be my next parity drive down the road, if I was going to upgrade, someday. It's looking like someday is here.

 

Statistics for /dev/sdd ST3000DM001-1CH166_W1F2JKX7
smartctl -a -d ata /dev/sdd
smartctl version 5.38 [i486-slackware-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

Edited by dogfluffy
Link to comment

SMART for those other 2 disks looks OK.

 

I thought this Disk13 was a new disk. Of course new disks can be bad.

 

Not sure what you have in mind thinking about buying a larger disk. Your current parity is only 3TB so none of your array disks can be larger than that. You could replace parity with a larger disk and rebuild parity, but then you wouldn't be able to rebuild Disk13.

 

But you may not be able to rebuild Disk13 anyway and even if you did it might not give any better results than if you just tried to use the original Disk13.

 

I am going to start another post so hold on.

 

Link to comment

https://www.rosewill.com/product/rosewill-rsv-l4500-4u-rackmount-server-case-or-chassis-15-internal-bays-8-cooling-fans-included/

 

Oh no this thing is all ripped apart right now. I disconnected the drives attached physically to one of the onboard controllers and also a string of SATA power on a molex splitter. It's a mess, but I was just trying to get a SMART report off disk13. I thought it was new, but is apparently trashed regardless. I have 2 other 3TB drives but they have files on them I may need, I don't know yet. It's sounding like this may be trashed...but I could order another 3TB and I already ordered the Supermicro controller. It might be best to just let this sit and try to isolate what is causing the bad mojo. Those 3 disks were connected to the same controller with the 4th cable unused, and I also had a whole series of wyes and power adapters going on I couldn't see very well along the fan rail. I can't tell how many because I think I needed a couple just as extensions for a longer run and it was all tucked away out of sight.

Link to comment

Excellent. Yes I have a project now. I ordered a fresh drive and I'll label these to the unraid serial number and re-inspect these power connections before I put it all back together. If I bring it all online with the new Supermicro controller and the new drive ready to become disk13 during a rebuild is that the optimal plan going forward next week hopefully?

Link to comment
39 minutes ago, dogfluffy said:

If I bring it all online with the new Supermicro controller and the new drive ready to become disk13 during a rebuild is that the optimal plan going forward next week hopefully?

Yes I think that will be fine.

 

Probably if you have everything hooked up the array will start but with nothing assigned as disk13. Then you would have to stop the array, assign the new disk13, then starting the array will begin the rebuild.

 

Whether or not it works like this let us know and we can figure out where to go from there.

Link to comment

Thanks to advances in modern logistics I'm already back up and running! I still have some housekeeping to do hardware wise to clean all this up, but new drive and controller are installed and running. I assigned the new disk13 and rebuilt last night with 0 errors. So I'm back to the Unformatted Disk10 and an essentially blank Disk11. I'm not sure how to best proceed, but I can start viewing files and trying to piece together what data is missing, and what data I have on previous unraid drives. There was one console message about a reiserfs error on device md10(disk10), probably due to my scan of that disk I gather, but no messages since and no bad noises. Also I forgot about a lost+found on disk9. This is probably the useful bit of info you need?

 

Feb  4 16:34:20 Calculon logger: mount: /dev/md10: can't read superblock
Feb  4 16:34:20 Calculon emhttp: _shcmd: shcmd (42): exit status: 32 (Other emhttp)
Feb  4 16:34:20 Calculon emhttp: disk10 mount error: 32 (Errors)
Feb  4 16:34:20 Calculon emhttp: shcmd (43): rmdir /mnt/disk10 (Other emhttp)
Feb  4 16:34:20 Calculon emhttp: shcmd (44): mkdir /mnt/disk11 (Routine)
Feb  4 16:34:20 Calculon emhttp: shcmd (45): set -o pipefail ; mount -t reiserfs -o user_xattr,acl,noatime,nodiratime /dev/md11 /mnt/disk11 |$stuff$ logger (Other emhttp)
Feb  4 16:34:20 Calculon kernel: REISERFS error (device md10): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1 2 0x0 SD] (Errors)
Feb  4 16:34:20 Calculon kernel: REISERFS (device md10): Remounting filesystem read-only (Drive related)
Feb  4 16:34:20 Calculon kernel: REISERFS (device md10): Using r5 hash to sort names (Routine)
 

 

Do you need any logs or more info from me?

Screenshot 2019-02-05 at 6.25.25 AM.png

Edited by dogfluffy
Link to comment
8 minutes ago, dogfluffy said:

It's working! Counting down from a very large number. I'm going to grab some lunch. Thanks again for the help. 

 

I assume this is processing the lost+found folder or does it run off parity? I'm not sure what handles lost+found?

This is using parity plus the other data disks to work out what each sector on the emulated disk should contain.     It is then reading every sector on the emulated disk trying to find file structures and reconstruct the folder/file name information.     It is only at the end of this process does it decide if it has some files that appear to have no related directory information and it is these that go into the lost+found folder.

Link to comment
I assume this is processing the lost+found folder or does it run off parity? I'm not sure what handles lost+found?

Parity won't be used for a filesystem check, unless doing it on an emulated disk, a lost+found folder might be created by reisersfck if it finds some files, complete or partial, that doesn't know what folder they were in.

 

 

Link to comment

Disk10 isn't disabled so it isn't emulated. Parity is not involved except for the fact that any changes made by the repair are writes to the data disk that will make parity update, as I already explained here:

On 1/30/2019 at 3:29 PM, trurl said:

Repairing a filesystem is also a write operation. It writes corrections to the filesystem metadata.

 

But if you are working at the command line, there are 2 different ways to refer to the filesystem.

 

You can repair the partition on the sd device, which is what you did way back in the first post. Doing it this way leaves parity out of it and so parity becomes invalid when you take this approach.

 

Or you can repair the md device, which includes parity when writing those corrections, and so parity is maintained.

 

Since you are repairing the md device, parity is updated when the repairs are written. But parity itself has none of your data as mentioned before, and since you aren't working with the emulated disk, parity and the other disks aren't even read. Emulation (if it were involved here, which it isn't) could not possibly be the reason for:

42 minutes ago, dogfluffy said:

why disk13 and disk3 (I believe) had very similar data

 

 

Link to comment

Not yet, but I watched the console scroll off and on. Looks promising! I didn't want to screw it up now by mounting something wrong, although I think we're basically finished now? I think that before I thought the parity works backwards but now I understand it. I also have the 2 disks I pulled out along the way. So once I restart the array I guess I can copy it back over? I also had a weird counter intuitive disk assignment for my data share, where I always assigned new disks as available to the share, thinking it would always be large enough. But it sort of scattered some data it seems across the nearly blank disks too. I'm still piecing it back together but it looks good and again no errors on the unraid main screen. I suppose I may have had a flaky controller or loose power, or all 3 at once going on. Thanks for all your help with this, and teaching me how it actually works. It's parity magic.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.