Read errors while rebuilding new drive


Recommended Posts

Hello,

 

I'm rebuilding a 6tb drive (disk2) right now (upgrade from a failed 4tb) and I just got an read error on disk 1. What is the best I can do right now and did I lose any data?

Error on disk 1:

Nov 18 21:47:41 Behemoth kernel: blk_update_request: critical medium error, dev sdg, sector 883130016 op 0x0:(READ) flags 0x0 phys_seg 12 prio class 0

 

I added diagnostics, i hope they will help.

behemoth-diagnostics-20211118-2228.zip

Link to comment
1 minute ago, rvoosterhout said:

failed 4tb

How did you determine original disk2 had failed?

 

Looks like disk1 is the one that I would have replaced. Do any of your other disks have SMART warnings on the Dashboard page? Do you have Notification setup to alert you immediately by email or other agent as soon as a problem is detected? You must deal with a disk problem immediately so you don't get multiple disk problems and data loss.

Link to comment
Just now, trurl said:

How did you determine original disk2 had failed?

 

Looks like disk1 is the one that I would have replaced. Do any of your other disks have SMART warnings on the Dashboard page? Do you have Notification setup to alert you immediately by email or other agent as soon as a problem is detected? You must deal with a disk problem immediately so you don't get multiple disk problems and data loss.

Disk 2 had failed in Unraid, it had a red cross in front of it. A long smart self test on disk 2 would not succeed, that's how I figured it would have been broken.

Link to comment
3 minutes ago, rvoosterhout said:

Disk 2 had failed in Unraid, it had a red cross in front of it. A long smart self test on disk 2 would not succeed, that's how I figured it would have been broken.

Bad connections are much more common than bad disks. In what way did extended test not succeed? Extended test might have aborted due to spindown. Disable spindown on the disk to get extended test to complete.

 

I'd like to see the SMART report for original disk2 if you can get it.

 

Do you know when disk2 became disabled and whether you have written anything to your server since then?

 

 

Link to comment
Just now, trurl said:

Bad connections are much more common than bad disks. In what way did extended test not succeed? Extended test might have aborted due to spindown. Disable spindown on the disk to get extended test to complete.

 

I'd like to see the SMART report for original disk2 if you can get it.

 

Do you know when disk2 became disabled and whether you have written anything to your server since then?

 

 

I don't think it would be a bad connection. I'm using an R710, which has a sas plane with 2 sas cables running to and SAS card. If I had a connection error, more drives would have suffered from that.

 

I Don't have the smart report for the original disk 2. Could I stop the current rebuild, remove the new drive, place the original disk 2 back, assign it as disk 2 again, create smart report? Or would I lose guaranteed data that way?

 

It became disabled a few days ago. I shut down the server when it became disabled and tried a few smart checks. Received the new 6tb drive today.

Link to comment
1 hour ago, rvoosterhout said:

So should I cancel the rebuild, put this drive back and start a rebuild on disk 1 with the new 6tb disk?

Not that simple. Unraid thinks disk2 needs rebuilding and won't let you rebuild disk1 instead without jumping through a few hoops.

 

How much of your data do you consider important and irreplaceable? You don't have any place to copy that?

 

Might as well let the rebuild continue if it will and we can decide if the result is good or if original disk2 contents would be better.

 

Emulated disk2 was mounted in those diagnostics you posted earlier. Is that still the case?

 

Don't let anything get written to your server until we decide. Anything written to the emulated disk will not be on the original disk, and will make parity out-of-sync with the original disk.

Link to comment
9 hours ago, trurl said:

Not that simple. Unraid thinks disk2 needs rebuilding and won't let you rebuild disk1 instead without jumping through a few hoops.

 

How much of your data do you consider important and irreplaceable? You don't have any place to copy that?

 

Might as well let the rebuild continue if it will and we can decide if the result is good or if original disk2 contents would be better.

 

Emulated disk2 was mounted in those diagnostics you posted earlier. Is that still the case?

 

Don't let anything get written to your server until we decide. Anything written to the emulated disk will not be on the original disk, and will make parity out-of-sync with the original disk.

Yes, emulated disk 2 is still mounted. I'm creating backup's right now to Dropbox using duplicati. I paused the rebuild last night, I set it to continue this morning, still 11 hours to go. It's 10 am here now. My dockers and vm's are stopped, so nothing is being written to the server right now. 

Edited by rvoosterhout
Link to comment
46 minutes ago, rvoosterhout said:

if the restore went good?

All disks mountable, does your data look OK?

 

46 minutes ago, rvoosterhout said:

disk1 still shows read errors

Where are you seeing that?

 

187 Reported_Uncorrect      -O--CK   098   098   000    -    2
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

Click on disk1 to get to its page, disable spindown, and run an extended SMART test. It will take several hours.

 

 

Link to comment
13 minutes ago, trurl said:

It passed and except for a small number of Reported Uncorrect attributes look OK. But those syslog entries do seem like problems with the disk and not something else.

 

You could check filesystem on the rebuilt disk2 but I expect it is OK.

 

If you have finished with the backups I guess you could replace disk1.

Yes the backups have finished, but I don't have a second newdrive  to also replace disk1. Disk1 is a 6 TB drive, disk2 used to be 4tb, but now is a 6TB (new). I don't think I can replace disk1 (6TB) with the old disk2 drive (4tb)?

Link to comment
Just now, rvoosterhout said:

I don't think I can replace disk1 (6TB) with the old disk2 drive (4tb)?

No. Keep an eye on disk1 and if it continues to have problems you will have to get a replacement. Do you have Notifications setup to alert you immediately by email or agent as soon as a problem is detected?

 

How much data is on disk1? If it would all fit on old disk2 then maybe you could copy it all there as an Unassigned Device and then New Config that disk into the array in place of disk1 and rebuild parity. Don't do anything with original disk2 though until you are satisfied with the rebuild results.

  • Thanks 1
Link to comment
3 minutes ago, trurl said:

No. Keep an eye on disk1 and if it continues to have problems you will have to get a replacement. Do you have Notifications setup to alert you immediately by email or agent as soon as a problem is detected?

 

How much data is on disk1? If it would all fit on old disk2 then maybe you could copy it all there as an Unassigned Device and then New Config that disk into the array in place of disk1 and rebuild parity. Don't do anything with original disk2 though until you are satisfied with the rebuild results.

I understand. I will run the check filesystem on the rebuild drive, when that's good i'll start up my dockers and vm's and see how everything responds. I'll see if I can fit disk1 on disk2, but I think it's almost full so that won't work.

 

Thank you very, very much for you help and quick responses.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.