[6.10.3] UNRAID reported read errors on existing Drive 4 - Replaced it and now I have read errors on many more drives?


Recommended Posts

15 hours ago, trurl said:

Disk4 was going to need repair whether or not rebuild was good, but that is worse.

 

Diagnostics smart folder, and system/vars.txt, still show disk4 invalid, and syslog doesn't show rebuild completed yet. Maybe rebuild could be tried again if disk2 problems are fixed.

 

SMART report for disk2 looks OK, but it has never had extended self-test.

 

Not sure whether syslog entries about disk2 are a problem with the disk or something else.

Aug  7 13:25:18 TaFlix-UNRAID kernel: sd 1:0:4:0: [sdf] tag#948 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=DRIVER_OK cmd_age=3s
Aug  7 13:25:18 TaFlix-UNRAID kernel: sd 1:0:4:0: [sdf] tag#948 Sense Key : 0x2 [current] 
Aug  7 13:25:18 TaFlix-UNRAID kernel: sd 1:0:4:0: [sdf] tag#948 ASC=0x4 ASCQ=0x0 
Aug  7 13:25:18 TaFlix-UNRAID kernel: sd 1:0:4:0: [sdf] tag#948 CDB: opcode=0x88 88 00 00 00 00 00 05 27 86 70 00 00 01 98 00 00
Aug  7 13:25:18 TaFlix-UNRAID kernel: blk_update_request: I/O error, dev sdf, sector 86476400 op 0x0:(READ) flags 0x0 phys_seg 51 prio class 0
Aug  7 13:25:18 TaFlix-UNRAID kernel: md: disk2 read error, sector=86476336
Aug  7 13:25:18 TaFlix-UNRAID kernel: md: disk2 read error, sector=86476344
Aug  7 13:25:18 TaFlix-UNRAID kernel: md: disk2 read error, sector=86476352
Aug  7 13:25:18 TaFlix-UNRAID kernel: md: disk2 read error, sector=86476360
Aug  7 13:25:18 TaFlix-UNRAID kernel: md: disk2 read error, sector=86476368
...

 

 

I'm running an extended test for Disk 2.

Link to comment
On 8/8/2022 at 8:20 AM, trurl said:

SMART report for disk2 looks OK, but it has never had extended self-test.

 

 

Hi, I just completed the extended test for Disk 2, it said:

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

# 1 Extended offline Completed without error 00% 1752 -

 

I'm assuming the full details will be in the diags?  Attached a fresh copy.

taflix-unraid-diagnostics-20220810-0835.zip

 

I'm leaning towards this being data / power cable issue?  I ordered some new cables and they should be delivered by this weekend.  I already got new cables the first time, but I guess with most Amazon stuff, quality control is hit or miss.

Edited by taflix
Link to comment
On 8/12/2022 at 1:52 AM, JorgeB said:

Check filesystem on disk4

 

Results

 

Phase 1 - find and verify superblock...
        - block cache size set to 462432 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 38823 tail block 38823
        - scan filesystem freespace and inode maps...
ir_freecount/free mismatch, inode chunk 0/128, freecount 7 nfree 1
finobt ir_freecount/free mismatch, inode chunk 0/108071680, freecount 5 nfree 3
agi_freecount 19, counted 11 in ag 0
agi_freecount 19, counted 7 in ag 0 finobt
agi unlinked bucket 3 is 131 in ag 0 (inode=131)
agi unlinked bucket 26 is 128617114 in ag 0 (inode=128617114)
sb_ifree 368, counted 376
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
bad inode format in inode 131
bad inode format in inode 131
would have cleared inode 131
imap claims in-use inode 108073626 is free, correcting imap
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 2
        - agno = 6
entry "Movies" in shortform directory 128 references free inode 131
        - agno = 7
would have junked entry "Movies" in directory inode 128
        - agno = 1
bad inode format in inode 131
would have cleared inode 131
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
No modify flag set, skipping phase 5
Inode allocation btrees are too corrupted, skipping phases 6 and 7
No modify flag set, skipping filesystem flush and exiting.

        XFS_REPAIR Summary    Sat Aug 13 08:27:40 2022

Phase		Start		End		Duration
Phase 1:	08/13 08:27:38	08/13 08:27:38
Phase 2:	08/13 08:27:38	08/13 08:27:39	1 second
Phase 3:	08/13 08:27:39	08/13 08:27:40	1 second
Phase 4:	08/13 08:27:40	08/13 08:27:40
Phase 5:	Skipped
Phase 6:	Skipped
Phase 7:	Skipped

Total run time: 2 seconds

 

What do you think?

Link to comment
23 minutes ago, trurl said:

On the User Shares page, click Compute... for lost+found share, wait for the results, then post a screenshot

 

After starting the array, it looks like it is rebuilding now, Disk 4 is mounted and no errors so far

 

image.thumb.png.28e74a2071d8c254c15171fef2507709.png

 

Should I still do the Compute thing?

Link to comment
11 minutes ago, trurl said:

I really only wanted Compute... for lost+found. That shows the repair put 2.94TB it couldn't figure out in lost+found. Take a look at that share, you won't like it.

 

Do you still have the original disk4?

 

That lost+found folder looks like this, ton of folders with numbers but no files?

 

image.thumb.png.a8c631f5405ff26a2b2e690a1037b77a.png

 

1092 objects: 1092 directories, 0 files (0 B total)

 

Yes I do.

Edited by taflix
Link to comment

OK, I just reviewed the thread.

 

Let rebuild finish then you can see if original disk mounts as Unassigned Device and if you can recover any of the lost+found.

 

Linux 'file' command can sometimes tell you what kind of data is in a file so you can try to open it, but often not worth the trouble if you have a lot of lost+found.

 

Do you have backups of anything important and irreplaceable?

Link to comment
17 minutes ago, trurl said:

OK, I just reviewed the thread.

 

Let rebuild finish then you can see if original disk mounts as Unassigned Device and if you can recover any of the lost+found.

 

Linux 'file' command can sometimes tell you what kind of data is in a file so you can try to open it, but often not worth the trouble if you have a lot of lost+found.

 

Do you have backups of anything important and irreplaceable?

 

  1.  Do you mean let rebuild finish on the new drive that I bought to replace the original Disk 4?
  2. No, unfortunately, I do not have another backup.
Edited by taflix
Link to comment
42 minutes ago, taflix said:

Do you mean let rebuild finish on the new drive that I bought to replace the original Disk 4?

Yes, it's always better to rebuild to a spare instead of the original disk. Then you still have the original disk in case rebuild has problems.

 

43 minutes ago, taflix said:

No, unfortunately, I do not have another backup.

You must always have another copy of anything important and irreplaceable. You get to decide what is important and irreplaceable.

 

Parity is not a substitute for backups.

Link to comment
45 minutes ago, trurl said:

Yes, it's always better to rebuild to a spare instead of the original disk. Then you still have the original disk in case rebuild has problems.

 

You must always have another copy of anything important and irreplaceable. You get to decide what is important and irreplaceable.

 

Parity is not a substitute for backups.

 

Okay, I'll allow it to rebuild to the spare new drive, that makes a lot of sense.

 

I do have the original Disk 4.  I plugged it into my Linux Mint box and it has all the files.  I can see now all the files that should be in the lost+found folder.  I found an old ext HDD and I'm making a backup of about 8TB of data onto another drive.

 

After the rebuild is completed, can I?

  1. Delete everything in the lost+found folder?
  2. Mount spare HDD
  3. Copy the bad data to Disk 4 from my spare HDD?
Edited by taflix
Link to comment
20 hours ago, taflix said:
  1. Delete everything in the lost+found folder?
  2. Mount spare HDD
  3. Copy the bad data to Disk 4 from my spare HDD?

Not entirely clear.

 

You might leave lost+found for later after you are sure you have everything.

 

By "spare" do you mean original disk4? And you would copy data from original disk4 to newly rebuilt disk4?

Link to comment
20 hours ago, trurl said:

Not entirely clear.

 

You might leave lost+found for later after you are sure you have everything.

 

By "spare" do you mean original disk4? And you would copy data from original disk4 to newly rebuilt disk4?

 

There are no files and 0 bytes in the lost+found folder.

 

Yes, I mean the original disk.  It still has all the data.  I was able to successfully copy everything to a spare HDD.

 

Can I use the spare HDD or original HDD to copy the contents that was unrecoverable in the lost+found folder?

Link to comment
2 minutes ago, taflix said:

There are no files and 0 bytes in the lost+found folder

 

Do you mean you already deleted it all? Your earlier screenshot showed

On 8/13/2022 at 3:44 PM, trurl said:

the repair put 2.94TB it couldn't figure out in lost+found

 

On 8/13/2022 at 3:47 PM, taflix said:

That lost+found folder looks like this, ton of folders with numbers but no files?

 

Did you look in those folders?

 

4 minutes ago, taflix said:

copy the contents

Unassigned Devices and Dynamix File Manager

Link to comment
12 minutes ago, trurl said:

 

Do you mean you already deleted it all? Your earlier screenshot showed

 

 

Did you look in those folders?

 

Unassigned Devices and Dynamix File Manager

 

Yes.  All of those subfolders were empty.  The entire contents of the lost+found folder was 0 bytes.

 

Use these tools to copy the original data over?

  • Unassigned Devices and Dynamix File Manager
  • Mount the HDD using Unassigned Devices?
  • Copy files using Dynamix File Manager?

Just wanting to confirm?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.