Jump to content

(solved again) Array has 5 disks with read errors.


Recommended Posts

As the title says.

Screen shot shows the discs, diagnostics logs included.

Also:

Wen using crusader i cannot access disk 3.

/mnt/disk 3

is somehow a file of zero bytes and not a folder

2021-05-18_disk errors.png

silverstone-diagnostics-20210518-2122.zip

 

I put Unraid in maintenance mode and ran a short smart test on Disk 3 it completed with no errors.

1352393918_2021-05-18_21.35_Unraid-disk3aftershortSMARTtest.thumb.png.df086e7ef3242beca3df955b76939e9e.png


 

 

Disk 3 XFS check with -n

**********************************************************************************************************

 

    Phase 1 - find and verify superblock...
    Phase 2 - using internal log
            - zero log...
    ALERT: The filesystem has valuable metadata changes in a log which is being
    ignored because the -n option was used.  Expect spurious inconsistencies
    which may be resolved by first mounting the filesystem to replay the log.
            - scan filesystem freespace and inode maps...
    sb_fdblocks 456165658, counted 458312794
            - found root inode chunk
    Phase 3 - for each AG...
            - scan (but don't clear) agi unlinked lists...
            - process known inodes and perform inode discovery...
            - agno = 0
            - agno = 1
            - agno = 2
            - agno = 3
            - agno = 4
            - agno = 5
            - agno = 6
            - agno = 7
            - process newly discovered inodes...
    Phase 4 - check for duplicate blocks...
            - setting up duplicate extent list...
            - check for inodes claiming duplicate blocks...
            - agno = 0
            - agno = 1
            - agno = 2
            - agno = 3
            - agno = 6
            - agno = 4
            - agno = 7
            - agno = 5
    No modify flag set, skipping phase 5
    Phase 6 - check inode connectivity...
            - traversing filesystem ...
            - traversal finished ...
            - moving disconnected inodes to lost+found ...
    Phase 7 - verify link counts...
    No modify flag set, skipping filesystem flush and exiting.

 

**********************************************************************************************************

Edit: i added the SMART diagnostics of all  5 disks with errors after running short SMART on all of them. Disk numbers appended( Disk 1 etc).

Disk 3 is also running the extensive SMART check atm.

WDC_WD80EMAZ-00W_7HKJT7EJ_35000cca257f1e771-20210518-2240 - Disk 3.txt

WDC_WD80EMAZ-00W_7HKJWUXJ_35000cca257f1f4f1-20210518-2244 - Disk 4.txt

WDC_WD80EZAZ-11T_2SG8U7JJ_35000cca27dc401ba-20210518-2245 Disk 7.txt

WDC_WD80EZAZ-11T_2SG9465F_35000cca27dc4271a-20210518-2244 Disk 6.txt

WDC_WD80EZAZ-11T_7HJJ6AVF_35000cca257e38cc8-20210518-2243 Disk 1.txt

What should i do ?

How bad is this?

Edited by Pjhal
Link to comment
13 hours ago, JorgeB said:

Don't see any controller issues logged, so most likely a power/connection problem, power down the server, check all connections and power back up, array should be accessible after that.

Thank you for your response.

I have rebooted, Unraid then reported zero errors. Then i started the array in maintenance mode, now doing a Parity check (read only).

After that ill try starting the array normally.

 

 

Link to comment
24 minutes ago, Pjhal said:

Oke it got worse i finished the Parity check with no errors and then tried to start the array normally now i have 6 unmountable Disks.

That is every Data Disk except Disk 5...

Edit: i included new diagnostics

silverstone-diagnostics-20210520-2253.zip 117.66 kB · 0 downloads 1902219284_Schermafbeelding2021-05-20225605.thumb.png.f254e1f94bed2dddf29ecbae19b7f2d9.png

 

Same Issues I'm having. Are these shucked drives?

Edited by TechTitus
Link to comment

Yes they are, but the Disks them selves are fine according to SMART. This happened after upgrading to 6.9.2 and then downgrading again to 6.8.3. So i am hoping that it is just some limited file inconsistency. And not a mayor failure of hard drives or the whole array.

Link to comment
2 minutes ago, Pjhal said:

Yes they are, but the Disks them selves are fine according to SMART. This happened after upgrading to 6.9.2 and then downgrading again to 6.8.3. So i am hoping that it is just some limited file inconsistency. And not a mayor failure of hard drives or the whole array.

Yep, I'm having the exact same issue and UDMA CRC errors as well. I'm going to swap Power Supplies to see if it's a power issue.

Link to comment

Read errors on multiple disks:

 

May 20 22:48:29 Silverstone kernel: md: disk4 read error, sector=8
May 20 22:48:29 Silverstone kernel: md: disk4 read error, sector=16
May 20 22:48:29 Silverstone kernel: md: disk4 read error, sector=24
May 20 22:48:29 Silverstone kernel: md: disk7 read error, sector=8
May 20 22:48:29 Silverstone kernel: md: disk7 read error, sector=16
May 20 22:48:29 Silverstone kernel: md: disk7 read error, sector=24
May 20 22:48:29 Silverstone kernel: md: disk6 read error, sector=8
May 20 22:48:29 Silverstone kernel: md: disk6 read error, sector=16
May 20 22:48:29 Silverstone kernel: md: disk6 read error, sector=24
May 20 22:48:29 Silverstone kernel: Buffer I/O error on dev md1, logical block 0, async page read
### [PREVIOUS LINE REPEATED 1 TIMES] ###
May 20 22:48:29 Silverstone kernel: md: disk1 read error, sector=32
May 20 22:48:29 Silverstone kernel: md: disk1 read error, sector=40
May 20 22:48:29 Silverstone kernel: md: disk1 read error, sector=48

 

This is a likely a power, connection or controller problem.

  • Like 1
Link to comment
5 hours ago, JorgeB said:

Read errors on multiple disks:

 




May 20 22:48:29 Silverstone kernel: md: disk4 read error, sector=8
May 20 22:48:29 Silverstone kernel: md: disk4 read error, sector=16
May 20 22:48:29 Silverstone kernel: md: disk4 read error, sector=24
May 20 22:48:29 Silverstone kernel: md: disk7 read error, sector=8
May 20 22:48:29 Silverstone kernel: md: disk7 read error, sector=16
May 20 22:48:29 Silverstone kernel: md: disk7 read error, sector=24
May 20 22:48:29 Silverstone kernel: md: disk6 read error, sector=8
May 20 22:48:29 Silverstone kernel: md: disk6 read error, sector=16
May 20 22:48:29 Silverstone kernel: md: disk6 read error, sector=24
May 20 22:48:29 Silverstone kernel: Buffer I/O error on dev md1, logical block 0, async page read
### [PREVIOUS LINE REPEATED 1 TIMES] ###
May 20 22:48:29 Silverstone kernel: md: disk1 read error, sector=32
May 20 22:48:29 Silverstone kernel: md: disk1 read error, sector=40
May 20 22:48:29 Silverstone kernel: md: disk1 read error, sector=48

 

This is a likely a power, connection or controller problem.

But this issue happened after downgrading from 6.92 back to 6.83 nothing else changed. I also read that some people had compatibility issues with the newer version.

I use a:

https://www.broadcom.com/products/storage/host-bus-adapters/sas-9300-8i

What can i do to fix this?  I understand that it is hypothetically possible that my power supply failed  or that it is a cable failure but it seems incredibly unlikely to me that this happens at the exact time that that i run into OS issues due to updating and downgrading my OS version.

Edit: oke i disconnected and reconnected the HBA and my array is back so maybe it was a badly plugged in connect?

 

 

Schermafbeelding 2021-05-21 141120.png

Edited by Pjhal
Link to comment
  • Pjhal changed the title to (Solved) Array has 5 disks with read errors
  • Pjhal changed the title to Array has 5 disks with read errors.(no longer solved)
9 hours ago, JorgeB said:

Still looks like a power/connection issue.

Shutdown server, re plugged HBA and all Disks. Then started it up again.

After some time new errors

Quote

May 22 18:02:58 Silverstone kernel: mdcmd (58): spindown 7
May 22 18:09:15 Silverstone kernel: mdcmd (59): spindown 6
May 22 18:15:53 Silverstone kernel: sd 13:0:6:0: [sdh] tag#1409 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
May 22 18:15:53 Silverstone kernel: sd 13:0:6:0: [sdh] tag#1409 Sense Key : 0x5 [current]
May 22 18:15:53 Silverstone kernel: sd 13:0:6:0: [sdh] tag#1409 ASC=0x20 ASCQ=0x0
May 22 18:15:53 Silverstone kernel: sd 13:0:6:0: [sdh] tag#1409 CDB: opcode=0x88 88 00 00 00 00 01 0b b7 0b 50 00 00 00 08 00 00
May 22 18:15:53 Silverstone kernel: print_req_error: critical target error, dev sdh, sector 4491512656
May 22 18:15:53 Silverstone kernel: md: disk6 read error, sector=4491512592
May 22 18:15:53 Silverstone kernel: sd 13:0:5:0: [sdg] tag#1414 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
May 22 18:15:53 Silverstone kernel: sd 13:0:5:0: [sdg] tag#1414 Sense Key : 0x5 [current]
May 22 18:15:53 Silverstone kernel: sd 13:0:5:0: [sdg] tag#1414 ASC=0x20 ASCQ=0x0
May 22 18:15:53 Silverstone kernel: sd 13:0:5:0: [sdg] tag#1414 CDB: opcode=0x88 88 00 00 00 00 01 0b b7 0b 50 00 00 00 08 00 00
May 22 18:15:53 Silverstone kernel: print_req_error: critical target error, dev sdg, sector 4491512656
May 22 18:15:53 Silverstone kernel: md: disk7 read error, sector=4491512592

The weird thing that stands out to me is that the errors occur after the 2 disk happen to spin down. Could that be related?

Also if it is a hardware defect....I don't have a spare HBA, proper size power supply or SAS cable to do any testing (by swapping them out ) so i am at a loss as to how i should handle this right now.

Is there anything i can do?

 

silverstone-diagnostics-20210522-1828.zip

Link to comment
On 5/23/2021 at 11:39 AM, JorgeB said:

It could, though don't remember spinning issues with WDs, but try disabling spin down to see if it changes anything.

After disabling spin down on all disks and restarting the server it has now been running for 1d and 3 hours without any errors, so i am assuming it is fixed.

Link to comment
  • Pjhal changed the title to (solved again) Array has 5 disks with read errors.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...