Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

1 Drive Redballed and 2 Drives Unmountable

Featured Replies

Hi Guys, 

 

Starting a new thread, as my original post was about one redballed drive, and things seem to have spiraled. 

 

1 Drive is Red Balled

2 Drives are Unmountable

 

All 3 drives are on the same SAS card, but across two sets of breakout cables

 

I've tried to run XFS check on the unmountable drives, but they both come back with the following error:

 

Phase 1 - find and verify superblock...
superblock read failed, offset 0, size 524288, ag 0, rval -1
fatal error -- Input/output error

I've tried plugging working SATA cables into the drives -- cables that are working fine going directly to MOBO -- but the XFS check comes back the same. 

 

I've ordered new breakout cables for both slots on my card, just in case. I will have them tomorrow. 

 

Not sure how to proceed

 

Diagnostics attached

tower-diagnostics-20170606-1328.zip

Edited by newoski

  • Community Expert
12 minutes ago, newoski said:

I've tried to run XFS check on the unmountable drives

Did you do this from the webUI or from the command line? If from the command line, what was the exact command?

  • Author

WebGUI. I tried the default parameter, as well as -v

 

Same result with both

  • Community Expert

There's something strange going here:

 

Jun  6 13:13:04 Tower kernel: Buffer I/O error on dev md11, logical block 1953506608, async page read
Jun  6 13:13:04 Tower kernel: Buffer I/O error on dev md12, logical block 1953506608, async page read
Jun  6 13:13:05 Tower kernel: Buffer I/O error on dev md18, logical block 1953506608, async page read

Like these 3 disks are not online, can you reboot and post new diags?

  • Community Expert

OK, I understand what's happening now, both disk 11 and 12 where disable at some point in the past, and since disk18 is also disable you have 3 invalid disks that can't be emulated by dual parity, hence the errors.

  • Author
Just now, johnnie.black said:

OK, I understand what's happening now, both disk 11 and 12 where disable at some point in the past, and since disk18 is also disable you have 3 invalid disks that can't be emulated by dual parity, hence the errors.


What order or operations would recommend to rectify?

  • Community Expert

You best bet is doing a new config, I assume disk18 was the 1st disk to get disable.

 

-Tools -> New Config
-assign all disks, previous disk order needs to be maintained, double check all disks are in the correct slots
-check both "parity is already valid" and "maintenance mode" before starting the array
-start the array
-stop array, unassign disk18
-start array, check emulated disk18 mounts and contents look correct (check that disks 11 and 12 mount also)
-if all looks good, stop array, reassign disk18
-start array to begin rebuild

  • Author

Shou

1 minute ago, johnnie.black said:

You best bet is doing a new config, I assume disk18 was the 1st disk to get disable.

 

-Tools -> New Config
-assign all disks, previous disk order needs to be maintained, double check all disks are in the correct slots
-check both "parity is already valid" and "maintenance mode" before starting the array
-start the array
-stop array, unassign disk18
-start array, check emulated disk18 mounts and contents look correct (check that disks 11 and 12 mount also)
-if all looks good, stop array, reassign disk18
-start array to begin rebuild

 

I should wait until I get the new SATA breakout cables, and replace those first, correct?

 

(THANK YOU)

  • Community Expert

If you don't know why disks11 and 12 got disable (like touching a cable, etc) probably best.

  • Author
6 minutes ago, johnnie.black said:

If you don't know why disks11 and 12 got disable (like touching a cable, etc) probably best.

 

How would new config help with the drives being unmountable?

 

(I believe cables got touches, most likely)

  • Community Expert
10 minutes ago, johnnie.black said:

OK, I understand what's happening now, both disk 11 and 12 where disable at some point in the past, and since disk18 is also disable you have 3 invalid disks that can't be emulated by dual parity, hence the errors.

So shouldn't that really be 3 "redballed" disks instead of one?

 

newoski, do you have Notifications setup? Even with dual parity you should attend to problems immediately and not let them accumulate.

  • Author

 

Just now, trurl said:

So shouldn't that really be 3 "redballed" disks instead of one?

 

newoski, do you have Notifications setup? Even with dual parity you should attend to problems immediately and not let them accumulate.

 

I had 1 redballed. I was responding immediately, when I rebooted and followed johnnie.black's original instrucitons, the other 2 drives became unmountable. I'm sure the cables got bumped

 

Which is to say, yes i have notifiactions, yes i responded immediately. I would never let more than 1 drive sit redballed 

 

( :

  • Community Expert
6 minutes ago, newoski said:

the other 2 drives became unmountable

 

They became unmountable because they are disable and no way unRAID can emulate their data since you have 3 disks disable total.

Edited by johnnie.black

  • Author
1 minute ago, johnnie.black said:

 

They became unmountable because they are disable and know way unRAID can emulate their data since you have 3 disks disable total.

 

I understand that Unraid can only emulate 1 drive, per parity drive... but if the sata cable was bumped and disconnected... wouldn't they show up as Missing?

 

They showed up as Unmountable, which to me, meant that they were seen but Unraid wasn't able to read them or something... 

 

I have 2 parity drives. Therefor, if they were both truly missing, wouldn't unraid have emulated at least one of the drives?

  • Community Expert

A disk is only disable if a write fails, so if a cable was bumped it was with the array mounted, and in that case the disk is disabled.

 

You are one disk past redundancy, so no disk can be emulated.

  • Community Expert

If by any change you have it I would really like to see the syslog when disks 11 and 12 got disabled, I suspect they where disable during disk18 rebuild, possibly as a result of a controller issue.

 

When errors on multiple disks happen, say unRAID loses contact with 8 disks on the same controller, the 1st one (or the 1st two if you have dual parity) that it can't write to get disable, the remaining disks will show read errors but don't get disabled past current redundancy, but if a disk is rebuilding, I suspect that it's not considered as disabled and since you have dual parity 2 disks got disabled, adding the third invalid disk leaves the user in more complicated situation, so if this is what happened maybe it could be improved, but no way to know for sure without the logs.

  • Author
3 minutes ago, johnnie.black said:

If by any change you have it I would really like to see the syslog when disks 11 and 12 got disabled, I suspect they where disable during disk18 rebuild, possibly as a result of a controller issue.

 

When errors on multiple disks happen, say unRAID loses contact with 8 disks on the same controller, the 1st one (or the 1st two if you have dual parity) that it can't write to get disable, the remaining disks will show read errors but don't get disabled past current redundancy, but if a disk is rebuilding, I suspect that it's not considered as disabled and since you have dual parity 2 disks got disabled, adding the third invalid disk leaves the user in more complicated situation, so if this is what happened maybe it could be improved, but no way to know for sure without the logs.

 

I believe you're asking for a syslog from the moment those drives disappeared. At this point I'm not sure I have one or which of the many it would but, but in the spirit of gratitude, here are all the syslogs I have from today for your perusing

tower-diagnostics-20170606-1328.zip

tower-smart-20170606-0721.zip

tower-smart-20170606-0725.zip

tower-smart-20170606-0836.zip

  • Community Expert

Unfortunately it's not there, but thanks anyway, I can simulate this on a test server, so I'll do that when I get the chance.

  • Author
2 hours ago, johnnie.black said:

Unfortunately it's not there, but thanks anyway, I can simulate this on a test server, so I'll do that when I get the chance.

 

Sorry and also... Hmmmmm... So I followed those steps, and everything seemed to work for a bit... then I started getting lots of errors in the SysLog and drive18 has totally disappeared from Explorer. It still shows up as Emulated in GUI, but it's not exporting in Windows

 

 

tower-diagnostics-20170606-1817.zip

  • Author
7 minutes ago, johnnie.black said:

You need to check filesystem on the emulated disk18, do it before rebuilding but actual disk18 has the same issues.

 

https://wiki.lime-technology.com/Check_Disk_Filesystems#Drives_formatted_with_XFS


Thoughts?

 


root@Tower:~# xfs_repair -v /dev/md18
Phase 1 - find and verify superblock...
        - block cache size set to 2965960 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 769204 tail block 769200
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.
 

  • Author
8 minutes ago, johnnie.black said:

Use -L

 

Thanks again!!

  • Author
On 6/6/2017 at 6:48 PM, johnnie.black said:

Use -L

 

So using -L worked. I replaced the SATA breakout cables and rebuilt the drive. Everything's been going fine for the last 7 hours. While there are no red balled drives, I happened to walk past my NAS and saw a bunch of metadata errors pop up on the screen... Looks like there might still be a few issues leftover on the data side?

 

How should I address? 

 

Should I run XFS check on each drive on my array?

 

tower-diagnostics-20170608-2038.zip

Edited by newoski

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.