[SOLVED] help me identify, drive failure or controller failure ?


Recommended Posts

here's the log:

 

Mar 23 18:39:43 Tower kernel: sd 2:0:4:0: [sdm] command f713f540 timed out (Drive related)

Mar 23 18:39:43 Tower kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1 (Drive related)

Mar 23 18:39:43 Tower kernel: sas: trying to find task 0xf71c7400 (Drive related)

Mar 23 18:39:43 Tower kernel: sas: sas_scsi_find_task: aborting task 0xf71c7400 (Drive related)

Mar 23 18:39:43 Tower kernel: sas: sas_scsi_find_task: task 0xf71c7400 is aborted (Drive related)

Mar 23 18:39:43 Tower kernel: sas: sas_eh_handle_sas_errors: task 0xf71c7400 is aborted (Errors)

Mar 23 18:39:43 Tower kernel: sas: ata13: end_device-2:4: cmd error handler (Errors)

Mar 23 18:39:43 Tower kernel: sas: ata9: end_device-2:0: dev error handler (Drive related)

Mar 23 18:39:43 Tower kernel: sas: ata10: end_device-2:1: dev error handler (Drive related)

Mar 23 18:39:43 Tower kernel: sas: ata11: end_device-2:2: dev error handler (Drive related)

Mar 23 18:39:43 Tower kernel: sas: ata12: end_device-2:3: dev error handler (Drive related)

Mar 23 18:39:43 Tower kernel: sas: ata13: end_device-2:4: dev error handler (Drive related)

Mar 23 18:39:43 Tower kernel: sas: ata14: end_device-2:5: dev error handler (Drive related)

Mar 23 18:39:43 Tower kernel: ata13.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen (Errors)

Mar 23 18:39:43 Tower kernel: ata13.00: failed command: READ FPDMA QUEUED (Minor Issues)

Mar 23 18:39:43 Tower kernel: ata13.00: cmd 60/08:00:68:ed:b6/00:00:a3:00:00/40 tag 0 ncq 4096 in (Drive related)

Mar 23 18:39:43 Tower kernel:          res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) (Errors)

Mar 23 18:39:43 Tower kernel: ata13.00: status: { DRDY } (Drive related)

Mar 23 18:39:43 Tower kernel: ata13: hard resetting link (Minor Issues)

Mar 23 18:39:45 Tower kernel: mvsas 0000:02:00.0: Phy6 : No sig fis (Drive related)

Mar 23 18:39:46 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1527:mvs_I_T_nexus_reset for device[4]:rc= 0 (System)

Mar 23 18:39:46 Tower kernel: sas: sas_form_port: phy6 belongs to port4 already(1)! (Drive related)

Mar 23 18:39:51 Tower kernel: ata13.00: qc timeout (cmd 0x27) (Drive related)

Mar 23 18:39:51 Tower kernel: ata13.00: failed to read native max address (err_mask=0x4) (Minor Issues)

Mar 23 18:39:51 Tower kernel: ata13.00: HPA support seems broken, skipping HPA handling (Minor Issues)

Mar 23 18:39:51 Tower kernel: ata13.00: revalidation failed (errno=-5) (Minor Issues)

Mar 23 18:39:51 Tower kernel: ata13: hard resetting link (Minor Issues)

Mar 23 18:39:53 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1527:mvs_I_T_nexus_reset for device[4]:rc= 0 (System)

Mar 23 18:39:53 Tower kernel: sas: sas_ata_task_done: SAS error 8a (Errors)

Mar 23 18:39:53 Tower kernel: sas: sas_ata_task_done: SAS error 8a (Errors)

Mar 23 18:39:53 Tower kernel: ata13.00: both IDENTIFYs aborted, assuming NODEV (Drive related)

Mar 23 18:39:53 Tower kernel: ata13.00: revalidation failed (errno=-2) (Minor Issues)

Mar 23 18:39:54 Tower kernel: mvsas 0000:02:00.0: Phy6 : No sig fis (Drive related)

Mar 23 18:39:58 Tower kernel: sas: sas_form_port: phy6 belongs to port4 already(1)! (Drive related)

Mar 23 18:39:58 Tower kernel: ata13: hard resetting link (Minor Issues)

Mar 23 18:40:04 Tower kernel: ata13.00: qc timeout (cmd 0xef) (Drive related)

Mar 23 18:40:04 Tower kernel: ata13.00: failed to set xfermode (err_mask=0x4) (Minor Issues)

Mar 23 18:40:04 Tower kernel: ata13.00: disabled (Errors)

Mar 23 18:40:04 Tower kernel: ata13.00: device reported invalid CHS sector 0 (Drive related)

Mar 23 18:40:04 Tower kernel: ata13: hard resetting link (Minor Issues)

Mar 23 18:40:06 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1527:mvs_I_T_nexus_reset for device[4]:rc= 0 (System)

Mar 23 18:40:06 Tower kernel: ata13: EH complete (Drive related)

Mar 23 18:40:06 Tower kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 (Drive related)

Mar 23 18:40:06 Tower kernel: sd 2:0:4:0: [sdm] Unhandled error code (Errors)

Mar 23 18:40:06 Tower kernel: sd 2:0:4:0: [sdm]  (Drive related)

Mar 23 18:40:06 Tower kernel: Result: hostbyte=0x04 driverbyte=0x00 (System)

Mar 23 18:40:06 Tower kernel: sd 2:0:4:0: [sdm] CDB:  (Drive related)

Mar 23 18:40:06 Tower kernel: cdb[0]=0x28: 28 00 a3 b6 ed 68 00 00 08 00

Mar 23 18:40:06 Tower kernel: end_request: I/O error, dev sdm, sector 2746674536 (Errors)

Mar 23 18:40:06 Tower kernel: md: disk5 read error, sector=2746674472 (Errors)

Mar 23 18:40:06 Tower kernel: sd 2:0:4:0: [sdm] READ CAPACITY(16) failed (Drive related)

Mar 23 18:40:06 Tower kernel: sd 2:0:4:0: [sdm]  (Drive related)

Mar 23 18:40:06 Tower kernel: Result: hostbyte=0x04 driverbyte=0x00 (System)

Mar 23 18:40:06 Tower kernel: sd 2:0:4:0: [sdm] Sense not available. (Drive related)

Mar 23 18:40:06 Tower kernel: sd 2:0:4:0: [sdm] READ CAPACITY failed (Drive related)

Mar 23 18:40:06 Tower kernel: sd 2:0:4:0: [sdm]  (Drive related)

Mar 23 18:40:06 Tower kernel: Result: hostbyte=0x04 driverbyte=0x00 (System)

Mar 23 18:40:06 Tower kernel: sd 2:0:4:0: [sdm] Sense not available. (Drive related)

Mar 23 18:40:06 Tower kernel: sd 2:0:4:0: [sdm] Asking for cache data failed (Drive related)

Mar 23 18:40:06 Tower kernel: sd 2:0:4:0: [sdm] Assuming drive cache: write through (Drive related)

Mar 23 18:40:06 Tower kernel: sdm: detected capacity change from 1500301910016 to 0 (Drive related)

Mar 23 18:40:06 Tower kernel: mvsas 0000:02:00.0: Phy6 : No sig fis (Drive related)

Mar 23 18:40:10 Tower kernel: sas: sas_form_port: phy6 belongs to port4 already(1)! (Drive related)

Mar 23 18:40:19 Tower kernel: md: disk5 write error, sector=2746674472 (Errors)

Mar 23 18:40:19 Tower kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 14614 does not match to the expected one 1 (Minor Issues)

Mar 23 18:40:19 Tower kernel: REISERFS error (device md5): vs-5150 search_by_key: invalid format found in block 343334309. Fsck? (Errors)

Mar 23 18:40:19 Tower kernel: md: recovery thread woken up ... (unRAID engine)

Mar 23 18:40:19 Tower kernel: REISERFS (device md5): Remounting filesystem read-only (Drive related)

Mar 23 18:40:19 Tower kernel: REISERFS error (device md5): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1274 10843 0x0 SD] (Errors)

Mar 23 18:40:20 Tower kernel: md: recovery thread has nothing to resync (unRAID engine)

 

 

 

I had a drive go down a couple of weeks ago, but turned out to be a bad controller...  this drive is one of the oldest drives in the array, which makes sense, also connected to a different controller than the controller that I replaced last couple of weeks ago.

 

 

Link to comment

so most likely it's a bad drive...

 

I couldn't get a smart report from the drive, but after powering down and re-powering up, running short smart test results in:

 

smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build)

Copyright © 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

 

=== START OF INFORMATION SECTION ===

Model Family:    Western Digital Caviar Green

Device Model:    WDC WD15EADS-00P8B0

Serial Number:    WD-WMAVU0039302

LU WWN Device Id: 5 0014ee 056d2dfe2

Firmware Version: 01.00A01

User Capacity:    1,500,301,910,016 bytes [1.50 TB]

Sector Size:      512 bytes logical/physical

Device is:        In smartctl database [for details use: -P show]

ATA Version is:  ATA8-ACS (minor revision not indicated)

SATA Version is:  SATA 2.6, 3.0 Gb/s

Local Time is:    Tue Mar 24 21:59:10 2015 EDT

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x84) Offline data collection activity

was suspended by an interrupting command from host.

Auto Offline Data Collection: Enabled.

Self-test execution status:      (  0) The previous self-test routine completed

without error or no self-test has ever

been run.

Total time to complete Offline

data collection: (33600) seconds.

Offline data collection

capabilities: (0x7b) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: (  2) minutes.

Extended self-test routine

recommended polling time: ( 383) minutes.

Conveyance self-test routine

recommended polling time: (  5) minutes.

SCT capabilities:       (0x303f) SCT Status supported.

SCT Error Recovery Control supported.

SCT Feature Control supported.

SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  200  200  051    Pre-fail  Always      -      872

  3 Spin_Up_Time            0x0027  192  180  021    Pre-fail  Always      -      5366

  4 Start_Stop_Count        0x0032  098  098  000    Old_age  Always      -      2184

  5 Reallocated_Sector_Ct  0x0033  200  200  140    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x002e  100  253  000    Old_age  Always      -      0

  9 Power_On_Hours          0x0032  047  047  000    Old_age  Always      -      39064

10 Spin_Retry_Count        0x0032  100  100  000    Old_age  Always      -      0

11 Calibration_Retry_Count 0x0032  100  253  000    Old_age  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      79

192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -      22

193 Load_Cycle_Count        0x0032  199  199  000    Old_age  Always      -      3876

194 Temperature_Celsius    0x0022  122  105  000    Old_age  Always      -      28

196 Reallocated_Event_Count 0x0032  200  200  000    Old_age  Always      -      0

197 Current_Pending_Sector  0x0032  200  200  000    Old_age  Always      -      2

198 Offline_Uncorrectable  0x0030  200  200  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x0032  200  200  000    Old_age  Always      -      0

200 Multi_Zone_Error_Rate  0x0008  200  199  000    Old_age  Offline      -      47

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Short offline      Completed without error      00%    39064        -

# 2  Extended offline    Completed without error      00%      4624        -

# 3  Extended offline    Completed without error      00%      4456        -

# 4  Extended offline    Completed without error      00%      4289        -

# 5  Extended offline    Completed without error      00%      4121        -

# 6  Extended offline    Completed without error      00%      3954        -

# 7  Extended offline    Completed without error      00%      3786        -

# 8  Extended offline    Completed without error      00%      3619        -

# 9  Extended offline    Completed without error      00%      3451        -

#10  Extended offline    Completed without error      00%      3285        -

#11  Extended offline    Completed without error      00%      3116        -

#12  Extended offline    Completed without error      00%      2948        -

#13  Extended offline    Completed without error      00%      2783        -

#14  Extended offline    Completed without error      00%      2613        -

#15  Extended offline    Completed without error      00%      2445        -

#16  Extended offline    Completed without error      00%      2278        -

#17  Extended offline    Completed without error      00%      2110        -

#18  Extended offline    Completed without error      00%      1943        -

#19  Extended offline    Completed without error      00%      1775        -

#20  Extended offline    Completed without error      00%      1610        -

#21  Extended offline    Completed without error      00%      1445        -

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

 

 

 

so I see pending reallocated sectors so drive is probably on its way out.

 

I bought a new drive and replaced this drive, but when I selected the new drive and started the array, I see unRaid tries to mount the new drive, and it's just sitting there... normally it'll tell me it's rebuilding the array... but it's not right now..

 

Mar 24 21:22:14 Tower emhttp: writing GPT on disk (sdn), with partition 1 offset 64, erased: 0 (Drive related)

Mar 24 21:22:14 Tower emhttp: shcmd (88): sgdisk -Z /dev/sdn $stuff$> /dev/null (Drive related)

Mar 24 21:22:16 Tower emhttp: shcmd (89): sgdisk -o -a 64 -n 1:64:0 /dev/sdn |$stuff$ logger (Drive related)

Mar 24 21:22:16 Tower kernel:  sdn: unknown partition table (Drive related)

Mar 24 21:22:17 Tower logger: Creating new GPT entries.

Mar 24 21:22:17 Tower logger: The operation has completed successfully.

Mar 24 21:22:17 Tower emhttp: shcmd (90): udevadm settle (Other emhttp)

Mar 24 21:22:17 Tower kernel:  sdn: sdn1 (Drive related)

Mar 24 21:22:17 Tower emhttp: Start array... (Other emhttp)

Mar 24 21:22:17 Tower kernel: mdcmd (52): start RECON_DISK (unRAID engine)

Mar 24 21:22:17 Tower kernel: unraid: allocating 207168K for 4096 stripes (12 disks)

Mar 24 21:22:17 Tower kernel: md1: running, size: 1953514552 blocks (Drive related)

Mar 24 21:22:17 Tower kernel: md2: running, size: 1953514552 blocks (Drive related)

Mar 24 21:22:17 Tower kernel: md3: running, size: 3907018532 blocks (Drive related)

Mar 24 21:22:17 Tower kernel: md4: running, size: 2930266532 blocks (Drive related)

Mar 24 21:22:17 Tower kernel: md5: running, size: 3907018532 blocks (Drive related)

Mar 24 21:22:17 Tower kernel: md6: running, size: 1465138552 blocks (Drive related)

Mar 24 21:22:17 Tower kernel: md7: running, size: 1465138552 blocks (Drive related)

Mar 24 21:22:17 Tower kernel: md8: running, size: 1953514552 blocks (Drive related)

Mar 24 21:22:17 Tower kernel: md9: running, size: 1953514552 blocks (Drive related)

Mar 24 21:22:17 Tower kernel: md10: running, size: 1953514552 blocks (Drive related)

Mar 24 21:22:17 Tower kernel: md11: running, size: 3907018532 blocks (Drive related)

Mar 24 21:22:17 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Console No such file or directory (Other emhttp)

Mar 24 21:22:17 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Music No such file or directory (Other emhttp)

Mar 24 21:22:17 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Personal No such file or directory (Other emhttp)

Mar 24 21:22:17 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Pictures No such file or directory (Other emhttp)

Mar 24 21:22:17 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Videos No such file or directory (Other emhttp)

Mar 24 21:22:17 Tower emhttp: get_filesystem_status: statfs: /mnt/user/mysql No such file or directory (Other emhttp)

Mar 24 21:22:17 Tower emhttp: shcmd (91): udevadm settle (Other emhttp)

Mar 24 21:22:17 Tower emhttp: shcmd (92): /usr/local/sbin/emhttp_event array_started (Other emhttp)

Mar 24 21:22:17 Tower emhttp_event: array_started (Other emhttp)

Mar 24 21:22:17 Tower emhttp: Mounting disks... (Other emhttp)

Mar 24 21:22:17 Tower emhttp: shcmd (93): mkdir /mnt/disk1 (Routine)

Mar 24 21:22:17 Tower emhttp: shcmd (94): set -o pipefail ; mount -t reiserfs -o user_xattr,acl,noatime,nodiratime /dev/md1 /mnt/disk1 |$stuff$ logger (Other emhttp)

Mar 24 21:22:17 Tower kernel: REISERFS (device md1): found reiserfs format "3.6" with standard journal (Routine)

Mar 24 21:22:17 Tower kernel: REISERFS (device md1): using ordered data mode (Routine)

Mar 24 21:22:17 Tower kernel: reiserfs: using flush barriers

Mar 24 21:22:17 Tower kernel: REISERFS (device md1): journal params: device md1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 (Routine)

Mar 24 21:22:17 Tower kernel: REISERFS (device md1): checking transaction log (md1) (Routine)

Mar 24 21:22:18 Tower kernel: REISERFS (device md1): Using r5 hash to sort names (Routine)

Mar 24 21:22:18 Tower emhttp: shcmd (95): mkdir /mnt/disk2 (Routine)

Mar 24 21:22:18 Tower emhttp: shcmd (96): set -o pipefail ; mount -t reiserfs -o user_xattr,acl,noatime,nodiratime /dev/md2 /mnt/disk2 |$stuff$ logger (Other emhttp)

Mar 24 21:22:18 Tower kernel: REISERFS (device md2): found reiserfs format "3.6" with standard journal (Routine)

Mar 24 21:22:18 Tower kernel: REISERFS (device md2): using ordered data mode (Routine)

Mar 24 21:22:18 Tower kernel: reiserfs: using flush barriers

Mar 24 21:22:18 Tower kernel: REISERFS (device md2): journal params: device md2, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 (Routine)

Mar 24 21:22:18 Tower kernel: REISERFS (device md2): checking transaction log (md2) (Routine)

Mar 24 21:22:18 Tower kernel: REISERFS (device md2): Using r5 hash to sort names (Routine)

Mar 24 21:22:18 Tower emhttp: shcmd (97): mkdir /mnt/disk3 (Routine)

Mar 24 21:22:18 Tower emhttp: shcmd (98): set -o pipefail ; mount -t reiserfs -o user_xattr,acl,noatime,nodiratime /dev/md3 /mnt/disk3 |$stuff$ logger (Other emhttp)

Mar 24 21:22:18 Tower kernel: REISERFS (device md3): found reiserfs format "3.6" with standard journal (Routine)

Mar 24 21:22:18 Tower kernel: REISERFS (device md3): using ordered data mode (Routine)

Mar 24 21:22:18 Tower kernel: reiserfs: using flush barriers

Mar 24 21:22:18 Tower kernel: REISERFS (device md3): journal params: device md3, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 (Routine)

Mar 24 21:22:18 Tower kernel: REISERFS (device md3): checking transaction log (md3) (Routine)

Mar 24 21:22:18 Tower kernel: REISERFS (device md3): Using r5 hash to sort names (Routine)

Mar 24 21:22:18 Tower emhttp: shcmd (99): mkdir /mnt/disk4 (Routine)

Mar 24 21:22:18 Tower emhttp: shcmd (100): set -o pipefail ; mount -t reiserfs -o user_xattr,acl,noatime,nodiratime /dev/md4 /mnt/disk4 |$stuff$ logger (Other emhttp)

Mar 24 21:22:18 Tower kernel: REISERFS (device md4): found reiserfs format "3.6" with standard journal (Routine)

Mar 24 21:22:18 Tower kernel: REISERFS (device md4): using ordered data mode (Routine)

Mar 24 21:22:18 Tower kernel: reiserfs: using flush barriers

Mar 24 21:22:18 Tower kernel: REISERFS (device md4): journal params: device md4, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 (Routine)

Mar 24 21:22:18 Tower kernel: REISERFS (device md4): checking transaction log (md4) (Routine)

Mar 24 21:22:18 Tower kernel: REISERFS (device md4): Using r5 hash to sort names (Routine)

Mar 24 21:22:18 Tower emhttp: shcmd (101): mkdir /mnt/disk5 (Routine)

Mar 24 21:22:18 Tower emhttp: shcmd (102): set -o pipefail ; mount -t reiserfs -o user_xattr,acl,noatime,nodiratime /dev/md5 /mnt/disk5 |$stuff$ logger (Other emhttp)

Mar 24 21:22:18 Tower kernel: REISERFS (device md5): found reiserfs format "3.6" with standard journal (Routine)

Mar 24 21:22:18 Tower kernel: REISERFS (device md5): using ordered data mode (Routine)

Mar 24 21:22:18 Tower kernel: reiserfs: using flush barriers

Mar 24 21:22:18 Tower kernel: REISERFS (device md5): journal params: device md5, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 (Routine)

Mar 24 21:22:18 Tower kernel: REISERFS (device md5): checking transaction log (md5) (Routine)

Mar 24 21:22:19 Tower kernel: REISERFS (device md5): Using r5 hash to sort names (Routine)

 

 

 

 

 

i'm hesitant to do anything at this point for fear of destroying my data... but i've never noticed it take this long at this point... this is a brand new drive that was not precleared... should I just leave it alone ?

 

 

Link to comment

Sorry - not quite clear what you did.

 

The 2 pending sectors might have very well resolved if you had run a parity check. Although a pending sector can interfere with a drive rebuild, my experience is that they can also be harmless or go away.

 

But if you decided to replace the disk, that should also be fine.

 

I am not clear what disk you replaced. Was it disk6?

 

I am wondering if you have a bad or loose cable and did not secure disk6 fully, because it seems to have mounted disk5 and then stopped according to the log.

 

This is not a good situation, but I do not think you are at serious risk for loosing data.

 

Does the WebGui come up? Is it responsive? I think I would try and run the powerdown script, shutdown the server, ensure all of the drives are properly secured. And boot again. I am not sure exactly the state you will find, but likely it will be prepared to rebuild the removed disk.

 

Sorry can't be more help. I will be offline until tomorrow, but am sure someone can help further if you need add'l assistance.

Link to comment

well, it looks like it just needed some time... it finished mounting the rest of my disks and now rebuilding the drive... weird...

 

is the disk24 mount error anything to be concerned about though ?

 

Mar 24 22:05:24 Tower emhttp: resized: /mnt/disk5 (Other emhttp)

Mar 24 22:05:24 Tower emhttp: shcmd (103): mkdir /mnt/disk6 (Routine)

Mar 24 22:05:24 Tower emhttp: shcmd (104): set -o pipefail ; mount -t reiserfs -o user_xattr,acl,noatime,nodiratime /dev/md6 /mnt/disk6 |$stuff$ logger (Other emhttp)

Mar 24 22:05:24 Tower kernel: REISERFS (device md6): found reiserfs format "3.6" with standard journal (Routine)

Mar 24 22:05:24 Tower kernel: REISERFS (device md6): using ordered data mode (Routine)

Mar 24 22:05:24 Tower kernel: reiserfs: using flush barriers

Mar 24 22:05:24 Tower kernel: REISERFS (device md6): journal params: device md6, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 (Routine)

Mar 24 22:05:24 Tower kernel: REISERFS (device md6): checking transaction log (md6) (Routine)

Mar 24 22:05:24 Tower kernel: REISERFS (device md6): Using r5 hash to sort names (Routine)

Mar 24 22:05:24 Tower emhttp: shcmd (105): mkdir /mnt/disk7 (Routine)

Mar 24 22:05:24 Tower emhttp: shcmd (106): set -o pipefail ; mount -t reiserfs -o user_xattr,acl,noatime,nodiratime /dev/md7 /mnt/disk7 |$stuff$ logger (Other emhttp)

Mar 24 22:05:24 Tower kernel: REISERFS (device md7): found reiserfs format "3.6" with standard journal (Routine)

Mar 24 22:05:24 Tower kernel: REISERFS (device md7): using ordered data mode (Routine)

Mar 24 22:05:24 Tower kernel: reiserfs: using flush barriers

Mar 24 22:05:24 Tower kernel: REISERFS (device md7): journal params: device md7, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 (Routine)

Mar 24 22:05:24 Tower kernel: REISERFS (device md7): checking transaction log (md7) (Routine)

Mar 24 22:05:24 Tower kernel: REISERFS (device md7): Using r5 hash to sort names (Routine)

Mar 24 22:05:24 Tower emhttp: shcmd (107): mkdir /mnt/disk8 (Routine)

Mar 24 22:05:24 Tower emhttp: shcmd (108): set -o pipefail ; mount -t reiserfs -o user_xattr,acl,noatime,nodiratime /dev/md8 /mnt/disk8 |$stuff$ logger (Other emhttp)

Mar 24 22:05:24 Tower kernel: REISERFS (device md8): found reiserfs format "3.6" with standard journal (Routine)

Mar 24 22:05:24 Tower kernel: REISERFS (device md8): using ordered data mode (Routine)

Mar 24 22:05:24 Tower kernel: reiserfs: using flush barriers

Mar 24 22:05:24 Tower kernel: REISERFS (device md8): journal params: device md8, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 (Routine)

Mar 24 22:05:24 Tower kernel: REISERFS (device md8): checking transaction log (md8) (Routine)

Mar 24 22:05:24 Tower kernel: REISERFS (device md8): Using r5 hash to sort names (Routine)

Mar 24 22:05:24 Tower emhttp: shcmd (109): mkdir /mnt/disk9 (Routine)

Mar 24 22:05:24 Tower emhttp: shcmd (110): set -o pipefail ; mount -t reiserfs -o user_xattr,acl,noatime,nodiratime /dev/md9 /mnt/disk9 |$stuff$ logger (Other emhttp)

Mar 24 22:05:24 Tower kernel: REISERFS (device md9): found reiserfs format "3.6" with standard journal (Routine)

Mar 24 22:05:24 Tower kernel: REISERFS (device md9): using ordered data mode (Routine)

Mar 24 22:05:24 Tower kernel: reiserfs: using flush barriers

Mar 24 22:05:24 Tower kernel: REISERFS (device md9): journal params: device md9, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 (Routine)

Mar 24 22:05:24 Tower kernel: REISERFS (device md9): checking transaction log (md9) (Routine)

Mar 24 22:05:24 Tower kernel: REISERFS (device md9): Using r5 hash to sort names (Routine)

Mar 24 22:05:25 Tower emhttp: shcmd (111): mkdir /mnt/disk10 (Routine)

Mar 24 22:05:25 Tower emhttp: shcmd (112): set -o pipefail ; mount -t reiserfs -o user_xattr,acl,noatime,nodiratime /dev/md10 /mnt/disk10 |$stuff$ logger (Other emhttp)

Mar 24 22:05:25 Tower kernel: REISERFS (device md10): found reiserfs format "3.6" with standard journal (Routine)

Mar 24 22:05:25 Tower kernel: REISERFS (device md10): using ordered data mode (Routine)

Mar 24 22:05:25 Tower kernel: reiserfs: using flush barriers

Mar 24 22:05:25 Tower kernel: REISERFS (device md10): journal params: device md10, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 (Routine)

Mar 24 22:05:25 Tower kernel: REISERFS (device md10): checking transaction log (md10) (Routine)

Mar 24 22:05:25 Tower kernel: REISERFS (device md10): Using r5 hash to sort names (Routine)

Mar 24 22:05:25 Tower emhttp: shcmd (113): mkdir /mnt/disk11 (Routine)

Mar 24 22:05:25 Tower emhttp: shcmd (114): set -o pipefail ; mount -t reiserfs -o user_xattr,acl,noatime,nodiratime /dev/md11 /mnt/disk11 |$stuff$ logger (Other emhttp)

Mar 24 22:05:25 Tower kernel: REISERFS (device md11): found reiserfs format "3.6" with standard journal (Routine)

Mar 24 22:05:25 Tower kernel: REISERFS (device md11): using ordered data mode (Routine)

Mar 24 22:05:25 Tower kernel: reiserfs: using flush barriers

Mar 24 22:05:25 Tower kernel: REISERFS (device md11): journal params: device md11, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 (Routine)

Mar 24 22:05:25 Tower kernel: REISERFS (device md11): checking transaction log (md11) (Routine)

Mar 24 22:05:25 Tower kernel: REISERFS (device md11): Using r5 hash to sort names (Routine)

Mar 24 22:05:25 Tower emhttp: shcmd (115): mkdir /mnt/cache (Other emhttp)

Mar 24 22:05:25 Tower emhttp: _shcmd: shcmd (115): exit status: 1 (Other emhttp)

Mar 24 22:05:25 Tower emhttp: shcmd (116): set -o pipefail ; mount -t reiserfs -o user_xattr,acl,noatime,nodiratime /dev/sdb1 /mnt/cache |$stuff$ logger (Drive related)

Mar 24 22:05:25 Tower logger: mount: /dev/sdb1 already mounted or /mnt/cache busy (Drive related)

Mar 24 22:05:25 Tower logger: mount: according to mtab, /dev/sdb1 is already mounted on /mnt/cache (Drive related)

Mar 24 22:05:25 Tower emhttp: _shcmd: shcmd (116): exit status: 32 (Other emhttp)

Mar 24 22:05:25 Tower emhttp: disk24 mount error: 32 (Errors)

Mar 24 22:05:25 Tower emhttp: shcmd (117): rmdir /mnt/cache (Other emhttp)

Mar 24 22:05:25 Tower emhttp: _shcmd: shcmd (117): exit status: 1 (Other emhttp)

Mar 24 22:05:26 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Console No such file or directory (Other emhttp)

Mar 24 22:05:26 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Music No such file or directory (Other emhttp)

Mar 24 22:05:26 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Personal No such file or directory (Other emhttp)

Mar 24 22:05:26 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Pictures No such file or directory (Other emhttp)

Mar 24 22:05:26 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Videos No such file or directory (Other emhttp)

Mar 24 22:05:26 Tower emhttp: get_filesystem_status: statfs: /mnt/user/mysql No such file or directory (Other emhttp)

Mar 24 22:05:26 Tower emhttp: shcmd (118): /usr/local/sbin/emhttp_event disks_mounted (Other emhttp)

Mar 24 22:05:26 Tower emhttp_event: disks_mounted (Other emhttp)

Mar 24 22:05:26 Tower kernel: mdcmd (53): check CORRECT (unRAID engine)

Mar 24 22:05:26 Tower kernel: md: recovery thread woken up ... (unRAID engine)

Mar 24 22:05:26 Tower kernel: md: recovery thread rebuilding disk5 ... (unRAID engine)

Mar 24 22:05:26 Tower kernel: md: using 5120k window, over a total of 3907018532 blocks. (unRAID engine)

Mar 24 22:05:27 Tower emhttp: shcmd (119): :>/etc/samba/smb-shares.conf (Other emhttp)

Mar 24 22:05:27 Tower avahi-daemon[1280]: Files changed, reloading.

Mar 24 22:05:27 Tower emhttp: Restart SMB... (Other emhttp)

Mar 24 22:05:27 Tower emhttp: shcmd (120): killall -HUP smbd (Minor Issues)

Mar 24 22:05:27 Tower emhttp: shcmd (121): cp /etc/avahi/services/smb.service- /etc/avahi/services/smb.service (Other emhttp)

Mar 24 22:05:27 Tower avahi-daemon[1280]: Files changed, reloading.

Mar 24 22:05:27 Tower emhttp: shcmd (122): ps axc | grep -q rpc.mountd (Other emhttp)

Mar 24 22:05:27 Tower avahi-daemon[1280]: Service group file /services/smb.service changed, reloading.

Mar 24 22:05:27 Tower emhttp: Stop NFS... (Other emhttp)

Mar 24 22:05:27 Tower emhttp: shcmd (123): /etc/rc.d/rc.nfsd stop |$stuff$ logger (Other emhttp)

Mar 24 22:05:27 Tower mountd[7570]: Caught signal 15, un-registering and exiting.

Mar 24 22:05:27 Tower avahi-daemon[1280]: Service "Tower" (/services/smb.service) successfully established.

Mar 24 22:05:28 Tower emhttp: shcmd (124): /usr/local/sbin/emhttp_event svcs_restarted (Other emhttp)

Mar 24 22:05:28 Tower emhttp_event: svcs_restarted (Other emhttp)

Mar 24 22:05:28 Tower emhttp: shcmd (125): /usr/local/sbin/emhttp_event started (Other emhttp)

Mar 24 22:05:28 Tower kernel: nfsd: last server has exited, flushing export cache

Mar 24 22:05:28 Tower emhttp_event: started (Other emhttp)

Mar 24 22:05:43 Tower avahi-daemon[1280]: Invalid response packet from host 192.168.0.119.

Mar 24 22:07:23 Tower avahi-daemon[1280]: Invalid response packet from host 192.168.0.119.

Mar 24 22:09:04 Tower avahi-daemon[1280]: Invalid response packet from host 192.168.0.119.

Mar 24 22:10:44 Tower avahi-daemon[1280]: Invalid response packet from host 192.168.0.119.

Mar 24 22:22:18 Tower emhttp: shcmd (126): /usr/sbin/hdparm -y /dev/sdb $stuff$> /dev/null (Drive related)

Mar 24 22:23:12 Tower avahi-daemon[1280]: Invalid response packet from host 192.168.0.119.

 

 

Link to comment

I'm confused about disk24. Looks like highest number array disk is 11?

 

Is the rebuild happening?

 

Can you access the contents of the disk being rebuilt?

 

At lease some unRAID 6.0 betas (including 14b) have issues with 24 disk arrays. What version are you using?

Link to comment

Yes, disk 11 is highest. I only have enough bays for 16 drives hence was confused about the disk24 mount reference. I'm using unRaid 5.0.3 I think.

 

It says the rebuild is happening but oddly I can't access my shares, smb doesn't seem to be running.

 

I have a few other warnings in log above relating to mounting cache drive and such, I have been ignoring it since everything seems to be working but is there something fishy?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.