April 8, 201313 yr I am using UnRaid 4.7. I have an array with 5 - 2tb drives. I had to replace data drive #3. After I rebuilt the array I got a message the data drive #3 and #4 were unformated. I don't know haw to proceed. Drive #3 is about half full and I had just started putting data on drive #4. This is my first Unraid experience. I am a experienced user but I do not have any experience getting into the operating system. I followed the instructions when replacing the drive. It seemed to be working properly. What is my next step? Can I save my data. The drives are still under warranty and I would like to get this straighented out before the warranty expires. I read that there seem to be a lot of hard drives that are not working properly (DOA of fail). I'm afraid that I may have either 1 or 2 bad drives. I did not do any prefiltering the hard drive. How do you preclean a hard drive? Please help. What are the next steps that I should take? http://pastebin.com/5SQLWJMi this is my syslog Howard [email protected]
April 8, 201313 yr What type of failure led you to replace disk3? The replacement disk is not correctly partitioned, or has file-system corruption and cannot be mounted. (to unRAID, any disk unable to be mounted is "unformatted") Mar 13 10:35:17 Tower logger: mount: wrong fs type, bad option, bad superblock on /dev/md3, Mar 13 10:35:17 Tower logger: missing codepage or helper program, or other error Mar 13 10:35:17 Tower logger: In some cases useful info is found in syslog - try Mar 13 10:35:17 Tower logger: dmesg | tail or so Mar 13 10:35:17 Tower logger: Mar 13 10:35:17 Tower emhttp: _shcmd: shcmd (22): exit status: 32 Mar 13 10:35:17 Tower emhttp: disk3 mount error: 32 Mar 13 10:35:17 Tower emhttp: shcmd (23): rmdir /mnt/disk3[/quote] Right now, disk 4 has tons of media errors (un-readable sectors) and it has been marked as needing file-system-repair. [code]Mar 13 10:35:50 Tower kernel: ata5.00: irq_stat 0x40000001 Mar 13 10:35:50 Tower kernel: ata5.00: failed command: READ DMA EXT Mar 13 10:35:50 Tower kernel: ata5.00: cmd 25/00:00:c7:31:00/00:04:00:00:00/e0 tag 0 dma 524288 in Mar 13 10:35:50 Tower kernel: res 51/40:6f:48:33:00/00:02:00:00:00/e0 Emask 0x9 (media error) Mar 13 10:35:50 Tower kernel: ata5.00: status: { DRDY ERR } Mar 13 10:35:50 Tower kernel: ata5.00: error: { UNC } Mar 13 10:35:50 Tower kernel: ata5.00: configured for UDMA/133 Mar 13 10:35:50 Tower kernel: sd 5:0:0:0: [sde] Unhandled sense code Mar 13 10:35:50 Tower kernel: sd 5:0:0:0: [sde] Result: hostbyte=0x00 driverbyte=0x08 Mar 13 10:35:50 Tower kernel: sd 5:0:0:0: [sde] Sense Key : 0x3 [current] [descriptor] Mar 13 10:35:50 Tower kernel: Descriptor sense data with sense descriptors (in hex): Mar 13 10:35:50 Tower kernel: 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Mar 13 10:35:50 Tower kernel: 00 00 33 48 Mar 13 10:35:50 Tower kernel: sd 5:0:0:0: [sde] ASC=0x11 ASCQ=0x4 Mar 13 10:35:50 Tower kernel: sd 5:0:0:0: [sde] CDB: cdb[0]=0x28: 28 00 00 00 31 c7 00 04 00 00 Mar 13 10:35:50 Tower kernel: end_request: I/O error, dev sde, sector 13128 Mar 13 10:35:50 Tower kernel: ata5: EH complete Mar 13 10:35:50 Tower kernel: md: disk4 read error Mar 13 10:35:50 Tower kernel: handle_stripe read error: 13064/4, count: 1 Mar 13 10:35:50 Tower kernel: md: disk4 read error Mar 13 10:35:50 Tower kernel: handle_stripe read error: 13072/4, count: 1 Mar 13 10:35:50 Tower kernel: md: disk4 read error Mar 13 10:35:50 Tower kernel: handle_stripe read error: 13080/4, count: 1 Mar 13 10:35:50 Tower kernel: md: disk4 read error Mar 13 10:35:50 Tower kernel: handle_stripe read error: 13088/4, count: 1 Mar 13 10:35:50 Tower kernel: md: disk4 read error Mar 13 10:35:50 Tower kernel: handle_stripe read error: 13096/4, count: 1 Mar 13 10:35:50 Tower kernel: md: disk4 read error Mar 13 10:35:50 Tower kernel: handle_stripe read error: 13104/4, count: 1 Mar 13 10:35:50 Tower kernel: md: disk4 read error Mar 13 10:35:50 Tower kernel: handle_stripe read error: 13112/4, count: 1 Mar 13 10:35:50 Tower kernel: md: disk4 read error Mar 13 10:35:17 Tower logger: mount: /dev/md4: can't read superblock Mar 13 10:35:17 Tower emhttp: _shcmd: shcmd (24): exit status: 32 Mar 13 10:35:17 Tower emhttp: disk4 mount error: 32 Mar 13 10:35:17 Tower emhttp: shcmd (25): rmdir /mnt/disk4 Mar 13 10:35:17 Tower kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one 1 Mar 13 10:35:17 Tower kernel: REISERFS error (device md4): vs-5150 search_by_key: invalid format found in block 8211. Fsck? Mar 13 10:35:17 Tower kernel: REISERFS (device md4): Remounting filesystem read-only Mar 13 10:35:17 Tower kernel: REISERFS error (device md4): vs-13070 reiserfs_read_locked_inode: i/o failure In addition, disk3 is identified as the "wrong" disk. I would put the original disk3 back in, and then see if you can get to where you only have one failed disk. Right now, you've basically got two with issues.
April 8, 201313 yr Author Joe L. Thanks for getting back to me. The original replacement disk was returned under the warranty. Its gone. What would be my next step? Am I basically screwed? Howard
April 9, 201313 yr Joe L. Thanks for getting back to me. The original replacement disk was returned under the warranty. Its gone. What would be my next step? Am I basically screwed? Howard Basically, yeah... Your only choice now is to try to repair the corrupt file-systems on the two disks. Do you remember if you partitioned them to start on sector 63 ? or on sector 64? Did disk3 ever finish its re-construction? If not, probably not likely to have much luck there. Personally, I'd concentrate on the disk4 with the media errors. Perhaps you can have more luck there. Step 1... get a smart report from disk4 to see how bad it is. smartctl -a /dev/sde Then, I would run a badblocks in non-destructive read/write mode badblocks -n -v -s -o /boot/badblocks_list.txt /dev/sde This will take MANY MANY HOURS on a 2TB drive. Make sure you do not close the terminal session. (probably best done using the system console as it is not going to go to sleep as would a PC with no activity.) Then, after it is through, get ANOTHER smart report, to see the effect of what it did (we are hoping it will allow the smart firmware on the disk to re-allocate the un-readable sectors)
April 9, 201313 yr Then, you can see if the first partition on the disk can be mounted. Either reboot and see if the disk mounts, or type mkdir /mnt/disk4 mount -t reiserfs -r /dev/sde1 /mnt/disk4 (note the device is /dev/sde1 with a "1" at the end of the name denoting the first partition) If the mount is successful, you can then reboot by typing reboot and see if unRAID can mount it. (No need to stop the array before the reboot this time, since it is already stopped)
April 10, 201313 yr Author Joe L., I did not partition disks 3 or 4 to start at any particular sector. I thought that it would start with the first sector. As far as I know, disk 3 finished reconstruction. I put the disk into the array. It was recognized and I got the blue dot. It asked if I was sure that a wanted t rebuild the array and the about 18 hours later (the next morning) I got the unformated message. Disk 4 had been the array and was fine until this trouble occurred. You suggested running a smart report on disk 4. Is the command you sent (which I need because I am not good with command line) specific for disk 4? I am assuming I run the commands that you gave from the console. Thanks Howard
April 15, 201313 yr Author Joe L. It has been at least 5 years since I have any Linux command line stuff. I realized just how stupid my last response was. I ran the "smartctl -a /dev/sde" command. After that I tried several times to run the "badblocks -n -v -s -o /boot/badblocks_list.txt /dev/sde" command. I got the message "/dev/sde is apparently in use by the system; It's not safe to run badblocks" I turned the server off for a while and I got the same response. Should I just go back to square one on these drives and try to format them? Will I loose any data (I don't think so) on my other drives? I want to find out if the drives are bad and return them under warranty if they are bad Thanks for your help
Archived
This topic is now archived and is closed to further replies.