Advice on disk issue

December 5, 201015 yr

Hello,

I'm troubleshooting an issue and looking for some advice. I have 5beta2 installed with a Supermicro AOC-SAT2-MV8 card. I have 5 drives in my array and a cache drive. When I boot up I have an issue with md4. So I thought the drive was bad, added a new drive and get the same issue. The web gui wants to format the drive, but when I say ok it fails with the error attached in the log.

I thought it may be SATA cable so I switched it out. no improvement

I then thought it may be dodgy card so I switched 4,5,6,7 to the internal SATA on the motherboard - same issue

I was running inside ESXi so I booted native to see if that was the problem - but get the same issue

I am now looking for advice - what else can I try?

Jon

Dec 5 17:51:47 Tower emhttp: shcmd (337): killall -HUP smbd

Dec 5 17:51:47 Tower emhttp: shcmd (338): /usr/local/emhttp/emhttp_event svcs_restarted

Dec 5 17:51:47 Tower emhttp_event: svcs_restarted

Dec 5 17:51:50 Tower emhttp: shcmd (339): mkreiserfs -q /dev/md4 2>&1 | logger

Dec 5 17:51:50 Tower logger: mkreiserfs 3.6.21 (2009 www.namesys.com)

Dec 5 17:51:50 Tower logger:

Dec 5 17:51:50 Tower logger: The problem has occurred looks like a hardware problem. If you have

Dec 5 17:51:50 Tower logger: bad blocks, we advise you to get a new hard drive, because once you

Dec 5 17:51:50 Tower logger: get one bad block that the disk drive internals cannot hide from

Dec 5 17:51:50 Tower logger: your sight,the chances of getting more are generally said to become

Dec 5 17:51:50 Tower logger: much higher (precise statistics are unknown to us), and this disk

Dec 5 17:51:50 Tower logger: drive is probably not expensive enough for you to you to risk your

Dec 5 17:51:50 Tower logger: time and data on it. If you don't want to follow that follow that

Dec 5 17:51:50 Tower logger: advice then if you have just a few bad blocks, try writing to the

Dec 5 17:51:50 Tower logger: bad blocks and see if the drive remaps the bad blocks (that means

Dec 5 17:51:50 Tower logger: it takes a block it has in reserve and allocates it for use for

Dec 5 17:51:50 Tower logger: of that block number). If it cannot remap the block, use badblock

Dec 5 17:51:50 Tower logger: option (-B) with reiserfs utils to handle this block correctly.

Dec 5 17:51:50 Tower logger:

Dec 5 17:51:50 Tower logger: bread: Cannot read the block (0): (Input/output error).

Dec 5 17:51:50 Tower logger:

Dec 5 17:51:50 Tower emhttp: shcmd (340): mkdir /mnt/disk4

Dec 5 17:51:50 Tower emhttp: shcmd (341): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md4 /mnt/disk4 2>&1 | logger

Dec 5 17:51:50 Tower logger: mount: wrong fs type, bad option, bad superblock on /dev/md4,

Dec 5 17:51:50 Tower logger: missing codepage or helper program, or other error

Dec 5 17:51:50 Tower logger: In some cases useful info is found in syslog - try

Dec 5 17:51:50 Tower logger: dmesg | tail or so

Dec 5 17:51:50 Tower logger:

Dec 5 17:51:50 Tower emhttp: _shcmd: shcmd (341): exit status: 32

Dec 5 17:51:50 Tower emhttp: disk4 mount error: 32

Dec 5 17:51:50 Tower emhttp: shcmd (342): rmdir /mnt/disk4

Dec 5 17:51:50 Tower kernel: REISERFS warning (device md4): sh-2006 read_super_block: bread failed (dev md4, block 2, size 4096)

Dec 5 17:51:50 Tower kernel: REISERFS warning (device md4): sh-2006 read_super_block: bread failed (dev md4, block 16, size 4096)

Dec 5 17:51:50 Tower kernel: REISERFS warning (device md4): sh-2021 reiserfs_fill_super: can not find reiserfs on md4

Dec 5 17:51:50 Tower emhttp: shcmd (343): rm /etc/samba/smb-shares.conf >/dev/null 2>&1

Dec 5 17:51:50 Tower emhttp: shcmd (344): cp /etc/exports- /etc/exports

Dec 5 17:51:50 Tower emhttp: get_config_idx: fopen /boot/config/shares/Downloads.cfg: No such file or directory - assigning defaults

Dec 5 17:51:50 Tower emhttp: get_config_idx: fopen /boot/config/shares/Jon.cfg: No such file or directory - assigning defaults

Dec 5 17:51:50 Tower emhttp: get_config_idx: fopen /boot/config/shares/Kids.cfg: No such file or directory - assigning defaults

Dec 5 17:51:50 Tower emhttp: get_config_idx: fopen /boot/config/shares/Movies.cfg: No such file or directory - assigning defaults

Dec 5 17:51:50 Tower emhttp: get_config_idx: fopen /boot/config/shares/Music.cfg: No such file or directory - assigning defaults

Dec 5 17:51:50 Tower emhttp: get_config_idx: fopen /boot/config/shares/Software.cfg: No such file or directory - assigning defaults

Dec 5 17:51:50 Tower emhttp: get_config_idx: fopen /boot/config/shares/Time Machine.cfg: No such file or directory - assigning defaults

Dec 5 17:51:50 Tower emhttp: get_config_idx: fopen /boot/config/shares/Videos.cfg: No such file or directory - assigning defaults

Dec 5 17:51:50 Tower emhttp: get_config_idx: fopen /boot/config/shares/esxi-backups.cfg: No such file or directory - assigning defaults

Dec 5 17:51:50 Tower emhttp: Restart CIFS...

Dec 5 17:51:50 Tower emhttp: shcmd (345): killall -HUP smbd

Dec 5 17:51:50 Tower emhttp: shcmd (346): /usr/local/emhttp/emhttp_event svcs_restarted

Dec 5 17:51:50 Tower emhttp_event: svcs_restarted

Quote

December 8, 201015 yr

Author

I've fixed this but I lost some data. I tried a new mobo, so by that point I had changed everything (disk, cable, mobo, card). I tried and initconfig and it worked immediately, actually saw the drive was already formatted and the array came up with all my data. Then about an hour later, unRAID fell over. Each time it rebooted, it fell over again. If I removed /dev/md4 then it was ok. So I tried adding in a blank drive and letting parity rebuild my missing disk - same issue, would not format md4. In the end I had to do another initconfig with my new drive and lost parity and therefore lost 600Gb data. Its all working now but I'm feeling bad about the loss. But I am using a beta I guess...

Jon

Quote

December 8, 201015 yr

I've fixed this but I lost some data. I tried a new mobo, so by that point I had changed everything (disk, cable, mobo, card). I tried and initconfig and it worked immediately, actually saw the drive was already formatted and the array came up with all my data. Then about an hour later, unRAID fell over. Each time it rebooted, it fell over again. If I removed /dev/md4 then it was ok. So I tried adding in a blank drive and letting parity rebuild my missing disk - same issue, would not format md4. In the end I had to do another initconfig with my new drive and lost parity and therefore lost 600Gb data. Its all working now but I'm feeling bad about the loss. But I am using a beta I guess...

Jon

What you did (typing "initconfig" with a failed disk) would have caused data loss regardless of the unRAID version. Also from your description, your failures have nothing at all to do with the beta version of unRAID but with a hardware issue on your server. Do not blame the loss on the beta version. I do not think it was the cause of your loss.

I would look at the cables used for disk4 (both power and data) or the port on the disk controller involved. It could as easily be a memory issue (have you done a memory test? Are the voltage timing, and clock speed set as specifically needed by your make/model memory strips?) It could be a heat-related issue. Did you install the heat-sink properly on your CPU? Is it still properly secured? It could be an over-stressed power supply or a poor connection or a bad "Y" power splitter.

I feel for your loss of data, but if you've not over-written the disk you were using it might still be possible to recover it. See here in the wiki: http://lime-technology.com/forum/index.php?topic=5087.msg47070#msg47070 (this will still work even if you re-formatted the disk)

Joe L.

Quote

Advice on disk issue

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)