December 19, 201015 yr Attached is the syslog and you will see a lot of errors related to the cache drive (sdj). The problem I am having is that it seems to have been mounted read-only, but I can't find a reference to it happening in the syslog. If anyone would be so kind as to look through the syslog for me and let me know what my next step should be that would be great! Thanks syslog-2010-12-19.txt.zip
December 20, 201015 yr I can't help ya, it's way above my head, but doesn't unraid lock the disk when the dma error occurred so you can no longer write to it - or does that not apply to a cache disk? This would be the error I'm talking about... Dec 18 23:13:32 Andromeda kernel: ata9.00: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen Dec 18 23:13:32 Andromeda kernel: ata9.00: irq_stat 0x00400040, connection status changed Dec 18 23:13:32 Andromeda kernel: ata9: SError: { PHYRdyChg DevExch } Dec 18 23:13:32 Andromeda kernel: ata9.00: failed command: READ DMA Dec 18 23:13:32 Andromeda kernel: ata9.00: cmd c8/00:20:17:36:29/00:00:00:00:00/e0 tag 0 dma 16384 in Dec 18 23:13:32 Andromeda kernel: res 50/00:00:7e:a9:33/00:00:00:00:00/e0 Emask 0x10 (ATA bus error) Dec 18 23:13:32 Andromeda kernel: ata9.00: status: { DRDY } Dec 18 23:13:32 Andromeda kernel: ata9: hard resetting link Dec 18 23:13:35 Andromeda kernel: ata9: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Dec 18 23:13:35 Andromeda kernel: ata9.00: configured for UDMA/133 Dec 18 23:13:35 Andromeda kernel: ata9: EH complete Dec 18 23:13:35 Andromeda kernel: ata9: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen Dec 18 23:13:35 Andromeda kernel: ata9: irq_stat 0x00400040, connection status changed Dec 18 23:13:35 Andromeda kernel: ata9: SError: { PHYRdyChg DevExch } Dec 18 23:13:35 Andromeda kernel: ata9: hard resetting link
December 20, 201015 yr Attached is the syslog and you will see a lot of errors related to the cache drive (sdj). The problem I am having is that it seems to have been mounted read-only, but I can't find a reference to it happening in the syslog. If anyone would be so kind as to look through the syslog for me and let me know what my next step should be that would be great! Thanks Suggest you shutdown and replug the cables to/from this disk. Then power up and take a smart report. If the file system gets corrupted unRAID will mount a disk read only. This would be the first time I've heard of this happening to the cache drive, but believe running reiserfsck would be the right next step.
December 20, 201015 yr Author Attached is the syslog and you will see a lot of errors related to the cache drive (sdj). The problem I am having is that it seems to have been mounted read-only, but I can't find a reference to it happening in the syslog. If anyone would be so kind as to look through the syslog for me and let me know what my next step should be that would be great! Thanks Suggest you shutdown and replug the cables to/from this disk. Then power up and take a smart report. If the file system gets corrupted unRAID will mount a disk read only. This would be the first time I've heard of this happening to the cache drive, but believe running reiserfsck would be the right next step. The odd thing about this is that the output of mount shows it as rw root@Andromeda:~# mount fusectl on /sys/fs/fuse/connections type fusectl (rw) usbfs on /proc/bus/usb type usbfs (rw) /dev/sdd1 on /boot type vfat (rw,noatime,nodiratime,umask=0,shortname=mixed) /dev/sdj1 on /mnt/cache type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr) /dev/md8 on /mnt/disk8 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr) /dev/md2 on /mnt/disk2 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr) /dev/md4 on /mnt/disk4 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr) /dev/md1 on /mnt/disk1 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr) /dev/md5 on /mnt/disk5 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr) /dev/md7 on /mnt/disk7 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr) /dev/md6 on /mnt/disk6 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr) /dev/md3 on /mnt/disk3 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr) shfs on /mnt/user type fuse.shfs (rw,nosuid,nodev,noatime,allow_other,default_permissions) shfs on /mnt/user0 type fuse.shfs (rw,nosuid,nodev,noatime,allow_other,default_permissions) I would love some input from the linux guru's so we can figure out what might be going on. If I was to run a reiserfsck what would I run it on (sdj1 I assume?) and would I need to stop samba and all that good stuff? Thanks
December 20, 201015 yr You would need to un-mount the cache drive. If it is being held busy you'll not be able to un-mount it. (and it will tell you it cannot) Yes reiserfsck would be run on /dev/sdj1 once the disk was unmounted.
December 20, 201015 yr Author You would need to un-mount the cache drive. If it is being held busy you'll not be able to un-mount it. (and it will tell you it cannot) Yes reiserfsck would be run on /dev/sdj1 once the disk was unmounted. I figured as much, I just found it extremely odd that I could not find, in the syslog, where the cache drive was remounted as read-only. I stopped everything on the cache drive and unmounted it, ran a check, it found 3 corruptions and it told me to run a --rebuild-tree. So I am doing that now, should not take to long as the cache drive is only 200GB.
December 20, 201015 yr Author So, it started running and then aborted at about 80%. It looks like 2 offline_uncorrectable have appeared in the smart report and there are now 2 pending sectors. I can't get a --rebuild-tree to complete though reiserfsck suggested running "badblocks" to build a file I could pass it so that it know what ones where bad. I did so and then a restart of --rebuild-tree proved fruitless again. Short of this one thing, the drive has been working OK, hell I even passed it through 3 rounds of preclear 6ish months ago. Any help/guidance is appreciated!
December 20, 201015 yr So, it started running and then aborted at about 80%. It looks like 2 offline_uncorrectable have appeared in the smart report and there are now 2 pending sectors. I can't get a --rebuild-tree to complete though reiserfsck suggested running "badblocks" to build a file I could pass it so that it know what ones where bad. I did so and then a restart of --rebuild-tree proved fruitless again. Short of this one thing, the drive has been working OK, hell I even passed it through 3 rounds of preclear 6ish months ago. Any help/guidance is appreciated! Sounds like it aborted on the disk errors. The "badblocks" program does not exist on unRAID but it does much the same as the preclear script in reading the entire disk and optionally writing it. Just run reiserfsck again. Joe L.
December 20, 201015 yr Author The "badblocks" program does not exist on unRAID but it does much the same as the preclear script in reading the entire disk and optionally writing it. If I run a "which badblocks" on my 5.0b2 box I get an output of "/sbin/badblocks." It appears to be that "badblocks" does indeed exist on the 5.0b2 Just run reiserfsck again. Doing that now! Thanks for the help and I will let you know what happens EDIT: Shit, that was a little two quick... output from the --rebuild-tree: Loading on-disk bitmap .. ok, 22535351 blocks marked used Skipping 9701 blocks (super block, journal, bitmaps) 22525650 blocks will be read 0% left 22385347, 4251 /sec The problem has occurred looks like a hardware problem. If you have bad blocks, we advise you to get a new hard drive, because once you get one bad block that the disk drive internals cannot hide from your sight,the chances of getting more are generally said to become much higher (precise statistics are unknown to us), and this disk drive is probably not expensive enough for you to you to risk your time and data on it. If you don't want to follow that follow that advice then if you have just a few bad blocks, try writing to the bad blocks and see if the drive remaps the bad blocks (that means it takes a block it has in reserve and allocates it for use for of that block number). If it cannot remap the block, use badblock option (-B) with reiserfs utils to handle this block correctly. bread: Cannot read the block (148523): (Input/output error). Aborted
Archived
This topic is now archived and is closed to further replies.