Read-only cache drive

December 19, 201015 yr

Attached is the syslog and you will see a lot of errors related to the cache drive (sdj). The problem I am having is that it seems to have been mounted read-only, but I can't find a reference to it happening in the syslog.

If anyone would be so kind as to look through the syslog for me and let me know what my next step should be that would be great!

Thanks

syslog-2010-12-19.txt.zip

December 20, 201015 yr

I can't help ya, it's way above my head, but doesn't unraid lock the disk when the dma error occurred so you can no longer write to it - or does that not apply to a cache disk?

This would be the error I'm talking about...

Dec 18 23:13:32 Andromeda kernel: ata9.00: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen
Dec 18 23:13:32 Andromeda kernel: ata9.00: irq_stat 0x00400040, connection status changed
Dec 18 23:13:32 Andromeda kernel: ata9: SError: { PHYRdyChg DevExch }
Dec 18 23:13:32 Andromeda kernel: ata9.00: failed command: READ DMA
Dec 18 23:13:32 Andromeda kernel: ata9.00: cmd c8/00:20:17:36:29/00:00:00:00:00/e0 tag 0 dma 16384 in
Dec 18 23:13:32 Andromeda kernel:          res 50/00:00:7e:a9:33/00:00:00:00:00/e0 Emask 0x10 (ATA bus error)
Dec 18 23:13:32 Andromeda kernel: ata9.00: status: { DRDY }
Dec 18 23:13:32 Andromeda kernel: ata9: hard resetting link
Dec 18 23:13:35 Andromeda kernel: ata9: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Dec 18 23:13:35 Andromeda kernel: ata9.00: configured for UDMA/133
Dec 18 23:13:35 Andromeda kernel: ata9: EH complete
Dec 18 23:13:35 Andromeda kernel: ata9: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen
Dec 18 23:13:35 Andromeda kernel: ata9: irq_stat 0x00400040, connection status changed
Dec 18 23:13:35 Andromeda kernel: ata9: SError: { PHYRdyChg DevExch }
Dec 18 23:13:35 Andromeda kernel: ata9: hard resetting link

December 20, 201015 yr

Attached is the syslog and you will see a lot of errors related to the cache drive (sdj). The problem I am having is that it seems to have been mounted read-only, but I can't find a reference to it happening in the syslog.

If anyone would be so kind as to look through the syslog for me and let me know what my next step should be that would be great!

Thanks

Suggest you shutdown and replug the cables to/from this disk. Then power up and take a smart report.

If the file system gets corrupted unRAID will mount a disk read only. This would be the first time I've heard of this happening to the cache drive, but believe running reiserfsck would be the right next step.

December 20, 201015 yr

Author

Attached is the syslog and you will see a lot of errors related to the cache drive (sdj). The problem I am having is that it seems to have been mounted read-only, but I can't find a reference to it happening in the syslog.

If anyone would be so kind as to look through the syslog for me and let me know what my next step should be that would be great!

Thanks

Suggest you shutdown and replug the cables to/from this disk. Then power up and take a smart report.

If the file system gets corrupted unRAID will mount a disk read only. This would be the first time I've heard of this happening to the cache drive, but believe running reiserfsck would be the right next step.

The odd thing about this is that the output of mount shows it as rw

root@Andromeda:~# mount
fusectl on /sys/fs/fuse/connections type fusectl (rw)
usbfs on /proc/bus/usb type usbfs (rw)
/dev/sdd1 on /boot type vfat (rw,noatime,nodiratime,umask=0,shortname=mixed)
/dev/sdj1 on /mnt/cache type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)
/dev/md8 on /mnt/disk8 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)
/dev/md2 on /mnt/disk2 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)
/dev/md4 on /mnt/disk4 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)
/dev/md1 on /mnt/disk1 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)
/dev/md5 on /mnt/disk5 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)
/dev/md7 on /mnt/disk7 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)
/dev/md6 on /mnt/disk6 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)
/dev/md3 on /mnt/disk3 type reiserfs (rw,noatime,nodiratime,noacl,nouser_xattr)
shfs on /mnt/user type fuse.shfs (rw,nosuid,nodev,noatime,allow_other,default_permissions)
shfs on /mnt/user0 type fuse.shfs (rw,nosuid,nodev,noatime,allow_other,default_permissions)

I would love some input from the linux guru's so we can figure out what might be going on. If I was to run a reiserfsck what would I run it on (sdj1 I assume?) and would I need to stop samba and all that good stuff?

Thanks

December 20, 201015 yr

You would need to un-mount the cache drive. If it is being held busy you'll not be able to un-mount it. (and it will tell you it cannot)

Yes reiserfsck would be run on /dev/sdj1 once the disk was unmounted.

December 20, 201015 yr

Author

You would need to un-mount the cache drive. If it is being held busy you'll not be able to un-mount it. (and it will tell you it cannot)

Yes reiserfsck would be run on /dev/sdj1 once the disk was unmounted.

I figured as much, I just found it extremely odd that I could not find, in the syslog, where the cache drive was remounted as read-only.

I stopped everything on the cache drive and unmounted it, ran a check, it found 3 corruptions and it told me to run a --rebuild-tree. So I am doing that now, should not take to long as the cache drive is only 200GB.

December 20, 201015 yr

Author

So, it started running and then aborted at about 80%. It looks like 2 offline_uncorrectable have appeared in the smart report and there are now 2 pending sectors.

I can't get a --rebuild-tree to complete though reiserfsck suggested running "badblocks" to build a file I could pass it so that it know what ones where bad. I did so and then a restart of --rebuild-tree proved fruitless again.

Short of this one thing, the drive has been working OK, hell I even passed it through 3 rounds of preclear 6ish months ago.

Any help/guidance is appreciated!

December 20, 201015 yr

So, it started running and then aborted at about 80%. It looks like 2 offline_uncorrectable have appeared in the smart report and there are now 2 pending sectors.

I can't get a --rebuild-tree to complete though reiserfsck suggested running "badblocks" to build a file I could pass it so that it know what ones where bad. I did so and then a restart of --rebuild-tree proved fruitless again.

Short of this one thing, the drive has been working OK, hell I even passed it through 3 rounds of preclear 6ish months ago.

Any help/guidance is appreciated!

Sounds like it aborted on the disk errors.

The "badblocks" program does not exist on unRAID but it does much the same as the preclear script in reading the entire disk and optionally writing it.

Just run reiserfsck again.

Joe L.

December 20, 201015 yr

Author

The "badblocks" program does not exist on unRAID but it does much the same as the preclear script in reading the entire disk and optionally writing it.

If I run a "which badblocks" on my 5.0b2 box I get an output of "/sbin/badblocks." It appears to be that "badblocks" does indeed exist on the 5.0b2

Just run reiserfsck again.

Doing that now!

Thanks for the help and I will let you know what happens

EDIT: Shit, that was a little two quick...

output from the --rebuild-tree:


Loading on-disk bitmap .. ok, 22535351 blocks marked used
Skipping 9701 blocks (super block, journal, bitmaps) 22525650 blocks will be read
0%                                                     left 22385347, 4251 /sec
The problem has occurred looks like a hardware problem. If you have
bad blocks, we advise you to get a new hard drive, because once you
get one bad block  that the disk  drive internals  cannot hide from
your sight,the chances of getting more are generally said to become
much higher  (precise statistics are unknown to us), and  this disk
drive is probably not expensive enough  for you to you to risk your
time and  data on it.  If you don't want to follow that follow that
advice then  if you have just a few bad blocks,  try writing to the
bad blocks  and see if the drive remaps  the bad blocks (that means
it takes a block  it has  in reserve  and allocates  it for use for
of that block number).  If it cannot remap the block,  use badblock
option (-B) with  reiserfs utils to handle this block correctly.

bread: Cannot read the block (148523): (Input/output error).

Aborted

Read-only cache drive

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)