MrD1234 Posted July 8, 2011 Share Posted July 8, 2011 I had 2 read / 2 write errors occur on a disk. The statistics for that drive do not update but writes to that drive are still allowed to proceed via a samba share on a user share. Jul 6 23:26:13 Tower2 kernel: sd 1:0:2:0: [sdc] Device not ready Jul 6 23:26:13 Tower2 kernel: sd 1:0:2:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08 Jul 6 23:26:13 Tower2 kernel: sd 1:0:2:0: [sdc] Sense Key : 0x2 [current] Jul 6 23:26:13 Tower2 kernel: sd 1:0:2:0: [sdc] ASC=0x4 ASCQ=0x2 Jul 6 23:26:13 Tower2 kernel: sd 1:0:2:0: [sdc] CDB: cdb[0]=0x28: 28 00 00 04 00 40 00 00 08 00 Jul 6 23:26:13 Tower2 kernel: end_request: I/O error, dev sdc, sector 262208 Jul 6 23:26:13 Tower2 kernel: md: disk2 read error Jul 6 23:26:13 Tower2 kernel: handle_stripe read error: 262144/2, count: 1 Jul 6 23:26:14 Tower2 kernel: sd 1:0:2:0: [sdc] Device not ready Jul 6 23:26:14 Tower2 kernel: sd 1:0:2:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08 Jul 6 23:26:14 Tower2 kernel: sd 1:0:2:0: [sdc] Sense Key : 0x2 [current] Jul 6 23:26:14 Tower2 kernel: sd 1:0:2:0: [sdc] ASC=0x4 ASCQ=0x2 Jul 6 23:26:14 Tower2 kernel: sd 1:0:2:0: [sdc] CDB: cdb[0]=0x2a: 2a 00 00 04 00 40 00 00 08 00 Jul 6 23:26:14 Tower2 kernel: end_request: I/O error, dev sdc, sector 262208 Jul 6 23:26:14 Tower2 kernel: md: disk2 write error Jul 6 23:26:14 Tower2 kernel: handle_stripe write error: 262144/2, count: 1 Jul 6 23:26:14 Tower2 kernel: md: recovery thread woken up ... Jul 6 23:26:14 Tower2 kernel: md: recovery thread has nothing to resync Jul 6 23:26:15 Tower2 kernel: sd 1:0:2:0: [sdc] Device not ready Jul 6 23:26:15 Tower2 kernel: sd 1:0:2:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08 Jul 6 23:26:15 Tower2 kernel: sd 1:0:2:0: [sdc] Sense Key : 0x2 [current] Jul 6 23:26:15 Tower2 kernel: sd 1:0:2:0: [sdc] ASC=0x4 ASCQ=0x2 Jul 6 23:26:15 Tower2 kernel: sd 1:0:2:0: [sdc] CDB: cdb[0]=0x28: 28 00 00 00 00 c0 00 00 08 00 Jul 6 23:26:15 Tower2 kernel: end_request: I/O error, dev sdc, sector 192 Jul 6 23:26:15 Tower2 kernel: md: disk2 read error Jul 6 23:26:15 Tower2 kernel: handle_stripe read error: 128/2, count: 1 Jul 6 23:26:15 Tower2 kernel: sd 1:0:2:0: [sdc] Device not ready Jul 6 23:26:15 Tower2 kernel: sd 1:0:2:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08 Jul 6 23:26:15 Tower2 kernel: sd 1:0:2:0: [sdc] Sense Key : 0x2 [current] Jul 6 23:26:15 Tower2 kernel: sd 1:0:2:0: [sdc] ASC=0x4 ASCQ=0x2 Jul 6 23:26:15 Tower2 kernel: sd 1:0:2:0: [sdc] CDB: cdb[0]=0x2a: 2a 00 00 00 00 c0 00 00 08 00 Jul 6 23:26:15 Tower2 kernel: end_request: I/O error, dev sdc, sector 192 Jul 6 23:26:15 Tower2 kernel: md: disk2 write error Jul 6 23:26:15 Tower2 kernel: handle_stripe write error: 128/2, count: 1 Jul 6 23:37:01 Tower2 crond[1148]: ignoring /var/spool/cron/crontabs/root- (non-existent user) I've read the system should be in a read only mode. This feels like a bug. Also given this is a test array, what is the best way to "clear" the error without replacing the drive? EDIT -- and the parity check buttons are gone Link to comment
lionelhutz Posted July 8, 2011 Share Posted July 8, 2011 That's exactly how it should work. The disk is being simulated using all the other array disks. The writes are done by updating the parity only. The whole purpose of the parity is to allow any single data disk to be simulated or rebuilt without losing data. The parity check button is gone because you need every data disk to be healthy before you can perform a successful parity build or parity check. Not sure what you mean by "clearing" the error. You have to either replace the disk and let the data be rebuilt or abandon the disk and all the data on it by initializing the array without it. Peter Link to comment
MrD1234 Posted July 8, 2011 Author Share Posted July 8, 2011 ok that makes sense. By "clear" i mean simulate replacing the drive I don't believe the drive is bad. I think it's a ESXi issue. Link to comment
Joe L. Posted July 8, 2011 Share Posted July 8, 2011 ok that makes sense. By "clear" i mean simulate replacing the drive I don't believe the drive is bad. I think it's a ESXi issue. same method as if it was actually devective, except you need to fool unRAID into thinking the old disk has been replaced. To do that you must get it to forget the model/serial number of the old disk. To do that: stop the array un-assign the failed disk. start the array with it un-assigned. (this will cause unRAID to forget the model/serial number of the old disk) stop the array re-assign the old disk. unRAID will think it is a replacement. start the array, unRAID will re-construct the contents of the old disk onto itself. Joe L. Link to comment
MrD1234 Posted July 8, 2011 Author Share Posted July 8, 2011 Thanks! There is something weird going on for sure. I added another disk to the LSI 9211 controller, added it as a raw virtual disk and mapped it into Unraid. It completed the re-construction successfully, but when I checked the array after I got home, I saw the same errors on a different physical disk. I am wondering if there is some kind of spin up delay that the controller / driver is not waiting long enough and the driver thinks it is a physical error. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.