EMKO Posted July 12, 2013 Share Posted July 12, 2013 trying to clear a drive first time after many hours i came back and the server was unresponsive with both web and ssh, on the second time i last checked it was at 20% and i left it over night when i checked it looks like it stopped because it showed the list of drives and on disc 6 drop down the drive is not there anymore had to reboot. The logs where massive im doing another clear right now so far in the logs is this tail -n 40 -f /var/log/syslog Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Jul 12 07:26:59 Tower kernel: Result: hostbyte=0x04 driverbyte=0x00 Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] CDB: Jul 12 07:26:59 Tower kernel: cdb[0]=0x8a: 8a 00 00 00 00 00 03 f9 84 00 00 00 00 40 00 00 Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] READ CAPACITY failed Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Jul 12 07:26:59 Tower kernel: Result: hostbyte=0x04 driverbyte=0x00 Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Sense not available. Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Write Protect is on Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Mode Sense: 80 bf 1d c5 Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Truncating mode parameter data from 32961 to 512 bytes Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Got wrong page Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Assuming drive cache: write through Jul 12 07:26:59 Tower kernel: sdf: detected capacity change from 3000592982016 to 0 Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] READ CAPACITY(16) failed Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Jul 12 07:26:59 Tower kernel: Result: hostbyte=0x04 driverbyte=0x00 Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Sense not available. Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] READ CAPACITY failed Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Jul 12 07:26:59 Tower kernel: Result: hostbyte=0x04 driverbyte=0x00 Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Sense not available. Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Write Protect is off Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Mode Sense: 00 00 00 00 Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Asking for cache data failed Jul 12 07:26:59 Tower kernel: sd 0:0:0:0: [sdf] Assuming drive cache: write through Jul 12 07:26:59 Tower ata_id[3467]: HDIO_GET_IDENTITY failed for '/dev/sdf' Jul 12 07:27:00 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Jul 12 07:27:02 Tower last message repeated 17 times Jul 12 07:27:02 Tower kernel: sas: sas_form_port: phy0 belongs to port0 already(1)! Jul 12 07:27:03 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Jul 12 07:27:34 Tower last message repeated 132 times Jul 12 07:28:20 Tower last message repeated 103 times Jul 12 07:28:21 Tower emhttp: clear: 3% complete Jul 12 07:28:22 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Jul 12 07:28:53 Tower last message repeated 68 times Jul 12 07:29:50 Tower last message repeated 134 times Jul 12 07:29:51 Tower emhttp: clear: 4% complete Jul 12 07:29:53 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Jul 12 07:30:24 Tower last message repeated 84 times Jul 12 07:31:21 Tower last message repeated 149 times Jul 12 07:31:23 Tower emhttp: clear: 5% complete Jul 12 07:31:23 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Jul 12 07:31:54 Tower last message repeated 84 times Quote Link to comment
EMKO Posted July 12, 2013 Author Share Posted July 12, 2013 ran another clear got to 100% then the logs started to get big again, web page changed to the drop down list again and in disc 6 i can choose (sdf) 0. The log just keeps repeating until it crashes the server. tail -n 40 -f /var/log/syslog Jul 12 09:56:04 Tower emhttp: WDC_WD20EADS-00R6B0_WD-WCAVY2267178 (hdc) 1953514584 Jul 12 09:56:04 Tower emhttp: ST4000DM000-1F2168_Z300D00R (sdb) 3907018584 Jul 12 09:56:04 Tower emhttp: WDC_WD30EZRX-00DC0B0_WD-WMC1T0564002 (sdc) 2930266584 Jul 12 09:56:04 Tower emhttp: WDC_WD20EARS-00MVWB0_WD-WMAZA1789360 (sdd) 1953514584 Jul 12 09:56:04 Tower emhttp: WDC_WD15EADS-00P8B0_WD-WMAVU0108337 (sde) 1465138584 Jul 12 09:56:04 Tower emhttp: (sdf) 0 Jul 12 09:56:04 Tower emhttp: WDC_WD20EARX-00PASB0_WD-WCAZA8276467 (sdg) 1953514584 Jul 12 09:56:04 Tower emhttp: ST3500418AS_9VM5HX2G (sdh) 488386584 Jul 12 09:56:04 Tower kernel: mdcmd (1): import 0 8,16 3907018532 ST4000DM000-1F2168_Z300D00R Jul 12 09:56:04 Tower kernel: md: import disk0: [8,16] (sdb) ST4000DM000-1F2168_Z300D00R size: 3907018532 Jul 12 09:56:04 Tower kernel: mdcmd (2): import 1 22,0 1953514552 WDC_WD20EADS-00R6B0_WD-WCAVY2267178 Jul 12 09:56:04 Tower kernel: md: import disk1: [22,0] (hdc) WDC_WD20EADS-00R6B0_WD-WCAVY2267178 size: 1953514552 Jul 12 09:56:04 Tower emhttp: shcmd (109): /usr/local/sbin/emhttp_event driver_loaded Jul 12 09:56:04 Tower kernel: mdcmd (3): import 2 8,64 1465138552 WDC_WD15EADS-00P8B0_WD-WMAVU0108337 Jul 12 09:56:04 Tower kernel: md: import disk2: [8,64] (sde) WDC_WD15EADS-00P8B0_WD-WMAVU0108337 size: 1465138552 Jul 12 09:56:04 Tower kernel: mdcmd (4): import 3 8,48 1953514552 WDC_WD20EARS-00MVWB0_WD-WMAZA1789360 Jul 12 09:56:04 Tower kernel: md: import disk3: [8,48] (sdd) WDC_WD20EARS-00MVWB0_WD-WMAZA1789360 size: 1953514552 Jul 12 09:56:04 Tower kernel: mdcmd (5): import 4 8,96 1953514552 WDC_WD20EARX-00PASB0_WD-WCAZA8276467 Jul 12 09:56:04 Tower kernel: md: import disk4: [8,96] (sdg) WDC_WD20EARX-00PASB0_WD-WCAZA8276467 size: 1953514552 Jul 12 09:56:04 Tower kernel: mdcmd (6): import 5 8,32 2930266532 WDC_WD30EZRX-00DC0B0_WD-WMC1T0564002 Jul 12 09:56:04 Tower kernel: md: import disk5: [8,32] (sdc) WDC_WD30EZRX-00DC0B0_WD-WMC1T0564002 size: 2930266532 Jul 12 09:56:04 Tower kernel: mdcmd (7): import 6 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (: import 7 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (9): import 8 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (10): import 9 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (11): import 10 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (12): import 11 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (13): import 12 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (14): import 13 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (15): import 14 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (16): import 15 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (17): import 16 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (18): import 17 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (19): import 18 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (20): import 19 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (21): import 20 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (22): import 21 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (23): import 22 0,0 Jul 12 09:56:04 Tower kernel: mdcmd (24): import 23 0,0 Jul 12 09:56:04 Tower emhttp_event: driver_loaded Jul 12 09:56:06 Tower emhttp: shcmd (110): rmmod md-mod |& logger Jul 12 09:56:06 Tower emhttp: shcmd (111): modprobe md-mod super=/boot/config/super.dat slots=24 |& logger Jul 12 09:56:06 Tower kernel: md: unRAID driver removed Jul 12 09:56:06 Tower emhttp: shcmd (112): udevadm settle Jul 12 09:56:06 Tower kernel: md: unRAID driver 2.1.6 installed Jul 12 09:56:06 Tower emhttp: Device inventory: Jul 12 09:56:06 Tower emhttp: WDC_WD20EADS-00R6B0_WD-WCAVY2267178 (hdc) 1953514584 Jul 12 09:56:06 Tower emhttp: ST4000DM000-1F2168_Z300D00R (sdb) 3907018584 Jul 12 09:56:06 Tower emhttp: WDC_WD30EZRX-00DC0B0_WD-WMC1T0564002 (sdc) 2930266584 Jul 12 09:56:06 Tower emhttp: WDC_WD20EARS-00MVWB0_WD-WMAZA1789360 (sdd) 1953514584 Jul 12 09:56:06 Tower emhttp: WDC_WD15EADS-00P8B0_WD-WMAVU0108337 (sde) 1465138584 Jul 12 09:56:06 Tower emhttp: (sdf) 0 Jul 12 09:56:06 Tower emhttp: WDC_WD20EARX-00PASB0_WD-WCAZA8276467 (sdg) 1953514584 Jul 12 09:56:06 Tower emhttp: ST3500418AS_9VM5HX2G (sdh) 488386584 Jul 12 09:56:06 Tower kernel: mdcmd (1): import 0 8,16 3907018532 ST4000DM000-1F2168_Z300D00R Jul 12 09:56:06 Tower kernel: md: import disk0: [8,16] (sdb) ST4000DM000-1F2168_Z300D00R size: 3907018532 Jul 12 09:56:06 Tower kernel: mdcmd (2): import 1 22,0 1953514552 WDC_WD20EADS-00R6B0_WD-WCAVY2267178 Jul 12 09:56:06 Tower kernel: md: import disk1: [22,0] (hdc) WDC_WD20EADS-00R6B0_WD-WCAVY2267178 size: 1953514552 Jul 12 09:56:06 Tower kernel: mdcmd (3): import 2 8,64 1465138552 WDC_WD15EADS-00P8B0_WD-WMAVU0108337 Jul 12 09:56:06 Tower kernel: md: import disk2: [8,64] (sde) WDC_WD15EADS-00P8B0_WD-WMAVU0108337 size: 1465138552 Jul 12 09:56:06 Tower kernel: mdcmd (4): import 3 8,48 1953514552 WDC_WD20EARS-00MVWB0_WD-WMAZA1789360 Jul 12 09:56:06 Tower kernel: md: import disk3: [8,48] (sdd) WDC_WD20EARS-00MVWB0_WD-WMAZA1789360 size: 1953514552 Jul 12 09:56:06 Tower kernel: mdcmd (5): import 4 8,96 1953514552 WDC_WD20EARX-00PASB0_WD-WCAZA8276467 Jul 12 09:56:06 Tower kernel: md: import disk4: [8,96] (sdg) WDC_WD20EARX-00PASB0_WD-WCAZA8276467 size: 1953514552 Jul 12 09:56:06 Tower emhttp: shcmd (113): /usr/local/sbin/emhttp_event driver_loaded Jul 12 09:56:06 Tower kernel: mdcmd (6): import 5 8,32 2930266532 WDC_WD30EZRX-00DC0B0_WD-WMC1T0564002 Jul 12 09:56:06 Tower kernel: md: import disk5: [8,32] (sdc) WDC_WD30EZRX-00DC0B0_WD-WMC1T0564002 size: 2930266532 Jul 12 09:56:06 Tower kernel: mdcmd (7): import 6 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (: import 7 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (9): import 8 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (10): import 9 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (11): import 10 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (12): import 11 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (13): import 12 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (14): import 13 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (15): import 14 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (16): import 15 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (17): import 16 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (18): import 17 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (19): import 18 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (20): import 19 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (21): import 20 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (22): import 21 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (23): import 22 0,0 Jul 12 09:56:06 Tower kernel: mdcmd (24): import 23 0,0 Jul 12 09:56:06 Tower emhttp_event: driver_loaded Jul 12 09:56:06 Tower emhttp: shcmd (114): rmmod md-mod |& logger Jul 12 09:56:07 Tower emhttp: shcmd (115): modprobe md-mod super=/boot/config/super.dat slots=24 |& logger Jul 12 09:56:07 Tower kernel: md: unRAID driver removed Jul 12 09:56:07 Tower kernel: md: unRAID driver 2.1.6 installed Jul 12 09:56:07 Tower emhttp: shcmd (116): udevadm settle Quote Link to comment
madburg Posted July 12, 2013 Share Posted July 12, 2013 @EMKO, please post your full syslog, when you have a moment. Download hdparm and smartmontools updates from this posting : http://lime-technology.com/forum/index.php?topic=28382.msg252874#msg252874 (original poster of these pulled them unfortunitly, long story; or I would have had you reference his). Update your system with those two packages, re-try your test. Quote Link to comment
EMKO Posted July 12, 2013 Author Share Posted July 12, 2013 @madburg here is my log, what i tried was to use preclear was working while it did a pre read but once it started to write, disc 4 showed up as red and cache just stopped working in preclear it was writing at 350MB/s these drives might be on hooked up to a Supermicro AOC-SASLP-MVL8 any way to tell if it is? im going to install those packages now syslog.txt Quote Link to comment
EMKO Posted July 12, 2013 Author Share Posted July 12, 2013 im stuck i rebooted server and disc 4 is still red cant start array Quote Link to comment
EMKO Posted July 13, 2013 Author Share Posted July 13, 2013 Just took the server apart all drives that failed once i added this drive are all on the same controller Supermicro AOC-SASLP-MVL8 DISC 4 red ball and cache failed while trying to clear DISC 6 witch is also on the same controller how do i tell unraid that this hard drive is fine so i don't have to rebuild it? and whats causing this problem? the controller? Quote Link to comment
Thornwood Posted July 13, 2013 Share Posted July 13, 2013 I saw a problem with my card an 4 tb drives seems the spin up wait time was taking to long. See if there is a seting in the card bios to wait longer. Also seems thus happened to me more when i restarted not from a full power down. See if this can help. I put my bigest drivrs off the card on to my mother board and have not had any more problems. Sent from my YP-G70 using Tapatalk 2 Quote Link to comment
RobJ Posted July 14, 2013 Share Posted July 14, 2013 I initially labeled this thread for v4.6 because the syslog you attached involved UnRAID v4.6. Later I saw that it was from January 23, so it probably was the wrong syslog attached. Plus this syslog is for a system with only 4 SATA ports and no SAS card. Without a syslog, it is hard to diagnose much, but there are a few things I can say from the tails you posted. In the first one, drive sdf has completely stopped responding, is not even answering queries about its identity. You probably should have aborted right there, and captured the syslog. What we needed to see were the very first error messages when it began to have issues. That would possibly have told us what was wrong right there. Since it is sdf, we know it's the 6th drive to be assigned, and it was on sd 0:0:0:0, so it was probably attached to a large SAS card. A clear was being performed, so you were adding a new drive, not PreCleared, and hopefully not the one known as sdf. The only way to recover a non-responding drive is to reboot, after which you can determine if the drive is responding or not. If not, then you check connections. If according to the syslog it is responding normally, then you should run SMART tests on it, to determine if it is a good drive still. If the drive is good, then you turn your attention to its cables or backplane connection, and its disk controller. If I could emphasize one point, whenever you have an issue, capture the syslog right then! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.