JustinChase Posted August 26, 2014 Share Posted August 26, 2014 So, I finally settled into a new place, and hooked up my server. I noticed some sketchy performance when watching video, and some investigation led me to discover that I had a drive red-balled. The array had started fine, but on drive seems to be bad. I took off the cover, and reset all the cables, and when I restarted, the array wouldn't even start, it couldn't even find the drive. I then re-reset all the cables, and then refreshed the array display, and the array started, but the drive is still red-balled. I have a 3TB drive on order, and the red-balled drive is a 2TB drive. So, I don't want to upgrade to beta7 just yet, and I'm not sure if there are any other troubleshooting steps I should take before just replacing the drive. The drive on order is actually meant to replace a 1TB drive that is showing some errors, not the one that has red-balled, so I suspect I need to order yet another drive, but I don't want to do too much changing until I get some guidance/suggestions on how I should proceed to at least get back to having a protected drive. Thoughts, ideas, suggestions? thanks! Link to comment
jphipps Posted August 26, 2014 Share Posted August 26, 2014 I would check the smart report on the drive, and run a short and long test. I had a drive get red balled in the past, and it was from some flakey issue booting that it didn't see the drive, and something most have written to the array and it would not come back. I had to re-install the same drive in the same slot and let it rebuild the drive to get it back up and working... If you wanted to be safe, rebuild that with the new disk, and the run a preclear on the red balled disk to make sure there are no errors on the drive.. Link to comment
JustinChase Posted August 26, 2014 Author Share Posted August 26, 2014 Hmmm... I can't find any way to run a smart test on the drive at this time. I may just be missing it, but it might be unavailable due to the red ball, I'm not sure. Link to comment
jphipps Posted August 26, 2014 Share Posted August 26, 2014 You would have to run it against the linux drive from a telnet or ssh shell, such as: to review the smart report: smartctl -a /dev/sda to run a long test: smartctl --test=long /dev/sda Link to comment
JustinChase Posted August 26, 2014 Author Share Posted August 26, 2014 ah, okay. Are you sure doing so won't 'harm' the driver further? I'm guessing not, but want to be sure I don't make matters worse by trying to make them better Link to comment
sgibbers17 Posted August 26, 2014 Share Posted August 26, 2014 There is always a chance that your drive may completely fail during testing, I had a drive do that once, but running a smart test will not change the data on the drive Link to comment
jphipps Posted August 26, 2014 Share Posted August 26, 2014 The report and the test are totally non-disruptive. You can monitor the test from the smart report. It will run about 3 hours, so make sure you aren't going to shutdown the server in that time.. Link to comment
JustinChase Posted August 26, 2014 Author Share Posted August 26, 2014 ...It will run about 3 hours... or less... root@media:~# smartctl -a /dev/sdm smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.15.0-unRAID] (local build) Copyright © 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org Smartctl open device: /dev/sdm failed: No such device root@media:~# smartctl --test=long /dev/sdm smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.15.0-unRAID] (local build) Copyright © 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org Smartctl open device: /dev/sdm failed: No such device Link to comment
itimpi Posted August 26, 2014 Share Posted August 26, 2014 If smartctl cannot see the drive, then that means as far as the system is concerned the drive is offline. I would check cables again, both SATA and power as bad connections are the commonest cause of this. However if the system is really not seeing the drive after powering the system off and on then maybe it has really failed. Link to comment
trurl Posted August 26, 2014 Share Posted August 26, 2014 If unRAID can't see the drive, how can it even be "sdm"? What about? ls -l /dev/disk/by-id Link to comment
JustinChase Posted August 26, 2014 Author Share Posted August 26, 2014 If unRAID can't see the drive, how can it even be "sdm"? What about? ls -l /dev/disk/by-id root@media:~# ls -l /dev/disk/by-id total 0 lrwxrwxrwx 1 root root 9 Aug 26 11:15 ata-Hitachi_HDS5C3030ALA630_MJ1323YNG1SUPC -> ../../sdg lrwxrwxrwx 1 root root 10 Aug 26 11:15 ata-Hitachi_HDS5C3030ALA630_MJ1323YNG1SUPC-part1 -> ../../sdg1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 ata-Hitachi_HDS5C3030ALA630_MJ1323YNG1TLWC -> ../../sdf lrwxrwxrwx 1 root root 10 Aug 26 11:15 ata-Hitachi_HDS5C3030ALA630_MJ1323YNG1TLWC-part1 -> ../../sdf1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 ata-Hitachi_HDS5C3030ALA630_MJ1323YNG1U3PC -> ../../sdd lrwxrwxrwx 1 root root 10 Aug 26 11:15 ata-Hitachi_HDS5C3030ALA630_MJ1323YNG1U3PC-part1 -> ../../sdd1 lrwxrwxrwx 1 root root 9 Aug 26 11:17 ata-Hitachi_HDS5C3030ALA630_MJ1323YNG1UB8C -> ../../sdj lrwxrwxrwx 1 root root 10 Aug 26 11:17 ata-Hitachi_HDS5C3030ALA630_MJ1323YNG1UB8C-part1 -> ../../sdj1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 ata-SAMSUNG_HD102UJ_S1ZUJDWS308586 -> ../../sdb lrwxrwxrwx 1 root root 10 Aug 26 11:15 ata-SAMSUNG_HD102UJ_S1ZUJDWS308586-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 ata-SAMSUNG_HD102UJ_S1ZUJDWS308596 -> ../../sdc lrwxrwxrwx 1 root root 10 Aug 26 11:15 ata-SAMSUNG_HD102UJ_S1ZUJDWS308596-part1 -> ../../sdc1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 ata-ST3500830AS_6QG123RD -> ../../sdh lrwxrwxrwx 1 root root 10 Aug 26 11:15 ata-ST3500830AS_6QG123RD-part1 -> ../../sdh1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 ata-ST4000DM000-1F2168_S30099K3 -> ../../sde lrwxrwxrwx 1 root root 10 Aug 26 11:15 ata-ST4000DM000-1F2168_S30099K3-part1 -> ../../sde1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 ata-ST4000DM000-1F2168_W300LGJ3 -> ../../sdi lrwxrwxrwx 1 root root 10 Aug 26 11:15 ata-ST4000DM000-1F2168_W300LGJ3-part1 -> ../../sdi1 lrwxrwxrwx 1 root root 9 Aug 26 11:26 ata-WDC_WD20EACS-11BHUB0_WD-WCAZA3758422 -> ../../sdk lrwxrwxrwx 1 root root 10 Aug 26 11:26 ata-WDC_WD20EACS-11BHUB0_WD-WCAZA3758422-part1 -> ../../sdk1 lrwxrwxrwx 1 root root 9 Aug 26 14:35 ata-WDC_WD20EARS-00MVWB0_WD-WCAZA2130725 -> ../../sdn lrwxrwxrwx 1 root root 10 Aug 26 14:35 ata-WDC_WD20EARS-00MVWB0_WD-WCAZA2130725-part1 -> ../../sdn1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 ata-WL3000GSA6472C_WOL240243956 -> ../../sdl lrwxrwxrwx 1 root root 10 Aug 26 11:15 ata-WL3000GSA6472C_WOL240243956-part1 -> ../../sdl1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 usb-_Patriot_Memory_07B80F151F7902B3-0:0 -> ../../sda lrwxrwxrwx 1 root root 10 Aug 26 11:15 usb-_Patriot_Memory_07B80F151F7902B3-0:0-part1 -> ../../sda1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 wwn-0x5000c50069e7f8b4 -> ../../sdi lrwxrwxrwx 1 root root 10 Aug 26 11:15 wwn-0x5000c50069e7f8b4-part1 -> ../../sdi1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 wwn-0x5000c5006d3c4a5f -> ../../sde lrwxrwxrwx 1 root root 10 Aug 26 11:15 wwn-0x5000c5006d3c4a5f-part1 -> ../../sde1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 wwn-0x5000cca228c0cdd2 -> ../../sdg lrwxrwxrwx 1 root root 10 Aug 26 11:15 wwn-0x5000cca228c0cdd2-part1 -> ../../sdg1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 wwn-0x5000cca228c0d0c0 -> ../../sdf lrwxrwxrwx 1 root root 10 Aug 26 11:15 wwn-0x5000cca228c0d0c0-part1 -> ../../sdf1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 wwn-0x5000cca228c0d2aa -> ../../sdd lrwxrwxrwx 1 root root 10 Aug 26 11:15 wwn-0x5000cca228c0d2aa-part1 -> ../../sdd1 lrwxrwxrwx 1 root root 9 Aug 26 11:17 wwn-0x5000cca228c0d395 -> ../../sdj lrwxrwxrwx 1 root root 10 Aug 26 11:17 wwn-0x5000cca228c0d395-part1 -> ../../sdj1 lrwxrwxrwx 1 root root 9 Aug 26 11:26 wwn-0x50014ee2057a96e9 -> ../../sdk lrwxrwxrwx 1 root root 10 Aug 26 11:26 wwn-0x50014ee2057a96e9-part1 -> ../../sdk1 lrwxrwxrwx 1 root root 9 Aug 26 14:35 wwn-0x50014ee2afd6fe55 -> ../../sdn lrwxrwxrwx 1 root root 10 Aug 26 14:35 wwn-0x50014ee2afd6fe55-part1 -> ../../sdn1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 wwn-0x50014eef0255eba8 -> ../../sdl lrwxrwxrwx 1 root root 10 Aug 26 11:15 wwn-0x50014eef0255eba8-part1 -> ../../sdl1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 wwn-0x50024e900126cff6 -> ../../sdb lrwxrwxrwx 1 root root 10 Aug 26 11:15 wwn-0x50024e900126cff6-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 9 Aug 26 11:15 wwn-0x50024e900126d0af -> ../../sdc lrwxrwxrwx 1 root root 10 Aug 26 11:15 wwn-0x50024e900126d0af-part1 -> ../../sdc1 Device Identification Temp. Size Used Free Reads Writes Errors View [spin Up] Parity ST4000DM000-1F2168_W300LGJ3 (sdi) 3907018532 * 4 TB - - 6253 6361 0 [spin Up] Disk 1 Hitachi_HDS5C3030ALA630_MJ1323YNG1UB8C (sdj) 2930266532 * 3 TB 2.77 TB 232 GB 966 33 0 [browse /mnt/disk1] [spin Up] Disk 2 Hitachi_HDS5C3030ALA630_MJ1323YNG1U3PC (sdd) 2930266532 * 3 TB 2.76 TB 236 GB 573 20 0 [browse /mnt/disk2] [spin Up] Disk 3 Hitachi_HDS5C3030ALA630_MJ1323YNG1TLWC (sdf) 2930266532 * 3 TB 2.77 TB 227 GB 1401 12 0 [browse /mnt/disk3] [spin Down] Disk 4 WDC_WD20EACS-11BHUB0_WD-WCAZA3758422 (sdk) 1953514552 30 °C 2 TB 1.50 TB 500 GB 777 105 0 [browse /mnt/disk4] [spin Down] Disk 5 Hitachi_HDS5C3030ALA630_MJ1323YNG1SUPC (sdg) 2930266532 33 °C 3 TB 2.32 TB 682 GB 6689 6054 0 [browse /mnt/disk5] [spin Down] Disk 6 SAMSUNG_HD102UJ_S1ZUJDWS308596 (sdc) 976762552 27 °C 1 TB 750 GB 250 GB 904 16 0 [browse /mnt/disk6] [spin Up] Disk 7 SAMSUNG_HD102UJ_S1ZUJDWS308586 (sdb) 976762552 * 1 TB 668 GB 333 GB 443 16 0 [browse /mnt/disk7] [spin Up] Disk 8 WL3000GSA6472C_WOL240243956 (sdl) 2930266532 * 3 TB 2.64 TB 363 GB 798 16 0 [browse /mnt/disk8] [spin Down] Disk 9 WDC_WD20EARS-00MVWB0_WD-WCAZA2130725 (sdm) 1953514552 * 2 TB 1.82 TB 178 GB 1226 16 0 [browse /mnt/disk9] [spin Down] Disk 10 ST4000DM000-1F2168_S30099K3 (sde) 3907018532 27 °C 4 TB 3 TB 999 GB 1827 97 0 [browse /mnt/disk10] [spin Down] Cache ST3500830AS_6QG123RD (sdh) 488386552 34 °C 500 GB 272 GB 228 GB 5047 24,476 0 [browse /mnt/cache] Link to comment
jphipps Posted August 26, 2014 Share Posted August 26, 2014 Is there any messages in your syslog relating to sdm? Link to comment
garycase Posted August 26, 2014 Share Posted August 26, 2014 "... after driving my server almost 3000 miles " ==> Clearly the server's been "shaken and stirred" a bit recently. In addition to reseating all of the cables (and the drive itself if it's in a caddy), did you try replacing the SATA cable for that drive? If so, then it's likely really bad. I'd replace it with your new 3TB unit; and let the rebuild complete while you have the chance. Meanwhile, order another drive to replace your 1TB unit as well, so you don't have any "known flaky" drives in the system. I assume your parity drive is at least 3TB, so you don't also have the issue of needing to upgrade parity at the same time -- correct? Link to comment
JonathanM Posted August 26, 2014 Share Posted August 26, 2014 If unRAID can't see the drive, how can it even be "sdm"? What about? ls -l /dev/disk/by-id root@media:~# ls -l /dev/disk/by-id total 0 lrwxrwxrwx 1 root root 9 Aug 26 14:35 ata-WDC_WD20EARS-00MVWB0_WD-WCAZA2130725 -> ../../sdn lrwxrwxrwx 1 root root 10 Aug 26 14:35 ata-WDC_WD20EARS-00MVWB0_WD-WCAZA2130725-part1 -> ../../sdn1 Device Identification Temp. Size Used Free Reads Writes Errors View [spin Down] Disk 9 WDC_WD20EARS-00MVWB0_WD-WCAZA2130725 (sdm) 1953514552 * 2 TB 1.82 TB 178 GB 1226 16 0 [browse /mnt/disk9] ??? Were these two snippets captured during the same boot session? Link to comment
JustinChase Posted August 26, 2014 Author Share Posted August 26, 2014 "... after driving my server almost 3000 miles " ==> Clearly the server's been "shaken and stirred" a bit recently. You can say that again! almost half of that was in Mexico, and several hours of terrible roads that felt like gravel roads, even though they were actually highways. That's why I even mentioned the miles at all. I assume your parity drive is at least 3TB, so you don't also have the issue of needing to upgrade parity at the same time -- correct? Yes, parity is 4TB actually. I only bought a 3TB drive now because frys had one for 89 delivered to my door, which I couldn't pass up. Unfortunately, only one per household, or I'd have already gotten another for the known flaky drive. I'll shut down again right now and remove and reseat all drives, and replace the cable for that bad drive also. The drives are in 'caddy's', and by that I mean, they all have a plastic 'band' around them that has posts that fit into the screw holes on the drives, then the 'bands' slide into grooves in the server and lock into place. however, they are only connected with SATA cables and the power cables. I guess I'm just not sure what qualifies as a caddy Were these two snippets captured during the same boot session? yes, without any reboot. The drive originally showed as missing, I reseated all cables while running, then refreshed the view and the array started, but the drive was still red-balled, but did show the sdm designation. Then, a bit later I ran the commands and posted the results. I have not rebooted yet. Is there any messages in your syslog relating to sdm? Yeah, several messages. I've attached the entire syslog, in case there's more going on here. syslog.zip Link to comment
garycase Posted August 26, 2014 Share Posted August 26, 2014 When you use a hot-swap caddy, the drive is actually connecting to the caddy -- NOT to the SATA cable you're using. It sounds like that's the case; so be sure you remove the drive from the caddy (pull it out a bit); and then re-seat it. That connection can get loose just like a SATA connector can. The more connections you have, the more potential failure points -- and to check them you have to reseat them all. Link to comment
jphipps Posted August 26, 2014 Share Posted August 26, 2014 Looks like that disk is not a happy camper from your syslog... Aug 26 11:34:19 media kernel: sd 14:0:0:0: Attached scsi generic sg12 type 0 Aug 26 11:34:19 media kernel: sd 14:0:0:0: [sdm] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB) Aug 26 11:34:19 media kernel: sd 14:0:0:0: [sdm] Write Protect is off Aug 26 11:34:19 media kernel: sd 14:0:0:0: [sdm] Mode Sense: 00 3a 00 00 Aug 26 11:34:19 media kernel: sd 14:0:0:0: [sdm] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 26 11:34:19 media kernel: sdm: sdm1 Aug 26 11:34:19 media kernel: sd 14:0:0:0: [sdm] Attached SCSI disk Aug 26 11:34:25 media kernel: ata14: exception Emask 0x10 SAct 0x0 SErr 0x90002 action 0xe frozen Aug 26 11:34:25 media kernel: ata14: irq_stat 0x00400000, PHY RDY changed Aug 26 11:34:25 media kernel: ata14: SError: { RecovComm PHYRdyChg 10B8B } Aug 26 11:34:25 media kernel: ata14: hard resetting link Aug 26 11:34:28 media kernel: ata12: exception Emask 0x10 SAct 0x0 SErr 0x10002 action 0xe frozen Aug 26 11:34:28 media kernel: ata12: irq_stat 0x00400000, PHY RDY changed Aug 26 11:34:28 media kernel: ata12: SError: { RecovComm PHYRdyChg } Aug 26 11:34:28 media kernel: ata12: hard resetting link Aug 26 11:34:35 media kernel: ata14: softreset failed (device not ready) Aug 26 11:34:35 media kernel: ata14: hard resetting link Aug 26 11:34:36 media kernel: ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Aug 26 11:34:36 media kernel: ata12.00: configured for UDMA/33 Aug 26 11:34:36 media kernel: ata12: EH complete Aug 26 11:34:45 media kernel: ata14: softreset failed (device not ready) Aug 26 11:34:45 media kernel: ata14: hard resetting link Aug 26 11:34:46 media kernel: ata14: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Aug 26 11:34:46 media kernel: ata14.00: configured for UDMA/133 Aug 26 11:34:46 media kernel: ata14: EH complete Aug 26 11:34:52 media kernel: ata14: exception Emask 0x10 SAct 0x0 SErr 0x90002 action 0xe frozen Aug 26 11:34:52 media kernel: ata14: irq_stat 0x00400000, PHY RDY changed Aug 26 11:34:52 media kernel: ata14: SError: { RecovComm PHYRdyChg 10B8B } Aug 26 11:34:52 media kernel: ata14: hard resetting link Aug 26 11:34:53 media kernel: ata12: exception Emask 0x10 SAct 0x0 SErr 0x10002 action 0xe frozen Aug 26 11:34:53 media kernel: ata12: irq_stat 0x00400000, PHY RDY changed Aug 26 11:34:53 media kernel: ata12: SError: { RecovComm PHYRdyChg } Aug 26 11:34:53 media kernel: ata12: hard resetting link If all connections look good, if you have a usb->sata dock you could try connecting the drive to make sure it is working. Or if you get a new drive, connect it on that same connection and see if the new drive is recognized. If so it is probably a failed drive.. Link to comment
JustinChase Posted August 26, 2014 Author Share Posted August 26, 2014 So, I've removed all drives (and re-labeled them all, so I can see what is what more easily), then moved a few around, to organize a bit better, then reinstalled all the drives. The result is that the red-balled drive is in a new location, and is connected to a different SATA drive. When I booted unRAID, it booted really quickly, and the array started. I figured that was a good thing, but then I took a look, and is shows Disk 9 as not installed, but now it also shows the WD drive as just a normal, available drive in the list. I want to think this is a good thing, and that I can just stop the array, assign it to disk9, then restart, but before I do, I figure I should check and see it that's actually a dumb idea. Screenshot attached... thoughts? Link to comment
trurl Posted August 27, 2014 Share Posted August 27, 2014 This sounds similar to the scenario where people intentionally make unRAID forget about the drive so they can rebuild it onto itself. I think if you assign it unRAID may want to rebuild it. Whether that is what you should do is unclear. Have you got another drive you could rebuild onto? Link to comment
jphipps Posted August 27, 2014 Share Posted August 27, 2014 Can you run the smartctl command against /dev/sdb ( the disk outsite the array ) My guess is that it came up once without the disk installed, so it is set to missing, and now showing up as a disk separate from the array. You could try mounting the /dev/sdb1 partion and verify it is the data that should be on that disk if the smart test passes and looks like the drive is still in good health. The safest would be to rebuild slot 9 on another drive, then you still have the original data on that disk if for some reason the rebuild fails. The last option, if you are sure all the data is correct on the drives, you can start with a new config and rebuild parity off all the data drives. Sounds hairy, but I have done that a few times and got my array back. You have to remember with unRaid all array disks are just individual stand alone data drives that are part of the array, and the parity disk is only really used for emulating a failed disk and rebuilding. Link to comment
JustinChase Posted August 27, 2014 Author Share Posted August 27, 2014 This sounds similar to the scenario where people intentionally make unRAID forget about the drive so they can rebuild it onto itself. I think if you assign it unRAID may want to rebuild it. Whether that is what you should do is unclear. Have you got another drive you could rebuild onto? Not until sometime tomorrow. I don't NEED to get it all fixed tonight for any particular reason, so I'll wait until I have a new drive to rebuild onto, then see if I can preclear the red-balled drive and see if it will work okay moving forward. Obviously, if someone else has any ideas, thoughts, or suggestions, I'm all ears. Thanks everyone for all your help so far! Link to comment
JustinChase Posted August 27, 2014 Author Share Posted August 27, 2014 Can you run the smartctl command against /dev/sdb ( the disk outsite the array ) root@media:~# smartctl -a /dev/sdb smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.15.0-unRAID] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Green (AF) Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WCAZA2130725 LU WWN Device Id: 5 0014ee 2afd6fe55 Firmware Version: 51.0AB51 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 2.6, 3.0 Gb/s Local Time is: Tue Aug 26 19:30:14 2014 CDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (40260) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 388) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 253 163 021 Pre-fail Always - 1108 4 Start_Stop_Count 0x0032 096 096 000 Old_age Always - 4205 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 084 084 000 Old_age Always - 12054 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 1471 192 Power-Off_Retract_Count 0x0032 199 199 000 Old_age Always - 1381 193 Load_Cycle_Count 0x0032 182 182 000 Old_age Always - 54340 194 Temperature_Celsius 0x0022 116 098 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 7 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. I'm not sure if I run the long test on that drive if it will disrupt my ability to stream the video I'm currently trying to watch, so I may wait until it's done to start that test. Link to comment
jphipps Posted August 27, 2014 Share Posted August 27, 2014 it shouldn't impact using the disk. The smart report looks pretty clean... Link to comment
RobJ Posted August 27, 2014 Share Posted August 27, 2014 Aug 26 11:34:25 media kernel: ata14: exception Emask 0x10 SAct 0x0 SErr 0x90002 action 0xe frozen Aug 26 11:34:25 media kernel: ata14: irq_stat 0x00400000, PHY RDY changed Aug 26 11:34:25 media kernel: ata14: SError: { RecovComm PHYRdyChg 10B8B } Aug 26 11:34:25 media kernel: ata14: hard resetting link Aug 26 11:34:28 media kernel: ata12: exception Emask 0x10 SAct 0x0 SErr 0x10002 action 0xe frozen Aug 26 11:34:28 media kernel: ata12: irq_stat 0x00400000, PHY RDY changed Aug 26 11:34:28 media kernel: ata12: SError: { RecovComm PHYRdyChg } Aug 26 11:34:28 media kernel: ata12: hard resetting link Aug 26 11:34:35 media kernel: ata14: softreset failed (device not ready) Aug 26 11:34:35 media kernel: ata14: hard resetting link Aug 26 11:34:36 media kernel: ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Aug 26 11:34:36 media kernel: ata12.00: configured for UDMA/33 Aug 26 11:34:36 media kernel: ata12: EH complete Aug 26 11:34:45 media kernel: ata14: softreset failed (device not ready) Aug 26 11:34:45 media kernel: ata14: hard resetting link Aug 26 11:34:46 media kernel: ata14: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Aug 26 11:34:46 media kernel: ata14.00: configured for UDMA/133 Aug 26 11:34:46 media kernel: ata14: EH complete Aug 26 11:34:52 media kernel: ata14: exception Emask 0x10 SAct 0x0 SErr 0x90002 action 0xe frozen Aug 26 11:34:52 media kernel: ata14: irq_stat 0x00400000, PHY RDY changed Aug 26 11:34:52 media kernel: ata14: SError: { RecovComm PHYRdyChg 10B8B } Aug 26 11:34:52 media kernel: ata14: hard resetting link Aug 26 11:34:53 media kernel: ata12: exception Emask 0x10 SAct 0x0 SErr 0x10002 action 0xe frozen Aug 26 11:34:53 media kernel: ata12: irq_stat 0x00400000, PHY RDY changed Aug 26 11:34:53 media kernel: ata12: SError: { RecovComm PHYRdyChg } Aug 26 11:34:53 media kernel: ata12: hard resetting link The SATA error flags you are seeing (RecovComm and PHYRdyChg) are typical of a bad connection, perhaps loose and vibrating, or poor backplane connection, or perhaps poor or noisy power. You have 2 separate drives with the same issues, connected to channels ata12 and ata14. I understand you've reconnected drives to different ports, so that in itself may have improved all connections. In cases like these, the drives themselves are usually completely fine. Link to comment
garycase Posted August 27, 2014 Share Posted August 27, 2014 The drive looks fine on the SMART report. I suspect you could simply assign it to Disk9 and it would just rebuild onto itself just fine ... HOWEVER -- it's safer to do that with a different disk. That way, if anything should go awry with the rebuild, you'll still have the original disk to copy your data off of. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.