garycase Posted August 4, 2013 Share Posted August 4, 2013 Here is a HD that got replaced by WD. Is this one good? Might keep this one as a warm spare will see.... Looks fine. Quote Link to comment
niven Posted August 5, 2013 Share Posted August 5, 2013 Hello! Just finished preclearing 3 Disks, and I'm wondering if anybody could take a look at the results and see if everything is okay? Also what should you look for in the preclear report when a disk is finished? There is a lot of information there.. So what is most important to inspect? Want to learn a little so I don't have to post my results everytime hehe preclear_results.txt Quote Link to comment
garycase Posted August 5, 2013 Share Posted August 5, 2013 Your report looks fine for all 3 disks. There are 3 files generated for each disk you preclear => the "before" SMART report (preclear_start); the "after" SMART report (preclear_finish); and the summary report (preclear_rpt). I always look at the summary first => the key things to look at are those SMART parameters that have changed; and the re-allocated sector summary at the end. If there have been no significant changes in the SMART data, and the re-allocation counters are all zeroes; then the disk looks pretty good. You should then look at the final SMART report to be sure there weren't any attributes that were already failing and simply didn't change (so they wouldn't have been in the summary as a changed attribute). But in general if there aren't any notable changes shown in the summary, and you see six zeroes at the end (the re-allocated sector summaries), then the disk is fine. Note that there are a few attributes that can SEEM troublesome, but really aren't. Some attributes have very narrow differences between the normal count and their threshold counts ... so they'll be shown as "near threshold" even though they haven't changed and are fine. You may want to read a bit more about SMART parameters -- although I can tell you there is little definitive information available; so you just have to learn what matters and what doesn't. [And those can vary by the drive manufacturer] Quote Link to comment
WorriedAboutDataLoss Posted August 6, 2013 Share Posted August 6, 2013 I just ran a preclear (report attached) on a 2TB old disk that I have (pulled out of external enclosure). I ran just 1 cycle as I had been using this disk without errors for almost 4 years now. I did notice that there was a Hardware_ECC_Recovered SMART attribute reports which I've not seen when preclearing previous disks. Is this anything to be concerned about. I'm planning to use this as a parity disk till I get a new WD Red 3 TB put in. Thanks! preclear.txt Quote Link to comment
garycase Posted August 6, 2013 Share Posted August 6, 2013 A high number of ECC recoveries indicate there are data errors that have been corrected via the ECC capability ... i.e. one bit was bad, but the error recovery mechanism could correct the data. When this gets high enough to start dropping the SMART value (in your case to 36) it's something you definitely want to watch. The interesting thing is it WAS at 23, but improved during the pre-clear. It may simply mean the disk needs to be re-written a few times to refresh the sectors. If you're going to use that for parity, I'd first zero the drive 2 or 3 more times and see if that value continues to improve. You can run "preclear_disk.sh -n /dev/sdx" and it will JUST do the actual pre-clear (zeroing) of the disk ... this takes about 1/4th the time of a full cycle. Then look at the SMART report and see what it looks. Quote Link to comment
WorriedAboutDataLoss Posted August 7, 2013 Share Posted August 7, 2013 Thanks Gary! As suggested I ran another preclear on the disk. The final report (diff) doesn't show the ECC recovered value - so I'm guessing it hasn't changed from start -> finish. I'm attaching all 3 reports here (start, finish and rpt). Any suggestions welcome!! preclear_start__2.txt preclear_finish_2.txt preclear_rpt__2.txt Quote Link to comment
Joe L. Posted August 7, 2013 Share Posted August 7, 2013 ALL disks have hardware error correction errors, some report them some do not. It is normal. The important thing to check is if the "normalized" value is above the affiliated error threshold. If it is, the drive is working as expected. All that said, the current normalized value is 031, the worst value it has been is 006, and the failure threshold is 0. Keep an eye on the disk, if the value continues to drop low into the single digits, it might be an indication of a drive nearing the end of its life. (it has been spinning for 16127 hours) Quote Link to comment
WorriedAboutDataLoss Posted August 7, 2013 Share Posted August 7, 2013 Thanks Joe! Will keep that in mind. I'll build parity on this drive for now but will replace it as soon as I receive my new drives and have them precleared. Maybe I'll repurpose this drive to a cache drive after that Quote Link to comment
garycase Posted August 7, 2013 Share Posted August 7, 2013 Maybe I'll repurpose this drive to a cache drive after that Good plan. Note that Seagate reports a lot more of the "raw" data than most other manufacturers do in their SMART data, so it's no uncommon to see very high raw read error counts, and ECC correction counts. As I noted before, and as Joe just noted as well, what you need to watch for more than the raw counts is changes in the "value" (lower = worse), and especially numbers that are approaching the thresholds. Quote Link to comment
rd48sec Posted August 17, 2013 Share Posted August 17, 2013 I am concerned about the results but not sure if it is failing. Please advise. Thanks so much! preclear_start_1.txt preclear_rpt_1.txt preclear_finish_1.txt Quote Link to comment
RobJ Posted August 17, 2013 Share Posted August 17, 2013 I am concerned about the results but not sure if it is failing. Please advise. Thanks so much! You are right to be concerned, RMA it ASAP. Initial SMART report looked fine, but the subsequent one showed that the critical attribute Raw_Read_Error_Rate has bottomed out, far below its threshold value. Send it back, with a copy of that last SMART report. Drive is already considered failed, according to its SMART system, even if it appears to be working somewhat. Quote Link to comment
KB36 Posted August 19, 2013 Share Posted August 19, 2013 ========================================================================1.13 == invoked as: ./preclear_disk.sh -A /dev/sdb == WDC WD20EARS-00MVWB0 WD-WCAZA3663146 == Disk /dev/sdb has been successfully precleared == with a starting sector of 64 == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 6:27:54 (85 MB/s) == Last Cycle's Zeroing time : 6:02:21 (92 MB/s) == Last Cycle's Post Read Time : 12:19:16 (45 MB/s) == Last Cycle's Total Time : 24:50:31 == == Total Elapsed Time 24:50:31 == == Disk Start Temperature: 28C == == Current Disk Temperature: 27C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdb /tmp/smart_finish_sdb ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Temperature_Celsius = 123 122 0 ok 27 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ ========================================================================1.13 == invoked as: ./preclear_disk.sh -A /dev/sdc == WDC WD20EARS-00MVWB0 WD-WCAZA4671510 == Disk /dev/sdc has been successfully precleared == with a starting sector of 64 == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 6:39:54 (83 MB/s) == Last Cycle's Zeroing time : 6:05:22 (91 MB/s) == Last Cycle's Post Read Time : 12:27:12 (44 MB/s) == Last Cycle's Total Time : 25:13:27 == == Total Elapsed Time 25:13:27 == == Disk Start Temperature: 29C == == Current Disk Temperature: 28C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdc /tmp/smart_finish_sdc ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Temperature_Celsius = 122 121 0 ok 28 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ ========================================================================1.13 == invoked as: ./preclear_disk.sh -A /dev/sdd == WDC WD20EARX-00PASB0 WD-WMAZA9033613 == Disk /dev/sdd has been successfully precleared == with a starting sector of 64 == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 6:55:21 (80 MB/s) == Last Cycle's Zeroing time : 6:02:40 (91 MB/s) == Last Cycle's Post Read Time : 13:39:26 (40 MB/s) == Last Cycle's Total Time : 26:38:26 == == Total Elapsed Time 26:38:26 == == Disk Start Temperature: 28C == == Current Disk Temperature: 29C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdd /tmp/smart_finish_sdd ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Seek_Error_Rate = 100 200 0 ok 0 Temperature_Celsius = 121 122 0 ok 29 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ This is as good as it gets?!? Quote Link to comment
garycase Posted August 19, 2013 Share Posted August 19, 2013 Yes, those look very good. No real issues with changing parameters; and all zeroes on the reallocated sector counts. Definitely what you want to see after a pre-clear. Quote Link to comment
pengrus Posted August 20, 2013 Share Posted August 20, 2013 OK, in the category of what you do NOT want to see, I present the following: I had some significant problems with read speed on this drive when it was living in my macbook, but it wouldn't fail a smart test, and I had little time to drill down as much as I wanted, so I cleared everything off and tried to preclear. I thought it would be a good test, and would maybe clear out some bad sectors or whatever, wouldn't be up to array snuff, but certainly sufficient for a non-array drive or random traveling drive, what-have-you. The first preclear for this 1TB drive took almost 77 hours. The report is attached below. I did that back in July, then moved on to bigger and better things, returning last night to try it again, after, for some reason, the sectors pending reallocation jumped up to 24. It's just been sitting there not doing anything in a drive tray for a month or so, not sure how that happened, but perhaps the bits, like the smurfs, decided to all band together and solve their problems. So I cleared it again. Oops. I don't even know what the hell happened to this thing, I think it actually tried to open the tray and leave the case. I'll let you all decide, but I can't quite make out what the problem is from the syslog, and the preclear report is decidedly less than helpful, as you'll see below! Of particular fear to me are these warnings about md12...I didn't clear md12!! As it happens, I didn't clear sdh either... If any kind soul needs further information from me, as always, please let me know. Thanks for reading, and unless you live in AK or HI, fear not, Monday is almost over! None too soon, by the way, what with all the positivity in here today (what is emoticon for awkward cringe?). Pengrus Archive.zip Quote Link to comment
Fireball3 Posted August 20, 2013 Share Posted August 20, 2013 Noticed this thread shortly after posting my results here. http://lime-technology.com/forum/index.php?topic=28989.0 Sorry for that but perhaps you can have a look at it? Quote Link to comment
RobJ Posted August 20, 2013 Share Posted August 20, 2013 OK, in the category of what you do NOT want to see, I present the following: I had some significant problems with read speed on this drive when it was living in my macbook, but it wouldn't fail a smart test, and I had little time to drill down as much as I wanted, so I cleared everything off and tried to preclear. I thought it would be a good test, and would maybe clear out some bad sectors or whatever, wouldn't be up to array snuff, but certainly sufficient for a non-array drive or random traveling drive, what-have-you. The first preclear for this 1TB drive took almost 77 hours. The report is attached below. I did that back in July, then moved on to bigger and better things, returning last night to try it again, after, for some reason, the sectors pending reallocation jumped up to 24. It's just been sitting there not doing anything in a drive tray for a month or so, not sure how that happened, but perhaps the bits, like the smurfs, decided to all band together and solve their problems. So I cleared it again. Oops. I don't even know what the hell happened to this thing, I think it actually tried to open the tray and leave the case. I'll let you all decide, but I can't quite make out what the problem is from the syslog, and the preclear report is decidedly less than helpful, as you'll see below! Aug 19 01:02:59 Tower kernel: sd 1:0:0:0: [sdh] command c21f9f00 timed out (first indication of loss of contact) ... Aug 19 01:03:19 Tower kernel: ata7.00: disabled (drive marked as disabled by kernel) (You can ignore all subsequent errors for this drive.) Short drive history: (SAMSUNG HM100UI, 1TB) First PreClear started at Jul 7 19:03:31 2013 PDT, Power_On_Hours = 5147. Pre Read took about 7 hours (39 MB/s) Zeroing took about 53 hours (5 MB/s), indicating probable issues Post Read took almost 17 hours (16 MB/s) Final SMART report could not be obtained, indicating drive was lost, so Post Read may have aborted Later SMART report indicates 24 Pending sectors, may have been found during this PreClear SMART short test was run at Power_On_Hours = 5742, indicating possible drive issues at that time, but short test found no issues. Second PreClear started at Aug 18 21:03:38 2013 PDT, Power_On_Hours = 6157. Pre Read took about 4 hours (69 MB/s), may have aborted (so speed is suspect) Zeroing aborted almost immediately, indicating drive was lost Post Read did not occur, drive not present Final SMART report could not be obtained, indicating drive not present At Aug 19 01:02:59, the kernel lost communications completely with the drive. It tried repeatedly to recover the drive, to reset the SAS card, but failed, and only 20 seconds later marked the drive as lost ('disabled'). You can completely ignore all subsequent errors from this drive, because from the kernel's point of view, the drive is no longer present. When a drive is lost this way, you cannot conclude where the fault is, whether of the drive or the SAS card or the cable or power issues or the driver etc. The drive does appear to be having problems, but they may not be related to the loss of communications here. I believe it happened with both PreClears you ran on this drive, during the current Pre Read and during the Post Read of the earlier PreClear. Part of the confusion here is that PreClear failed to notice that the drive had failed, and proceeded with final reports anyway, basing them on incomplete info. SMART info on this drive is rather odd. The only 3 attributes marked as critical are either perfect or very good. The Multi_Zone_Error_Rate has bottomed out with a VALUE of 001 and count of 53348. If it had considered that to be a critical item, the drive would be considered FAILED, but the manufacturers seem to have stopped marking new SMART attributes as critical any more. Calibration_Retry_Count has a VALUE of 046 and count of 55072, which similarly looks very serious to me, but its SMART controller does not seem too worried about it. If you still want to use this drive, I would move it to a motherboard SATA port, use a different power cable, and try PreClearing it again. I personally believe you should consider the drive as failed, unreliable, and not use it. Joe L, I can't help being concerned about PreClear here. It failed to notice that a drive was lost, that a Pre Read had aborted, that a Zeroing failed to start, that a Post Read may have aborted, and that the final SMART report could not be obtained. Could I respectfully request you recheck how error returns are being handled? Perhaps some additional checks for the continuing existence of the drive are warranted. Of particular fear to me are these warnings about md12...I didn't clear md12!! As it happens, I didn't clear sdh either... Aug 19 01:04:16 Tower kernel: REISERFS error (device md12): vs-7000 search_by_entry_key: search_by_key returned item position == 0 Aug 19 01:04:16 Tower kernel: REISERFS (device md12): Remounting filesystem read-only This occurred only a minute after the resetting and disabling of the PreClearing drive sdh. I would not normally associate a physical drive issue with a file system issue like this, but it seems too coincidental not to in this case. I also note that the same SAS card has sdh, md12, and md13. Disk 13 did not show any issues. It is possible that the resetting of the SAS card caused a lost packet that was being written to Disk 12 (md12). The only other cause I can think of is a strong power spike that affected both sdh and md12, except sdh had had a similar loss of comm during its earlier PreClear. As you can see above, once the Reiser file system detected corruption in the file system, it remounted it as read-only, blocking any further modifications until after rebooting. You will need to run Check Disk File systems on Disk 12. Quote Link to comment
pengrus Posted August 20, 2013 Share Posted August 20, 2013 RobJ, Thanks very much for the insight! Yeah, this drive is a little wonky, I'd only had it for a few months, running in my MBP, and all of a sudden it started taking FOREVER to read. I was watching a movie with XBMC and it stuttered every few seconds. I couldn't agree more on not using it in the array, I was just hoping that the combined stress level of preclearing and the drive's own mechanisms for fixing its problems would make it useful for something besides a coaster. (now there's an idea, old drives encased in lexan for drink coasters? who doesn't need a frosty adult beverage whilst constructing the perfect parity-protected server?!) I'm running the fsck now, but I do have a few more questions. If the kernel detected a lost packet, why did the 3TB drive (md12) not redball? As you see from the dates, this was a couple of days ago, and the array still showed all green after the issue. What could, and I know this is probably all guesswork, cause the 1TB drive to "fail" in this manner? And thirdly, I'd like to second the respectful request to Joe L. to take a look at the preclear script. These don't look like 'PASS'es to me! Don't get me wrong here, I, like probably everyone else on this forum am indebted to Joe multiple times over for his advice and unmenu and everything else. Just a question/bug(maybe). Thanks! Pengrus p.s. Oh, i forgot one more...is there any way to tell what "command c21f9f00" is? Quote Link to comment
Joe L. Posted August 21, 2013 Share Posted August 21, 2013 RobJ, Thanks very much for the insight! Yeah, this drive is a little wonky, I'd only had it for a few months, running in my MBP, and all of a sudden it started taking FOREVER to read. I was watching a movie with XBMC and it stuttered every few seconds. I couldn't agree more on not using it in the array, I was just hoping that the combined stress level of preclearing and the drive's own mechanisms for fixing its problems would make it useful for something besides a coaster. (now there's an idea, old drives encased in lexan for drink coasters? who doesn't need a frosty adult beverage whilst constructing the perfect parity-protected server?!) I'm running the fsck now, but I do have a few more questions. If the kernel detected a lost packet, why did the 3TB drive (md12) not redball? As you see from the dates, this was a couple of days ago, and the array still showed all green after the issue. What could, and I know this is probably all guesswork, cause the 1TB drive to "fail" in this manner? And thirdly, I'd like to second the respectful request to Joe L. to take a look at the preclear script. These don't look like 'PASS'es to me! Don't get me wrong here, I, like probably everyone else on this forum am indebted to Joe multiple times over for his advice and unmenu and everything else. Just a question/bug(maybe). Thanks! Pengrus p.s. Oh, i forgot one more...is there any way to tell what "command c21f9f00" is? I agree, the preclear script should do better in detecting the complete failure of a drive, it is not easy however, as the "dd" commands used do not always give clear indications of a failure. (preclear does not look in the syslog) Quote Link to comment
RobJ Posted August 21, 2013 Share Posted August 21, 2013 I'm running the fsck now, but I do have a few more questions. If the kernel detected a lost packet, why did the 3TB drive (md12) not redball? As you see from the dates, this was a couple of days ago, and the array still showed all green after the issue. There were no lost packets detected, or any other disk errors. That was just my speculation about a possible linkage between the resetting of the card for recovery of one attached drive that somehow affected the I/O of another attached drive. Resetting should be completely transparent and safe, and it *was* transparent in that operations continued without issue to the other drives (except for a small delay). But in this case, something caused file system corruption, at essentially the same time as the resetting and disabling of the other attached drive. It seems linked to me, but I don't have an actual error or fault to point to, just a bit of speculation. What could, and I know this is probably all guesswork, cause the 1TB drive to "fail" in this manner? It's usually always hard to say for sure. The SMART wiki page says about Multi-Zone Error Rate: "The count of errors found when writing a sector. The higher the value, the worse the disk's mechanical condition is." It associates a higher Multi-Zone Error Rate with deteriorating mechanical condition. I would not be surprised if the drive sounds noisier than the others. p.s. Oh, i forgot one more...is there any way to tell what "command c21f9f00" is? That's part of the exception handler for drive issues, and that's probably a local address within it. If you really want to know, you'll have to research that yourself, somewhere within the Linux kernel source code, possibly libata.c or wherever the ATA exception handling code is. Quote Link to comment
cyan Posted August 22, 2013 Share Posted August 22, 2013 I just preclear 2*RED 3TB. first HDD finish 1st cycle 36 hours and 2nd cycle 37 hours second HDD finish 1st cycle 41 hours and 2nd cycle 42 hours both show no error (0 in Reallocated_Sector_Ct and 0 in Current_Pending_Sector) How come the second HDD take 4-5 hours longer ? should I worry and run 3rd cycle ? Quote Link to comment
BobPhoenix Posted August 22, 2013 Share Posted August 22, 2013 I just preclear 2*RED 3TB. first HDD finish 1st cycle 36 hours and 2nd cycle 37 hours second HDD finish 1st cycle 41 hours and 2nd cycle 42 hours both show no error (0 in Reallocated_Sector_Ct and 0 in Current_Pending_Sector) How come the second HDD take 4-5 hours longer ? should I worry and run 3rd cycle ? Drives vary. I have some that are slow as well including a Seagate drive that can only do 35MB/s read and write access as a stand alone drive. Nothing shows on smart reports for it since it came with my N40L I didn't think I would be able to return it. If you feel it is too slow you can return it to WD as a performance problem - at least the last time I returned a drive to WD that was still available anyway. Quote Link to comment
Darts Posted August 22, 2013 Share Posted August 22, 2013 Hello! I'm currently preclearing a new WD RED 3TB disk (sdg). It all went fine until the last step (post read) around 70% of completion the speed went down to around 300 KB/sec and I noticed the following errors in the syslog (I'm using screen) : Aug 22 09:49:39 Alpha kernel: Buffer I/O error on device sdg, logical block 557200641 Aug 22 09:49:39 Alpha kernel: Buffer I/O error on device sdg, logical block 557200642 Aug 22 09:49:39 Alpha kernel: Buffer I/O error on device sdg, logical block 557200643 Aug 22 09:49:39 Alpha kernel: Buffer I/O error on device sdg, logical block 557200644 Aug 22 09:49:39 Alpha kernel: Buffer I/O error on device sdg, logical block 557200645 Aug 22 09:49:39 Alpha kernel: Buffer I/O error on device sdg, logical block 557200646 Aug 22 09:49:39 Alpha kernel: Buffer I/O error on device sdg, logical block 557200647 Aug 22 09:49:39 Alpha kernel: Buffer I/O error on device sdg, logical block 557200648 Aug 22 09:49:39 Alpha kernel: Buffer I/O error on device sdg, logical block 557200649 Aug 22 09:49:39 Alpha kernel: Buffer I/O error on device sdg, logical block 557200650 Aug 22 09:49:39 Alpha kernel: ata6: EH complete Aug 22 09:50:08 Alpha kernel: ata6.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0 Aug 22 09:50:08 Alpha kernel: ata6.00: irq_stat 0x40000008 Aug 22 09:50:08 Alpha kernel: ata6.00: failed command: READ FPDMA QUEUED Aug 22 09:50:08 Alpha kernel: ata6.00: cmd 60/00:00:98:ad:b1/01:00:09:01:00/40 tag 0 ncq 131072 in Aug 22 09:50:08 Alpha kernel: res 41/40:00:68:ae:b1/00:00:09:01:00/40 Emask 0x409 (media error) <F> Aug 22 09:50:08 Alpha kernel: ata6.00: status: { DRDY ERR } Aug 22 09:50:08 Alpha kernel: ata6.00: error: { UNC } Aug 22 09:50:08 Alpha kernel: ata6.00: configured for UDMA/133 Aug 22 09:50:08 Alpha kernel: sd 6:0:0:0: [sdg] Unhandled sense code Aug 22 09:50:08 Alpha kernel: sd 6:0:0:0: [sdg] Aug 22 09:50:08 Alpha kernel: Result: hostbyte=0x00 driverbyte=0x08 Aug 22 09:50:08 Alpha kernel: sd 6:0:0:0: [sdg] Aug 22 09:50:08 Alpha kernel: Sense Key : 0x3 [current] [descriptor] Aug 22 09:50:08 Alpha kernel: Descriptor sense data with sense descriptors (in hex): Aug 22 09:50:08 Alpha kernel: 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 01 Aug 22 09:50:08 Alpha kernel: 09 b1 ae 68 Aug 22 09:50:08 Alpha kernel: sd 6:0:0:0: [sdg] Aug 22 09:50:08 Alpha kernel: ASC=0x11 ASCQ=0x4 Aug 22 09:50:08 Alpha kernel: sd 6:0:0:0: [sdg] CDB: Aug 22 09:50:08 Alpha kernel: cdb[0]=0x88: 88 00 00 00 00 01 09 b1 ad 98 00 00 01 00 00 00 Aug 22 09:50:08 Alpha kernel: end_request: I/O error, dev sdg, sector 4457606760 Aug 22 09:50:08 Alpha kernel: quiet_error: 8 callbacks suppressed Could you please let me know if I'm facing a DOA disk and I need to RMA it? Or is it possible to "save" the current state of preclearing and restart only the post-read? FYI, memory seems OK : total used free shared buffers cached Mem: 2071540 1811684 259856 0 634120 1104488 Low: 889388 686068 203320 High: 1182152 1125616 56536 -/+ buffers/cache: 73076 1998464 Swap: 0 0 0 Any insight would be helpful, thanks Quote Link to comment
itimpi Posted August 22, 2013 Share Posted August 22, 2013 It looks as though that disk may have dropped off-line. This could be a problem with the disk, but may be something else. When (if) you get the disk back online you can run a smartctl command to check SMART information. I would also check carefully all cabling to see that nothing has worked its way lose. Quote Link to comment
Darts Posted August 22, 2013 Share Posted August 22, 2013 Hello itimpi, Thank you for helping me on this one Indeed, the disk seems to have dropped off line : Smartctl open device: sdg failed: No such device Any way I can bring it back online? EDIT >> My bad, here's the smart short report (thanks unMenu ) Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 134 134 051 Pre-fail Always - 70899 3 Spin_Up_Time 0x0027 100 253 021 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 3 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 44 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 3 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 2 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 0 194 Temperature_Celsius 0x0022 118 114 000 Old_age Always - 32 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged Seems good no? Quote Link to comment
garycase Posted August 22, 2013 Share Posted August 22, 2013 The SMART values all look fine. What does the PreClear report look like? [it will be in the preclear_reports folder on your flash drive.] Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.