November 28, 20169 yr This drive trashed? I haven't touched the cables in months, I checked and everything looks to be plugged in fine. I tried running a smartctl from command line, and got same error. I have extra space on the server, so I thought I'd do the remove drive rebuild parity until I was able to get another hard drive in to replace this one in a week or so. smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.30-unRAID] (local build) Copyright © 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: /2:0:3:0 Product: Compliance: SPC-5 User Capacity: 600,332,565,813,390,450 bytes [600 PB] Logical block size: 774843950 bytes Physical block size: 1903784304 bytes Lowest aligned LBA: 14896 scsiModePageOffset: response length too short, resp_len=47 offset=50 bd_len=46 scsiModePageOffset: response length too short, resp_len=47 offset=50 bd_len=46 >> Terminate command early due to bad response to IEC mode page A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
November 28, 20169 yr The controller has lost contact with it. Power down and check cables, then power back up and try again to get a SMART report.
November 28, 20169 yr Author Well, I followed the "rebuild parity" method, and lost all the data on that one drive. Lame!
November 28, 20169 yr Community Expert Did you read this warning in the instructions? "This method does not keep the drive's data within the array. If the drive to be removed has data you want to stay in the array, you must move it yourself to the other data drives. Parity will be built based entirely and only on the remaining drives and their contents."
November 28, 20169 yr Author oh boy, there it is - just below where I was paying close attention. One of those things you really should pay close attention to! I think I'll do some wiki editing here...
November 28, 20169 yr I'd think it's fairly obvious that if you do a New Config that does NOT include one of your previous drives; that the data from that drive isn't going to be in the array. Also, since you're doing a New Config, you're clearly losing any parity information that would have allowed you to emulate the contents of a failed drive. r.e. the specific drive you're referring to here: Was it shown as "Disabled" on your array? If so, you COULD (can't any more) have simply copied the contents of it to another location -- the drive would have been emulated using the other drives plus parity to reconstruct the information. That, of course, is no longer possible. However ... if it was NOT shown as disabled, it MAY still be readable. Connect it to a SATA controller, and see if you can access it via the Unattached Devices plugin. You might be able to read the data on it => if that's the case, you can simply copy it all to your array. In the future, however, if you have questions about how to proceed, you might want to wait for some responses before trying anything. A bit of dialogue about just what the issue is would likely have prevented you from making the mistake of eliminating the ability to emulate the drive.
November 28, 20169 yr Author Thank you @garycase and others. I've mounted the drive via unattached drive plugin, and am copying files back to the array (turned off "use cache drive" for all affected shares). I will work on updating the Shrink Array wiki also. Here is the current SMART report from the drive in question. Does it look like a goner? The parts that jump out in yellow 197 Current pending sector 0x0032 198 198 000 Old age Always Never 800 198 Offline uncorrectable 0x0030 200 200 000 Old age Offline Never 319 full report smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.30-unRAID] (local build) Copyright © 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Green (AF) Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WMAZA4627276 LU WWN Device Id: 5 0014ee 002c1cc62 Firmware Version: 51.0AB51 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 2.6, 3.0 Gb/s Local Time is: Mon Nov 28 12:31:48 2016 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (38460) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 371) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 33 3 Spin_Up_Time 0x0027 171 166 021 Pre-fail Always - 6441 4 Start_Stop_Count 0x0032 087 087 000 Old_age Always - 13151 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 043 043 000 Old_age Always - 42056 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 144 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 57 193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 302820 194 Temperature_Celsius 0x0022 125 101 000 Old_age Always - 25 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 198 198 000 Old_age Always - 800 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 319 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 174 174 000 Old_age Offline - 7188 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 42056 - # 2 Short offline Completed without error 00% 34732 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
November 28, 20169 yr 800 pending sectors ==> Definitely a goner But the good news is that you can apparently read at least most of the data -- so you'll be able to get it back into your array.
November 28, 20169 yr Community Expert We welcome people who will help with the wiki, and it's probably true that some of it makes more sense if you already understand unRAID. But be careful you don't edit the wiki and create misunderstandings of unRAID. Many things about unRAID make sense if you understand how parity works and is used by unRAID. The mistake you made would have been less likely if you had that understanding.
November 28, 20169 yr Author True, BUT at the top of the instructions it said "without losing data" and then it gave the instructions, which was then followed by "this method loses data," or something to that effect. I'm making some corrections there now. I welcome help, and in fact, I have some questions about how best to to stuff - I will split that off into a new post. Thanks folks.
November 28, 20169 yr Looking at the Wiki article you referenced, it seems the only thing that needs to be done is to move the 5th line in the caveats before the actual procedure to the top of that list. Granted, it SHOULDN'T make a difference -- it's reasonable to assume folks will read ALL of the caveats before proceeding ... but clearly in this case you skipped over that last caveat. Note, by the way, that it does NOT say "... this method loses data ..." ==> what is notes is that it "... does not keep the drive's data within the array ..." ==> there's a BIG difference in those two statements. ... and, of course, it DID "preserver the contents of the data drive you are removing from the array" -- exactly as advertised. In fact, you're copying those contents back to the array right now (or may have already finished doing so).
November 28, 20169 yr Community Expert Also note that the "standard" way of dealing with this situation would have been to rebuild to a new data disk using parity. Just what unRAID was meant to do.
November 28, 20169 yr Author Yes, all points well taken. I will revert the wiki, and just make that item stand out a little more. I'll think about it in a few days. Thanks for the help again.
Archived
This topic is now archived and is closed to further replies.